Saturday, November 16, 2019
Tuesday, October 8, 2019
Monday, October 7, 2019
|Left: Original data distribution. Right: Learned co-displacement, darker is lower.|
|Notice the echoes around (10,-10) and (-10, 10)|
There are minor artifacts created by choosing axis aligned cuts in RRCF, similar to what was noted with IsoForest.
Friday, October 4, 2019
I had a glance through "Realizing the potential of astrostatistics and astroinformatics" by Eadie et al. (2019). While I do not feel qualified or informed to comment on the suggestions, I can summarize them quickly. There are three problems:
- Education: Most astronomers are not trained in code development resulting in maybe good but fragile code. Similarly, most computer scientists don't have the astronomy background or connections.
- Funding: Grants for methodology improvement are scarce. I wonder if these things can be funded from the computer science side of things in collaborations.
- Quality: Astro-informatics lacks support of state-of-the-art methodology as it stands.
I was much more interested in the final section about potential themes in research:
- Nonlinear dimensionality reduction.
- Deep learning.
I find the last theme incredibly broad and am unclear exactly how they mean it. It seems they're most interested in hierarchical representations of data. I would also claim that anomaly detection/clustering is important for reducing the volume of data.
Tuesday, September 17, 2019
I am working on a project where we wish to use anomaly detection to find what image patches have structure and which don't. As an aside, I ran an experiment on MNIST. You have 500 images of fives. You have 5000 images that are pure noise. You train a deep convolutional autoencoder. What you end up with is the following reconstruction:
Monday, September 16, 2019
I stumbled upon a game called Flood. It's a simple enough game. You start with a grid of random colors. Then, you change the color of contiguous region formed from the upper left corner until you have flooded the entire grid with one color. I wrote some code and have been tinkering around some.
The most naive solver is a breadth first search. So, I did that. Below you see the solution length for a grid size of varying size with only three colors.
This search breaks down at large grid size because it's so slow. Some kind of heuristic approach would perform better, but can you prove it's within some epsilon of optimal? What is the expected optimal solution length? I think that should be proveable theoretically since you just have a uniform grid and can constrain the growth rate. I will likely return and do that.
Monday, September 9, 2019
The code for this is: