Random Research Walk
Notes from J. Marcus Hughes on various research topics.
Saturday, November 16, 2019
Tuesday, October 8, 2019
Monday, October 7, 2019
Axis aligned artifacts
Left: Original data distribution. Right: Learned co-displacement, darker is lower. |
Notice the echoes around (10,-10) and (-10, 10) |
There are minor artifacts created by choosing axis aligned cuts in RRCF, similar to what was noted with IsoForest.
Friday, October 4, 2019
State of Astro-informatics
I had a glance through "Realizing the potential of astrostatistics and astroinformatics" by Eadie et al. (2019). While I do not feel qualified or informed to comment on the suggestions, I can summarize them quickly. There are three problems:
- Education: Most astronomers are not trained in code development resulting in maybe good but fragile code. Similarly, most computer scientists don't have the astronomy background or connections.
- Funding: Grants for methodology improvement are scarce. I wonder if these things can be funded from the computer science side of things in collaborations.
- Quality: Astro-informatics lacks support of state-of-the-art methodology as it stands.
I was much more interested in the final section about potential themes in research:
- Nonlinear dimensionality reduction.
- Sparsity.
- Deep learning.
I find the last theme incredibly broad and am unclear exactly how they mean it. It seems they're most interested in hierarchical representations of data. I would also claim that anomaly detection/clustering is important for reducing the volume of data.
Tuesday, September 17, 2019
Training an autoencoder with mostly noise
I am working on a project where we wish to use anomaly detection to find what image patches have structure and which don't. As an aside, I ran an experiment on MNIST. You have 500 images of fives. You have 5000 images that are pure noise. You train a deep convolutional autoencoder. What you end up with is the following reconstruction:
The top row are the inputs and the bottom row are the reconstructions. You find images of fives even when nothing is present.
The top row are the inputs and the bottom row are the reconstructions. You find images of fives even when nothing is present.
Monday, September 16, 2019
Flood
I stumbled upon a game called Flood. It's a simple enough game. You start with a grid of random colors. Then, you change the color of contiguous region formed from the upper left corner until you have flooded the entire grid with one color. I wrote some code and have been tinkering around some.
The most naive solver is a breadth first search. So, I did that. Below you see the solution length for a grid size of varying size with only three colors.
This search breaks down at large grid size because it's so slow. Some kind of heuristic approach would perform better, but can you prove it's within some epsilon of optimal? What is the expected optimal solution length? I think that should be proveable theoretically since you just have a uniform grid and can constrain the growth rate. I will likely return and do that.
Monday, September 9, 2019
Goal of Anomaly Detection in Non-stationary Data
The code for this is:
Subscribe to:
Posts (Atom)