How do you denoise images with an autoencoder if you don't have a clean version to train with? One option is to add more noise to your images! In this experiment, I trained an autoencoder with noisy MNIST data. I began with MNIST images on the bottom row, the noiseless versions. To simulate observational data, I added Gaussian noise to the images. In reality, we may never have access to these noiseless images. To train an autoencoder we need an input set with noise and output set without noise so the autoencoder can learn the denoising procedure. An autoencoder could potentially also learn the denoising procedure if we gave it extra noisy images as input and slightly denoised images as output. To simulate this, I added more Gaussian noise to the observations to arrive at the top row. Then, the top row is input and the second row is the output for training. When we want to denoise observations, we use this trained network with the observations as input and the denoised row as our output.
I am not sure how sensitive this is to an accurate noise model when adding noise or the amount of noise added. In the solar extreme ultraviolet setting, we suffer more from shot/Poisson noise than Gaussian noise. I am unsure how well this approach works under that setting.
An arguably more elegant approach to this problem is the "Blind Denoising Autoencoder" by Majumdar (2018). It does not require this noise addition or noiseless images.
Wednesday, March 27, 2019
Direction specific errors and granularity
In solar image segmentation, we identify many categories of structures on the Sun: coronal hole, filament, flare, active region, quiet sun, prominence. In our use case, some mistakes are more egregious than others. For example, mistaking a filament as a coronal hole is not too bad, no where close to as bad as calling it a flare. Assume we have a gold standard set for evaluation. (In reality, even this gold standard set may have errors, but we can ignore that for now.) It has a region labeled as filament. Ideally, we want our trained classifier to also call that filament. However, if it calls it quiet sun, we would be okay. Calling it coronal hole is also acceptable. Any other category is wrong, with the most egregious being if we call it outer space or flare. Now another portion of the Sun is labeled quiet sun in the gold standard. It is not okay for the classifier to then call it filament. In this way, it is acceptable to mistakenly label a filament as quiet sun but unacceptable to label quiet sun as anything else. The error depends on the direction of the mistake.
Similarly, in our current evaluation we evaluate errors on a pixel-by-pixel basis. In reality, we do not care about this granularity. We want coherent labeling. Small boundary disagreements are okay. We need a more robust evaluation metric.
Similarly, in our current evaluation we evaluate errors on a pixel-by-pixel basis. In reality, we do not care about this granularity. We want coherent labeling. Small boundary disagreements are okay. We need a more robust evaluation metric.
TSS versus f1-measure
The above movie shows how accuracy, TSS, and f1-measure change under the assumption that a classifier has no false positives until it has classified all of a class correctly. The vertical grey line shows the actual percentage of the features having a given class versus the horizontal axis what percentage of the class is identified by the model. For example, if the true class percentage is 0.1 as shown below we see that an aggressive classifier, one that prefers creating false positives, is punished much less by accuracy and TSS than by the f1-measure. If the model to classified 20% of the examples as a true example, the accuracy and TSS is around 0.9 while the f1-measure drops to around 0.65. Selecting your metric is very important depending on if you prefer false positives or false negatives.
Thursday, March 21, 2019
Motivation for Denoising Solar images
Wednesday, March 13, 2019
Boulder Solar Day
I attended the second half of Boulder Solar Day yesterday and gave a talk on some of what I've worked on and plan to work on. The slides are available here.
Sunday, March 10, 2019
Efficient Dictionary Learning via Very Sparse Random Projections
For a course with Claire Monteleoni, I presented on Efficient Dictionary Learning via Very Sparse Random Projections by Farhad Pourkamali-Anaraki, Stephen Becker, and Shannon M. Hughes. You can access the slides here or some crude notes here.
Subscribe to:
Posts (Atom)