In my thesis I am training a convolutional neural network for image classification. The particular task is to classify images of zebrafish embryos into two classes: alive and dead. There are 3 photos for every embryo, photos are taken at 1, 2 and 3 days of age. Of course we need a lot of samples in order to be able to train the neural network to solve the task at hand. However, we only have a couple hundred images, namely 395 images of alive embryos and 61 images of dead embryos. This is a very little data set, furthermore we can’t kill animals to create more samples, that’s why I am generating similar images from the original images to increase the size of the data set.
I examined two main approaches for image generation. One of them is pixel-based texture synthesis, which enlarges a small image by sampling itself (or another image) by finding pixels with similar neighborhood. The other method I examined is Principal Component Analysis, which represents our images as the linear combination of principal components. The idea is to generate new coefficients for the linear combination, thus getting new images similar to our original images.
I chose the second approach and I implemented it. I tried multiple methods for getting new coefficients for image generation, but none of them were perfect, however I could use the generated images to train the neural network to solve the problem.