Scalable data mining techniques for mapping neuron connections

OData support
Supervisor:
Prekopcsák Zoltán
Department of Telecommunications and Media Informatics

The nervous system has a very complex structure, hence its observation raises difficult problems. We are able to detect the activity of the nervous system by different imaging technics for a long while, however, it has just recently become available, that both the spatial and the temporal resolution of the detected signal is sufficient to observe the individual action potentials per individual cells.

Both the complexity and the quantity of the resulting data is huge. Therefore, choosing data mining techniques for further processing might be reasonable.

My project was to analyze data which was created by neuron calcium imaging. I could determine the connections between the neurons by the time series of their activity. Since the real connections were known in some cases of the initial data, it was possible to use classification algorithms.

The first challenge was to produce the appropriate features for the classification. These could be derived from the couples of the time series of the neural activity, and they had to denote the probability of the connections between the actually scanned neuron couple.

After a comparison of lot of classification algorithm, the Random Forest seemed to be the best choice. It is an ensemble learning method, which construct a multitude of decision trees at training time. I used a model which consists of 3000 decision trees to improve the results of Antonio Sutera's team. The performance of the model was measured with the AUC. My approach accuracy was 0.94238.

Downloads

Please sign in to download the files of this thesis.