Cross-correlation analysis of depressed speech

OData support
Dr. Vicsi Klára
Department of Telecommunications and Media Informatics

In our century depression is one of the most frequent mental disorders. The people’s lifestyle, the continous pressure from their environment and the stress factors push nearly one fifth of the population into the state of depression during their lifetime. To diagnose it, it requires doctors who have no capacity to treat all who need it. Moreover, the patients often stay away from the doctors until their environment convince them to be treated.

Psychologists say that they can hear the signs of mental disorders from their patients speech. The changes in the produced speech are measurable and in this paper I make attempt to recognize depression from the patients’ speech with the gathered information from it.

In this paper I use the correlation structure of the acoustic features from the patients’ speech and predict the state of their depresson. I used two speech databases with two different languages, the AVEC 2013 database and a Hungarian database. I created new features from the gathered acoustic features of the samples from the databases and selected the best ones, using FFS (Fast Forward Selection) algorithm.

I tested the selected correlation features with SVR (Support Vector Regression) and received the correlation between them and the BDI (Beck Depression Inventory) scores, the MAE (Mean Absolute Error), and the RMSE (Root Mean Square Error) from the testing. The results are promising, I received better results in every indicator than the winner of the AVEC 2013 challenge. My best results are 0.8319 for the correlation, with 7.5241 RMSE and 6.0997 MAE.

The results could be a base for a real- time diagnostic system, which could be a big step in the diagnostics of mental disorders.


Please sign in to download the files of this thesis.