Automatic detection of depression by SVR using Frace speech material

OData support
Dr. Vicsi Klára
Department of Telecommunications and Media Informatics

The topic of my thesis is the creation of a model, which helps us detect if a speaking person is suffering from depression. Depression is an extremely serious illness, found in millions of people around the world. An early diagnosis would improve the overall quality of life, while also lowering the financial costs of the treatment. Starting the treatment at an early stage could potentially save lives. Numerous researchers found that depressed speech differs from healthy, as it can be monotonous, hoarse, slurred and slow. The goal of my thesis is to examine the basic features of our speech, and identify the ones that can be helpful in the accurate detection of depression in sick people.

During the work with my thesis, I used recorded speech from multiple sources. To create the model, I used a multi-language database. To detect the depression I used speech files recorded by French scientists working on the South Pole, at the Concordia station. Both of the databases contained recordings of people reading the phonetically balanced tale in their mother tongue, called “The North Wind and the Sun”, out loud. To build a model and detect depression I used the 21-question test, the Beck Depression Inventory II. I segmented and annotated every speech sample on the level of phonemes using the programs Praat and MAUS

During the analysis I used both segmental and supra-segmental parameters. The examined segmental parameters were fundamental frequency, formant frequencies, jitter, shimmer, Harmonic-to-Noise Ratio, intensity, Mel filter and MFC coefficients. From the supra-segmental parameters I looked at the percentile range and variance of fundamental frequency and intensity.

For the analysis of the connection of the speech parameters and the severity of depression, I used Support Vector Regression. My results show that the best parameters to use for depression-detection are the alteration of the range of fundamental frequency, and the variance of intensity. The results are promising, but further research is needed with more languages, and a bigger group of subjects.


Please sign in to download the files of this thesis.