Diagnostical possibilities of pathological voices

OData support
Dr. Vicsi Klára
Department of Telecommunications and Media Informatics

This essay focuses on diagnostic possibilities of pathological voices with automatic methods. For my research I used a sound database which was developed by the Laboratory of Speech Acoustic (LSA) at BUTE TMIT. This sound database contains healthy and pathological voices as well. The quality of the patients voice was assessed by a specialist according to a four-point ranking of subjective voice quality called RBH (0 = least abnormal, 3 = most abnormal). I used the H parameter to separate the healthy and pathological voices. This value represents the hoarseness in generally. Besides the LSA’s database I also used some recordings of the Hungarian Reference Database (MRBA) to increase the number of the healthy samples.

My first task was to compare the one-class and the two-class Support Vector Machines (SVM). For the comparison I used the healthy and the pathological samples and I put the acoustic parameters of the vowel “e” from the continuous speech to the SVM vectors. These acoustic parameters were jitter ddp, shimmer dda, mean Harmonic-to-Noise-Ratio (HNR) and Mel Frequency Cepstral Coefficients (MFCC). Both type of SVM were tested with two different testing methods: normal testing and full cross-validation. The results showed that the two-class SVM is better to separate these two classes (108 healthy and 108 pathological samples). With this type of SVM I achieved an accuracy of 86.1%,

My second task was to investigate the classification of different types of illnesses. I chose functional dysphonia and vocal cord paralysis. I used the two-class SVM with various input vectors. In these experiments I utilized the acoustic parameters of the vowel “e” from continuous speech and I also examined how the accuracy changes if I use the whole speech of each speakers. The best result was 78.9% when I used the whole speech of each person.

Finally I examined the one-step and the two-step classification methods. My task was to find the best technique to diagnose functional dysphonia and vocal cord paralysis. After I compared all the results I got to the conclusion that vocal cord paralysis can be diagnose with the highest accuracy (51.3%) if I used one-step classification method.


Please sign in to download the files of this thesis.