Construction of Hungarian Sound Gesture Database and Acoustical Analysis of Sound Gestures

Dr. Vicsi Klára
Department of Telecommunications and Media Informatics

My “Project laboratory” task in the previous semester was about processing the Hungarian Telephone Customer Service Speech Database from the acoustic and linguistic points of view. After annotation I was studying the machine recognition of the content of emotions of the speech. The theme of my Thesis Project was outlined during the process of the database mentioned above. In spontaneous conversation numerous nonverbal elements can be found, as we often sigh, laugh, hum while we are talking. Some kinds of the loud gestures reflecting the emotional state of the speaker are different exclamations (“jaj, hí, há, ó, jé”), too. Furthermore a large number of vocal events having no meaning occur in conversations, for instance cough, croak and sneeze. My task is to collect and analyse from an acoustic aspect the vocal gestures and nonverbal vocal events peculiar to Hungarian.

First of all I would like to give an overall picture about communication by introducing the levels and functions of communication. After that some pieces of literature on the subject are presented. A brief survey of articles published in the national specialized literature and of the results included in these articles and taken into consideration during my work is offered.

The main part of the project describes my work about collecting the vocal material and building the database. Five different spontaneous or nearly spontaneous speech databases available in the BME Department of Telecommunications and Media-Informatics Laboratory of Speech Acoustics for collected by me from the media are processed. After annotation the vocal events marked in the database were cut and classified. In this way I created a so-called vocal gesture-database, in that besides vocal gestures, vocal events delimited and independent from linguistic content are indicated.

The next phase was analysing the elements of the gesture-database from an acoustic point of view, and different researches were carried out about speech signs. My purpose was to supply and record in a database the acoustic functions in the aspect that how they are peculiar to the classes. Statistical analysis of basic frequency and harmonicity parameter was carried out. The intonations peculiar to the classes are given. Finally spectrums were analysed and intensity structures were examined.

At the end of the project the results are summarised, the next steps and purposes are outlined and my experiences are described.


