Application of sinusoidal coding for speech synthesis

Dr. Csapó Tamás Gábor
Department of Telecommunications and Media Informatics

In this thesis I deal with sinusodial coding of audio signals. It is a special type of vocoding, which I apply to speech signals. During sinusodial coding different parameters can be extracted, which are specific for each sinusodial component of the sound. From these parameters it is possible to resynthesize new audio which is very similar to the original one.

First I describe the basics of sinusodial coding. The first step in my work is researching. During this I provide an insight to the function of sinusodial vocoders. After that I present the design and realisation of a simple sinusodial vocoder, based on my new knowledge. It can extract amplitude and frequency parameters from the sinusodial components of the speech signal, and also can make synthesized speech from these parameters.

In the next chapter I examine from closer an existing sinusodial vocoder, and write about its functionality. I present its weakness which occures when it has to deal with glottalization and try to solve it with another model from erlier.

To test the implemented methods and get subjective feedbacks, I rate them with an online listening test. After several people complete the test, I will have a more comprehensive picture about the quality, and I will see that the corrected version performs better.

The vocoder which I implemented can be developed further, so it could be used in different areas of speech signal processing in the future (such as speech signal processing, speech transformation).


