Audio Fingerprinting in Managed Environment

OData support
Supervisor:
Simon Gábor
Department of Automation and Applied Informatics

Listening to music has become an integral part of our everyday lives. Discovering the metadata that belongs to our audio files – such as artist, title, album, etc. – is usually done by using the corresponding ID3 metadata or file names. However, for files missing these key data, the audio content needs to be analysed (unless we can find the lyrics). Since the binary representation of the audio files depends on the used bitrate, compression and format to a large extent, we have to use a method which takes into account the audio content itself and compares it with known songs. For this purpose, we need to create an extract based on the audio – the audio fingerprint – so that we can accomplish the above task by building a database from these digests. By using fingerprints, we can also determine which redundant songs (even if they are in various formats) are using up the available space on our storage devices.

The primary objective of my thesis was to create an audio fingerprinting library based on the .NET framework. I purposefully wanted to use the algorithm of an existing open source solution, so that the end result would be compatible with an extant fingerprint database. I chose Chromaprint since it seemed to be the most mature of the available implementations and because it is part of a well-established infrastructure. With its complex algorithm, it describes the input audio based on the characteristics of the temporal changes in the strength of musical notes found. I acquainted myself with this algorithm in depth and I ported it – modifying the code to its C# equivalent and using external .NET libraries wherever the original solution called in other libraries too. Despite of many difficulties that occurred, even though the output is slightly different from the original because of one of the libraries used, a full-featured audio library has been created. This justifies the usability of the .NET platform even in compute-intensive tasks. Then I implemented an algorithm for the comparison of the fingerprints, which can counter not only the differences arising from the storage method, but also the effect of a constant shift in the audio signal. For this, I also created a graphical user interface, which enables the user to select and compare audio files, showing the degree of similarity between them.

Downloads

Please sign in to download the files of this thesis.