My goal was to create an efficient Bloom Filter implementation. Bloom Filter structures can be used to represent bigger datasets while maintaining low memory requirements, and it has only two operations. One can insert elements into it, and one can test an element to see if it is present in the set or not. In the past few centuries the Bloom Filters applications became wider and wider, and new variations of Bloom Filters are created. First, I have created my solutions on ordinary computer CPUs, then I have created some other applications which are using CPU and GPU.
When I have created implementations on the CPU, I have used C++11, which offers tools for thread management, such as creating threads and synchronizing between them. These tools have a great adventage, they are standardized, which means that the application has the potential to be compiled on different platforms.
For the general-purpose GPU programming I have used the NVIDIA CUDA architecture. It can be learned easily, and the programmer can create applications in general programming languages just like C. The challange is to learn the GPU’s specialities, like different memory spaces and to parallelize algorithms effectively. Programming on GPUs has a great potential, if a problem parallelized effectively, one can achieve great performance gain.
In my thesis after a short introduction I will present the Bloom Filters and the CUDA architecture. After that I will show a simple Bloom Filter implementation, then different parallelized ones on the CPU. I will use the ideas used in these solutions, to create applications which are running on the CUDA architecture. At last I will compare the different solutions.