Where numerous data is stored, usually data mining is used to reveal
new patterns invisible to the human eye. Due to the large input,
data processing algorithms often take a long time to finish.
One way to improve the performance of these algorithms is using parallel
or distributed systems. Building a computer cluster takes much time and
money, so it is out of reach for a person. However, every PC has a
graphic card inside, which contains hundreds of cores nowadays.
In my thesis I adapt a cluster analysis algorithm to the GPU (graphical processing
unit). The task of the cluster analysis is to sort a set of objects into groups,
where objects in the same group are as similar as they can be, and objects
in different group are as distinct as possible. The selected algorithm
was the K-means. With writing several versions of the program, I explored
the capabilities of the GPU.
For using the graphic card for non-graphic computations I used and compared
two frameworks, CUDA and OpenCL. With OpenCL, one can even write multi-core programs that
run on the CPU instead of the GPU, without modifying the code. These
frameworks are only just a few years old, and are developed constantly,
adding new features and speedups. I used the newest versions available
at the beginning of my thesis, and at the end, even newer versions appeared.