Implementing clustering algorithm on GPU using CUDA and OpenCL languages

OData support
Dr. Dudás Ákos
Department of Automation and Applied Informatics

Where numerous data is stored, usually data mining is used to reveal

new patterns invisible to the human eye. Due to the large input,

data processing algorithms often take a long time to finish.

One way to improve the performance of these algorithms is using parallel

or distributed systems. Building a computer cluster takes much time and

money, so it is out of reach for a person. However, every PC has a

graphic card inside, which contains hundreds of cores nowadays.

In my thesis I adapt a cluster analysis algorithm to the GPU (graphical processing

unit). The task of the cluster analysis is to sort a set of objects into groups,

where objects in the same group are as similar as they can be, and objects

in different group are as distinct as possible. The selected algorithm

was the K-means. With writing several versions of the program, I explored

the capabilities of the GPU.

For using the graphic card for non-graphic computations I used and compared

two frameworks, CUDA and OpenCL. With OpenCL, one can even write multi-core programs that

run on the CPU instead of the GPU, without modifying the code. These

frameworks are only just a few years old, and are developed constantly,

adding new features and speedups. I used the newest versions available

at the beginning of my thesis, and at the end, even newer versions appeared.


Please sign in to download the files of this thesis.