GPGPU support for exploratory data analysis

OData support
Kocsis Imre
Department of Measurement and Information Systems

In exploratory data analysis, data visualization has an important role; thus, this is the most competent tool for exploring the datasets visually and creating assumptions. Data visualization can be either static or interactive; the latter significantly improves the effectiveness of the process of creating assumptions.

Visualizing small datasets is not a major challenge when using most of the available tools; however, handling Big Data brings complications even in static visualization. Lately, there has been a canonical way appearing in the approach towards exploratory data analysis of Big Data that gives answers to the challenges in visualization, although it lacks interactivity.

The goal of my thesis is to give an overview of the several problems of Big Data visualization and to demonstrate an efficient approach for handling them, visualizing large datasets the right way. After this, I am introducing two data visualization tools (Bigvis, Datashader) and with their help I am proposing a solution for the possibilities of GPU accelerated data visualization with CUDA.

Based on the GPU acceleration, I am implementing the selection and linked highlighting interaction in Datashader, providing a different way in which interactive visualization of Big Data can be carried out more easily. Lastly, I am showing the initial steps of creating and managing a GPU cluster, and scaling out the GPU accelerated data visualization to the cluster.


Please sign in to download the files of this thesis.