GPGPU-based multidimensional heatmaps for general purpose data analysis

OData support
Supervisor:
Kocsis Imre
Department of Measurement and Information Systems

Heatmap visualization is common as part of the Explanatory Data Analysis (EDA)

on Big Data. These heatmaps are only used in two dimensions, but the usage of three

dimensional heatmaps for general data analysis isn’t examined adequately.

In this work I'll provide two solutions for the pipeline for three dimension heatmap

visualization of big data. One is fully GPGPU based, thus really fast, but limited to only

one aggregation function, the other one is partially running on the CPU, and much more

flexible in terms of summarizing. These volumetric heatmaps can be shown and examined

with two rendering packages. The third dimension opens up possibilities for finding such

new patterns, that we could hardly or not see at all with the previous two dimensional

solutions.

I will illustrate the data analysis that is embedded in the general purpose Python scripting environment, implemented with three-dimensional heatmaps, to which I provide a prototype in my work.

The analysis of the New York taxi dataset is nowadays a classic Big Data visualization exercise; In my work, I show this example that using instead of two, three and more

dimensional heat maps in EDA accelerates the process of data analysis and finding the better hypotheses.

Downloads

Please sign in to download the files of this thesis.