In the last two decades, data mining became one of the most important industry in the world. Besides the ordinary data base queries, it became possible to find hidden knowledge and patterns in the accumulated data. Meanwhile the preparation of the technical creation I worked on a database of Centre for Budapest Transport (Budapest Közlekedési Központ) that were collected from the temporarily accessible database. My goal was to create predictive models for the arrival and departure of two, basically different routes.
Meanwhile the literature research I got to know methods and techniques that could be in help with time-series analysis and machine learning. My task in the end was to create as accurate models as possible.
In the project I iterated throught the step of the CRISP-DM methodology and documented the engineering decisions I made. During the data analysis I picked the most relevant attributes in the perspective of the goals of the project, so the models could become as accurate as possible. In the end I compared and evaluated the four created models and I recommended ways to improve or fine-tune them if needed.