Analysis and visualization of Bike-Sharing Systems' utilization

OData support
Marton József Ernő
Department of Telecommunications and Media Informatics

The topic of my MSc thesis is the analysis and visualization of the usage of Budapest bicycle sharing system (Bubi BSS). The popularity of BSS systems is continuously increasing, and such systems appear in more and more cities. In the future it can disencumber the urban transportation. Because of the popularity and novelty of the topic wide selection of literature is available. Most of these works focus on logistics optimization, i.e. they deal with the re-distribution of the bikes. In my work I prepare the usage data of Bubi for analysis, then I analyze the data on usage and users. I examine the use of stations in correlation with their geographical location. Based on the daily usage of the stations I create a static model for the once-a-day re-distribution of bikes.

The analytical database consists of 4 tables, which contains trip history since the launch of the system (September 2014) to August 2017 as well as station information, a calendar, and daily weather history for the particular period.

I examined the distribution of trips’ duration. The result is in line with other cities’ BSS data presented in the related work. 94% of the trips lasted less than 30 minutes and the most frequent duration is 8 minutes. Besides that, I analyzed the popularity of the system on daily base. Beyond to the expected seasonality (low winter and intense summer usage) and the periodicity within the week I found a decreasing trend in number of trips since June 2016. By examining the user population, I found that the decreasing number of trips is in line with the decreasing number of season ticket holders. The system is used by Hungarian citizens mostly and not by tourists. 96,6% of the trips were made by Hungarian people.

With cluster analyzes I identified the less used stations. They are located mainly on the west part of the city (Buda side).

I created several linear regression models to predict the daily traffic. Based on the model the strongest variables are weather attributes (rain, temperature) and the holidays. Having a typical number of uses between 0,2 to 3 thousand, the precision of the best model was RMSE = 463 with a correlation of 0.886.

Based on the system's 3-year trip history I identified the criteria of operation without dynamic rebalancing. For 100% availability it would need 2548 bicycles. However, in the 95% of days the current fleet size (1486) is sufficient.


Please sign in to download the files of this thesis.