My task is to analyse the world’s most popular video sharing site. I had to create statistics in general, based on video categories and based on the year the videos were uploaded in.
Solving my task required new knowledge. The first one being getting to know YouTube API (Application Programming Interface). This program makes it possible to request information and statistics from YouTube. I created two requests. One being a request for information about the videos themselves. While the other being a request from playlist for video IDs, which I use in the first request.
For these request the browser replies with a JSON code. To process this output I wrote a Python program. This program fills in a database beside decoding the reply. SQLite is responsible for the connection between the program and the database. Which is later also responsible for an other connection, the connection between the database and MatLab.
For creating my graphs and diagrams I chose MatLab. I wrote a script which makes all of the needed plots, and counts some important data for my analysis.
While creating these diagrams I analysed them and wrote the documentation. Firstly, I analysed the database as a whole, and created some graphs to depict them. These are mainly distribution functions.
After the general analysis, I separated the „music” category videos from the rest, and compared them to the others. Then I divided my data based on their categories. I described all the possible video categories used by YouTube.
As my main task I studied the length of the videos based on their categories. I created many different graphs to represent the properties of these categories.
As a final step, I divided the videos by the year they were uploaded in, and checked their video length compared to each other and presented them on similar diagrams.