The amount of multimedia content increases significantly by the day. By 2017 the sum of all forms of Internet video traffic will be in the range of 80 to 90 percent of global consumer traffic. Organizing and searching in this vast amount of video content is a hard task. Automatic annotation of videos can provide a solution for these sorts of problems. In my thesis I developed two slightly different face recognition systems that store the appearances and disappearances of all people in a given video. One system uses face recognition technique and actually identifies the actors, while the other uses clustering methods to differentiate between them, for situations when no prior information is available. For implementation I used the OpenCV library and C++ programming language.
In the first part of my paper I studied the field of face detection and recognition and elaborated on the algorithms I used in my work. After thoroughly testing the OpenCV face recognition elements, the face recognizer and face clustering systems turned out as follows.
For finding faces in videos I used the Viola-Jones object detection algorithm trained with Haar cascade for frontal faces. Then, using similar detectors, the software finds the eyes of the detected faces and aligns them. The face recognizer system uses the Fisherfaces model to identify the actors. When there is no possibility to obtain several face images of the people in a video, then the face clustering system needs to be used. This program creates Local Binary Pattern histograms for each of the detected faces, and uses hierarchical k-means clustering in order to determine which face belongs to which actor. The systems then store the calculated screen time for each person found.