Traffic classification aims to give information to network operators about the mixture of traffic passing through their network. The analysis of the information provided by classification algorithms can supply network management tasks like capacity planning or billing. Most of the methods available are developed by independent research communities, they all have their strengths and weaknesses, and their comparison and validation is sometimes problematic, as it relies on many components.
The operation and the scalability of the algorithms can be thoroughly tested if sufficient amount of test traffic is provided. The validity of the results can only be checked if a reference is present. The comparison of the algorithms requires the transformation of their various outputs to a comparable format. Evaluation may rely on such applications that allow quick numerical and graphical representation of the results.
In my thesis I introduce those software components of a framework that I have developed to aid the comparison process. First I present a traffic aggregation tool implemented in C. This comprises real world like traffic in a real world like aggregation scenario in order to produce aggregated output of considerable number of users with a single commodity PC. The operation is based on real traffic traces, and the purpose of such a tool is to stress classification methods with realistic, high speed traffic. I also evaluate the performance and the timing capabilities of the tool developed.
Another contribution of my work is a method comparison environment. This allows the comparison of five traffic classifiers, and it is also capable of incorporating results of a reference tool to allow credible validation. The operation of the environment is based on Ruby scripts, but I also discuss some parts developed in C, namely a special parser for Captool, Ericsson’s industrial classifier, and a reference extractor tool. The final part of my thesis introduces a website which I have implemented in Ruby on Rails. This allows scheduling of comparison tasks and also helps efficient evaluation of comparison results by providing a wide variety of information ranging from detailed flow logs to automatically generated graphical presentation. I conclude my work mentioning further possibilities of development.