Designing a Test Network for Analyzing Botnet Detection Algorithms

OData support
Dr. Fehér Gábor
Department of Telecommunications and Media Informatics

Nowadays the majority of the complex attacks on the Internet are caused by botnets, networks consist of infected computers, controlled by a person or a group – known as the botmaster – via a Command and Control (C&C) channel. Botnets are mostly used for distributed denial-of-service (DDoS) attacks, stealing personal data, and sending e-mails containing spam, phishing viruses or trojans.

Defending against these networks is an open issue. Viruses use computer vulnerabilities or the naivety of their users to spread. After the infection they hide and run as a background process, waiting for commands from their creator. Modern viruses also have the ability to modify their own code periodically, so not even an up-to-date antivirus program can guarantee perfect protection against them.

Botnet detection is an important task since huge damages can be caused by these networks. In this thesis I present an architecture designed for effective and reliable botnet detection with emphasis on a distributed, scalable, and anonymous operation resistant against attacks.

My task was to test the efficiency of the algorithm responsible for botnet detection. In order to do this I designed and implemented a network where the traffic can be captured and analyzed while keeping the malicious traffic under supervision so that viruses are not able to spread outside the network.

After implementation I tested the ability of the algorithm to recognize malware related traffic and I also measured the authenticity of the results. For these tests I used viruses compiled from source or acquired from the honeypot connected to the system.

After analyzing the results I came to the conclusion that though the algorithm is able to recognize malicious traffic and therefore is able to detect botnets, the proportion of false positive results – harmless traffic marked as malicious – is not low enough to be used extensively. Further tuning of the algorithm is necessary to achieve better results. This problem may also be solved with whitelists or extending the algorithm with adaptive learning methods in the future. In this state though using the algorithm as a live application should be supervised by an administrator to set true and false positive results apart.


Please sign in to download the files of this thesis.