Visualizing HTTP traffic flows from packet data

Typ
Examensarbete för masterexamen
Master Thesis
Program
Computer systems and networks (MPCSN), MSc
Publicerad
2015
Författare
Shailaja, Mallick
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
The Hyper Text Transfer Protocol (HTTP) implemented in browsers to find information on Internet is a large part of the traffic caused by the browsers which are the predominant source of data communication in networks. The content of a website varies from static displays of a simple page to rich media applications along with a large number of third-party advertisements. While having the advantage of gathering as much information as possible for a user, websites also have the disadvantage of making it possible for an attacker to exploit its structure and design. Any susceptible website if manipulated by an attacker with the injection of malicious software i.e. malware will have the potential to compromise a user's machine along with sensitive information that he/she has. In order to investigate these security incidents, one needs to have a thorough analysis of HTTP network traffic streams as it provides a bigger picture of the request/ response mechanism along with the information of embedded requests if any. However, with the complex structure of web pages, it is a huge manual effort to scan through hundreds of megabytes of HTTP streams extracted from packet capture tools. Differentiating from benign traffic to malicious traffic in such large size files makes it a slow and lengthy process. However, if this process can be automated by a tool which can extract HTTP streams from packet capture files, process and filter it and finally, provide a visual representation of the HTTP traffic flow over time efficiently, it can be beneficial for a forensic analyst to easily figure out the malicious traffic. The main purpose of this thesis is to develop such a prototype which can effectively and efficiently visualize the flow(s) of HTTP traffic from websites with a partial focus on malicious advertisement- 'malvertising'. This paper describes the methodology used to extract the HTTP traffic from packet data and thereby using the data or metadata from the extracted information to visualize it in the form of a graph over time. The results show that it is even possible for large size files to clearly display the traffic flows efficiently with the ability to further analyze each node in the graph. It also shows that if malicious traffic is found, it can be traced back to its parent host and thereby, it is possible to understand the root cause.
Beskrivning
Ämne/nyckelord
Data- och informationsvetenskap , Computer and Information Science
Citation
Arkitekt (konstruktör)
Geografisk plats
Byggnad (typ)
Byggår
Modelltyp
Skala
Teknik / material
Index