Payload-based network intrusion detection using LSTM autoencoders

Date

2020-12

Editor(s)

Advisor

Kozat, Süleyman Serdar

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

Print ISSN

Electronic ISSN

Publisher

Volume

Issue

Pages

Language

English

Journal Title

Journal ISSN

Volume Title

Series

Abstract

The increase in the use of computer networks by vast numbers of different devices have allowed malicious entities to develop a plethora of diverse attacks, targeting individuals and businesses. The defence systems need to be kept up to date constantly since new attacks emerge daily, in addition to having a wide range of characteristics. Intrusion detection is a branch of cyber-security that aims to prevent these attacks. Machine learning and deep learning approaches gained popularity in this discipline, as they did in many others such as fraud detection and medicine. Given that network traffic usually displays normal behavior, anomaly detection methods can pinpoint threats by identifying connections with abnormal properties. This task can be accomplished in a supervised or an unsupervised manner. Regardless of the path, constructing meaningful representations of network data is essential. In this thesis, we employ different types of feature extraction methods for computer network data and anomaly detection strategies that can detect malicious behaviour. For the feature extraction task, we aim to obtain vector representations of network payloads such that the core information is more reachable and irrelevant information is discarded. In our setting, the input size can vary due to the nature of the computer network data. Considering this, we use feature extraction methods that can map inputs of varying sizes into feature spaces with fixed dimensionality so that some machine learning approaches, that are otherwise unusable in these settings, can be employed. For the anomaly detection task, we utilize both supervised and unsupervised approaches. The supervised methods make use of the aforementioned feature extraction strategies and use the reduced and fixed dimensional representations of the computer network data. For the unsupervised case, we employ autoencoders that can extract information from sequential data. Recurrent neural networks(RNNs) can process sequential data with varying length. We specifically use autoencoders with long short-term memory(LSTM), which is a special form of RNNs with a more complex structure that allows them to handle long-term dependencies in sequential data. Then, anomaly detection is performed using reconstruction error. We conduct experiments using dynamic and realistic data sets, which consist of various types of attacks. Then, we evaluate the validity of our proposed approaches based on AUC and F1 measures.

Course

Other identifiers

Book Title

Degree Discipline

Electrical and Electronic Engineering

Degree Level

Master's

Degree Name

MS (Master of Science)

Citation

Published Version (Please cite this version)