Automated detection and classification of malware used in targeted attacks via machine learning
MetadataShow full item record
Please cite this item using this persistent URLhttp://hdl.handle.net/11693/29171
Targeted attacks pose a great threat to governments and commercial entities. Increasing number of targeted attacks, especially Advanced Persistent Threats, are being discovered and exposed in each year by various cyber security organizations. Key characteristics of these attacks are well-funded and skilled actors persistently targeting speci c entities, sophisticated tools and tactics, long-time presence in breached environments before detection and stealth operation. Malware plays a crucial role in a targeted attack for various tasks such as compromising systems, maintaining presence, communicating with the operators, carrying out commands, etc. Because of its stealthy nature, malware used in targeted attacks is expected to act di erent than the traditional malware when it is dynamically analyzed in a sandbox environment. In this thesis we focused on the malware used in targeted attacks and present a method to automatically detect and classify targeted malware through machine learning using behavioral and memory features. Its worth noting that it is a rst work published in the literature that classi es targeted malware and incorporates memory features into the dynamic features. The method comprises the steps of running both traditional and targeted malware in a dynamic analysis system along with a memory analysis tool, extracting features from behavioral and memory artifacts found in analysis results and employing machine learning on the extracted features. New behavioral and memory features were de ned in order to classify targeted malware more e ectively. Method is then evaluated over a dataset comprised of targeted and traditional malware with di erent supervised learning algorithms. The results show that machine learning can be employed successfully to automatically detect and classify targeted malware from dynamic analysis results using behavioral and memory features.