Transfer Learning Approach for Botnet Detection Based on Recurrent Variational Autoencoder
Published Web Location
https://sdm.lbl.gov/oapapers/snta20-kim.pdfAbstract
Machine Learning (ML) methods have been widely used in Intrusion Detection Systems (IDS). In particular, many botnet detection methods are based on ML. However, due to the fast-evolving nature of network security threats, it is necessary to frequently retrain the ML tools with up-to-date data, especially because data labeling takes a long time and requires a lot of effort, making it difficult to generate training data. We propose transfer learning as a more effective approach for botnet detection, as it can learn from well curated source data and transfer the knowledge to a target problem domain not seen before. We devise an approach that is effective regardless whether or not the data from the target domain is labeled. More specifically, we train a neural network with the Recurrrent Variation Autoencoder (RVAE) structure on the source data, and use RVAE to compute anomaly scores for data records from the target domain. In an evaluation of this transfer learning framework, we use CTU-13 dataset as a source domain and a fresh set of network monitoring data as a target domain. Tests show that the proposed transfer learning method is able to detect botnets better than semi-supervised learning method that was trained on the target domain data. The area under Receiver Operating Characteristic is 0.810 for transfer learning, and 0.779 for directly using RVAE on the target domain data.