Malware detection and classification using recurrent neural networks.

dc.contributor.authorAl Rahal Al Orabi, Wael Mohammad
dc.contributor.departmentDepartment of Computer Science
dc.contributor.facultyFaculty of Arts and Sciences
dc.contributor.institutionAmerican University of Beirut
dc.date2018
dc.date.accessioned2020-03-28T11:50:09Z
dc.date.available2020-02
dc.date.available2020-03-28T11:50:09Z
dc.date.issued2018
dc.date.submitted2018
dc.descriptionThesis. M.S. American University of Beirut. Department of Computer Science, 2018. T:6937.
dc.descriptionAdvisor : Dr. Haidar Safa, Professor, Computer Science ; Members of Committee : Dr. Wassim El Hajj, Associate Professor, Chairperson, Computer Science ; Dr. Mohamed Nassar, Assistant Professor, Computer Science.
dc.descriptionIncludes bibliographical references (leaves 86-92)
dc.description.abstractMalware detection and classification is becoming one of the hottest eras of research because the number of malwares is increasing nowadays which raises many questions and concerns related to security. For example, recently ransomware is a malware that targeted huge companies and infected many computing systems. Over the years, researchers have focused on automating the process of detecting malware in computing systems by designing approaches that rely on data mining and machine learning methodologies. These approaches were proved to be efficient by achieving great results in terms of accuracy. On the other hand, one of their limitations is that they still being considered as shallow models compared to deep learning. Deep learning technologies rely on more complex computational architecture which needs more data. As the computational complexity of the model increases, a larger dataset is required to train, build, and validate it. To remedy the limitations of those shallow approaches, in this thesis we propose an automated solution for malware detection and classification in binary executable sequences based on deep learning. We define a new malware language which is designed with the concept of a vocabulary, documents, and words. Each malware assembly instance is a document, and each assembly action in the malware document is a word. Consequently, a malware vocabulary is defined as a set of malware documents. This language design is used to extract the features from executable binary sequences. We develop a hybrid classification model that consists of two main components: feature extraction and classification component. The feature extraction component is based on the predefined malware language. We have different architectures for the classification component such as Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), 1 dimensional Convolutional Neural Networks (1DCNN), and a hybrid architecture that consists of 1D-CNN and LSTM. We validated our models empirically by running a set of experiments on Micro
dc.format.extent1 online resource (xii, 92 leaves) : color illustrations
dc.identifier.otherb23273562
dc.identifier.urihttp://hdl.handle.net/10938/21717
dc.language.isoen
dc.subject.classificationT:006937
dc.subject.lcshNeural networks (Computer science)
dc.subject.lcshComputer crimes -- Prevention.
dc.subject.lcshHackers.
dc.subject.lcshMachine learning.
dc.subject.lcshComputer security.
dc.titleMalware detection and classification using recurrent neural networks.
dc.typeThesis

Files