A review on machine learning–based approaches for Internet traffic classification
Loading...
Files
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Springer Science and Business Media Deutschland GmbH
Abstract
Traffic classification acquired the interest of the Internet community early on. Different approaches have been proposed to classify Internet traffic to manage both security and Quality of Service (QoS). However, traditional classification approaches consisting of modifying the Transmission Control Protocol/Internet Protocol (TCP/IP) scheme have not been adopted due to their complex management. In addition, port-based methods and deep packet inspection have limitations in dealing with new traffic characteristics (e.g., dynamic port allocation, tunneling, encryption). Conversely, machine learning (ML) solutions effectively classify traffic down to the device type and specific user action. Another research direction aims to anonymize Internet traffic and thwart classification to maintain user privacy. Existing traffic surveys focus on classification and do not consider anonymization. Here, we review the Internet traffic classification and obfuscation techniques, largely considering the ML-based solutions. In addition, this paper presents a comprehensive review of various data representation methods, and the different objectives of Internet traffic classification. Finally, we present the key findings, limitations, and recommendations for future research. © 2020, Institut Mines-Télécom and Springer Nature Switzerland AG.
Description
Keywords
Classification, Data representation, Internet traffic, Machine learning, Obfuscation, Survey, Cryptography, Traffic surveys, Transmission control protocol, Classification approach, Data representations, Deep packet inspection, Internet communities, Internet traffic classifications, Traffic characteristics, Traffic classification, Transmission control protocol/internet protocols, Quality of service