Abstract:
Traffic classification is a key network function for managing both Quality of Service (QoS) and security. While some traffic classification applications (e.g. QoS based path allocation) can tolerate delays, other applications (e.g. attack detection) are time critical. In this context, early traffic classification has been proposed based on the first few packets of flows. However, the choice of the number of packets to inspect is method dependent and based on empirical assessment without considering the information carried by these packets (features). In this paper, we aim at identifying the sufficient number of packets, N, that guarantees high classification accuracy while optimizing the response time, based on both empirical classification results and information theory. We propose a confidence measure based on the variations in the model training accuracy and the average mutual information among the packets’ features and the label vector. This measure is then used to define the value of N, which optimizes the trade-off between the time overhead and the classification accuracy. In addition, we propose an ensemble Deep Learning (DL)-based classifier model to enhance the classification accuracy by training successive DL models based on the traffic stream. The proposed ensemble method output is based on the average of the individual classifiers predictions. The experimental results show that when using the proposed confidence measure, we can achieve good classification accuracy at early phase of the flow. In addition, using the proposed ensemble method presents enhancement in the early classification accuracy. Consequently, combining the ensemble method with the confidence measure criteria allows for striking a good balance between high accuracy and fast response time. © 2021 Elsevier B.V.