Abstract:
Network Address Translation (NAT) is present in many routers and Customer Premise
Equipment (CPEs). It is used to distribute internet access to several local hosts. Most
NAT devices implement Port Address Translation (PAT), which allows mapping multiple
private IP addresses to a single public IP address. The private network behind a NAT
becomes hidden from the public internet and only a single outward IP address will be
visible to Internet Service Providers (ISP’s). With the proliferation of unauthorized wired
and wireless NAT routers, internet subscribers can re-distribute an internet connection or
deploy hidden devices, thus causing a problem known as shadow IT.
To this end, it is of ISP’s interest to know how their services are used. This study will
propose a method to detect NAT devices and identify the size of the network (number of
hosts) hidden behind them. A supervised Machine Learning (ML) algorithm that uses
aggregated network traffic flow features is proposed to detect NAT devices. Traffic
features are aggregated within multiple window sizes to study the effect of feature
aggregation on NAT detection. The host counting algorithm is processed by a machine
learning approach on real network traffic features. This research demonstrates that
eXtreme Gradient Boosting (XGBoost) performs best in NAT detection and hidden
network size detection. Whereas the Random Forest (RF) classifier was more able to
predict the exact number of hidden hosts than any other algorithm. The XGBoost NAT
detection model can detect NAT devices with a 97.09% F1 score which significantly
outperforms many state-of-the-art methods. The exact host counting model resulted in a
65.53% F1 score, and the result increased to 90.63% after transforming the problem into
a binary one. Most previous methods focused on achieving a high detection rate on given
datasets instead of focusing on the model’s generalizability. However, this thesis focuses
on the performance of the detection algorithms especially when the network data is
subjected to intended obfuscation or even when there is an environment change. The
performance of detection models dropped below 70% when testing the model in a new
network environment. In this thesis we also focus on interpreting the behavior of the
complex algorithm to enhance trust in the results, understand the generalizability, and
explain the importance of feature aggregation in case of NAT. Two eXplainable Artificial
Intelligence (XAI) methods are used to analyze the generalizability of a given feature set
to different network environments or after performing obfuscation techniques. These
methods are also used to study the sensitivity of the detection algorithms to the aggregated
feature set extracted. Finally, this study uses transfer learning to build an optimized model
that can work in case of any feature change in the network traffic data.