Contemporary Network Intrusion Detection Systems (NDIS) typically incorporate artificial intelligence (i.e. machine learning and deep learning) techniques. Training such systems requires representative and commonly accepted network-based datasets. Such datasets are rare, however, and frequently lack the required volume of traffic and an appropriate attack diversity. A fundamental cause for this scarcity is that reliable datasets can often not be published due to privacy issues. As a result, most publicly available datasets do not reflect current attack techniques or are subject to data anonymization, resulting in missing or incorrect metadata. Training an IDS with such suboptimal data is generally not very effective.
NeDaGen (Network traffic Dataset Generator) is a flexible and expandable tool that generates labelled network traffic datasets by simulating a real infrastructure and real attacks. It is capable of building user-defined (customizable) networks and simulating both benign and malicious network traffic, thus generating a labelled dataset that can properly facilitate the evaluation of a Network-based Intrusion Detection System (NIDS).