This repository presents a novel approach for generating realistic network traffic using Large Language Models (LLMs), specifically OpenAI’s GPT-4.1 and GPT-5.
Our method, called the Large Language Model Network Traffic Generator (LLM-NTG), aims to bridge the gap between realistic traffic generation and the expressive capabilities of LLMs.
We employ a few-shot learning framework combined with a human-in-the-loop feedback mechanism, where generated traffic is continuously evaluated and refined.
Contains datasets required for traffic generation:
- One-way and two-way traffic datasets.
- Sample inputs in
.json,.pcapng, and.csvformats for traffic generation.
Experiments performed with GPT-4.1.
This folder has two subfolders: Experiment_1/ and Experiment_2/.
Each experiment includes:
Exp1_Sample_Packets_Extraction.ipynb
Extracts traffic data from the dataset and prepares sample packets for generation.Exp1_Traffic_Generation_gpt4.1.ipynb
Generates synthetic traffic and saves the output as.jsonfiles.Exp1_Statistics_of_Generated_Traffics_gpt4.1.ipynb
Computes statistics of the generated traffic.
(Experiment_2/ follows the same structure.)
Experiments performed with GPT-5.
This folder also has Experiment_1/ and Experiment_2/.
Since the same input samples are used as in GPT-4.1, there is no extraction notebook here.
Each experiment includes:
Exp1_Traffic_Generation_gpt5.ipynb
Generates synthetic traffic with GPT-5.Exp1_Statistics_of_Generated_Traffics_gpt5.ipynb
Computes statistics of the generated traffic.
(Experiment_2/ follows the same structure.)
Contains the generated traffic outputs in .json format and .pcap files.
- Each experiment’s results are stored here.
- The
JSON_files/subfolder contains details such as: .json format of generated traffic - The
PCAP_files/subfolder contains details such as: There are .pcap files of the generated traffi,c and Wireshark can be used to view them. - The
Results/subfolder contains details such as:- Token usage
- Computation time
A transformation script that converts generated .json traffic files into .pcap format, enabling further analysis with standard network traffic tools (e.g., Wireshark).
Wireshark representation of traffic generated by GPT 4.1 and 5 for Experiment~1.
- Explore the datasets under
Datasets/. - Run the notebooks in
GPT-4.1/orGPT-5/for traffic generation. - Generated traffic will be saved under
Generated_Traffic/. - Optionally, convert
.jsontraffic files into.pcapformat using:python pcap_converter.py input.json output.pcap