A lightweight, machine learning-enhanced system for detecting DNS tunneling attacks through statistical traffic analysis.
This project demonstrates detection of DNS tunneling attacks using both rule-based and machine learning approaches. The system analyzes DNS traffic patterns to identify covert data exfiltration attempts that bypass traditional security controls.
Key Achievements:
- π― 99.6% detection accuracy (rule-based) on obvious tunnels
- π 100% accuracy (ML classifier) across all traffic types
- π Analyzed 1,389 DNS queries from 4 traffic types
- β‘ Lightweight solution using only open-source tools
| Tool | Type | Queries Captured | Detection (Rule-Based) | Detection (ML) |
|---|---|---|---|---|
| iodine | IP-over-DNS | 5 | 20.0% | 100% |
| dnscat2 | Encrypted C2 | 530 | 99.6% | 100% |
| dns2tcp | TCP-over-DNS | 471 | 17.8% | 100% |
| Metric | Normal | Iodine | Dnscat2 | Dns2tcp |
|---|---|---|---|---|
| Total Queries | 383 | 5 | 530 | 471 |
| Avg Subdomain Length | 6.71 | 14.40 | 36.46 | 18.68 |
| Avg Entropy | 1.92 | 2.78 | 3.62 | 3.22 |
| Query Frequency | 0.27/s | 0.06/s | 2.25/s | 2.62/s |
Query frequency emerged as the strongest ML feature (importance: 1.0000), demonstrating that temporal patterns are more discriminative than content-based features.
# Python 3.8 or higher
python3 --version
# Install dependencies
pip3 install -r requirements.txt-
Prepare PCAP files (place in
captures/directory):baseline_normal_traffic.pcapiodine_tunnel_traffic.pcapdnscat2_tunnel_traffic.pcapdns2tcp_tunnel_traffic.pcap
-
Run detection system:
python3 dns_detector.py- View results:
- Statistical analysis printed to console
- Graphs saved to
results/graphs/ - Data exported to
results/data/ - Decision tree visualization generated
Visualizations (results/graphs/):
subdomain_length_comparison.png- Box plot comparisonentropy_comparison.png- Entropy distributionlength_vs_entropy_scatter.png- 2D feature spacedetection_summary.png- Detection rate comparisondecision_tree.png- ML classifier visualization
Data Files (results/data/):
baseline_analysis.csviodine_analysis.csvdnscat2_analysis.csvdns2tcp_analysis.csv
- SEED Ubuntu 20.04 (10.0.2.15) - DNS Server / Target
- Kali Linux (10.0.2.7) - Attacker / Client
- Isolated NAT network for controlled testing
1. Feature Extraction
- Subdomain length (characters)
- Shannon entropy (randomness measure)
- Query frequency (queries/second)
- Character distribution patterns
2. Rule-Based Detection
Flagged as suspicious if:
- Subdomain length > 20 characters
- Shannon entropy > 3.5
- Query frequency > 10 queries/second
3. Machine Learning Classification
- Algorithm: Decision Tree (max depth: 4)
- Training: 1,111 samples (80%)
- Testing: 278 samples (20%)
- Features: subdomain_length, entropy, query_frequency
| Traffic Type | Rule-Based | ML Classifier | Improvement |
|---|---|---|---|
| Normal | 2.9% FP | 100% | Perfect |
| Iodine | 20.0% | 100% | +80.0% |
| Dnscat2 | 99.6% | 100% | +0.4% |
| Dns2tcp | 17.8% | 100% | +82.2% |
Actual \ Predicted Dns2tcp Dnscat2 Iodine Normal
Dns2tcp 94 0 0 0
Dnscat2 0 106 0 0
Iodine 0 0 1 0
Normal 0 0 0 77
-
Tool Detectability Varies:
- Dnscat2 is easily detected (long subdomains, high entropy)
- Dns2tcp and iodine are stealthy (efficient encoding)
-
Query Frequency Matters Most:
- ML identified temporal patterns as strongest signal
- Content features (length, entropy) less important than expected
-
ML Superiority:
- Dramatically improves detection of stealthy tools
- Reduces false positives on legitimate traffic
-
Accessible Detection:
- No expensive commercial solutions needed
- Standard Python libraries sufficient
- Can run on modest hardware
- Small dataset: 1,389 queries total (larger dataset needed for production)
- Lab environment: Needs validation on real-world networks
- Potential overfitting: 100% accuracy suggests model may be overfit
- Limited tools: Only tested 3 tunneling implementations
- Evasion possible: Attackers can slow traffic to evade frequency-based detection
This project was completed as a graduate research project for CMSC 5323 - Network Security at the University of Central Oklahoma.
Complete documentation available in the academic report (not included in repository for privacy).
dns-tunnel-project/
βββ dns_detector.py # Main detection script
βββ requirements.txt # Python dependencies
βββ config/
β βββ dns2tcpd.conf # DNS2TCP configuration
βββ results/
β βββ graphs/ # Visualization outputs
β βββ data/ # CSV analysis files
βββ captures/ # PCAP files (not in repo)
βββ dnscat2/ # Dnscat2 source code
βββ docs/ # Documentation
listen = 0.0.0.0
port = 53
domain = tunnel.test
resources = ssh:127.0.0.1:22
- Test additional tunneling protocols
- Validate on diverse real-world networks
- Develop real-time detection capability
- Deploy in production environment
- Integrate with SIEM systems (Splunk, ELK)
- Expand dataset for better ML generalization
This project is licensed under the MIT License - see the LICENSE file for details.
- Dr. Grace Park - Project Advisor
- University of Central Oklahoma - Research Support
- SEED Labs - Virtual machine images
- iodine, dnscat2, dns2tcp - Open-source tunneling tools
This tool is for educational and research purposes only.
Use responsibly and only on networks you own or have explicit permission to test. Unauthorized use of DNS tunneling tools or detection systems may violate laws and regulations.
For questions about this research, please open an issue on GitHub.
Made with π for network security research