Maple Dataset
The Maple Dataset is an intrusion detection evaluation dataset aimed at enhancing the performance and reliability of anomaly-based intrusion detection systems (IDS) and intrusion prevention systems (IPS). With the increasing sophistication of network attacks, having a reliable and up-to-date dataset is crucial for testing and validating IDS and IPS solutions.
Download Dataset
Dataset Included:
- DDoS: HTTP (Plain/gzip/random), TCP, UDP, ReCOIL, LOIC
- DNS: DoH, DoQ, DoT (coming soon)
- ICMP: Normal ICMP, Smuggled ICMP
- MySQL: CVE-2012-2122
- Nginx: CVE-2017-7529
- OpenSSL: CVE-2022-0778, HeartBleed, Normal traffic
- Windows OS: Windows 10 provision, Windows Update
- VPN: Cisco AnyConnect, DNS Leak, Trojan traffic (coming soon)
How to use
Directly using the CSV file
- The CSV file contains the packet data exactly match the CIC-IDS format
- Just rename the *.csv dataset in your Python code
Manually export CSV file with custom columns
- Prepare raw pcap/pcapng file
- Open with CICFlowMeter (https://github.com/ahlashkari/CICFlowMeter)
- export
Background
Traditional evaluation datasets have shown inconsistencies and unreliability, mainly due to outdated content, lack of traffic diversity, insufficient attack variety, anonymized packet payload data, and inadequate feature sets and metadata. The Maple Dataset addresses these challenges by providing a comprehensive and contemporary dataset for intrusion detection research.
Compatible with Your Previous Work on CIC-IDS Dataset
The Maple Dataset is compatible with your previous work on the CIC-IDS 2017 dataset. It offers a more comprehensive and more diverse dataset, which is ideal for your previous work. You can use CICFlowMeter to generate the CSV files as same as you used before! No more code or work is needed.
Dataset Category Overview (What’s inside?)
- Content: The dataset contains recent common attacks, resembling real-world network traffic (PCAP/PCAPNGs).
- Traffic Analysis: Results of network traffic analysis using CICFlowMeter with labeled flows based on timestamps, source and destination IPs, ports, protocols, and attack types are included in CSV files.
- DDoS Attacks: The dataset includes DDoS attacks, which are common in the real-world network traffic. And with random content, the dataset is more diverse. GET, POST, HEAD and OPTIONS are the most common HTTP methods.
- N-day Vunerability: The dataset includes n-day vulnerabilities, such as HeartBleed.
- More scenarios: Netflow in IoT devices, DNS tunneling, and more.
More Features Coming Soon
- DPDK, PF_RING support
- More attacks and vulnerabilities
- More metadata for each flow
Please feedback to us if you have any questions or suggestions.
Data Generation
We profile the traffic by the mode and pattern we have observed in the real-world network traffic by mirroring the dataflow. The abstract behaviors of users based on HTTP, HTTPS with SM3/4 (People’s Republic of China), GOST (Russian Federation) and more. SSH, RESTful API, gRPC, WASM, these modern protocols with various of implementations, and more were constructed for this dataset.
Middleware and Tools Available
We have used a lot of tools developed by us during the creation of the dataset They are open-sourced and available to download.
Tool | Description | Link |
---|---|---|
pcap2para | extract http payload from pcap file | maple-nefu/pcap2para |
AnyConnect-Server | Script to generate SSLVPN encrypted traffic | maple-nefu/AnyConnect-Server |
ws-traffic-analyze-kit | High-performance traffic analyze toolkit developed by Rust | maple-nefu/ws-traffic-analyze-kit |
OracleHTTPServer | Oracle HTTP Server on Docker | maple-nefu/OracleHTTPServer |
more is coming | … | … |
Cite Us!
Q. Li, B. Wang, X. Wen, Y. Chen, Cybersecurity situational awareness framework based on ResNet modeling
Contact Us
If you have any questions or need assistance, please feel free to contact us:
- Email: maple@nefu.edu.cn
- GitHub: github.com/maple-nefu
- QQ Group: 631300176
- Telegram: @maple_dataset
- Discord: Maple Dataset