AnnoCTR: A Dataset for Detecting and Linking Entities, Tactics, and Techniques in Cyber Threat Reports
Authors:
Lukas Lange,
Marc Müller,
Ghazaleh Haratinezhad Torbati,
Dragan Milchevski,
Patrick Grau,
Subhash Pujari,
Annemarie Friedrich
Abstract:
Monitoring the threat landscape to be aware of actual or potential attacks is of utmost importance to cybersecurity professionals. Information about cyber threats is typically distributed using natural language reports. Natural language processing can help with managing this large amount of unstructured information, yet to date, the topic has received little attention. With this paper, we present…
▽ More
Monitoring the threat landscape to be aware of actual or potential attacks is of utmost importance to cybersecurity professionals. Information about cyber threats is typically distributed using natural language reports. Natural language processing can help with managing this large amount of unstructured information, yet to date, the topic has received little attention. With this paper, we present AnnoCTR, a new CC-BY-SA-licensed dataset of cyber threat reports. The reports have been annotated by a domain expert with named entities, temporal expressions, and cybersecurity-specific concepts including implicitly mentioned techniques and tactics. Entities and concepts are linked to Wikipedia and the MITRE ATT&CK knowledge base, the most widely-used taxonomy for classifying types of attacks. Prior datasets linking to MITRE ATT&CK either provide a single label per document or annotate sentences out-of-context; our dataset annotates entire documents in a much finer-grained way. In an experimental study, we model the annotations of our dataset using state-of-the-art neural models. In our few-shot scenario, we find that for identifying the MITRE ATT&CK concepts that are mentioned explicitly or implicitly in a text, concept descriptions from MITRE ATT&CK are an effective source for training data augmentation.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
The GRIFFIN Perception Dataset: Bridging the Gap Between Flap**-Wing Flight and Robotic Perception
Authors:
J. P. Rodríguez-Gómez,
R. Tapia,
J. L. Paneque,
P. Grau,
A. Gómez Eguíluz,
J. R. Martínez-de Dios,
A. Ollero
Abstract:
The development of automatic perception systems and techniques for bio-inspired flap**-wing robots is severely hampered by the high technical complexity of these platforms and the installation of onboard sensors and electronics. Besides, flap**-wing robot perception suffers from high vibration levels and abrupt movements during flight, which cause motion blur and strong changes in lighting con…
▽ More
The development of automatic perception systems and techniques for bio-inspired flap**-wing robots is severely hampered by the high technical complexity of these platforms and the installation of onboard sensors and electronics. Besides, flap**-wing robot perception suffers from high vibration levels and abrupt movements during flight, which cause motion blur and strong changes in lighting conditions. This paper presents a perception dataset for bird-scale flap**-wing robots as a tool to help alleviate the aforementioned problems. The presented data include measurements from onboard sensors widely used in aerial robotics and suitable to deal with the perception challenges of flap**-wing robots, such as an event camera, a conventional camera, and two Inertial Measurement Units (IMUs), as well as ground truth measurements from a laser tracker or a motion capture system. A total of 21 datasets of different types of flights were collected in three different scenarios (one indoor and two outdoor). To the best of the authors' knowledge this is the first dataset for flap**-wing robot perception.
△ Less
Submitted 18 February, 2021; v1 submitted 25 January, 2021;
originally announced January 2021.