Exploring the Limits of Transfer Learning with Unified Model in the Cybersecurity Domain
Authors:
Kuntal Kumar Pal,
Kazuaki Kashihara,
Ujjwala Anantheswaran,
Kirby C. Kuznia,
Siddhesh Jagtap,
Chitta Baral
Abstract:
With the increase in cybersecurity vulnerabilities of software systems, the ways to exploit them are also increasing. Besides these, malware threats, irregular network interactions, and discussions about exploits in public forums are also on the rise. To identify these threats faster, to detect potentially relevant entities from any texts, and to be aware of software vulnerabilities, automated app…
▽ More
With the increase in cybersecurity vulnerabilities of software systems, the ways to exploit them are also increasing. Besides these, malware threats, irregular network interactions, and discussions about exploits in public forums are also on the rise. To identify these threats faster, to detect potentially relevant entities from any texts, and to be aware of software vulnerabilities, automated approaches are necessary. Application of natural language processing (NLP) techniques in the Cybersecurity domain can help in achieving this. However, there are challenges such as the diverse nature of texts involved in the cybersecurity domain, the unavailability of large-scale publicly available datasets, and the significant cost of hiring subject matter experts for annotations. One of the solutions is building multi-task models that can be trained jointly with limited data. In this work, we introduce a generative multi-task model, Unified Text-to-Text Cybersecurity (UTS), trained on malware reports, phishing site URLs, programming code constructs, social media data, blogs, news articles, and public forum posts. We show UTS improves the performance of some cybersecurity datasets. We also show that with a few examples, UTS can be adapted to novel unseen tasks and the nature of data
△ Less
Submitted 20 February, 2023;
originally announced February 2023.
Census Data Mining and Data Analysis using WEKA
Authors:
Sudhir B Jagtap,
Kodge B. G
Abstract:
Data mining (also known as knowledge discovery from databases) is the process of extraction of hidden, previously unknown and potentially useful information from databases. The outcome of the extracted data can be analyzed for the future planning and development perspectives. In this paper, we have made an attempt to demonstrate how one can extract the local (district) level census, socio-economic…
▽ More
Data mining (also known as knowledge discovery from databases) is the process of extraction of hidden, previously unknown and potentially useful information from databases. The outcome of the extracted data can be analyzed for the future planning and development perspectives. In this paper, we have made an attempt to demonstrate how one can extract the local (district) level census, socio-economic and population related other data for knowledge discovery and their analysis using the powerful data mining tool Weka.
△ Less
Submitted 17 October, 2013;
originally announced October 2013.