Search | arXiv e-print repository

Semantic Ranking for Automated Adversarial Technique Annotation in Security Text

Authors: Udesh Kumarasinghe, Ahmed Lekssays, Husrev Taha Sencar, Sabri Boughorbel, Charitha Elvitigala, Preslav Nakov

Abstract: We introduce a new method for extracting structured threat behaviors from threat intelligence text. Our method is based on a multi-stage ranking architecture that allows jointly optimizing for efficiency and effectiveness. Therefore, we believe this problem formulation better aligns with the real-world nature of the task considering the large number of adversary techniques and the extensive body o… ▽ More We introduce a new method for extracting structured threat behaviors from threat intelligence text. Our method is based on a multi-stage ranking architecture that allows jointly optimizing for efficiency and effectiveness. Therefore, we believe this problem formulation better aligns with the real-world nature of the task considering the large number of adversary techniques and the extensive body of threat intelligence created by security analysts. Our findings show that the proposed system yields state-of-the-art performance results for this task. Results show that our method has a top-3 recall performance of 81\% in identifying the relevant technique among 193 top-level techniques. Our tests also demonstrate that our system performs significantly better (+40\%) than the widely used large language models when tested under a zero-shot setting. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2208.08486 [pdf, other]

EmoMent: An Emotion Annotated Mental Health Corpus from two South Asian Countries

Authors: Thushari Atapattu, Mahen Herath, Charitha Elvitigala, Piyanjali de Zoysa, Kasun Gunawardana, Menasha Thilakaratne, Kasun de Zoysa, Katrina Falkner

Abstract: People often utilise online media (e.g., Facebook, Reddit) as a platform to express their psychological distress and seek support. State-of-the-art NLP techniques demonstrate strong potential to automatically detect mental health issues from text. Research suggests that mental health issues are reflected in emotions (e.g., sadness) indicated in a person's choice of language. Therefore, we develope… ▽ More People often utilise online media (e.g., Facebook, Reddit) as a platform to express their psychological distress and seek support. State-of-the-art NLP techniques demonstrate strong potential to automatically detect mental health issues from text. Research suggests that mental health issues are reflected in emotions (e.g., sadness) indicated in a person's choice of language. Therefore, we developed a novel emotion-annotated mental health corpus (EmoMent), consisting of 2802 Facebook posts (14845 sentences) extracted from two South Asian countries - Sri Lanka and India. Three clinical psychology postgraduates were involved in annotating these posts into eight categories, including 'mental illness' (e.g., depression) and emotions (e.g., 'sadness', 'anger'). EmoMent corpus achieved 'very good' inter-annotator agreement of 98.3% (i.e. % with two or more agreement) and Fleiss' Kappa of 0.82. Our RoBERTa based models achieved an F1 score of 0.76 and a macro-averaged F1 score of 0.77 for the first task (i.e. predicting a mental health condition from a post) and the second task (i.e. extent of association of relevant posts with the categories defined in our taxonomy), respectively. △ Less

Submitted 17 August, 2022; originally announced August 2022.

Comments: This work has been accepted to appear at COLING 2022 Conference

arXiv:2202.07883 [pdf, other]

CGraph: Graph Based Extensible Predictive Domain Threat Intelligence Platform

Authors: Wathsara Daluwatta, Ravindu De Silva, Sanduni Kariyawasam, Mohamed Nabeel, Charith Elvitigala, Kasun De Zoysa, Chamath Keppitiyagama

Abstract: Ability to effectively investigate indicators of compromise and associated network resources involved in cyber attacks is paramount not only to identify affected network resources but also to detect related malicious resources. Today, most of the cyber threat intelligence platforms are reactive in that they can identify attack resources only after the attack is carried out. Further, these systems… ▽ More Ability to effectively investigate indicators of compromise and associated network resources involved in cyber attacks is paramount not only to identify affected network resources but also to detect related malicious resources. Today, most of the cyber threat intelligence platforms are reactive in that they can identify attack resources only after the attack is carried out. Further, these systems have limited functionality to investigate associated network resources. In this work, we propose an extensible predictive cyber threat intelligence platform called cGraph that addresses the above limitations. cGraph is built as a graph-first system where investigators can explore network resources utilizing a graph based API. Further, cGraph provides real-time predictive capabilities based on state-of-the-art inference algorithms to predict malicious domains from network graphs with a few known malicious and benign seeds. To the best of our knowledge, cGraph is the only threat intelligence platform to do so. cGraph is extensible in that additional network resources can be added to the system transparently. △ Less

Submitted 16 February, 2022; originally announced February 2022.

Comments: threat intelligence graph investigation

arXiv:2202.07882 [pdf, other]

PhishChain: A Decentralized and Transparent System to Blacklist Phishing URLs

Authors: Shehan Edirimannage, Mohamed Nabeel, Charith Elvitigala, Chamath Keppitiyagama

Abstract: Blacklists are a widely-used Internet security mechanism to protect Internet users from financial scams, malicious web pages and other cyber attacks based on blacklisted URLs. In this demo, we introduce PhishChain, a transparent and decentralized system to blacklisting phishing URLs. At present, public/private domain blacklists, such as PhishTank, CryptoScamDB, and APWG, are maintained by a centra… ▽ More Blacklists are a widely-used Internet security mechanism to protect Internet users from financial scams, malicious web pages and other cyber attacks based on blacklisted URLs. In this demo, we introduce PhishChain, a transparent and decentralized system to blacklisting phishing URLs. At present, public/private domain blacklists, such as PhishTank, CryptoScamDB, and APWG, are maintained by a centralized authority, but operate in a crowd sourcing fashion to create a manually verified blacklist periodically. In addition to being a single point of failure, the blacklisting process utilized by such systems is not transparent. We utilize the blockchain technology to support transparency and decentralization, where no single authority is controlling the blacklist and all operations are recorded in an immutable distributed ledger. Further, we design a page rank based truth discovery algorithm to assign a phishing score to each URL based on crowd sourced assessment of URLs. As an incentive for voluntary participation, we assign skill points to each user based on their participation in URL verification. △ Less

Submitted 16 February, 2022; originally announced February 2022.

Comments: phishing blockchain blocklisting

arXiv:2102.12223 [pdf, other]

Malicious and Low Credibility URLs on Twitter during the AstraZeneca COVID-19 Vaccine Development

Authors: Sameera Horawalavithana, Ravindu De Silva, Mohamed Nabeel, Charitha Elvitigala, Primal Wijesekara, Adriana Iamnitchi

Abstract: We investigate the link sharing behavior of Twitter users following the temporary halt of AstraZeneca COVID-19 vaccine development in September 2020. During this period, we show the presence of malicious and low credibility information sources shared on Twitter messages in multiple languages. The malicious URLs, often in shortened forms, are increasingly hosted in content delivery networks and sha… ▽ More We investigate the link sharing behavior of Twitter users following the temporary halt of AstraZeneca COVID-19 vaccine development in September 2020. During this period, we show the presence of malicious and low credibility information sources shared on Twitter messages in multiple languages. The malicious URLs, often in shortened forms, are increasingly hosted in content delivery networks and shared cloud hosting infrastructures not only to improve reach but also to avoid being detected and blocked. There are potential signs of coordination to promote both malicious and low credibility URLs on Twitter. Our findings suggest the need to develop a system that monitors the low-quality URLs shared in times of crisis. △ Less

Submitted 27 May, 2021; v1 submitted 24 February, 2021; originally announced February 2021.

Comments: Published in International Conference on Social Computing, Behavioral- Cultural Modeling, & Prediction and Behavior Representation in Modeling and Simulation (SBP-BRiMS), DC, USA, 2021. This paper has won the Grand Challenge, North American Social Network Conference, 2021

arXiv:1910.12244 [pdf, other]

Investigating MMM Ponzi scheme on Bitcoin

Authors: Yazan Boshmaf, Charitha Elvitigala, Husam Al Jawaheri, Primal Wijesekera, Mashael Al Sabah

Abstract: Cybercriminals exploit cryptocurrencies to carry out illicit activities. In this paper, we focus on Ponzi schemes that operate on Bitcoin and perform an in-depth analysis of MMM, one of the oldest and most popular Ponzi schemes. Based on 423K transactions involving 16K addresses, we show that: (1) Starting Sep 2014, the scheme goes through three phases over three years. At its peak, MMM circulated… ▽ More Cybercriminals exploit cryptocurrencies to carry out illicit activities. In this paper, we focus on Ponzi schemes that operate on Bitcoin and perform an in-depth analysis of MMM, one of the oldest and most popular Ponzi schemes. Based on 423K transactions involving 16K addresses, we show that: (1) Starting Sep 2014, the scheme goes through three phases over three years. At its peak, MMM circulated more than 150M dollars a day, after which it collapsed by the end of Jun 2016. (2) There is a high income inequality between MMM members, with the daily Gini index reaching more than 0.9. The scheme also exhibits a zero-sum investment model, in which one member's loss is another member's gain. The percentage of victims who never made any profit has grown from 0% to 41% in five months, during which the top-earning scammer has made 765K dollars in profit. (3) The scheme has a global reach with 80 different member countries but a highly-asymmetrical flow of money between them. While India and Indonesia have the largest pairwise flow in MMM, members in Indonesia have received 12x more money than they have sent to their counterparts in India. △ Less

Submitted 1 December, 2019; v1 submitted 27 October, 2019; originally announced October 2019.

Showing 1–6 of 6 results for author: Elvitigala, C