Search | arXiv e-print repository

Zero-shot and Few-shot Generation Strategies for Artificial Clinical Records

Authors: Erlend Frayling, Jake Lever, Graham McDonald

Abstract: The challenge of accessing historical patient data for clinical research, while adhering to privacy regulations, is a significant obstacle in medical science. An innovative approach to circumvent this issue involves utilising synthetic medical records that mirror real patient data without compromising individual privacy. The creation of these synthetic datasets, particularly without using actual p… ▽ More The challenge of accessing historical patient data for clinical research, while adhering to privacy regulations, is a significant obstacle in medical science. An innovative approach to circumvent this issue involves utilising synthetic medical records that mirror real patient data without compromising individual privacy. The creation of these synthetic datasets, particularly without using actual patient data to train Large Language Models (LLMs), presents a novel solution as gaining access to sensitive patient information to train models is also a challenge. This study assesses the capability of the Llama 2 LLM to create synthetic medical records that accurately reflect real patient information, employing zero-shot and few-shot prompting strategies for comparison against fine-tuned methodologies that do require sensitive patient data during training. We focus on generating synthetic narratives for the History of Present Illness section, utilising data from the MIMIC-IV dataset for comparison. In this work introduce a novel prompting technique that leverages a chain-of-thought approach, enhancing the model's ability to generate more accurate and contextually relevant medical narratives without prior fine-tuning. Our findings suggest that this chain-of-thought prompted approach allows the zero-shot model to achieve results on par with those of fine-tuned models, based on Rouge metrics evaluation. △ Less

Submitted 14 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

Comments: 4 pages

arXiv:2403.01038 [pdf, other]

AutoAttacker: A Large Language Model Guided System to Implement Automatic Cyber-attacks

Authors: Jiacen Xu, Jack W. Stokes, Geoff McDonald, Xuesong Bai, David Marshall, Siyue Wang, Adith Swaminathan, Zhou Li

Abstract: Large language models (LLMs) have demonstrated impressive results on natural language tasks, and security researchers are beginning to employ them in both offensive and defensive systems. In cyber-security, there have been multiple research efforts that utilize LLMs focusing on the pre-breach stage of attacks like phishing and malware generation. However, so far there lacks a comprehensive study r… ▽ More Large language models (LLMs) have demonstrated impressive results on natural language tasks, and security researchers are beginning to employ them in both offensive and defensive systems. In cyber-security, there have been multiple research efforts that utilize LLMs focusing on the pre-breach stage of attacks like phishing and malware generation. However, so far there lacks a comprehensive study regarding whether LLM-based systems can be leveraged to simulate the post-breach stage of attacks that are typically human-operated, or "hands-on-keyboard" attacks, under various attack techniques and environments. As LLMs inevitably advance, they may be able to automate both the pre- and post-breach attack stages. This shift may transform organizational attacks from rare, expert-led events to frequent, automated operations requiring no expertise and executed at automation speed and scale. This risks fundamentally changing global computer security and correspondingly causing substantial economic impacts, and a goal of this work is to better understand these risks now so we can better prepare for these inevitable ever-more-capable LLMs on the horizon. On the immediate impact side, this research serves three purposes. First, an automated LLM-based, post-breach exploitation framework can help analysts quickly test and continually improve their organization's network security posture against previously unseen attacks. Second, an LLM-based penetration test system can extend the effectiveness of red teams with a limited number of human analysts. Finally, this research can help defensive systems and teams learn to detect novel attack behaviors preemptively before their use in the wild.... △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2401.13434 [pdf, other]

Query Exposure Prediction for Groups of Documents in Rankings

Authors: Thomas Jaenich, Graham McDonald, Iadh Ounis

Abstract: The main objective of an Information Retrieval system is to provide a user with the most relevant documents to the user's query. To do this, modern IR systems typically deploy a re-ranking pipeline in which a set of documents is retrieved by a lightweight first-stage retrieval process and then re-ranked by a more effective but expensive model. However, the success of a re-ranking pipeline is heavi… ▽ More The main objective of an Information Retrieval system is to provide a user with the most relevant documents to the user's query. To do this, modern IR systems typically deploy a re-ranking pipeline in which a set of documents is retrieved by a lightweight first-stage retrieval process and then re-ranked by a more effective but expensive model. However, the success of a re-ranking pipeline is heavily dependent on the performance of the first stage retrieval, since new documents are not usually identified during the re-ranking stage. Moreover, this can impact the amount of exposure that a particular group of documents, such as documents from a particular demographic group, can receive in the final ranking. For example, the fair allocation of exposure becomes more challenging or impossible if the first stage retrieval returns too few documents from certain groups, since the number of group documents in the ranking affects the exposure more than the documents' positions. With this in mind, it is beneficial to predict the amount of exposure that a group of documents is likely to receive in the results of the first stage retrieval process, in order to ensure that there are a sufficient number of documents included from each of the groups. In this paper, we introduce the novel task of query exposure prediction (QEP). Specifically, we propose the first approach for predicting the distribution of exposure that groups of documents will receive for a given query. Our new approach, called GEP, uses lexical information from individual groups of documents to estimate the exposure the groups will receive in a ranking. Our experiments on the TREC 2021 and 2022 Fair Ranking Track test collections show that our proposed GEP approach results in exposure predictions that are up to 40 % more accurate than the predictions of adapted existing query performance prediction and resource allocation approaches. △ Less

Submitted 24 January, 2024; originally announced January 2024.

arXiv:2401.05144 [pdf, other]

SARA: A Collection of Sensitivity-Aware Relevance Assessments

Authors: Jack McKechnie, Graham McDonald

Abstract: Large archival collections, such as email or government documents, must be manually reviewed to identify any sensitive information before the collection can be released publicly. Sensitivity classification has received a lot of attention in the literature. However, more recently, there has been increasing interest in develo** sensitivity-aware search engines that can provide users with relevant… ▽ More Large archival collections, such as email or government documents, must be manually reviewed to identify any sensitive information before the collection can be released publicly. Sensitivity classification has received a lot of attention in the literature. However, more recently, there has been increasing interest in develo** sensitivity-aware search engines that can provide users with relevant search results, while ensuring that no sensitive documents are returned to the user. Sensitivity-aware search would mitigate the need for a manual sensitivity review prior to collections being made available publicly. To develop such systems, there is a need for test collections that contain relevance assessments for a set of information needs as well as ground-truth labels for a variety of sensitivity categories. The well-known Enron email collection contains a classification ground-truth that can be used to represent sensitive information, e.g., the Purely Personal and Personal but in Professional Context categories can be used to represent sensitive personal information. However, the existing Enron collection does not contain a set of information needs and relevance assessments. In this work, we present a collection of fifty information needs (topics) with crowdsourced query formulations (3 per topic) and relevance assessments (11,471 in total) for the Enron collection (mean number of relevant documents per topic = 11, variance = 34.7). The developed information needs, queries and relevance judgements are available on GitHub and will be available along with the existing Enron collection through the popular ir_datasets library. Our proposed collection results in the first freely available test collection for develo** sensitivity-aware search systems. △ Less

Submitted 10 January, 2024; originally announced January 2024.

arXiv:2308.14597 [pdf, other]

Adversarial Attacks on Foundational Vision Models

Authors: Nathan Inkawhich, Gwendolyn McDonald, Ryan Luley

Abstract: Rapid progress is being made in develo** large, pretrained, task-agnostic foundational vision models such as CLIP, ALIGN, DINOv2, etc. In fact, we are approaching the point where these models do not have to be finetuned downstream, and can simply be used in zero-shot or with a lightweight probing head. Critically, given the complexity of working at this scale, there is a bottleneck where relativ… ▽ More Rapid progress is being made in develo** large, pretrained, task-agnostic foundational vision models such as CLIP, ALIGN, DINOv2, etc. In fact, we are approaching the point where these models do not have to be finetuned downstream, and can simply be used in zero-shot or with a lightweight probing head. Critically, given the complexity of working at this scale, there is a bottleneck where relatively few organizations in the world are executing the training then sharing the models on centralized platforms such as HuggingFace and torch.hub. The goal of this work is to identify several key adversarial vulnerabilities of these models in an effort to make future designs more robust. Intuitively, our attacks manipulate deep feature representations to fool an out-of-distribution (OOD) detector which will be required when using these open-world-aware models to solve closed-set downstream tasks. Our methods reliably make in-distribution (ID) images (w.r.t. a downstream task) be predicted as OOD and vice versa while existing in extremely low-knowledge-assumption threat models. We show our attacks to be potent in whitebox and blackbox settings, as well as when transferred across foundational model types (e.g., attack DINOv2 with CLIP)! This work is only just the beginning of a long journey towards adversarially robust foundational vision models. △ Less

Submitted 28 August, 2023; originally announced August 2023.

arXiv:2304.08497 [pdf, other]

Agent-Based Modeling and its Tradeoffs: An Introduction & Examples

Authors: G. Wade McDonald, Nathaniel D. Osgood

Abstract: Agent-based modeling is a computational dynamic modeling technique that may be less familiar to some readers. Agent-based modeling seeks to understand the behaviour of complex systems by situating agents in an environment and studying the emergent outcomes of agent-agent and agent-environment interactions. In comparison with compartmental models, agent-based models offer simpler, more scalable and… ▽ More Agent-based modeling is a computational dynamic modeling technique that may be less familiar to some readers. Agent-based modeling seeks to understand the behaviour of complex systems by situating agents in an environment and studying the emergent outcomes of agent-agent and agent-environment interactions. In comparison with compartmental models, agent-based models offer simpler, more scalable and flexible representation of heterogeneity, the ability to capture dynamic and static network and spatial context, and the ability to consider history of individuals within the model. In contrast, compartmental models offer faster development time with less programming required, lower computational requirements that do not scale with population, and the option for concise mathematical formulation with ordinary, delay or stochastic differential equations supporting derivation of properties of the system behaviour. In this chapter, basic characteristics of agent-based models are introduced, advantages and disadvantages of agent-based models, as compared with compartmental models, are discussed, and two example agent-based infectious disease models are reviewed. △ Less

Submitted 6 April, 2023; originally announced April 2023.

ACM Class: I.6.8

arXiv:2302.10987 [pdf, other]

Towards a responsible machine learning approach to identify forced labor in fisheries

Authors: Rocío Joo, Gavin McDonald, Nathan Miller, David Kroodsma, Courtney Farthing, Dyhia Belhabib, Timothy Hochberg

Abstract: Many fishing vessels use forced labor, but identifying vessels that engage in this practice is challenging because few are regularly inspected. We developed a positive-unlabeled learning algorithm using vessel characteristics and movement patterns to estimate an upper bound of the number of positive cases of forced labor, with the goal of hel** make accurate, responsible, and fair decisions. 89%… ▽ More Many fishing vessels use forced labor, but identifying vessels that engage in this practice is challenging because few are regularly inspected. We developed a positive-unlabeled learning algorithm using vessel characteristics and movement patterns to estimate an upper bound of the number of positive cases of forced labor, with the goal of hel** make accurate, responsible, and fair decisions. 89% of the reported cases of forced labor were correctly classified as positive (recall) while 98% of the vessels certified as having decent working conditions were correctly classified as negative. The recall was high for vessels from different regions using different gears, except for trawlers. We found that as much as ~28% of vessels may operate using forced labor, with the fraction much higher in squid jiggers and longlines. This model could inform risk-based port inspections as part of a broader monitoring, control, and surveillance regime to reduce forced labor. * Translated versions of the English title and abstract are available in five languages in S1 Text: Spanish, French, Simplified Chinese, Traditional Chinese, and Indonesian. △ Less

Submitted 3 February, 2023; originally announced February 2023.

Comments: 24 pages, 2 figures, 2 supp files

arXiv:2302.10856 [pdf, other]

Overview of the TREC 2021 Fair Ranking Track

Authors: Michael D. Ekstrand, Graham McDonald, Amifa Raj, Isaac Johnson

Abstract: The TREC Fair Ranking Track aims to provide a platform for participants to develop and evaluate novel retrieval algorithms that can provide a fair exposure to a mixture of demographics or attributes, such as ethnicity, that are represented by relevant documents in response to a search query. For example, particular demographics or attributes can be represented by the documents' topical content or… ▽ More The TREC Fair Ranking Track aims to provide a platform for participants to develop and evaluate novel retrieval algorithms that can provide a fair exposure to a mixture of demographics or attributes, such as ethnicity, that are represented by relevant documents in response to a search query. For example, particular demographics or attributes can be represented by the documents' topical content or authors. The 2021 Fair Ranking Track adopted a resource allocation task. The task focused on supporting Wikipedia editors who are looking to improve the encyclopedia's coverage of topics under the purview of a WikiProject. WikiProject coordinators and/or Wikipedia editors search for Wikipedia documents that are in need of editing to improve the quality of the article. The 2021 Fair Ranking track aimed to ensure that documents that are about, or somehow represent, certain protected characteristics receive a fair exposure to the Wikipedia editors, so that the documents have an fair opportunity of being improved and, therefore, be well-represented in Wikipedia. The under-representation of particular protected characteristics in Wikipedia can result in systematic biases that can have a negative human, social, and economic impact, particularly for disadvantaged or protected societal groups. △ Less

Submitted 21 February, 2023; originally announced February 2023.

Comments: Published in The Thirtieth Text REtrieval Conference Proceedings (TREC 2021). arXiv admin note: substantial text overlap with arXiv:2302.05558

arXiv:2302.05558 [pdf, other]

Overview of the TREC 2022 Fair Ranking Track

Authors: Michael D. Ekstrand, Graham McDonald, Amifa Raj, Isaac Johnson

Abstract: The TREC Fair Ranking Track aims to provide a platform for participants to develop and evaluate novel retrieval algorithms that can provide a fair exposure to a mixture of demographics or attributes, such as ethnicity, that are represented by relevant documents in response to a search query. For example, particular demographics or attributes can be represented by the documents topical content or a… ▽ More The TREC Fair Ranking Track aims to provide a platform for participants to develop and evaluate novel retrieval algorithms that can provide a fair exposure to a mixture of demographics or attributes, such as ethnicity, that are represented by relevant documents in response to a search query. For example, particular demographics or attributes can be represented by the documents topical content or authors. The 2022 Fair Ranking Track adopted a resource allocation task. The task focused on supporting Wikipedia editors who are looking to improve the encyclopedia's coverage of topics under the purview of a WikiProject. WikiProject coordinators and/or Wikipedia editors search for Wikipedia documents that are in need of editing to improve the quality of the article. The 2022 Fair Ranking track aimed to ensure that documents that are about, or somehow represent, certain protected characteristics receive a fair exposure to the Wikipedia editors, so that the documents have an fair opportunity of being improved and, therefore, be well-represented in Wikipedia. The under-representation of particular protected characteristics in Wikipedia can result in systematic biases that can have a negative human, social, and economic impact, particularly for disadvantaged or protected societal groups. △ Less

Submitted 10 February, 2023; originally announced February 2023.

arXiv:2202.03276 [pdf, ps, other]

doi 10.3390/s22030953

Ransomware: Analysing the Impact on Windows Active Directory Domain Services

Authors: Grant McDonald, Pavlos Papadopoulos, Nikolaos Pitropakis, Jawad Ahmad, William J. Buchanan

Abstract: Ransomware has become an increasingly popular type of malware across the past decade and continues to rise in popularity due to its high profitability. Organisations and enterprises have become prime targets for ransomware as they are more likely to succumb to ransom demands as part of operating expenses to counter the cost incurred from downtime. Despite the prevalence of ransomware as a threat t… ▽ More Ransomware has become an increasingly popular type of malware across the past decade and continues to rise in popularity due to its high profitability. Organisations and enterprises have become prime targets for ransomware as they are more likely to succumb to ransom demands as part of operating expenses to counter the cost incurred from downtime. Despite the prevalence of ransomware as a threat towards organisations, there is very little information outlining how ransomware affects Windows Server environments, and particularly its proprietary domain services such as Active Directory. Hence, we aim to increase the cyber situational awareness of organisations and corporations that utilise these environments. Dynamic analysis was performed using three ransomware variants to uncover how crypto-ransomware affects Windows Server-specific services and processes. Our work outlines the practical investigation undertaken as WannaCry, TeslaCrypt, and Jigsaw were acquired and tested against several domain services. The findings showed that none of the three variants stopped the processes and decidedly left all domain services untouched. However, although the services remained operational, they became uniquely dysfunctional as ransomware encrypted the files pertaining to those services △ Less

Submitted 7 February, 2022; originally announced February 2022.

Journal ref: Sensors 22, no. 3: 953 (2022)

arXiv:1907.02956 [pdf, other]

The FACTS of Technology-Assisted Sensitivity Review

Authors: Graham McDonald, Craig Macdonald, Iadh Ounis

Abstract: At least ninety countries implement Freedom of Information laws that state that government documents must be made freely available, or opened, to the public. However, many government documents contain sensitive information, such as personal or confidential information. Therefore, all government documents that are opened to the public must first be reviewed to identify, and protect, any sensitive i… ▽ More At least ninety countries implement Freedom of Information laws that state that government documents must be made freely available, or opened, to the public. However, many government documents contain sensitive information, such as personal or confidential information. Therefore, all government documents that are opened to the public must first be reviewed to identify, and protect, any sensitive information. Historically, sensitivity review has been a completely manual process. However, with the adoption of born-digital documents, such as e-mail, human-only sensitivity review is not practical and there is a need for new technologies to assist human sensitivity reviewers. In this paper, we discuss how issues of fairness, accountability, confidentiality, transparency and safety (FACTS) impact technology-assisted sensitivity review. Moreover, we outline some important areas of future FACTS research that will need to be addressed within technology-assisted sensitivity review. △ Less

Submitted 5 July, 2019; originally announced July 2019.

Comments: 4 pages

arXiv:1904.01126 [pdf, other]

ScriptNet: Neural Static Analysis for Malicious JavaScript Detection

Authors: Jack W. Stokes, Rakshit Agrawal, Geoff McDonald, Matthew Hausknecht

Abstract: Malicious scripts are an important computer infection threat vector in the wild. For web-scale processing, static analysis offers substantial computing efficiencies. We propose the ScriptNet system for neural malicious JavaScript detection which is based on static analysis. We use the Convoluted Partitioning of Long Sequences (CPoLS) model, which processes Javascript files as byte sequences. Lower… ▽ More Malicious scripts are an important computer infection threat vector in the wild. For web-scale processing, static analysis offers substantial computing efficiencies. We propose the ScriptNet system for neural malicious JavaScript detection which is based on static analysis. We use the Convoluted Partitioning of Long Sequences (CPoLS) model, which processes Javascript files as byte sequences. Lower layers capture the sequential nature of these byte sequences while higher layers classify the resulting embedding as malicious or benign. Unlike previously proposed solutions, our model variants are trained in an end-to-end fashion allowing discriminative training even for the sequential processing layers. Evaluating this model on a large corpus of 212,408 JavaScript files indicates that the best performing CPoLS model offers a 97.20% true positive rate (TPR) for the first 60K byte subsequence at a false positive rate (FPR) of 0.50%. The best performing CPoLS model significantly outperform several baseline models. △ Less

Submitted 1 April, 2019; originally announced April 2019.

arXiv:1805.05603 [pdf, other]

Neural Classification of Malicious Scripts: A study with JavaScript and VBScript

Authors: Jack W. Stokes, Rakshit Agrawal, Geoff McDonald

Abstract: Malicious scripts are an important computer infection threat vector. Our analysis reveals that the two most prevalent types of malicious scripts include JavaScript and VBScript. The percentage of detected JavaScript attacks are on the rise. To address these threats, we investigate two deep recurrent models, LaMP (LSTM and Max Pooling) and CPoLS (Convoluted Partitioning of Long Sequences), which pr… ▽ More Malicious scripts are an important computer infection threat vector. Our analysis reveals that the two most prevalent types of malicious scripts include JavaScript and VBScript. The percentage of detected JavaScript attacks are on the rise. To address these threats, we investigate two deep recurrent models, LaMP (LSTM and Max Pooling) and CPoLS (Convoluted Partitioning of Long Sequences), which process JavaScript and VBScript as byte sequences. Lower layers capture the sequential nature of these byte sequences while higher layers classify the resulting embedding as malicious or benign. Unlike previously proposed solutions, our models are trained in an end-to-end fashion allowing discriminative training even for the sequential processing layers. Evaluating these models on a large corpus of 296,274 JavaScript files indicates that the best performing LaMP model has a 65.9% true positive rate (TPR) at a false positive rate (FPR) of 1.0%. Similarly, the best CPoLS model has a TPR of 45.3% at an FPR of 1.0%. LaMP and CPoLS yield a TPR of 69.3% and 67.9%, respectively, at an FPR of 1.0% on a collection of 240,504 VBScript files. △ Less

Submitted 15 May, 2018; originally announced May 2018.

Showing 1–13 of 13 results for author: McDonald, G