Skip to main content

Showing 1–13 of 13 results for author: McDonald, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.08664  [pdf, other

    cs.CL cs.LG

    Zero-shot and Few-shot Generation Strategies for Artificial Clinical Records

    Authors: Erlend Frayling, Jake Lever, Graham McDonald

    Abstract: The challenge of accessing historical patient data for clinical research, while adhering to privacy regulations, is a significant obstacle in medical science. An innovative approach to circumvent this issue involves utilising synthetic medical records that mirror real patient data without compromising individual privacy. The creation of these synthetic datasets, particularly without using actual p… ▽ More

    Submitted 14 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: 4 pages

  2. arXiv:2403.01038  [pdf, other

    cs.CR cs.AI

    AutoAttacker: A Large Language Model Guided System to Implement Automatic Cyber-attacks

    Authors: Jiacen Xu, Jack W. Stokes, Geoff McDonald, Xuesong Bai, David Marshall, Siyue Wang, Adith Swaminathan, Zhou Li

    Abstract: Large language models (LLMs) have demonstrated impressive results on natural language tasks, and security researchers are beginning to employ them in both offensive and defensive systems. In cyber-security, there have been multiple research efforts that utilize LLMs focusing on the pre-breach stage of attacks like phishing and malware generation. However, so far there lacks a comprehensive study r… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  3. arXiv:2401.13434  [pdf, other

    cs.IR

    Query Exposure Prediction for Groups of Documents in Rankings

    Authors: Thomas Jaenich, Graham McDonald, Iadh Ounis

    Abstract: The main objective of an Information Retrieval system is to provide a user with the most relevant documents to the user's query. To do this, modern IR systems typically deploy a re-ranking pipeline in which a set of documents is retrieved by a lightweight first-stage retrieval process and then re-ranked by a more effective but expensive model. However, the success of a re-ranking pipeline is heavi… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  4. arXiv:2401.05144  [pdf, other

    cs.IR

    SARA: A Collection of Sensitivity-Aware Relevance Assessments

    Authors: Jack McKechnie, Graham McDonald

    Abstract: Large archival collections, such as email or government documents, must be manually reviewed to identify any sensitive information before the collection can be released publicly. Sensitivity classification has received a lot of attention in the literature. However, more recently, there has been increasing interest in develo** sensitivity-aware search engines that can provide users with relevant… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

  5. arXiv:2308.14597  [pdf, other

    cs.CV cs.CR cs.LG

    Adversarial Attacks on Foundational Vision Models

    Authors: Nathan Inkawhich, Gwendolyn McDonald, Ryan Luley

    Abstract: Rapid progress is being made in develo** large, pretrained, task-agnostic foundational vision models such as CLIP, ALIGN, DINOv2, etc. In fact, we are approaching the point where these models do not have to be finetuned downstream, and can simply be used in zero-shot or with a lightweight probing head. Critically, given the complexity of working at this scale, there is a bottleneck where relativ… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

  6. arXiv:2304.08497  [pdf, other

    cs.MA cs.CE

    Agent-Based Modeling and its Tradeoffs: An Introduction & Examples

    Authors: G. Wade McDonald, Nathaniel D. Osgood

    Abstract: Agent-based modeling is a computational dynamic modeling technique that may be less familiar to some readers. Agent-based modeling seeks to understand the behaviour of complex systems by situating agents in an environment and studying the emergent outcomes of agent-agent and agent-environment interactions. In comparison with compartmental models, agent-based models offer simpler, more scalable and… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    ACM Class: I.6.8

  7. arXiv:2302.10987  [pdf, other

    cs.LG cs.CY stat.AP

    Towards a responsible machine learning approach to identify forced labor in fisheries

    Authors: RocĂ­o Joo, Gavin McDonald, Nathan Miller, David Kroodsma, Courtney Farthing, Dyhia Belhabib, Timothy Hochberg

    Abstract: Many fishing vessels use forced labor, but identifying vessels that engage in this practice is challenging because few are regularly inspected. We developed a positive-unlabeled learning algorithm using vessel characteristics and movement patterns to estimate an upper bound of the number of positive cases of forced labor, with the goal of hel** make accurate, responsible, and fair decisions. 89%… ▽ More

    Submitted 3 February, 2023; originally announced February 2023.

    Comments: 24 pages, 2 figures, 2 supp files

  8. arXiv:2302.10856  [pdf, other

    cs.IR

    Overview of the TREC 2021 Fair Ranking Track

    Authors: Michael D. Ekstrand, Graham McDonald, Amifa Raj, Isaac Johnson

    Abstract: The TREC Fair Ranking Track aims to provide a platform for participants to develop and evaluate novel retrieval algorithms that can provide a fair exposure to a mixture of demographics or attributes, such as ethnicity, that are represented by relevant documents in response to a search query. For example, particular demographics or attributes can be represented by the documents' topical content or… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: Published in The Thirtieth Text REtrieval Conference Proceedings (TREC 2021). arXiv admin note: substantial text overlap with arXiv:2302.05558

  9. arXiv:2302.05558  [pdf, other

    cs.IR

    Overview of the TREC 2022 Fair Ranking Track

    Authors: Michael D. Ekstrand, Graham McDonald, Amifa Raj, Isaac Johnson

    Abstract: The TREC Fair Ranking Track aims to provide a platform for participants to develop and evaluate novel retrieval algorithms that can provide a fair exposure to a mixture of demographics or attributes, such as ethnicity, that are represented by relevant documents in response to a search query. For example, particular demographics or attributes can be represented by the documents topical content or a… ▽ More

    Submitted 10 February, 2023; originally announced February 2023.

  10. Ransomware: Analysing the Impact on Windows Active Directory Domain Services

    Authors: Grant McDonald, Pavlos Papadopoulos, Nikolaos Pitropakis, Jawad Ahmad, William J. Buchanan

    Abstract: Ransomware has become an increasingly popular type of malware across the past decade and continues to rise in popularity due to its high profitability. Organisations and enterprises have become prime targets for ransomware as they are more likely to succumb to ransom demands as part of operating expenses to counter the cost incurred from downtime. Despite the prevalence of ransomware as a threat t… ▽ More

    Submitted 7 February, 2022; originally announced February 2022.

    Journal ref: Sensors 22, no. 3: 953 (2022)

  11. arXiv:1907.02956  [pdf, other

    cs.CY cs.IR

    The FACTS of Technology-Assisted Sensitivity Review

    Authors: Graham McDonald, Craig Macdonald, Iadh Ounis

    Abstract: At least ninety countries implement Freedom of Information laws that state that government documents must be made freely available, or opened, to the public. However, many government documents contain sensitive information, such as personal or confidential information. Therefore, all government documents that are opened to the public must first be reviewed to identify, and protect, any sensitive i… ▽ More

    Submitted 5 July, 2019; originally announced July 2019.

    Comments: 4 pages

  12. arXiv:1904.01126  [pdf, other

    cs.CR cs.AI cs.LG

    ScriptNet: Neural Static Analysis for Malicious JavaScript Detection

    Authors: Jack W. Stokes, Rakshit Agrawal, Geoff McDonald, Matthew Hausknecht

    Abstract: Malicious scripts are an important computer infection threat vector in the wild. For web-scale processing, static analysis offers substantial computing efficiencies. We propose the ScriptNet system for neural malicious JavaScript detection which is based on static analysis. We use the Convoluted Partitioning of Long Sequences (CPoLS) model, which processes Javascript files as byte sequences. Lower… ▽ More

    Submitted 1 April, 2019; originally announced April 2019.

  13. arXiv:1805.05603  [pdf, other

    cs.CR cs.AI

    Neural Classification of Malicious Scripts: A study with JavaScript and VBScript

    Authors: Jack W. Stokes, Rakshit Agrawal, Geoff McDonald

    Abstract: Malicious scripts are an important computer infection threat vector. Our analysis reveals that the two most prevalent types of malicious scripts include JavaScript and VBScript. The percentage of detected JavaScript attacks are on the rise. To address these threats, we investigate two deep recurrent models, LaMP (LSTM and Max Pooling) and CPoLS (Convoluted Partitioning of Long Sequences), which pr… ▽ More

    Submitted 15 May, 2018; originally announced May 2018.