Skip to main content

Showing 1–8 of 8 results for author: Fleshman, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14764  [pdf, other

    cs.IR cs.AI cs.CL cs.LG

    RE-AdaptIR: Improving Information Retrieval through Reverse Engineered Adaptation

    Authors: William Fleshman, Benjamin Van Durme

    Abstract: Large language models (LLMs) fine-tuned for text-retrieval have demonstrated state-of-the-art results across several information retrieval (IR) benchmarks. However, supervised training for improving these models requires numerous labeled examples, which are generally unavailable or expensive to acquire. In this work, we explore the effectiveness of extending reverse engineered adaptation to the co… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  2. arXiv:2405.15007  [pdf, other

    cs.CL cs.AI cs.LG

    RE-Adapt: Reverse Engineered Adaptation of Large Language Models

    Authors: William Fleshman, Benjamin Van Durme

    Abstract: We introduce RE-Adapt, an approach to fine-tuning large language models on new domains without degrading any pre-existing instruction-tuning. We reverse engineer an adapter which isolates what an instruction-tuned model has learned beyond its corresponding pretrained base model. Importantly, this requires no additional data or training. We can then fine-tune the base model on a new domain and read… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  3. arXiv:2404.08417  [pdf, other

    cs.LG cs.AI cs.CL

    AdapterSwap: Continuous Training of LLMs with Data Removal and Access-Control Guarantees

    Authors: William Fleshman, Aleem Khan, Marc Marone, Benjamin Van Durme

    Abstract: Large language models (LLMs) are increasingly capable of completing knowledge intensive tasks by recalling information from a static pretraining corpus. Here we are concerned with LLMs in the context of evolving data requirements. For instance: batches of new data that are introduced periodically; subsets of data with user-based access controls; or requirements on dynamic removal of documents with… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  4. arXiv:2311.08620  [pdf, other

    cs.CL cs.LG

    Toucan: Token-Aware Character Level Language Modeling

    Authors: William Fleshman, Benjamin Van Durme

    Abstract: Character-level language models obviate the need for separately trained tokenizers, but efficiency suffers from longer sequence lengths. Learning to combine character representations into tokens has made training these models more efficient, but they still require decoding characters individually. We propose Toucan, an augmentation to character-level models to make them "token-aware". Comparing ou… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  5. arXiv:2012.09390  [pdf, other

    stat.ML cs.AI cs.LG

    Classifying Sequences of Extreme Length with Constant Memory Applied to Malware Detection

    Authors: Edward Raff, William Fleshman, Richard Zak, Hyrum S. Anderson, Bobby Filar, Mark McLean

    Abstract: Recent works within machine learning have been tackling inputs of ever-increasing size, with cybersecurity presenting sequence classification problems of particularly extreme lengths. In the case of Windows executable malware detection, inputs may exceed $100$ MB, which corresponds to a time series with $T=100,000,000$ steps. To date, the closest approach to handling such a task is MalConv, a conv… ▽ More

    Submitted 16 December, 2020; originally announced December 2020.

    Comments: To appear in AAAI 2021

  6. arXiv:2011.01331  [pdf

    cs.SI cs.CY

    Deception and the Strategy of Influence

    Authors: Brian B., William Fleshman, Kevin H., Ryan Kaliszewski, Shawn R

    Abstract: Organizations have long used deception as a means to exert influence in pursuit of their agendas. In particular, information operations such as propaganda distribution, support of antigovernment protest, and revelation of politically and socially damaging secrets were abundant during World War II and the Cold War. A key component of each of these efforts is deceiving the targets by obscuring inten… ▽ More

    Submitted 2 November, 2020; originally announced November 2020.

    Comments: The Next Wave article pre-release, full issue soon available on www.nsa.gov/thenextwave

  7. arXiv:1806.06108  [pdf, other

    stat.ML cs.AI cs.LG

    Non-Negative Networks Against Adversarial Attacks

    Authors: William Fleshman, Edward Raff, Jared Sylvester, Steven Forsyth, Mark McLean

    Abstract: Adversarial attacks against neural networks are a problem of considerable importance, for which effective defenses are not yet readily available. We make progress toward this problem by showing that non-negative weight constraints can be used to improve resistance in specific scenarios. In particular, we show that they can provide an effective defense for binary classification problems with asymme… ▽ More

    Submitted 3 January, 2019; v1 submitted 15 June, 2018; originally announced June 2018.

    Report number: AICS/2019/08

  8. arXiv:1806.04773  [pdf

    cs.CR cs.LG stat.ML

    Static Malware Detection & Subterfuge: Quantifying the Robustness of Machine Learning and Current Anti-Virus

    Authors: William Fleshman, Edward Raff, Richard Zak, Mark McLean, Charles Nicholas

    Abstract: As machine-learning (ML) based systems for malware detection become more prevalent, it becomes necessary to quantify the benefits compared to the more traditional anti-virus (AV) systems widely used today. It is not practical to build an agreed upon test set to benchmark malware detection systems on pure classification performance. Instead we tackle the problem by creating a new testing methodolog… ▽ More

    Submitted 12 June, 2018; originally announced June 2018.