Skip to main content

Showing 1–16 of 16 results for author: Olson, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.02333  [pdf, other

    cs.CL cs.CV

    Why do LLaVA Vision-Language Models Reply to Images in English?

    Authors: Musashi Hinck, Carolin Holtermann, Matthew Lyle Olson, Florian Schneider, Sungduk Yu, Anahita Bhiwandiwalla, Anne Lauscher, Shaoyen Tseng, Vasudev Lal

    Abstract: We uncover a surprising multilingual bias occurring in a popular class of multimodal vision-language models (VLMs). Including an image in the query to a LLaVA-style VLM significantly increases the likelihood of the model returning an English response, regardless of the language of the query. This paper investigates the causes of this loss with a two-pronged approach that combines extensive ablatio… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Pre-print

  2. arXiv:2404.03118  [pdf, other

    cs.CV

    LVLM-Interpret: An Interpretability Tool for Large Vision-Language Models

    Authors: Gabriela Ben Melech Stan, Estelle Aflalo, Raanan Yehezkel Rohekar, Anahita Bhiwandiwalla, Shao-Yen Tseng, Matthew Lyle Olson, Yaniv Gurwicz, Chenfei Wu, Nan Duan, Vasudev Lal

    Abstract: In the rapidly evolving landscape of artificial intelligence, multi-modal large language models are emerging as a significant area of interest. These models, which combine various forms of data input, are becoming increasingly popular. However, understanding their internal mechanisms remains a complex task. Numerous advancements have been made in the field of explainability tools and mechanisms, y… ▽ More

    Submitted 24 June, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

  3. arXiv:2404.01331  [pdf, other

    cs.CL cs.AI

    LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Model

    Authors: Musashi Hinck, Matthew L. Olson, David Cobbley, Shao-Yen Tseng, Vasudev Lal

    Abstract: We train a suite of multimodal foundation models (MMFM) using the popular LLaVA framework with the recently released Gemma family of large language models (LLMs). Of particular interest is the 2B parameter Gemma model, which provides opportunities to construct capable small-scale MMFMs. In line with findings from other papers in this space, we test the effect of ablating three design features: pre… ▽ More

    Submitted 10 June, 2024; v1 submitted 29 March, 2024; originally announced April 2024.

    Comments: CVPR 2024, MMFM workshop. Authors 1 and 2 contributed equally. Models available at https://huggingface.co/intel/llava-gemma-2b/ and https://huggingface.co/intel/llava-gemma-7b/ Training code at https://github.com/IntelLabs/multimodal_cognitive_ai/tree/main/LLaVA-Gemma

  4. arXiv:2312.04494  [pdf, other

    cs.HC cs.AI cs.CV cs.GR

    AVA: Towards Autonomous Visualization Agents through Visual Perception-Driven Decision-Making

    Authors: Shusen Liu, Haichao Miao, Zhimin Li, Matthew Olson, Valerio Pascucci, Peer-Timo Bremer

    Abstract: With recent advances in multi-modal foundation models, the previously text-only large language models (LLM) have evolved to incorporate visual input, opening up unprecedented opportunities for various applications in visualization. Our work explores the utilization of the visual perception ability of multi-modal LLMs to develop Autonomous Visualization Agents (AVAs) that can interpret and accompli… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Report number: LLNL-CONF-857838

  5. arXiv:2312.03642  [pdf, other

    cs.LG

    Transformer-Powered Surrogates Close the ICF Simulation-Experiment Gap with Extremely Limited Data

    Authors: Matthew L. Olson, Shusen Liu, Jayaraman J. Thiagarajan, Bogdan Kustowski, Weng-Keen Wong, Rushil Anirudh

    Abstract: Recent advances in machine learning, specifically transformer architecture, have led to significant advancements in commercial domains. These powerful models have demonstrated superior capability to learn complex relationships and often generalize better to new data and problems. This paper presents a novel transformer-powered approach for enhancing prediction accuracy in multi-modal output scenar… ▽ More

    Submitted 28 May, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

    Comments: MLST

  6. arXiv:2303.10774  [pdf, other

    cs.LG cs.CV

    Cross-GAN Auditing: Unsupervised Identification of Attribute Level Similarities and Differences between Pretrained Generative Models

    Authors: Matthew L. Olson, Shusen Liu, Rushil Anirudh, Jayaraman J. Thiagarajan, Peer-Timo Bremer, Weng-Keen Wong

    Abstract: Generative Adversarial Networks (GANs) are notoriously difficult to train especially for complex distributions and with limited data. This has driven the need for tools to audit trained networks in human intelligible format, for example, to identify biases or ensure fairness. Existing GAN audit tools are restricted to coarse-grained, model-data comparisons based on summary statistics such as FID o… ▽ More

    Submitted 2 May, 2023; v1 submitted 19 March, 2023; originally announced March 2023.

    Comments: CVPR 2023. Source code is available at https://github.com/mattolson93/cross_gan_auditing

  7. arXiv:2302.12689  [pdf, other

    cs.LG cs.AI

    GANterfactual-RL: Understanding Reinforcement Learning Agents' Strategies through Visual Counterfactual Explanations

    Authors: Tobias Huber, Maximilian Demmler, Silvan Mertes, Matthew L. Olson, Elisabeth André

    Abstract: Counterfactual explanations are a common tool to explain artificial intelligence models. For Reinforcement Learning (RL) agents, they answer "Why not?" or "What if?" questions by illustrating what minimal change to a state is needed such that an agent chooses a different action. Generating counterfactual explanations for RL agents with visual input is especially challenging because of their large… ▽ More

    Submitted 24 February, 2023; originally announced February 2023.

  8. arXiv:2209.13129  [pdf, other

    cs.AI

    Deep Generative Multimedia Children's Literature

    Authors: Matthew L. Olson

    Abstract: Artistic work leveraging Machine Learning techniques is an increasingly popular endeavour for those with a creative lean. However, most work is done in a single domain: text, images, music, etc. In this work, I design a system for a machine learning created multimedia experience, specifically in the genre of children's literature. We detail the process for exclusively using publicly available pret… ▽ More

    Submitted 10 January, 2023; v1 submitted 26 September, 2022; originally announced September 2022.

    Comments: AAAI 2023 Workshop on Creative AI Across Modalities

  9. arXiv:2110.02150  [pdf, other

    cs.PF

    Online Application Guidance for Heterogeneous Memory Systems

    Authors: M. Ben Olson, Brandon Kammerdiener, Kshitij A. Doshi, Terry Jones, Michael R. Jantz

    Abstract: Many high end and next generation computing systems to incorporated alternative memory technologies to meet performance goals. Since these technologies present distinct advantages and tradeoffs compared to conventional DDR* SDRAM, such as higher bandwidth with lower capacity or vice versa, they are typically packaged alongside conventional SDRAM in a heterogeneous memory architecture. To utilize t… ▽ More

    Submitted 5 October, 2021; originally announced October 2021.

  10. arXiv:2108.08000  [pdf, other

    cs.LG cs.AI cs.HC

    Contrastive Identification of Covariate Shift in Image Data

    Authors: Matthew L. Olson, Thuy-Vy Nguyen, Gaurav Dixit, Neale Ratzlaff, Weng-Keen Wong, Minsuk Kahng

    Abstract: Identifying covariate shift is crucial for making machine learning systems robust in the real world and for detecting training data biases that are not reflected in test data. However, detecting covariate shift is challenging, especially when the data consists of high-dimensional images, and when multiple types of localized covariate shift affect different subspaces of the data. Although automated… ▽ More

    Submitted 19 August, 2021; v1 submitted 18 August, 2021; originally announced August 2021.

    Comments: IEEE VIS 2021

  11. Counterfactual State Explanations for Reinforcement Learning Agents via Generative Deep Learning

    Authors: Matthew L. Olson, Roli Khanna, Lawrence Neal, Fuxin Li, Weng-Keen Wong

    Abstract: Counterfactual explanations, which deal with "why not?" scenarios, can provide insightful explanations to an AI agent's behavior. In this work, we focus on generating counterfactual explanations for deep reinforcement learning (RL) agents which operate in visual input environments like Atari. We introduce counterfactual state explanations, a novel example-based approach to counterfactual explanati… ▽ More

    Submitted 29 January, 2021; originally announced January 2021.

    Comments: Full source code available at https://github.com/mattolson93/counterfactual-state-explanations

    Journal ref: Artificial Intelligence, 2021, 103455, ISSN 0004-3702

  12. arXiv:2012.08113  [pdf, other

    cs.CL cs.LG

    Enriched Annotations for Tumor Attribute Classification from Pathology Reports with Limited Labeled Data

    Authors: Nick Altieri, Briton Park, Mara Olson, John DeNero, Anobel Odisho, Bin Yu

    Abstract: Precision medicine has the potential to revolutionize healthcare, but much of the data for patients is locked away in unstructured free-text, limiting research and delivery of effective personalized treatments. Generating large annotated datasets for information extraction from clinical notes is often challenging and expensive due to the high level of expertise needed for high quality annotations.… ▽ More

    Submitted 15 December, 2020; originally announced December 2020.

  13. arXiv:1909.12969  [pdf, other

    cs.LG cs.AI cs.HC stat.ML

    Counterfactual States for Atari Agents via Generative Deep Learning

    Authors: Matthew L. Olson, Lawrence Neal, Fuxin Li, Weng-Keen Wong

    Abstract: Although deep reinforcement learning agents have produced impressive results in many domains, their decision making is difficult to explain to humans. To address this problem, past work has mainly focused on explaining why an action was chosen in a given state. A different type of explanation that is useful is a counterfactual, which deals with "what if?" scenarios. In this work, we introduce the… ▽ More

    Submitted 27 September, 2019; originally announced September 2019.

    Comments: IJCAI XAI Workshop 2019

  14. arXiv:1812.05792  [pdf, other

    stat.ML cs.LG

    Making Sense of Random Forest Probabilities: a Kernel Perspective

    Authors: Matthew A. Olson, Abraham J. Wyner

    Abstract: A random forest is a popular tool for estimating probabilities in machine learning classification tasks. However, the means by which this is accomplished is unprincipled: one simply counts the fraction of trees in a forest that vote for a certain class. In this paper, we forge a connection between random forests and kernel regression. This places random forest probability estimation on more sound… ▽ More

    Submitted 14 December, 2018; originally announced December 2018.

  15. arXiv:1504.07676  [pdf, other

    stat.ML cs.LG stat.ME

    Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers

    Authors: Abraham J. Wyner, Matthew Olson, Justin Bleich, David Mease

    Abstract: There is a large literature explaining why AdaBoost is a successful classifier. The literature on AdaBoost focuses on classifier margins and boosting's interpretation as the optimization of an exponential likelihood function. These existing explanations, however, have been pointed out to be incomplete. A random forest is another popular ensemble method for which there is substantially less explana… ▽ More

    Submitted 29 April, 2017; v1 submitted 28 April, 2015; originally announced April 2015.

    Comments: 40 pages, 11 figures, 2 algorithms

  16. Phase Transitions on Fixed Connected Graphs and Random Graphs in the Presence of Noise

    Authors: Jialing Liu, Vikas Yadav, Hullas Sehgal, Joshua M. Olson, Haifeng Liu, Nicola Elia

    Abstract: In this paper, we study the phase transition behavior emerging from the interactions among multiple agents in the presence of noise. We propose a simple discrete-time model in which a group of non-mobile agents form either a fixed connected graph or a random graph process, and each agent, taking bipolar value either +1 or -1, updates its value according to its previous value and the noisy measur… ▽ More

    Submitted 24 August, 2008; originally announced August 2008.

    Comments: 15 pages, 3 figures. To appear in the IEEE Transactions on Automatic Control

    Journal ref: IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 53, NO. 8, 1817-1825, SEPTEMBER 2008