Skip to main content

Showing 1–50 of 2,860 results for author: Jason

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.06023  [pdf, other

    cs.CL cs.AI

    Distilling System 2 into System 1

    Authors: ** Yu, **g Xu, Jason Weston, Ilia Kulikov

    Abstract: Large language models (LLMs) can spend extra compute during inference to generate intermediate thoughts, which helps to produce better final responses. Since Chain-of-Thought (Wei et al., 2022), many such System 2 techniques have been proposed such as Rephrase and Respond (Deng et al., 2023a), System 2 Attention (Weston and Sukhbaatar, 2023) and Branch-Solve-Merge (Saha et al., 2023). In this work… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  2. arXiv:2407.05832  [pdf, other

    astro-ph.EP cs.LG

    A Data-Driven Machine Learning Approach for Detecting Albedo Anomalies on the Lunar Surface

    Authors: Sofia Strukova, Sergei Gleyzer, Patrick Peplowski, Jason P. Terry

    Abstract: This study introduces a data-driven approach using machine learning (ML) techniques to explore and predict albedo anomalies on the Moon's surface. The research leverages diverse planetary datasets, including high-spatial-resolution albedo maps and element maps (LPFe, LPK, LPTh, LPTi) derived from laser and gamma-ray measurements. The primary objective is to identify relationships between chemical… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  3. arXiv:2407.05504  [pdf

    cs.HC

    Multimedia and Immersive Training Materials Influence Impressions of Learning But Not Learning Outcomes

    Authors: Benjamin A. Clegg, Alex Karduna, Ethan Holen, Jason Garcia, Matthew G. Rhodes, Francisco R. Ortega

    Abstract: Although the use of technologies like multimedia and virtual reality (VR) in training offer the promise of improved learning, these richer and potentially more engaging materials do not consistently produce superior learning outcomes. Default approaches to such training may inadvertently mimic concepts like naive realism in display design, and desirable difficulties in the science of learning - fo… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: published and presented in I/ITSEC 2022

  4. arXiv:2407.05467  [pdf, other

    cs.DC cs.AI

    The infrastructure powering IBM's Gen AI model development

    Authors: Talia Gershon, Seetharami Seelam, Brian Belgodere, Milton Bonilla, Lan Hoang, Danny Barnett, I-Hsin Chung, Apoorve Mohan, Ming-Hung Chen, Lixiang Luo, Robert Walkup, Constantinos Evangelinos, Shweta Salaria, Marc Dombrowa, Yoonho Park, Apo Kayi, Liran Schour, Alim Alim, Ali Sydney, Pavlos Maniotis, Laurent Schares, Bernard Metzler, Bengi Karacali-Akyamac, Sophia Wen, Tatsuhiro Chiba , et al. (121 additional authors not shown)

    Abstract: AI Infrastructure plays a key role in the speed and cost-competitiveness of develo** and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational models, where on occasion thousands of GPUs must cooperate on a single training job for the model to be trained in a reasonable time. Delivering effi… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: Corresponding Authors: Talia Gershon, Seetharami Seelam,Brian Belgodere, Milton Bonilla

  5. arXiv:2407.04976  [pdf, other

    cs.DS

    Congestion-Approximators from the Bottom Up

    Authors: Jason Li, Satish Rao, Di Wang

    Abstract: We develop a novel algorithm to construct a congestion-approximator with polylogarithmic quality on a capacitated, undirected graph in nearly-linear time. Our approach is the first *bottom-up* hierarchical construction, in contrast to previous *top-down* approaches including that of Racke, Shah, and Taubig (SODA 2014), the only other construction achieving polylogarithmic quality that is implement… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: 46 pages

  6. arXiv:2407.04181  [pdf, other

    cs.AI cs.CL

    Orchestrating LLMs with Different Personalizations

    Authors: ** Peng Zhou, Katie Z Luo, **gwen Gu, Jason Yuan, Kilian Q. Weinberger, Wen Sun

    Abstract: This paper presents a novel approach to aligning large language models (LLMs) with individual human preferences, sometimes referred to as Reinforcement Learning from \textit{Personalized} Human Feedback (RLPHF). Given stated preferences along multiple dimensions, such as helpfulness, conciseness, or humor, the goal is to create an LLM without re-training that best adheres to this specification. St… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  7. arXiv:2407.02723  [pdf, other

    cs.CL

    e-Health CSIRO at "Discharge Me!" 2024: Generating Discharge Summary Sections with Fine-tuned Language Models

    Authors: **ghui Liu, Aaron Nicolson, Jason Dowling, Bevan Koopman, Anthony Nguyen

    Abstract: Clinical documentation is an important aspect of clinicians' daily work and often demands a significant amount of time. The BioNLP 2024 Shared Task on Streamlining Discharge Documentation (Discharge Me!) aims to alleviate this documentation burden by automatically generating discharge summary sections, including brief hospital course and discharge instruction, which are often time-consuming to syn… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: BioNLP @ ACL 2024

  8. arXiv:2407.02626  [pdf

    cs.DB

    The text2term tool to map free-text descriptions of biomedical terms to ontologies

    Authors: Rafael S. Gonçalves, Jason Payne, Amelia Tan, Carmen Benitez, Jamie Haddock, Robert Gentleman

    Abstract: There is an ongoing need for scalable tools to aid researchers in both retrospective and prospective standardization of discrete entity types -- such as disease names, cell types or chemicals -- that are used in metadata associated with biomedical data. When metadata are not well-structured or precise, the associated data are harder to find and are often burdensome to reuse, analyze or integrate w… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  9. arXiv:2407.01222  [pdf, other

    cs.RO

    Deep Learning Models for Flap** Fin Unmanned Underwater Vehicle Control System Gait Optimization

    Authors: Brian Zhou, Kamal Viswanath, Jason Geder, Alisha Sharma, Julian Lee

    Abstract: The last few decades have led to the rise of research focused on propulsion and control systems for bio-inspired unmanned underwater vehicles (UUVs), which provide more maneuverable alternatives to traditional UUVs in underwater missions. Recent work has explored the use of time-series neural network surrogate models to predict thrust and power from vehicle design and fin kinematics. We develop a… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 28 pages, 20 figures. arXiv admin note: text overlap with arXiv:2310.14135

  10. arXiv:2407.00908  [pdf, other

    cs.CL cs.AI

    FineSurE: Fine-grained Summarization Evaluation using LLMs

    Authors: Hwanjun Song, Hang Su, Igor Shalyminov, Jason Cai, Saab Mansour

    Abstract: Automated evaluation is crucial for streamlining text summarization benchmarking and model development, given the costly and time-consuming nature of human evaluation. Traditional methods like ROUGE do not correlate well with human judgment, while recently proposed LLM-based metrics provide only summary-level assessment using Likert-scale scores. This limits deeper model analysis, e.g., we can onl… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: Accepted at ACL 2024 (main, long)

  11. arXiv:2407.00438  [pdf, other

    cs.CV

    AI Age Discrepancy: A Novel Parameter for Frailty Assessment in Kidney Tumor Patients

    Authors: Rikhil Seshadri, Jayant Siva, Angelica Bartholomew, Clara Goebel, Gabriel Wallerstein-King, Beatriz López Morato, Nicholas Heller, Jason Scovell, Rebecca Campbell, Andrew Wood, Michal Ozery-Flato, Vesna Barros, Maria Gabrani, Michal Rosen-Zvi, Resha Tejpaul, Vidhyalakshmi Ramesh, Nikolaos Papanikolopoulos, Subodh Regmi, Ryan Ward, Robert Abouassaly, Steven C. Campbell, Erick Remer, Christopher Weight

    Abstract: Kidney cancer is a global health concern, and accurate assessment of patient frailty is crucial for optimizing surgical outcomes. This paper introduces AI Age Discrepancy, a novel metric derived from machine learning analysis of preoperative abdominal CT scans, as a potential indicator of frailty and postoperative risk in kidney cancer patients. This retrospective study of 599 patients from the 20… ▽ More

    Submitted 2 July, 2024; v1 submitted 29 June, 2024; originally announced July 2024.

    Comments: 10 pages, 3 figures, 2 tables

  12. arXiv:2406.19967  [pdf, other

    cs.CL cs.AI

    Into the Unknown: Generating Geospatial Descriptions for New Environments

    Authors: Tzuf Paz-Argaman, John Palowitch, Sayali Kulkarni, Reut Tsarfaty, Jason Baldridge

    Abstract: Similar to vision-and-language navigation (VLN) tasks that focus on bridging the gap between vision and language for embodied navigation, the new Rendezvous (RVS) task requires reasoning over allocentric spatial relationships (independent of the observer's viewpoint) using non-sequential navigation instructions and maps. However, performance substantially drops in new environments with no training… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Journal ref: ACL 2024 Findings

  13. arXiv:2406.19617  [pdf, ps, other

    cs.LG cs.IT math.OC

    Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity

    Authors: Qian Yu, Yining Wang, Baihe Huang, Qi Lei, Jason D. Lee

    Abstract: Optimization of convex functions under stochastic zeroth-order feedback has been a major and challenging question in online learning. In this work, we consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries. We provide the first tight characterization for the rate of the mi… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  14. arXiv:2406.19538  [pdf, other

    cs.CL

    Context Matters: An Empirical Study of the Impact of Contextual Information in Temporal Question Answering Systems

    Authors: Dan Schumacher, Fatemeh Haji, Tara Grey, Niharika Bandlamudi, Nupoor Karnik, Gagana Uday Kumar, Jason Cho-Yu Chiang, Paul Rad, Nishant Vishwamitra, Anthony Rios

    Abstract: Large language models (LLMs) often struggle with temporal reasoning, crucial for tasks like historical event analysis and time-sensitive information retrieval. Despite advancements, state-of-the-art models falter in handling temporal information, especially when faced with irrelevant or noisy contexts. This paper addresses this gap by empirically examining the robustness of temporal question-answe… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  15. arXiv:2406.17957  [pdf, other

    cs.SD cs.AI eess.AS

    Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment

    Authors: Paarth Neekhara, Shehzeen Hussain, Subhankar Ghosh, Jason Li, Rafael Valle, Rohan Badlani, Boris Ginsburg

    Abstract: Large Language Model (LLM) based text-to-speech (TTS) systems have demonstrated remarkable capabilities in handling large speech datasets and generating natural speech for new speakers. However, LLM-based TTS models are not robust as the generated output can contain repeating words, missing words and mis-aligned speech (referred to as hallucinations or attention errors), especially when the text c… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Published as a conference paper at INTERSPEECH 2024

  16. arXiv:2406.17744  [pdf, other

    cs.CL

    Following Length Constraints in Instructions

    Authors: Weizhe Yuan, Ilia Kulikov, ** Yu, Kyunghyun Cho, Sainbayar Sukhbaatar, Jason Weston, **g Xu

    Abstract: Aligned instruction following models can better fulfill user requests than their unaligned counterparts. However, it has been shown that there is a length bias in evaluation of such models, and that training algorithms tend to exploit this bias by learning longer responses. In this work we show how to train models that can be controlled at inference time with instructions containing desired length… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 13 pages

  17. arXiv:2406.16955  [pdf, other

    eess.SP cs.CV cs.LG

    SRViT: Vision Transformers for Estimating Radar Reflectivity from Satellite Observations at Scale

    Authors: Jason Stock, Kyle Hilburn, Imme Ebert-Uphoff, Charles Anderson

    Abstract: We introduce a transformer-based neural network to generate high-resolution (3km) synthetic radar reflectivity fields at scale from geostationary satellite imagery. This work aims to enhance short-term convective-scale forecasts of high-impact weather events and aid in data assimilation for numerical weather prediction over the United States. Compared to convolutional approaches, which have limite… ▽ More

    Submitted 28 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: Published as a workshop paper at "Machine Learning for Earth System Modeling", ICML 2024; added acknowledgements and github link

  18. A multitask learning framework for leveraging subjectivity of annotators to identify misogyny

    Authors: Jason Angel, Segun Taofeek Aroyehun, Grigori Sidorov, Alexander Gelbukh

    Abstract: Identifying misogyny using artificial intelligence is a form of combating online toxicity against women. However, the subjective nature of interpreting misogyny poses a significant challenge to model the phenomenon. In this paper, we propose a multitask learning approach that leverages the subjectivity of this task to enhance the performance of the misogyny identification systems. We incorporated… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  19. arXiv:2406.15677  [pdf, other

    cs.RO

    Open-vocabulary Pick and Place via Patch-level Semantic Maps

    Authors: Mingxi Jia, Haojie Huang, Zhewen Zhang, Chenghao Wang, Linfeng Zhao, Dian Wang, Jason Xinyu Liu, Robin Walters, Robert Platt, Stefanie Tellex

    Abstract: Controlling robots through natural language instructions in open-vocabulary scenarios is pivotal for enhancing human-robot collaboration and complex robot behavior synthesis. However, achieving this capability poses significant challenges due to the need for a system that can generalize from limited data to a wide range of tasks and environments. Existing methods rely on large, costly datasets and… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  20. arXiv:2406.15045  [pdf, other

    cs.CL

    Harnessing Knowledge Retrieval with Large Language Models for Clinical Report Error Correction

    Authors: **ge Wu, Zhaolong Wu, Abul Hasan, Yunsoo Kim, Jason P. Y. Cheung, Teng Zhang, Honghan Wu

    Abstract: This study proposes an approach for error correction in clinical radiology reports, leveraging large language models (LLMs) and retrieval-augmented generation (RAG) techniques. The proposed framework employs internal and external retrieval mechanisms to extract relevant medical entities and relations from the report and external knowledge sources. A three-stage inference process is introduced, dec… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  21. arXiv:2406.14864  [pdf, other

    cs.LG stat.AP stat.ML

    A review of feature selection strategies utilizing graph data structures and knowledge graphs

    Authors: Sisi Shao, Pedro Henrique Ribeiro, Christina Ramirez, Jason H. Moore

    Abstract: Feature selection in Knowledge Graphs (KGs) are increasingly utilized in diverse domains, including biomedical research, Natural Language Processing (NLP), and personalized recommendation systems. This paper delves into the methodologies for feature selection within KGs, emphasizing their roles in enhancing machine learning (ML) model efficacy, hypothesis generation, and interpretability. Through… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  22. arXiv:2406.14769  [pdf

    cs.AI

    How critically can an AI think? A framework for evaluating the quality of thinking of generative artificial intelligence

    Authors: Luke Zaphir, Jason M. Lodge, Jacinta Lisec, Dom McGrath, Hassan Khosravi

    Abstract: Generative AI such as those with large language models have created opportunities for innovative assessment design practices. Due to recent technological developments, there is a need to know the limits and capabilities of generative AI in terms of simulating cognitive skills. Assessing student critical thinking skills has been a feature of assessment for time immemorial, but the demands of digita… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  23. arXiv:2406.14739  [pdf, other

    cs.CL

    Learning to Retrieve Iteratively for In-Context Learning

    Authors: Yunmo Chen, Tongfei Chen, Harsh Jhamtani, Patrick Xia, Richard Shin, Jason Eisner, Benjamin Van Durme

    Abstract: We introduce iterative retrieval, a novel framework that empowers retrievers to make iterative decisions through policy optimization. Finding an optimal portfolio of retrieved items is a combinatorial optimization problem, generally considered NP-hard. This approach provides a learned approximation to such a solution, meeting specific task requirements under a given family of large language models… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  24. Low-Step Multi-Commodity Flow Emulators

    Authors: Bernhard Haeupler, D Ellis Hershkowitz, Jason Li, Antti Roeyskoe, Thatchaphol Saranurak

    Abstract: We introduce the concept of low-step multi-commodity flow emulators for any undirected, capacitated graph. At a high level, these emulators contain approximate multi-commodity flows whose paths contain a small number of edges, shattering the infamous flow decomposition barrier for multi-commodity flow. We prove the existence of low-step multi-commodity flow emulators and develop efficient algori… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Appears at STOC 2024

  25. arXiv:2406.13231  [pdf, other

    cs.DS

    Tight Lower Bounds for Directed Cut Sparsification and Distributed Min-Cut

    Authors: Yu Cheng, Max Li, Honghao Lin, Zi-Yi Tai, David P. Woodruff, Jason Zhang

    Abstract: In this paper, we consider two fundamental cut approximation problems on large graphs. We prove new lower bounds for both problems that are optimal up to logarithmic factors. The first problem is to approximate cuts in balanced directed graphs. In this problem, the goal is to build a data structure that $(1 \pm ε)$-approximates cut values in graphs with $n$ vertices. For arbitrary directed graph… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  26. arXiv:2406.13181  [pdf, other

    cs.CV

    The Impact of Auxiliary Patient Data on Automated Chest X-Ray Report Generation and How to Incorporate It

    Authors: Aaron Nicolson, Shengyao Zhuang, Jason Dowling, Bevan Koopman

    Abstract: This study investigates the integration of diverse patient data sources into multimodal language models for automated chest X-ray (CXR) report generation. Traditionally, CXR report generation relies solely on CXR images and limited radiology data, overlooking valuable information from patient health records, particularly from emergency departments. Utilising the MIMIC-CXR and MIMIC-IV-ED datasets,… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  27. arXiv:2406.12881  [pdf, other

    physics.acc-ph cs.CL

    Towards Unlocking Insights from Logbooks Using AI

    Authors: Antonin Sulc, Alex Bien, Annika Eichler, Daniel Ratner, Florian Rehm, Frank Mayet, Gregor Hartmann, Hayden Hoschouer, Henrik Tuennermann, Jan Kaiser, Jason St. John, Jennefer Maldonado, Kyle Hazelwood, Raimund Kammering, Thorsten Hellert, Tim Wilksen, Verena Kain, Wan-Lin Hu

    Abstract: Electronic logbooks contain valuable information about activities and events concerning their associated particle accelerator facilities. However, the highly technical nature of logbook entries can hinder their usability and automation. As natural language processing (NLP) continues advancing, it offers opportunities to address various challenges that logbooks present. This work explores jointly t… ▽ More

    Submitted 25 May, 2024; originally announced June 2024.

    Comments: 5 pages, 1 figure, 15th International Particle Accelerator Conference

  28. arXiv:2406.12150  [pdf, other

    cs.LG cs.AI

    ChaosMining: A Benchmark to Evaluate Post-Hoc Local Attribution Methods in Low SNR Environments

    Authors: Ge Shi, Ziwen Kan, Jason Smucny, Ian Davidson

    Abstract: In this study, we examine the efficacy of post-hoc local attribution methods in identifying features with predictive power from irrelevant ones in domains characterized by a low signal-to-noise ratio (SNR), a common scenario in real-world machine learning applications. We developed synthetic datasets encompassing symbolic functional, image, and audio data, incorporating a benchmark on the {\it (Mo… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 19 pages, 10 figures, submission to Neurips 2024

  29. arXiv:2406.12006  [pdf, other

    cs.NE

    Lexidate: Model Evaluation and Selection with Lexicase

    Authors: Jose Guadalupe Hernandez, Anil Kumar Saini, Jason H. Moore

    Abstract: Automated machine learning streamlines the task of finding effective machine learning pipelines by automating model training, evaluation, and selection. Traditional evaluation strategies, like cross-validation (CV), generate one value that averages the accuracy of a pipeline's predictions. This single value, however, may not fully describe the generalizability of the pipeline. Here, we present Lex… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  30. arXiv:2406.11880  [pdf, other

    cs.CR cs.LG

    Knowledge Return Oriented Prompting (KROP)

    Authors: Jason Martin, Kenneth Yeung

    Abstract: Many Large Language Models (LLMs) and LLM-powered apps deployed today use some form of prompt filter or alignment to protect their integrity. However, these measures aren't foolproof. This paper introduces KROP, a prompt injection technique capable of obfuscating prompt injection attacks, rendering them virtually undetectable to most of these security measures.

    Submitted 11 June, 2024; originally announced June 2024.

  31. arXiv:2406.11779  [pdf, other

    cs.LG cs.LO

    Compact Proofs of Model Performance via Mechanistic Interpretability

    Authors: Jason Gross, Rajashree Agrawal, Thomas Kwa, Euan Ong, Chun Hei Yip, Alex Gibson, Soufiane Noubir, Lawrence Chan

    Abstract: We propose using mechanistic interpretability -- techniques for reverse engineering model weights into human-interpretable algorithms -- to derive and compactly prove formal guarantees on model performance. We prototype this approach by formally proving lower bounds on the accuracy of 151 small transformers trained on a Max-of-$K$ task. We create 102 different computer-assisted proof strategies an… ▽ More

    Submitted 7 July, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: accepted to ICML 2024 Workshop on Mechanistic Interpretability (Spotlight)

  32. arXiv:2406.11704  [pdf, other

    cs.CL cs.AI cs.LG

    Nemotron-4 340B Technical Report

    Authors: Nvidia, :, Bo Adler, Niket Agarwal, Ashwath Aithal, Dong H. Anh, Pallab Bhattacharya, Annika Brundyn, Jared Casper, Bryan Catanzaro, Sharon Clay, Jonathan Cohen, Sirshak Das, Ayush Dattagupta, Olivier Delalleau, Leon Derczynski, Yi Dong, Daniel Egert, Ellie Evans, Aleksander Ficek, Denys Fridman, Shaona Ghosh, Boris Ginsburg, Igor Gitman, Tomasz Grzegorzek , et al. (58 additional authors not shown)

    Abstract: We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under the NVIDIA Open Model License Agreement, a permissive model license that allows distribution, modification, and use of the models and its outputs. These models perform competitively to open access models on a wide range of evaluation be… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  33. arXiv:2406.10290  [pdf, other

    cs.CL cs.AI cs.LG

    MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases

    Authors: Rithesh Murthy, Liangwei Yang, Juntao Tan, Tulika Manoj Awalgaonkar, Yilun Zhou, Shelby Heinecke, Sachin Desai, Jason Wu, Ran Xu, Sarah Tan, Jianguo Zhang, Zhiwei Liu, Shirley Kokane, Zuxin Liu, Ming Zhu, Huan Wang, Caiming Xiong, Silvio Savarese

    Abstract: The deployment of Large Language Models (LLMs) and Large Multimodal Models (LMMs) on mobile devices has gained significant attention due to the benefits of enhanced privacy, stability, and personalization. However, the hardware constraints of mobile devices necessitate the use of models with fewer parameters and model compression techniques like quantization. Currently, there is limited understand… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  34. arXiv:2406.10215  [pdf, other

    cs.CL cs.LG

    DevBench: A multimodal developmental benchmark for language learning

    Authors: Alvin Wei Ming Tan, Sunny Yu, Bria Long, Wan**g Anya Ma, Tonya Murray, Rebecca D. Silverman, Jason D. Yeatman, Michael C. Frank

    Abstract: How (dis)similar are the learning trajectories of vision-language models and children? Recent modeling work has attempted to understand the gap between models' and humans' data efficiency by constructing models trained on less data, especially multimodal naturalistic data. However, such models are often evaluated on adult-level benchmarks, with limited breadth in language abilities tested, and wit… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  35. arXiv:2406.10211  [pdf, other

    cs.CV

    DiffusionBlend: Learning 3D Image Prior through Position-aware Diffusion Score Blending for 3D Computed Tomography Reconstruction

    Authors: Bowen Song, Jason Hu, Zhaoxu Luo, Jeffrey A. Fessler, Liyue Shen

    Abstract: Diffusion models face significant challenges when employed for large-scale medical image reconstruction in real practice such as 3D Computed Tomography (CT). Due to the demanding memory, time, and data requirements, it is difficult to train a diffusion model directly on the entire volume of high-dimensional data to obtain an efficient 3D diffusion prior. Existing works utilizing diffusion priors o… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  36. arXiv:2406.09750  [pdf, other

    cs.CV cs.AI

    ControlVAR: Exploring Controllable Visual Autoregressive Modeling

    Authors: Xiang Li, Kai Qiu, Hao Chen, Jason Kuen, Zhe Lin, Rita Singh, Bhiksha Raj

    Abstract: Conditional visual generation has witnessed remarkable progress with the advent of diffusion models (DMs), especially in tasks like control-to-image generation. However, challenges such as expensive computational cost, high inference latency, and difficulties of integration with large language models (LLMs) have necessitated exploring alternatives to DMs. This paper introduces ControlVAR, a novel… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 24 pages, 19 figures, 4 tables

  37. arXiv:2406.09606  [pdf, other

    cs.LG cs.AI cs.AR

    Cross-Modality Program Representation Learning for Electronic Design Automation with High-Level Synthesis

    Authors: Zongyue Qin, Yunsheng Bai, Atefeh Sohrabizadeh, Zijian Ding, Ziniu Hu, Yizhou Sun, Jason Cong

    Abstract: In recent years, domain-specific accelerators (DSAs) have gained popularity for applications such as deep learning and autonomous driving. To facilitate DSA designs, programmers use high-level synthesis (HLS) to compile a high-level description written in C/C++ into a design with low-level hardware description languages that eventually synthesize DSAs on circuits. However, creating a high-quality… ▽ More

    Submitted 27 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: 14 pages, 8 figures. arXiv admin note: text overlap with arXiv:2305.10838

  38. arXiv:2406.09579  [pdf, other

    cs.HC

    Hovering Over the Key to Text Input in XR

    Authors: Mar Gonzalez-Franco, Diar Abdlkarim, Arpit Bhatia, Stuart Macgregor, Jason Alexander Fotso-Puepi, Eric J Gonzalez, Hasti Seifi, Massimiliano Di Luca, Karan Ahuja

    Abstract: Virtual, Mixed, and Augmented Reality (XR) technologies hold immense potential for transforming productivity beyond PC. Therefore there is a critical need for improved text input solutions for XR. However, achieving efficient text input in these environments remains a significant challenge. This paper examines the current landscape of XR text input techniques, focusing on the importance of keyboar… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  39. arXiv:2406.09363  [pdf, ps, other

    cs.AI cs.GT cs.LG

    ElicitationGPT: Text Elicitation Mechanisms via Language Models

    Authors: Yifan Wu, Jason Hartline

    Abstract: Scoring rules evaluate probabilistic forecasts of an unknown state against the realized state and are a fundamental building block in the incentivized elicitation of information and the training of machine learning models. This paper develops mechanisms for scoring elicited text against ground truth text using domain-knowledge-free queries to a large language model (specifically ChatGPT) and empir… ▽ More

    Submitted 18 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  40. arXiv:2406.09103  [pdf, other

    cs.CL

    Chain-of-Though (CoT) prompting strategies for medical error detection and correction

    Authors: Zhaolong Wu, Abul Hasan, **ge Wu, Yunsoo Kim, Jason P. Y. Cheung, Teng Zhang, Honghan Wu

    Abstract: This paper describes our submission to the MEDIQA-CORR 2024 shared task for automatically detecting and correcting medical errors in clinical notes. We report results for three methods of few-shot In-Context Learning (ICL) augmented with Chain-of-Thought (CoT) and reason prompts using a large language model (LLM). In the first method, we manually analyse a subset of train and validation dataset to… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: accepted as NAACL workshop

  41. arXiv:2406.08466  [pdf, other

    cs.LG cs.AI math.ST stat.ML

    Scaling Laws in Linear Regression: Compute, Parameters, and Data

    Authors: Licong Lin, **gfeng Wu, Sham M. Kakade, Peter L. Bartlett, Jason D. Lee

    Abstract: Empirically, large-scale deep learning models often satisfy a neural scaling law: the test error of the trained model improves polynomially as the model size and data size grow. However, conventional wisdom suggests the test error consists of approximation, bias, and variance errors, where the variance error increases with model size. This disagrees with the general form of neural scaling laws, wh… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  42. arXiv:2406.08205  [pdf, other

    cs.SE cs.LG

    What do we know about Hugging Face? A systematic literature review and quantitative validation of qualitative claims

    Authors: Jason Jones, Wenxin Jiang, Nicholas Synovic, George K. Thiruvathukal, James C. Davis

    Abstract: Background: Collaborative Software Package Registries (SPRs) are an integral part of the software supply chain. Much engineering work synthesizes SPR package into applications. Prior research has examined SPRs for traditional software, such as NPM (JavaScript) and PyPI (Python). Pre-Trained Model (PTM) Registries are an emerging class of SPR of increasing importance, because they support the deep… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  43. arXiv:2406.07813  [pdf, other

    eess.IV cs.CV

    Evaluating the Impact of Sequence Combinations on Breast Tumor Segmentation in Multiparametric MRI

    Authors: Hang Min, Gorane Santamaria Hormaechea, Prabhakar Ramachandran, Jason Dowling

    Abstract: Multiparametric magnetic resonance imaging (mpMRI) is a key tool for assessing breast cancer progression. Although deep learning has been applied to automate tumor segmentation in breast MRI, the effect of sequence combinations in mpMRI remains under-investigated. This study explores the impact of different combinations of T2-weighted (T2w), dynamic contrast-enhanced MRI (DCE-MRI) and diffusion-we… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  44. arXiv:2406.07739  [pdf, other

    cs.CL cs.HC cs.SE

    UICoder: Finetuning Large Language Models to Generate User Interface Code through Automated Feedback

    Authors: Jason Wu, Eldon Schoop, Alan Leung, Titus Barik, Jeffrey P. Bigham, Jeffrey Nichols

    Abstract: Large language models (LLMs) struggle to consistently generate UI code that compiles and produces visually relevant designs. Existing approaches to improve generation rely on expensive human feedback or distilling a proprietary model. In this paper, we explore the use of automated feedback (compilers and multi-modal models) to guide LLMs to generate high-quality UI code. Our method starts with an… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted to NAACL 2024

  45. arXiv:2406.06893  [pdf, other

    stat.ML cs.IT cs.LG

    Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot

    Authors: Zixuan Wang, Stanley Wei, Daniel Hsu, Jason D. Lee

    Abstract: The transformer architecture has prevailed in various deep learning settings due to its exceptional capabilities to select and compose structural information. Motivated by these capabilities, Sanford et al. proposed the sparse token selection task, in which transformers excel while fully-connected networks (FCNs) fail in the worst case. Building upon that, we strengthen the FCN lower bound to an a… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  46. arXiv:2406.06796  [pdf, other

    cs.CV cs.AI cs.LG cs.RO eess.SP

    FlexLoc: Conditional Neural Networks for Zero-Shot Sensor Perspective Invariance in Object Localization with Distributed Multimodal Sensors

    Authors: Jason Wu, Ziqi Wang, Xiaomin Ouyang, Ho Lyun Jeong, Colin Samplawski, Lance Kaplan, Benjamin Marlin, Mani Srivastava

    Abstract: Localization is a critical technology for various applications ranging from navigation and surveillance to assisted living. Localization systems typically fuse information from sensors viewing the scene from different perspectives to estimate the target location while also employing multiple modalities for enhanced robustness and accuracy. Recently, such systems have employed end-to-end deep neura… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  47. arXiv:2406.06512  [pdf, other

    cs.CV cs.AI

    Merlin: A Vision Language Foundation Model for 3D Computed Tomography

    Authors: Louis Blankemeier, Joseph Paul Cohen, Ashwin Kumar, Dave Van Veen, Syed Jamal Safdar Gardezi, Magdalini Paschali, Zhihong Chen, Jean-Benoit Delbrouck, Eduardo Reis, Cesar Truyts, Christian Bluethgen, Malte Engmann Kjeldskov Jensen, Sophie Ostmeier, Maya Varma, Jeya Maria Jose Valanarasu, Zhongnan Fang, Zepeng Huo, Zaid Nabulsi, Diego Ardila, Wei-Hung Weng, Edson Amaro Junior, Neera Ahuja, Jason Fries, Nigam H. Shah, Andrew Johnston , et al. (6 additional authors not shown)

    Abstract: Over 85 million computed tomography (CT) scans are performed annually in the US, of which approximately one quarter focus on the abdomen. Given the current radiologist shortage, there is a large impetus to use artificial intelligence to alleviate the burden of interpreting these complex imaging studies. Prior state-of-the-art approaches for automated medical image interpretation leverage vision la… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 18 pages, 7 figures

  48. arXiv:2406.06433  [pdf, other

    cs.LG cs.AI

    DISCO: An End-to-End Bandit Framework for Personalised Discount Allocation

    Authors: Jason Shuo Zhang, Benjamin Howson, Panayiota Savva, Eleanor Loh

    Abstract: Personalised discount codes provide a powerful mechanism for managing customer relationships and operational spend in e-commerce. Bandits are well suited for this product area, given the partial information nature of the problem, as well as the need for adaptation to the changing business environment. Here, we introduce DISCO, an end-to-end contextual bandit framework for personalised discount cod… ▽ More

    Submitted 12 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted at ECML/PKDD 2024

  49. arXiv:2406.05588  [pdf, other

    cs.CL cs.AI cs.LG

    CERET: Cost-Effective Extrinsic Refinement for Text Generation

    Authors: Jason Cai, Hang Su, Monica Sunkara, Igor Shalyminov, Saab Mansour

    Abstract: Large Language Models (LLMs) are powerful models for generation tasks, but they may not generate good quality outputs in their first attempt. Apart from model fine-tuning, existing approaches to improve prediction accuracy and quality typically involve LLM self-improvement / self-reflection that incorporate feedback from models themselves. Despite their effectiveness, these methods are hindered by… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: The source code and data samples are released at https://github.com/amazon-science/CERET-LLM-refine

  50. arXiv:2406.05490  [pdf, other

    cs.DC

    Beatnik: A Novel Global Communication Mini-Application

    Authors: Jason A. Stewart, Patrick G. Bridges

    Abstract: Beatnik is a novel open source mini-application that exercises the complex communication patterns often found in production codes but rarely found in benchmarks or mini-applications. It simulates 3D Raleigh-Taylor instabilities based on Pandya and Shkoller's Z-Model formulation using the Cabana performance portability framework. This paper presents both the high-level design and important implemen… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.