Skip to main content

Showing 1–50 of 424 results for author: Loo, K

.
  1. arXiv:2406.18219  [pdf, other

    cs.CL cs.LG

    A Closer Look into Mixture-of-Experts in Large Language Models

    Authors: Ka Man Lo, Zeyu Huang, Zihan Qiu, Zili Wang, Jie Fu

    Abstract: Mixture-of-experts (MoE) is gaining increasing attention due to its unique properties and remarkable performance, especially for language tasks. By sparsely activating a subset of parameters for each token, MoE architecture could increase the model size without sacrificing computational efficiency, achieving a better trade-off between performance and training costs. However, the underlying mechani… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. arXiv:2406.16746  [pdf, other

    cs.LG cs.AI cs.CL

    The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources

    Authors: Shayne Longpre, Stella Biderman, Alon Albalak, Hailey Schoelkopf, Daniel McDuff, Sayash Kapoor, Kevin Klyman, Kyle Lo, Gabriel Ilharco, Nay San, Maribeth Rauh, Aviya Skowron, Bertie Vidgen, Laura Weidinger, Arvind Narayanan, Victor Sanh, David Adelani, Percy Liang, Rishi Bommasani, Peter Henderson, Sasha Luccioni, Yacine Jernite, Luca Soldaini

    Abstract: Foundation model development attracts a rapidly expanding body of contributors, scientists, and applications. To help shape responsible development practices, we introduce the Foundation Model Development Cheatsheet: a growing collection of 250+ tools and resources spanning text, vision, and speech modalities. We draw on a large body of prior work to survey resources (e.g. software, documentation,… ▽ More

    Submitted 25 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

  3. arXiv:2406.16264  [pdf, other

    cs.CL cs.AI

    One Thousand and One Pairs: A "novel" challenge for long-context language models

    Authors: Marzena Karpinska, Katherine Thai, Kyle Lo, Tanya Goyal, Mohit Iyyer

    Abstract: Synthetic long-context LLM benchmarks (e.g., "needle-in-the-haystack") test only surface-level retrieval capabilities, but how well can long-context LLMs retrieve, synthesize, and reason over information across book-length inputs? We address this question by creating NoCha, a dataset of 1,001 minimally different pairs of true and false claims about 67 recently-published English fictional books, wr… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: preprint, 29 pages

  4. arXiv:2406.11794  [pdf, other

    cs.LG cs.CL

    DataComp-LM: In search of the next generation of training sets for language models

    Authors: Jeffrey Li, Alex Fang, Georgios Smyrnis, Maor Ivgi, Matt Jordan, Samir Gadre, Hritik Bansal, Etash Guha, Sedrick Keh, Kushal Arora, Saurabh Garg, Rui Xin, Niklas Muennighoff, Reinhard Heckel, Jean Mercat, Mayee Chen, Suchin Gururangan, Mitchell Wortsman, Alon Albalak, Yonatan Bitton, Marianna Nezhurina, Amro Abbas, Cheng-Yu Hsieh, Dhruba Ghosh, Josh Gardner , et al. (34 additional authors not shown)

    Abstract: We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models. As part of DCLM, we provide a standardized corpus of 240T tokens extracted from Common Crawl, effective pretraining recipes based on the OpenLM framework, and a broad suite of 53 downstream evaluations. Participants in the DCLM benchmark can experiment with dat… ▽ More

    Submitted 20 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Project page: https://www.datacomp.ai/dclm/

  5. arXiv:2406.07835  [pdf, other

    cs.CL cs.AI

    SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature

    Authors: David Wadden, Kejian Shi, Jacob Morrison, Aakanksha Naik, Shruti Singh, Nitzan Barzilay, Kyle Lo, Tom Hope, Luca Soldaini, Shannon Zejiang Shen, Doug Downey, Hannaneh Hajishirzi, Arman Cohan

    Abstract: We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following demonstrations for 54 tasks covering five essential scientific literature understanding capabilities: information extraction, summarization, question answering, claim verification, and classification. SciRIFF demonstrations are notable for their long input contexts, detailed t… ▽ More

    Submitted 18 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Submitted to NeurIPS Datasets and Benchmarks 2024

  6. arXiv:2406.04398  [pdf, other

    astro-ph.GA

    lenscat: a Public and Community-Contributed Catalog of Known Strong Gravitational Lenses

    Authors: L. Vujeva, R. K. L. Lo, J. M. Ezquiaga, J. C. L. Chan

    Abstract: We present lenscat, a public and community-contributed catalog of strong gravitational lenses found by electromagnetic surveys. The main objective of lenscat is to compile a simple, easy-to-access catalog that can be used in a variety of lensing studies, such as facilitating the search for the host galaxy of a candidate strongly lensed transient event. We also provide a python package to interact… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 7 pages, 2 figures

  7. arXiv:2405.18954  [pdf, other

    math.AP math.OC

    Determining state space anomalies in mean field games

    Authors: Hongyu Liu, Catharine W. K. Lo

    Abstract: In this paper, we are concerned with the inverse problem of determining anomalies in the state space associated with the stationary mean field game (MFG) system. We establish novel unique identifiability results for the intrinsic structure of these anomalies in mean field games systems, including their topological structure and parameter configurations, in several general scenarios of practical in… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Keywords: Stationary mean field games, inverse boundary problems, anomalies in state space, singularities, uniqueness

    MSC Class: Primary 35R30; secondary 35Q89; 91A16; 35R35

  8. arXiv:2405.18943  [pdf, other

    math.AP math.OC

    Decoding a mean field game by the Cauchy data around its unknown stationary states

    Authors: Hongyu Liu, Catharine W. K. Lo, Shen Zhang

    Abstract: In recent years, mean field games (MFGs) have garnered considerable attention and emerged as a dynamic and actively researched field across various domains, including economics, social sciences, finance, and transportation. The inverse design and decoding of MFGs offer valuable means to extract information from observed data and gain insights into the intricate underlying dynamics and strategies o… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Keywords: Mean field games, inverse problems, Cauchy data, unique continuation principle, unique identifiability

    MSC Class: Primary 35Q89; 35R30; secondary 91A16; 35R35

  9. arXiv:2405.17792  [pdf, other

    hep-ex hep-ph

    JUNO Sensitivity to Invisible Decay Modes of Neutrons

    Authors: JUNO Collaboration, Angel Abusleme, Thomas Adam, Kai Adamowicz, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato, Marco Beretta, Antonio Bergnoli, Daniel Bick , et al. (635 additional authors not shown)

    Abstract: We explore the bound neutrons decay into invisible particles (e.g., $n\rightarrow 3 ν$ or $nn \rightarrow 2 ν$) in the JUNO liquid scintillator detector. The invisible decay includes two decay modes: $ n \rightarrow { inv} $ and $ nn \rightarrow { inv} $. The invisible decays of $s$-shell neutrons in $^{12}{\rm C}$ will leave a highly excited residual nucleus. Subsequently, some de-excitation mode… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 28 pages, 7 figures, 4 tables

  10. arXiv:2405.17014  [pdf, other

    math.AP

    On the Obstacle Problem in Fractional Generalised Orlicz Spaces

    Authors: Catharine W. K. Lo, José Francisco Rodrigues

    Abstract: We consider the one and the two obstacles problems for the nonlocal nonlinear anisotropic $g$-Laplacian $\mathcal{L}_g^s$, with $0<s<1$. We prove the strict T-monotonicity of $\mathcal{L}_g^s$ and we obtain the Lewy-Stampacchia inequalities. We consider the approximation of the solutions through semilinear problems, for which we prove a global $L^\infty$-estimate, and we extend the local Hölder re… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  11. arXiv:2405.10467  [pdf, other

    cs.AI cs.SE

    Agent Design Pattern Catalogue: A Collection of Architectural Patterns for Foundation Model based Agents

    Authors: Yue Liu, Sin Kit Lo, Qinghua Lu, Liming Zhu, Dehai Zhao, Xiwei Xu, Stefan Harrer, Jon Whittle

    Abstract: Foundation model-enabled generative artificial intelligence facilitates the development and implementation of agents, which can leverage distinguished reasoning and language processing capabilities to takes a proactive, autonomous role to pursue users' goals. Nevertheless, there is a lack of systematic knowledge to guide practitioners in designing the agents considering challenges of goal-seeking… ▽ More

    Submitted 24 June, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

  12. arXiv:2404.09837  [pdf, other

    math.AP q-bio.PE

    On inverse problems in multi-population aggregation models

    Authors: Yuhan Li, Hongyu Liu, Catharine W. K. Lo

    Abstract: This paper focuses on inverse problems arising in studying multi-population aggregations. The goal is to reconstruct the diffusion coefficient, advection coefficient, and interaction kernels of the aggregation system, which characterize the dynamics of different populations. In the theoretical analysis of the physical setup, it is crucial to ensure non-negativity of solutions. To address this, we… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 29 pages, Keywords: inverse multi-population aggregation model, positive solutions, unique identifiability, transformative asymptotic technique, high-order variation method

    MSC Class: 35R30; 35B09; 35K45; 35Q92; 92-10; 92D25; 92D50; 35B10; 35C20

  13. arXiv:2404.09290  [pdf, other

    cs.CV eess.IV

    RoofDiffusion: Constructing Roofs from Severely Corrupted Point Data via Diffusion

    Authors: Kyle Shih-Huang Lo, Jörg Peters, Eric Spellman

    Abstract: Accurate completion and denoising of roof height maps are crucial to reconstructing high-quality 3D buildings. Repairing sparse points can enhance low-cost sensor use and reduce UAV flight overlap. RoofDiffusion is a new end-to-end self-supervised diffusion technique for robustly completing, in particular difficult, roof height maps. RoofDiffusion leverages widely-available curated footprints and… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  14. arXiv:2404.06393  [pdf, other

    cs.SD cs.AI eess.AS

    MuPT: A Generative Symbolic Music Pretrained Transformer

    Authors: Xingwei Qu, Yuelin Bai, Yinghao Ma, Ziya Zhou, Ka Man Lo, Jiaheng Liu, Ruibin Yuan, Lejun Min, Xueling Liu, Tianyu Zhang, Xinrun Du, Shuyue Guo, Yiming Liang, Yizhi Li, Shangda Wu, Junting Zhou, Tianyu Zheng, Ziyang Ma, Fengze Han, Wei Xue, Gus Xia, Emmanouil Benetos, Xiang Yue, Chenghua Lin, Xu Tan , et al. (4 additional authors not shown)

    Abstract: In this paper, we explore the application of Large Language Models (LLMs) to the pre-training of music. While the prevalent use of MIDI in music modeling is well-established, our findings suggest that LLMs are inherently more compatible with ABC Notation, which aligns more closely with their design and strengths, thereby enhancing the model's performance in musical composition. To address the chal… ▽ More

    Submitted 10 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

  15. arXiv:2404.01261  [pdf, other

    cs.CL cs.AI

    FABLES: Evaluating faithfulness and content selection in book-length summarization

    Authors: Yekyung Kim, Yapei Chang, Marzena Karpinska, Aparna Garimella, Varun Manjunatha, Kyle Lo, Tanya Goyal, Mohit Iyyer

    Abstract: While long-context large language models (LLMs) can technically summarize book-length documents (>100K tokens), the length and complexity of the documents have so far prohibited evaluations of input-dependent aspects like faithfulness. In this paper, we conduct the first large-scale human evaluation of faithfulness and content selection on LLM-generated summaries of fictional books. Our study miti… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: preprint - 39 pages

  16. arXiv:2403.15246  [pdf, other

    cs.IR cs.CL cs.LG

    FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions

    Authors: Orion Weller, Benjamin Chang, Sean MacAvaney, Kyle Lo, Arman Cohan, Benjamin Van Durme, Dawn Lawrie, Luca Soldaini

    Abstract: Modern Language Models (LMs) are capable of following long and complex instructions that enable a large and diverse set of user requests. While Information Retrieval (IR) models use these LMs as the backbone of their architectures, virtually none of them allow users to provide detailed instructions alongside queries, thus limiting their ability to satisfy complex information needs. In this work, w… ▽ More

    Submitted 7 May, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

  17. arXiv:2403.14371  [pdf

    cs.LG cs.AI cs.DC

    Loop Improvement: An Efficient Approach for Extracting Shared Features from Heterogeneous Data without Central Server

    Authors: Fei Li, Chu Kiong Loo, Wei Shiung Liew, Xiaofeng Liu

    Abstract: In federated learning, data heterogeneity significantly impacts performance. A typical solution involves segregating these parameters into shared and personalized components, a concept also relevant in multi-task learning. Addressing this, we propose "Loop Improvement" (LI), a novel method enhancing this separation and feature extraction without necessitating a central server or data interchange a… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: 11 pages, 11 figures

  18. arXiv:2403.04979  [pdf, other

    cs.HC

    Know Your Audience: The benefits and pitfalls of generating plain language summaries beyond the "general" audience

    Authors: Tal August, Kyle Lo, Noah A. Smith, Katharina Reinecke

    Abstract: Language models (LMs) show promise as tools for communicating science to the general public by simplifying and summarizing complex language. Because models can be prompted to generate text for a specific audience (e.g., college-educated adults), LMs might be used to create multiple versions of plain language summaries for people with different familiarities of scientific topics. However, it is not… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  19. arXiv:2403.03866  [pdf, other

    cs.CL

    KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions

    Authors: Fangyuan Xu, Kyle Lo, Luca Soldaini, Bailey Kuehl, Eunsol Choi, David Wadden

    Abstract: Large language models (LLMs) adapted to follow user instructions are now widely deployed as conversational agents. In this work, we examine one increasingly common instruction-following task: providing writing assistance to compose a long-form answer. To evaluate the capabilities of current LLMs on this task, we construct KIWI, a dataset of knowledge-intensive writing instructions in the scientifi… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  20. arXiv:2403.03004  [pdf, other

    astro-ph.CO gr-qc hep-ph

    Ultralight vector dark matter search using data from the KAGRA O3GK run

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, H. Abe, I. Abouelfettouh, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi , et al. (1778 additional authors not shown)

    Abstract: Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we prese… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 20 pages, 5 figures

    Report number: LIGO-P2300250

  21. arXiv:2402.18106  [pdf, other

    math.AP

    On the Stability of the $s$-Nonlocal $p$-Obstacle Problem and their Coincidence Sets and Free Boundaries

    Authors: Catharine W. K. Lo, José Francisco Rodrigues

    Abstract: We show that the solutions to the nonlocal obstacle problems for the nonlocal $-Δ_p^s$ operator, when the fractional parameter $s\toσ$ for $0<σ\leq1$, converge to the solution of the corresponding obstacle problem for $-Δ_p^σ$, being $σ=1$ the classical obstacle problem for the local $p$-Laplacian. We discuss the weak stability of the quasi-characteristic functions of coincidence sets of the solut… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  22. arXiv:2402.16918  [pdf, other

    cs.LG cs.CV

    m2mKD: Module-to-Module Knowledge Distillation for Modular Transformers

    Authors: Ka Man Lo, Yiming Liang, Wenyu Du, Yuantao Fan, Zili Wang, Wenhao Huang, Lei Ma, Jie Fu

    Abstract: Modular neural architectures are gaining attention for their powerful generalization and efficient adaptation to new domains. However, training these models poses challenges due to optimization difficulties arising from intrinsic sparse connectivity. Leveraging knowledge from monolithic models through techniques like knowledge distillation can facilitate training and enable integration of diverse… ▽ More

    Submitted 7 July, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

  23. arXiv:2402.04505  [pdf, other

    cs.CL quant-ph

    Developments in Sheaf-Theoretic Models of Natural Language Ambiguities

    Authors: Kin Ian Lo, Mehrnoosh Sadrzadeh, Shane Mansfield

    Abstract: Sheaves are mathematical objects consisting of a base which constitutes a topological space and the data associated with each open set thereof, e.g. continuous functions defined on the open sets. Sheaves have originally been used in algebraic topology and logic. Recently, they have also modelled events such as physical experiments and natural language disambiguation processes. We extend the latter… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: text overlap with arXiv:2308.16498

  24. arXiv:2402.00838  [pdf, other

    cs.CL

    OLMo: Accelerating the Science of Language Models

    Authors: Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam , et al. (18 additional authors not shown)

    Abstract: Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings. As their commercial importance has surged, the most powerful models have become closed off, gated behind proprietary interfaces, with important details of their training data, architectures, and development undisclosed. Given the importance of these details in scientifically studying these models… ▽ More

    Submitted 7 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  25. arXiv:2402.00159  [pdf, other

    cs.CL

    Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

    Authors: Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen , et al. (11 additional authors not shown)

    Abstract: Information about pretraining corpora used to train the current best-performing language models is seldom discussed: commercial models rarely detail their data, and even open models are often released without accompanying training data or recipes to reproduce them. As a result, it is challenging to conduct and advance scientific research on language modeling, such as understanding how training dat… ▽ More

    Submitted 6 June, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

    Comments: Accepted at ACL 2024; Dataset: https://hf.co/datasets/allenai/dolma; Code: https://github.com/allenai/dolma

  26. arXiv:2401.16475  [pdf, other

    cs.CL

    InfoLossQA: Characterizing and Recovering Information Loss in Text Simplification

    Authors: Jan Trienes, Sebastian Joseph, Jörg Schlötterer, Christin Seifert, Kyle Lo, Wei Xu, Byron C. Wallace, Junyi Jessy Li

    Abstract: Text simplification aims to make technical texts more accessible to laypeople but often results in deletion of information and vagueness. This work proposes InfoLossQA, a framework to characterize and recover simplification-induced information loss in form of question-and-answer (QA) pairs. Building on the theory of Question Under Discussion, the QA pairs are designed to help readers deepen their… ▽ More

    Submitted 4 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: Accepted at ACL 2024 (main conference)

  27. arXiv:2312.10523  [pdf, other

    cs.CL cs.AI cs.LG

    Paloma: A Benchmark for Evaluating Language Model Fit

    Authors: Ian Magnusson, Akshita Bhagia, Valentin Hofmann, Luca Soldaini, Ananya Harsh Jha, Oyvind Tafjord, Dustin Schwenk, Evan Pete Walsh, Yanai Elazar, Kyle Lo, Dirk Groeneveld, Iz Beltagy, Hannaneh Hajishirzi, Noah A. Smith, Kyle Richardson, Jesse Dodge

    Abstract: Language models (LMs) commonly report perplexity on monolithic data held out from training. Implicitly or explicitly, this data is composed of domains$\unicode{x2013}$varying distributions of language. Rather than assuming perplexity on one distribution extrapolates to others, Perplexity Analysis for Language Model Assessment (Paloma), measures LM fit to 585 text domains, ranging from nytimes.com… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: Project Page: https://paloma.allen.ai/

  28. On inverse problems in predator-prey models

    Authors: Yuhan Li, Hongyu Liu, Catharine W. K. Lo

    Abstract: In this paper, we consider the inverse problem of determining the coefficients of interaction terms within some Lotka-Volterra models, with support from boundary observation of its non-negative solutions. In the physical background, the solutions to the predator-prey model stand for the population densities for predator and prey and are non-negative, which is a critical challenge in our inverse pr… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    MSC Class: 35R30; 35B09; 35K51; 35Q92; 92-10; 92D25; 35K58

    Journal ref: Journal of Differential Equations Volume 397, 15 July 2024, Pages 349-376

  29. arXiv:2312.03215  [pdf

    cond-mat.supr-con

    Evidence for the novel type of orbital Fulde-Ferrell-Larkin-Ovchinnikov state in the bulk limit of 2H-NbSe2

    Authors: Chang-woo Cho, Kwan To Lo, Cheuk Yin Ng, Timothée T. Lortz, Abdel Rahman Allan, Mahmoud Abdel-Hafiez, Jaemun Park, Beopgil Cho, Keeseong Park, Rolf Lortz

    Abstract: The Fulde-Ferrell-Larkin-Ovchinnikov (FFLO) state, an unusual superconducting state, defies high magnetic fields beyond the Pauli paramagnetic limit. It exhibits a spatial modulation of the superconducting order parameter in real space and is exceptionally rare. Recently, an even more exotic variant - the orbital FFLO state - was predicted and identified in the transition metal dichalcogenide supe… ▽ More

    Submitted 19 February, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

  30. arXiv:2311.14385  [pdf

    hep-ex

    New Evidence for DM-like Anomalies in Neutron Multiplicity Spectra

    Authors: W. H. Trzaska, A. Barzilov, T. Enqvist, K. Jedrzejczak, M. Kasztelan, P. Kuusiniemi, K. K. Loo, J. Orzechowski, M. Slupecki, J. Szabelski, T. E. Ward

    Abstract: Subterrestrial neutron spectra show weak but consistent anomalies at multiplicities ~100 and above. The origin of the excess events remains ambiguous, but, in principle, it could be a signature of Dark Matter WIMP annihilation-like interaction with a massive Pb target. However, since the results of the available measurements are below the 5-sigma discovery level, and the observed anomalous structu… ▽ More

    Submitted 27 February, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

    Comments: 4 pages, 2 figures, proceedings of the TAUP 2023 conference

    MSC Class: 85-06

  31. arXiv:2311.13358  [pdf, other

    hep-th

    Irregular Fibonacci Conformal Blocks

    Authors: Xia Gu, Babak Haghighat, Kevin Loo

    Abstract: This work studies Liouville conformal blocks of irregular type with the insertion of at least one level-$3$ degenerate field admitting a Fibonacci fusion rule. We algebraically derive the corresponding third-order BPZ equations for regular blocks and their modifications when a rank one irregular operator is inserted. Employing Lefschetz thimbles as integration cycles, we then successively proceed… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

    Comments: 25 pages, 4 figures

  32. arXiv:2311.09765  [pdf, other

    cs.IR cs.AI

    Back to Basics: A Simple Recipe for Improving Out-of-Domain Retrieval in Dense Encoders

    Authors: Hyunji Lee, Luca Soldaini, Arman Cohan, Minjoon Seo, Kyle Lo

    Abstract: Prevailing research practice today often relies on training dense retrievers on existing large datasets such as MSMARCO and then experimenting with ways to improve zero-shot generalization capabilities to unseen domains. While prior work has tackled this challenge through resource-intensive steps such as data augmentation, architectural modifications, increasing model size, or even further base mo… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  33. arXiv:2311.05191  [pdf, other

    math.AP math.NA

    Determining Sources in the Bioluminescence Tomography Problem

    Authors: Ming-Hui Ding, Rongfang Gong, Hongyu Liu, Catharine W. K. Lo

    Abstract: In this paper, we revisit the bioluminescence tomography (BLT) problem, where one seeks to reconstruct bioluminescence signals (an internal light source) from external measurements of the Cauchy data. As one kind of optical imaging, the BLT has many merits such as high signal-to-noise ratio, non-destructivity and cost-effectiveness etc., and has potential applications such as cancer diagnosis, dru… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    MSC Class: Primary 35R30; secondary 78A46; 92C55; 35Q60; 78A70

  34. arXiv:2310.03193  [pdf

    cs.DL cs.CL cs.CY physics.hist-ph physics.soc-ph

    The Rise of Open Science: Tracking the Evolution and Perceived Value of Data and Methods Link-Sharing Practices

    Authors: Hancheng Cao, Jesse Dodge, Kyle Lo, Daniel A. McFarland, Lucy Lu Wang

    Abstract: In recent years, funding agencies and journals increasingly advocate for open science practices (e.g. data and method sharing) to improve the transparency, access, and reproducibility of science. However, quantifying these practices at scale has proven difficult. In this work, we leverage a large-scale dataset of 1.1M papers from arXiv that are representative of the fields of physics, math, and co… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  35. arXiv:2310.01649  [pdf, other

    cs.LG cs.AI

    On Training Derivative-Constrained Neural Networks

    Authors: KaiChieh Lo, Daniel Huang

    Abstract: We refer to the setting where the (partial) derivatives of a neural network's (NN's) predictions with respect to its inputs are used as additional training signal as a derivative-constrained (DC) NN. This situation is common in physics-informed settings in the natural sciences. We propose an integrated RELU (IReLU) activation function to improve training of DC NNs. We also investigate denormalizat… ▽ More

    Submitted 11 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

  36. arXiv:2310.00785  [pdf, other

    cs.CL cs.AI cs.LG

    BooookScore: A systematic exploration of book-length summarization in the era of LLMs

    Authors: Yapei Chang, Kyle Lo, Tanya Goyal, Mohit Iyyer

    Abstract: Summarizing book-length documents (>100K tokens) that exceed the context window size of large language models (LLMs) requires first breaking the input document into smaller chunks and then prompting an LLM to merge, update, and compress chunk-level summaries. Despite the complexity and importance of this task, it has yet to be meaningfully studied due to the challenges of evaluation: existing book… ▽ More

    Submitted 13 April, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

    Comments: ICLR 2024 camera-ready (updated figure1 and table2; corrected minor details in the explanation of hierarchical merging)

  37. arXiv:2309.08541  [pdf, other

    cs.IR cs.AI cs.CL

    When do Generative Query and Document Expansions Fail? A Comprehensive Study Across Methods, Retrievers, and Datasets

    Authors: Orion Weller, Kyle Lo, David Wadden, Dawn Lawrie, Benjamin Van Durme, Arman Cohan, Luca Soldaini

    Abstract: Using large language models (LMs) for query or document expansion can improve generalization in information retrieval. However, it is unknown whether these techniques are universally beneficial or only effective in specific settings, such as for particular retrieval models, dataset domains, or query types. To answer this, we conduct the first comprehensive analysis of LM-based expansion. We find t… ▽ More

    Submitted 26 February, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

    Comments: EACL 2024 camera ready

  38. arXiv:2309.07109  [pdf, ps, other

    hep-ex astro-ph.HE hep-ph

    Real-time Monitoring for the Next Core-Collapse Supernova in JUNO

    Authors: Angel Abusleme, Thomas Adam, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Muhammad Akram, Abid Aleem, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, Burin Asavapibhop, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato, Marco Beretta, Antonio Bergnoli , et al. (606 additional authors not shown)

    Abstract: The core-collapse supernova (CCSN) is considered one of the most energetic astrophysical events in the universe. The early and prompt detection of neutrinos before (pre-SN) and during the supernova (SN) burst presents a unique opportunity for multi-messenger observations of CCSN events. In this study, we describe the monitoring concept and present the sensitivity of the system to pre-SN and SN neu… ▽ More

    Submitted 4 December, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: 24 pages, 9 figures, accepted for the publication at JCAP

  39. arXiv:2309.03487  [pdf, other

    cs.LG cs.CR cs.NE

    Privacy-preserving Continual Federated Clustering via Adaptive Resonance Theory

    Authors: Naoki Masuyama, Yusuke Nojima, Yuichiro Toda, Chu Kiong Loo, Hisao Ishibuchi, Naoyuki Kubota

    Abstract: With the increasing importance of data privacy protection, various privacy-preserving machine learning methods have been proposed. In the clustering domain, various algorithms with a federated learning framework (i.e., federated clustering) have been actively studied and showed high clustering performance while preserving data privacy. However, most of the base clusterers (i.e., clustering algorit… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

    Comments: This paper is currently under review. arXiv admin note: substantial text overlap with arXiv:2305.01507

  40. arXiv:2308.16498  [pdf, other

    cs.CL cs.AI quant-ph

    Generalised Winograd Schema and its Contextuality

    Authors: Kin Ian Lo, Mehrnoosh Sadrzadeh, Shane Mansfield

    Abstract: Ambiguities in natural language give rise to probability distributions over interpretations. The distributions are often over multiple ambiguous words at a time; a multiplicity which makes them a suitable topic for sheaf-theoretic models of quantum contextuality. Previous research showed that different quantitative measures of contextuality correlate well with Psycholinguistic research on lexical… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

    Comments: In Proceedings QPL 2023, arXiv:2308.15489

    Journal ref: EPTCS 384, 2023, pp. 187-202

  41. arXiv:2308.13666  [pdf, other

    astro-ph.HE

    A Joint Fermi-GBM and Swift-BAT Analysis of Gravitational-Wave Candidates from the Third Gravitational-wave Observing Run

    Authors: C. Fletcher, J. Wood, R. Hamburg, P. Veres, C. M. Hui, E. Bissaldi, M. S. Briggs, E. Burns, W. H. Cleveland, M. M. Giles, A. Goldstein, B. A. Hristov, D. Kocevski, S. Lesage, B. Mailyan, C. Malacaria, S. Poolakkil, A. von Kienlin, C. A. Wilson-Hodge, The Fermi Gamma-ray Burst Monitor Team, M. Crnogorčević, J. DeLaunay, A. Tohuvavohu, R. Caputo, S. B. Cenko , et al. (1674 additional authors not shown)

    Abstract: We present Fermi Gamma-ray Burst Monitor (Fermi-GBM) and Swift Burst Alert Telescope (Swift-BAT) searches for gamma-ray/X-ray counterparts to gravitational wave (GW) candidate events identified during the third observing run of the Advanced LIGO and Advanced Virgo detectors. Using Fermi-GBM on-board triggers and sub-threshold gamma-ray burst (GRB) candidates found in the Fermi-GBM ground analyses,… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

  42. arXiv:2308.06616  [pdf, other

    astro-ph.CO astro-ph.HE gr-qc

    Identifying strongly lensed gravitational waves through their phase consistency

    Authors: Jose María Ezquiaga, Wayne Hu, Rico K. L. Lo

    Abstract: Strongly lensed gravitational waves (GWs) from binary coalescence manifest as repeated chirps from the original merger. At the detectors, the phase of the lensed GWs and its arrival time differences will be consistent modulo a fixed constant phase shift. We develop a fast and reliable method to efficiently reject event pairs that are not-lensed copies and appropriately rank the most interesting ca… ▽ More

    Submitted 23 October, 2023; v1 submitted 12 August, 2023; originally announced August 2023.

    Comments: 14+7 pages, 12+7 figures, 2 tables, code at https://github.com/ezquiaga/phazap. Matches PRD version

    Journal ref: Phys. Rev. D 108 (2023), 103520

  43. arXiv:2308.03822  [pdf, other

    astro-ph.HE

    Search for Eccentric Black Hole Coalescences during the Third Observing Run of LIGO and Virgo

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, H. Abe, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi , et al. (1750 additional authors not shown)

    Abstract: Despite the growing number of confident binary black hole coalescences observed through gravitational waves so far, the astrophysical origin of these binaries remains uncertain. Orbital eccentricity is one of the clearest tracers of binary formation channels. Identifying binary eccentricity, however, remains challenging due to the limited availability of gravitational waveforms that include effect… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: 24 pages, 5 figures

    Report number: LIGO-P2300080

  44. arXiv:2307.09701  [pdf, other

    cs.CL

    Efficiency Pentathlon: A Standardized Arena for Efficiency Evaluation

    Authors: Hao Peng, Qingqing Cao, Jesse Dodge, Matthew E. Peters, Jared Fernandez, Tom Sherborne, Kyle Lo, Sam Skjonsberg, Emma Strubell, Darrell Plessas, Iz Beltagy, Evan Pete Walsh, Noah A. Smith, Hannaneh Hajishirzi

    Abstract: Rising computational demands of modern natural language processing (NLP) systems have increased the barrier to entry for cutting-edge research while posing serious environmental concerns. Yet, progress on model efficiency has been impeded by practical challenges in model evaluation and comparison. For example, hardware is challenging to control due to disparate levels of accessibility across diffe… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

  45. arXiv:2307.00744  [pdf, other

    math.AP

    Strong uniqueness principle for fractional polyharmonic operators and applications to inverse problems

    Authors: Ching-Lung Lin, Hongyu Liu, Catharine W. K. Lo

    Abstract: In this work, we are concerned with inverse problems involving poly-fractional operators, where the poly-fractional operator is of the form \[P( (-Δ_g)^s)u := \sum_{i=1}^M α_i(-Δ_{g_i})^{s_i}u\] for $s=(s_1,\dots,s_M)$, $0<s_1<\cdots<s_M<\infty$, $s_M\in\mathbb{R}_+\backslash\mathbb{Z}$, $g=(g_1,\dots,g_M)$. There are three major contributions in this work that are new to the literature. First… ▽ More

    Submitted 2 August, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

    MSC Class: Primary 35R30; secondary 35R11; 26A33

  46. arXiv:2306.16469  [pdf, other

    gr-qc

    Recipes for computing radiation from a Kerr black hole using Generalized Sasaki-Nakamura formalism: I. Homogeneous solutions

    Authors: Rico K. L. Lo

    Abstract: Central to black hole perturbation theory calculations is the Teukolsky equation that governs the propagation and the generation of radiation emitted by Kerr black holes. However, it is plagued by a long-ranged potential associated to the perturbation equation and hence a direct numerical integration of the equation is challenging. Sasaki and Nakamura devised a formulation that transforms the equa… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

    Comments: 31 pages, 12 figures

  47. arXiv:2306.13410  [pdf, other

    cs.LG

    Explainable Lifelong Stream Learning Based on "Glocal" Pairwise Fusion

    Authors: Chu Kiong Loo, Wei Shiung Liew, Stefan Wermter

    Abstract: Real-time on-device continual learning applications are used on mobile phones, consumer robots, and smart appliances. Such devices have limited processing and memory storage capabilities, whereas continual learning acquires data over a long period of time. By necessity, lifelong learning algorithms have to be able to operate under such constraints while delivering good performance. This study pres… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

    Comments: 24 pages, 8 figures

  48. arXiv:2306.09567  [pdf, other

    hep-ex astro-ph.HE hep-ph

    JUNO sensitivity to the annihilation of MeV dark matter in the galactic halo

    Authors: JUNO Collaboration, Angel Abusleme, Thomas Adam, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Muhammad Akram, Abid Aleem, Tsagkarakis Alexandros, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, Burin Asavapibhop, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato , et al. (581 additional authors not shown)

    Abstract: We discuss JUNO sensitivity to the annihilation of MeV dark matter in the galactic halo via detecting inverse beta decay reactions of electron anti-neutrinos resulting from the annihilation. We study possible backgrounds to the signature, including the reactor neutrinos, diffuse supernova neutrino background, charged- and neutral-current interactions of atmospheric neutrinos, backgrounds from muon… ▽ More

    Submitted 13 September, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: 25 pages, 9 figures, matches the publised version

    Journal ref: JCAP 09 (2023) 001

  49. arXiv:2306.08056  [pdf, other

    cs.CR cs.AI cs.SE

    Distributed Trust Through the Lens of Software Architecture

    Authors: Sin Kit Lo, Yue Liu, Guangsheng Yu, Qinghua Lu, Xiwei Xu, Liming Zhu

    Abstract: Distributed trust is a nebulous concept that has evolved from different perspectives in recent years. While one can attribute its current prominence to blockchain and cryptocurrency, the distributed trust concept has been cultivating progress in federated learning, trustworthy and responsible AI in an ecosystem setting, data sharing, privacy issues across organizational boundaries, and zero trust… ▽ More

    Submitted 25 May, 2023; originally announced June 2023.

  50. Follow-up Analyses to the O3 LIGO-Virgo-KAGRA Lensing Searches

    Authors: Justin Janquart, Mick Wright, Srashti Goyal, Juno C. L. Chan, Apratim Ganguly, Ángel Garrón, David Keitel, Alvin K. Y. Li, Anna Liu, Rico K. L. Lo, Anuj Mishra, Anupreeta More, Hemantakumar Phurailatpam, Prasia Pankunni, Sylvia Biscoveanu, Paolo Cremonese, Jean-René Cudell, José M. Ezquiaga, Juan Garcia-Bellido, Otto A. Hannuksela, K. Haris, Ian Harry, Martin Hendry, Sascha Husa, Shasvath Kapadia , et al. (6 additional authors not shown)

    Abstract: Along their path from source to observer, gravitational waves may be gravitationally lensed by massive objects. This results in distortions of the observed signal which can be used to extract new information about fundamental physics, astrophysics, and cosmology. Searches for these distortions amongst the observed signals from the current detector network have already been carried out, though ther… ▽ More

    Submitted 15 August, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: 29 pages, 27 figures

    Journal ref: Monthly Notices of the Royal Astronomical Society, 526, 3, 2023