-
DeCE: Deceptive Cross-Entropy Loss Designed for Defending Backdoor Attacks
Authors:
Guang Yang,
Yu Zhou,
Xiang Chen,
Xiangyu Zhang,
Terry Yue Zhuo,
David Lo,
Taolue Chen
Abstract:
Code Language Models (CLMs), particularly those leveraging deep learning, have achieved significant success in code intelligence domain. However, the issue of security, particularly backdoor attacks, is often overlooked in this process. The previous research has focused on designing backdoor attacks for CLMs, but effective defenses have not been adequately addressed. In particular, existing defens…
▽ More
Code Language Models (CLMs), particularly those leveraging deep learning, have achieved significant success in code intelligence domain. However, the issue of security, particularly backdoor attacks, is often overlooked in this process. The previous research has focused on designing backdoor attacks for CLMs, but effective defenses have not been adequately addressed. In particular, existing defense methods from natural language processing, when directly applied to CLMs, are not effective enough and lack generality, working well in some models and scenarios but failing in others, thus fall short in consistently mitigating backdoor attacks. To bridge this gap, we first confirm the phenomenon of ``early learning" as a general occurrence during the training of CLMs. This phenomenon refers to that a model initially focuses on the main features of training data but may become more sensitive to backdoor triggers over time, leading to overfitting and susceptibility to backdoor attacks. We then analyze that overfitting to backdoor triggers results from the use of the cross-entropy loss function, where the unboundedness of cross-entropy leads the model to increasingly concentrate on the features of the poisoned data. Based on this insight, we propose a general and effective loss function DeCE (Deceptive Cross-Entropy) by blending deceptive distributions and applying label smoothing to limit the gradient to be bounded, which prevents the model from overfitting to backdoor triggers and then enhances the security of CLMs against backdoor attacks. To verify the effectiveness of our defense method, we select code synthesis tasks as our experimental scenarios. Our experiments across various code synthesis datasets, models, and poisoning ratios demonstrate the applicability and effectiveness of DeCE in enhancing the security of CLMs.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Long-fiber Sagnac interferometers for twin field quantum key distribution networks
Authors:
Reem Mandil,
Li Qian,
Hoi-Kwong Lo
Abstract:
A Sagnac loop structure can help overcome the major difficulty in the practical implementation of a twin field quantum key distribution (TFQKD) network, namely, the need to stabilize the phase of a quantum state over many kilometers of fiber. Unfortunately, Rayleigh backscattering noise limits the signal-to-noise ratio for Sagnac systems containing long fibers and lossy photonic devices. Here, we…
▽ More
A Sagnac loop structure can help overcome the major difficulty in the practical implementation of a twin field quantum key distribution (TFQKD) network, namely, the need to stabilize the phase of a quantum state over many kilometers of fiber. Unfortunately, Rayleigh backscattering noise limits the signal-to-noise ratio for Sagnac systems containing long fibers and lossy photonic devices. Here, we solve this problem by sending optical pulses in long on-off bursts and using time post-selection on measurements taken with free-run single-photon avalanche detectors. We also investigate the impact of the residual phase noise uncompensated by the Sagnac structure and find that the variance of the phase noise scales as loop length to the third power, verifying an existing calculation in the literature. We measure the interference visibility in Sagnac loops of varying length without active phase or polarization stabilization and achieve > 97% visibility in 200 km ultra-low-loss fiber, which is, to our knowledge, the longest fiber Sagnac interferometer demonstrated. Our results indicate the suitability of a Sagnac system for long-distance TFQKD networks, an important step towards the practical implementation of metropolitan quantum networks.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Generative Image as Action Models
Authors:
Mohit Shridhar,
Yat Long Lo,
Stephen James
Abstract:
Image-generation diffusion models have been fine-tuned to unlock new capabilities such as image-editing and novel view synthesis. Can we similarly unlock image-generation models for visuomotor control? We present GENIMA, a behavior-cloning agent that fine-tunes Stable Diffusion to 'draw joint-actions' as targets on RGB images. These images are fed into a controller that maps the visual targets int…
▽ More
Image-generation diffusion models have been fine-tuned to unlock new capabilities such as image-editing and novel view synthesis. Can we similarly unlock image-generation models for visuomotor control? We present GENIMA, a behavior-cloning agent that fine-tunes Stable Diffusion to 'draw joint-actions' as targets on RGB images. These images are fed into a controller that maps the visual targets into a sequence of joint-positions. We study GENIMA on 25 RLBench and 9 real-world manipulation tasks. We find that, by lifting actions into image-space, internet pre-trained diffusion models can generate policies that outperform state-of-the-art visuomotor approaches, especially in robustness to scene perturbations and generalizing to novel objects. Our method is also competitive with 3D agents, despite lacking priors such as depth, keypoints, or motion-planners.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Measurement and analysis of the $^{246}$Cm and $^{248}$Cm neutron capture cross-sections at the EAR2 of the n TOF facility
Authors:
V. Alcayne,
A. Kimura,
E. Mendoza,
D. Cano-Ott,
O. Aberle,
F. Álvarez-Velarde,
S. Amaducci,
J. Andrzejewski,
L. Audouin,
V. Bécares,
V. Babiano-Suarez,
M. Bacak,
M. Barbagallo,
F. Bečvář,
G. Bellia,
E. Berthoumieux,
J. Billowes,
D. Bosnar,
A. Brown,
M. Busso,
M. Caamaño,
L. Caballero-Ontanaya,
F. Calviño,
M. Calviani,
A. Casanovas
, et al. (108 additional authors not shown)
Abstract:
The $^{246}$Cm(n,$γ$) and $^{248}$Cm(n,$γ$) cross-sections have been measured at the Experimental Area 2 (EAR2) of the n_TOF facility at CERN with three C$_6$D$_6$ detectors. This measurement is part of a collective effort to improve the capture cross-section data for Minor Actinides (MAs), which are required to estimate the production and transmutation rates of these isotopes in light water react…
▽ More
The $^{246}$Cm(n,$γ$) and $^{248}$Cm(n,$γ$) cross-sections have been measured at the Experimental Area 2 (EAR2) of the n_TOF facility at CERN with three C$_6$D$_6$ detectors. This measurement is part of a collective effort to improve the capture cross-section data for Minor Actinides (MAs), which are required to estimate the production and transmutation rates of these isotopes in light water reactors and innovative reactor systems. In particular, the neutron capture in $^{246}$Cm and $^{248}$Cm open the path for the formation of other Cm isotopes and heavier elements such as Bk and Cf and the knowledge of (n,$γ$) cross-sections of these Cm isotopes plays an important role in the transport, transmutation and storage of the spent nuclear fuel. The reactions $^{246}$Cm(n,$γ$) and $^{248}$Cm(n,$γ$) have been the two first capture measurements analyzed at n_TOF EAR2. Until this experiment and two recent measurements performed at J-PARC, there was only one set of data of the capture cross-sections of $^{246}$Cm and $^{248}$Cm, that was obtained in 1969 in an underground nuclear explosion experiment. In the measurement at n_TOF a total of 13 resonances of $^{246}$Cm between 4 and 400 eV and 5 of $^{248}$Cm between 7 and 100 eV have been identified and fitted. The radiative kernels obtained for $^{246}$Cm are compatible with JENDL-5, but some of them are not with JENDL-4, which has been adopted by JEFF-3.3 and ENDF/B-VIII.0. The radiative kernels obtained for the first three $^{248}$Cm resonances are compatible with JENDL-5, however, the other two are not compatible with any other evaluation and are 20% and 60% larger than JENDL-5.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
IntentionNet: Map-Lite Visual Navigation at the Kilometre Scale
Authors:
Wei Gao,
Bo Ai,
Joel Loo,
Vinay,
David Hsu
Abstract:
This work explores the challenges of creating a scalable and robust robot navigation system that can traverse both indoor and outdoor environments to reach distant goals. We propose a navigation system architecture called IntentionNet that employs a monolithic neural network as the low-level planner/controller, and uses a general interface that we call intentions to steer the controller. The paper…
▽ More
This work explores the challenges of creating a scalable and robust robot navigation system that can traverse both indoor and outdoor environments to reach distant goals. We propose a navigation system architecture called IntentionNet that employs a monolithic neural network as the low-level planner/controller, and uses a general interface that we call intentions to steer the controller. The paper proposes two types of intentions, Local Path and Environment (LPE) and Discretised Local Move (DLM), and shows that DLM is robust to significant metric positioning and map** errors. The paper also presents Kilo-IntentionNet, an instance of the IntentionNet system using the DLM intention that is deployed on a Boston Dynamics Spot robot, and which successfully navigates through complex indoor and outdoor environments over distances of up to a kilometre with only noisy odometry.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Exploring the Capabilities of LLMs for Code Change Related Tasks
Authors:
Lishui Fan,
Jiakun Liu,
Zhongxin Liu,
David Lo,
Xin Xia,
Shan** Li
Abstract:
Developers deal with code-change-related tasks daily, e.g., reviewing code. Pre-trained code and code-change-oriented models have been adapted to help developers with such tasks. Recently, large language models (LLMs) have shown their effectiveness in code-related tasks. However, existing LLMs for code focus on general code syntax and semantics rather than the differences between two code versions…
▽ More
Developers deal with code-change-related tasks daily, e.g., reviewing code. Pre-trained code and code-change-oriented models have been adapted to help developers with such tasks. Recently, large language models (LLMs) have shown their effectiveness in code-related tasks. However, existing LLMs for code focus on general code syntax and semantics rather than the differences between two code versions. Thus, it is an open question how LLMs perform on code-change-related tasks.
To answer this question, we conduct an empirical study using \textgreater 1B parameters LLMs on three code-change-related tasks, i.e., code review generation, commit message generation, and just-in-time comment update, with in-context learning (ICL) and parameter-efficient fine-tuning (PEFT, including LoRA and prefix-tuning). We observe that the performance of LLMs is poor without examples and generally improves with examples, but more examples do not always lead to better performance. LLMs tuned with LoRA have comparable performance to the state-of-the-art small pre-trained models. Larger models are not always better, but \textsc{Llama~2} and \textsc{Code~Llama} families are always the best. The best LLMs outperform small pre-trained models on the code changes that only modify comments and perform comparably on other code changes. We suggest future work should focus more on guiding LLMs to learn the knowledge specific to the changes related to code rather than comments for code-change-related tasks.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Subpath-Based Column Generation for the Electric Routing-Scheduling Problem
Authors:
Alexandre Jacquillat,
Sean Lo
Abstract:
Motivated by widespread electrification targets, this paper studies an electric routing-scheduling problem (ERSP) that jointly optimizes routing-scheduling and charging decisions. The ERSP is formulated as a semi-infinite set-partitioning model, where continuous charging decisions result in infinitely-many path-based variables. To solve it, we develop a column generation algorithm with a bi-level…
▽ More
Motivated by widespread electrification targets, this paper studies an electric routing-scheduling problem (ERSP) that jointly optimizes routing-scheduling and charging decisions. The ERSP is formulated as a semi-infinite set-partitioning model, where continuous charging decisions result in infinitely-many path-based variables. To solve it, we develop a column generation algorithm with a bi-level label-setting algorithm to decompose the pricing problem into (i) a first-level procedure to generate subpaths between charging stations, and (ii) a second-level procedure to combine subpaths into paths. We formalize subpath-based domination properties to establish the finite convergence and exactness of the column generation algorithm. We prove that the methodology can handle modeling extensions with heterogeneous charging costs (via dynamic re-optimization of charging decisions) and algorithm extensions to tighten the relaxation using ng-routes and limited-memory subset-row inequalities (via augmented domination criteria). Computational results show that the methodology scales to large instances, outperforming state-of-the-art column generation algorithms. From a practical standpoint, the methodology achieves significant cost reductions by jointly optimizing routing-scheduling and charging decisions and by capturing heterogeneous charging costs.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Open Scene Graphs for Open World Object-Goal Navigation
Authors:
Joel Loo,
Zhanxin Wu,
David Hsu
Abstract:
How can we build robots for open-world semantic navigation tasks, like searching for target objects in novel scenes? While foundation models have the rich knowledge and generalisation needed for these tasks, a suitable scene representation is needed to connect them into a complete robot system. We address this with Open Scene Graphs (OSGs), a topo-semantic representation that retains and organises…
▽ More
How can we build robots for open-world semantic navigation tasks, like searching for target objects in novel scenes? While foundation models have the rich knowledge and generalisation needed for these tasks, a suitable scene representation is needed to connect them into a complete robot system. We address this with Open Scene Graphs (OSGs), a topo-semantic representation that retains and organises open-set scene information for these models, and has a structure that can be configured for different environment types. We integrate foundation models and OSGs into the OpenSearch system for Open World Object-Goal Navigation, which is capable of searching for open-set objects specified in natural language, while generalising zero-shot across diverse environments and embodiments. Our OSGs enhance reasoning with Large Language Models (LLM), enabling robust object-goal navigation outperforming existing LLM approaches. Through simulation and real-world experiments, we validate OpenSearch's generalisation across varied environments, robots and novel instructions.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
A systematic comparison of measures for k-anonymity in networks
Authors:
Rachel G. de Jong,
Mark P. J. van der Loo,
Frank W. Takes
Abstract:
Privacy-aware sharing of network data is a difficult task due to the interconnectedness of individuals in networks. An important part of this problem is the inherently difficult question of how in a particular situation the privacy of an individual node should be measured. To that end, in this paper we propose a set of aspects that one should consider when choosing a measure for privacy. These asp…
▽ More
Privacy-aware sharing of network data is a difficult task due to the interconnectedness of individuals in networks. An important part of this problem is the inherently difficult question of how in a particular situation the privacy of an individual node should be measured. To that end, in this paper we propose a set of aspects that one should consider when choosing a measure for privacy. These aspects include the type of desired privacy and attacker scenario against which the measure protects, utility of the data, the type of desired output, and the computational complexity of the chosen measure. Based on these aspects, we provide a systematic overview of existing approaches in the literature. We then focus on a set of measures that ultimately enables our objective: sharing the anonymized full network dataset with limited disclosure risk. The considered measures, each based on the concept of k-anonymity, account for the structure of the surroundings of a certain node and differ in completeness and reach of the structural information taken into account. We present a comprehensive theoretical characterization as well as comparative empirical experiments on a wide range of real-world network datasets with up to millions of edges. We find that the choice of the measure has an enormous effect on aforementioned aspects. Most interestingly, we find that the most effective measures consider a greater node vicinity, yet utilize minimal structural information and thus use minimal computational resources. This finding has important implications for researchers and practitioners, who may, based on the recommendations given in this paper, make an informed choice on how to safely share large-scale network data in a privacy-aware manner.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Large-scale, Independent and Comprehensive study of the power of LLMs for test case generation
Authors:
Wendkûuni C. Ouédraogo,
Kader Kaboré,
Haoye Tian,
Yewei Song,
Anil Koyuncu,
Jacques Klein,
David Lo,
Tegawendé F. Bissyandé
Abstract:
Unit testing, crucial for identifying bugs in code modules like classes and methods, is often neglected by developers due to time constraints. Automated test generation techniques have emerged to address this, but often lack readability and require developer intervention. Large Language Models (LLMs), like GPT and Mistral, show promise in software engineering, including in test generation. However…
▽ More
Unit testing, crucial for identifying bugs in code modules like classes and methods, is often neglected by developers due to time constraints. Automated test generation techniques have emerged to address this, but often lack readability and require developer intervention. Large Language Models (LLMs), like GPT and Mistral, show promise in software engineering, including in test generation. However, their effectiveness remains unclear.
This study conducts the first comprehensive investigation of LLMs, evaluating the effectiveness of four LLMs and five prompt engineering techniques, for unit test generation. We analyze 216\,300 tests generated by the selected advanced instruct-tuned LLMs for 690 Java classes collected from diverse datasets. We assess correctness, understandability, coverage, and bug detection capabilities of LLM-generated tests, comparing them to EvoSuite, a popular automated testing tool. While LLMs show potential, improvements in test correctness are necessary. This study reveals the strengths and limitations of LLMs compared to traditional methods, paving the way for further research on LLMs in software engineering.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
Sampled Datasets Risk Substantial Bias in the Identification of Political Polarization on Social Media
Authors:
Gabriele Di Bona,
Emma Fraxanet,
Björn Komander,
Andrea Lo Sasso,
Virginia Morini,
Antoine Vendeville,
Max Falkenberg,
Alessandro Galeazzi
Abstract:
Following recent policy changes by X (Twitter) and other social media platforms, user interaction data has become increasingly difficult to access. These restrictions are impeding robust research pertaining to social and political phenomena online, which is critical due to the profound impact social media platforms may have on our societies. Here, we investigate the reliability of polarization mea…
▽ More
Following recent policy changes by X (Twitter) and other social media platforms, user interaction data has become increasingly difficult to access. These restrictions are impeding robust research pertaining to social and political phenomena online, which is critical due to the profound impact social media platforms may have on our societies. Here, we investigate the reliability of polarization measures obtained from different samples of social media data by studying the structural polarization of the Polish political debate on Twitter over a 24-hour period. First, we show that the political discussion on Twitter is only a small subset of the wider Twitter discussion. Second, we find that large samples can be representative of the whole political discussion on a platform, but small samples consistently fail to accurately reflect the true structure of polarization online. Finally, we demonstrate that keyword-based samples can be representative if keywords are selected with great care, but that poorly selected keywords can result in substantial political bias in the sampled data. Our findings demonstrate that it is not possible to measure polarization in a reliable way with small, sampled datasets, highlighting why the current lack of research data is so problematic, and providing insight into the practical implementation of the European Union's Digital Service Act which aims to improve researchers' access to social media data.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
A Closer Look into Mixture-of-Experts in Large Language Models
Authors:
Ka Man Lo,
Zeyu Huang,
Zihan Qiu,
Zili Wang,
Jie Fu
Abstract:
Mixture-of-experts (MoE) is gaining increasing attention due to its unique properties and remarkable performance, especially for language tasks. By sparsely activating a subset of parameters for each token, MoE architecture could increase the model size without sacrificing computational efficiency, achieving a better trade-off between performance and training costs. However, the underlying mechani…
▽ More
Mixture-of-experts (MoE) is gaining increasing attention due to its unique properties and remarkable performance, especially for language tasks. By sparsely activating a subset of parameters for each token, MoE architecture could increase the model size without sacrificing computational efficiency, achieving a better trade-off between performance and training costs. However, the underlying mechanism of MoE still lacks further exploration, and its modularization degree remains questionable. In this paper, we make an initial attempt to understand the inner workings of MoE-based large language models. Concretely, we comprehensively study the parametric and behavioral features of three recent MoE-based models and reveal some intriguing observations, including (1) Neurons act like fine-grained experts. (2) The router of MoE usually selects experts with larger output norms. (3) The expert diversity increases as the layer increases, while the last layer is an outlier. Based on the observations, we also provide suggestions for a broad spectrum of MoE practitioners, such as router design and expert allocation. We hope this work could shed light on future research on the MoE framework and other modular architectures. Code is available at https://github.com/kamanphoebe/Look-into-MoEs.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
MDHA: Multi-Scale Deformable Transformer with Hybrid Anchors for Multi-View 3D Object Detection
Authors:
Michelle Adeline,
Junn Yong Loo,
Vishnu Monn Baskaran
Abstract:
Multi-view 3D object detection is a crucial component of autonomous driving systems. Contemporary query-based methods primarily depend either on dataset-specific initialization of 3D anchors, introducing bias, or utilize dense attention mechanisms, which are computationally inefficient and unscalable. To overcome these issues, we present MDHA, a novel sparse query-based framework, which constructs…
▽ More
Multi-view 3D object detection is a crucial component of autonomous driving systems. Contemporary query-based methods primarily depend either on dataset-specific initialization of 3D anchors, introducing bias, or utilize dense attention mechanisms, which are computationally inefficient and unscalable. To overcome these issues, we present MDHA, a novel sparse query-based framework, which constructs adaptive 3D output proposals using hybrid anchors from multi-view, multi-scale input. Fixed 2D anchors are combined with depth predictions to form 2.5D anchors, which are projected to obtain 3D proposals. To ensure high efficiency, our proposed Anchor Encoder performs sparse refinement and selects the top-k anchors and features. Moreover, while existing multi-view attention mechanisms rely on projecting reference points to multiple images, our novel Circular Deformable Attention mechanism only projects to a single image but allows reference points to seamlessly attend to adjacent images, improving efficiency without compromising on performance. On the nuScenes val set, it achieves 46.4% mAP and 55.0% NDS with a ResNet101 backbone. MDHA significantly outperforms the baseline, where anchor proposals are modelled as learnable embeddings.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources
Authors:
Shayne Longpre,
Stella Biderman,
Alon Albalak,
Hailey Schoelkopf,
Daniel McDuff,
Sayash Kapoor,
Kevin Klyman,
Kyle Lo,
Gabriel Ilharco,
Nay San,
Maribeth Rauh,
Aviya Skowron,
Bertie Vidgen,
Laura Weidinger,
Arvind Narayanan,
Victor Sanh,
David Adelani,
Percy Liang,
Rishi Bommasani,
Peter Henderson,
Sasha Luccioni,
Yacine Jernite,
Luca Soldaini
Abstract:
Foundation model development attracts a rapidly expanding body of contributors, scientists, and applications. To help shape responsible development practices, we introduce the Foundation Model Development Cheatsheet: a growing collection of 250+ tools and resources spanning text, vision, and speech modalities. We draw on a large body of prior work to survey resources (e.g. software, documentation,…
▽ More
Foundation model development attracts a rapidly expanding body of contributors, scientists, and applications. To help shape responsible development practices, we introduce the Foundation Model Development Cheatsheet: a growing collection of 250+ tools and resources spanning text, vision, and speech modalities. We draw on a large body of prior work to survey resources (e.g. software, documentation, frameworks, guides, and practical tools) that support informed data selection, processing, and understanding, precise and limitation-aware artifact documentation, efficient model training, advance awareness of the environmental impact from training, careful model evaluation of capabilities, risks, and claims, as well as responsible model release, licensing and deployment practices. We hope this curated collection of resources helps guide more responsible development. The process of curating this list, enabled us to review the AI development ecosystem, revealing what tools are critically missing, misused, or over-used in existing practices. We find that (i) tools for data sourcing, model evaluation, and monitoring are critically under-serving ethical and real-world needs, (ii) evaluations for model safety, capabilities, and environmental impact all lack reproducibility and transparency, (iii) text and particularly English-centric analyses continue to dominate over multilingual and multi-modal analyses, and (iv) evaluation of systems, rather than just models, is needed so that capabilities and impact are assessed in context.
△ Less
Submitted 25 June, 2024; v1 submitted 24 June, 2024;
originally announced June 2024.
-
Bijective Enumeration and Sign-Imbalance for Permutation Depth and Excedances
Authors:
Sen-Peng Eu,
Tung-Shan Fu,
Yuan-Hsun Lo
Abstract:
We present a simplified variant of Biane's bijection between permutations and 3-colored Motzkin paths with weight that keeps track of the inversion number, excedance number and a statistic so-called depth of a permutation. This generalizes a result by Guay-Paquet and Petersen about a continued fraction of the generating function for depth on the permutations of n elements. In terms of weighted Mot…
▽ More
We present a simplified variant of Biane's bijection between permutations and 3-colored Motzkin paths with weight that keeps track of the inversion number, excedance number and a statistic so-called depth of a permutation. This generalizes a result by Guay-Paquet and Petersen about a continued fraction of the generating function for depth on the permutations of n elements. In terms of weighted Motzkin path, we establish an involution on the permutations that reverses the parities of depth and excedance numbers simultaneously, which proves that the numbers of permutations with even and odd depth (excedance numbers, respectively) are equal if n is even and differ by the tangent number if n is odd. Moreover, we present some interesting sign-imbalance results on permutations and derangements, refined with respect to depth and excedance numbers.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
One Thousand and One Pairs: A "novel" challenge for long-context language models
Authors:
Marzena Karpinska,
Katherine Thai,
Kyle Lo,
Tanya Goyal,
Mohit Iyyer
Abstract:
Synthetic long-context LLM benchmarks (e.g., "needle-in-the-haystack") test only surface-level retrieval capabilities, but how well can long-context LLMs retrieve, synthesize, and reason over information across book-length inputs? We address this question by creating NoCha, a dataset of 1,001 minimally different pairs of true and false claims about 67 recently-published English fictional books, wr…
▽ More
Synthetic long-context LLM benchmarks (e.g., "needle-in-the-haystack") test only surface-level retrieval capabilities, but how well can long-context LLMs retrieve, synthesize, and reason over information across book-length inputs? We address this question by creating NoCha, a dataset of 1,001 minimally different pairs of true and false claims about 67 recently-published English fictional books, written by human readers of those books. In contrast to existing long-context benchmarks, our annotators confirm that the largest share of pairs in NoCha require global reasoning over the entire book to verify. Our experiments show that while human readers easily perform this task, it is enormously challenging for all ten long-context LLMs that we evaluate: no open-weight model performs above random chance (despite their strong performance on synthetic benchmarks), while GPT-4o achieves the highest accuracy at 55.8%. Further analysis reveals that (1) on average, models perform much better on pairs that require only sentence-level retrieval vs. global reasoning; (2) model-generated explanations for their decisions are often inaccurate even for correctly-labeled claims; and (3) models perform substantially worse on speculative fiction books that contain extensive world-building. The methodology proposed in NoCha allows for the evolution of the benchmark dataset and the easy analysis of future models.
△ Less
Submitted 23 June, 2024;
originally announced June 2024.
-
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
Authors:
Terry Yue Zhuo,
Minh Chien Vu,
Jenny Chim,
Han Hu,
Wenhao Yu,
Ratnadira Widyasari,
Imam Nur Bani Yusuf,
Haolan Zhan,
Junda He,
Indraneil Paul,
Simon Brunner,
Chen Gong,
Thong Hoang,
Armel Randy Zebaze,
Xiaoheng Hong,
Wen-Ding Li,
Jean Kaddour,
Ming Xu,
Zhihan Zhang,
Prateek Yadav,
Naman Jain,
Alex Gu,
Zhoujun Cheng,
Jiawei Liu,
Qian Liu
, et al. (8 additional authors not shown)
Abstract:
Automated software engineering has been greatly empowered by the recent advances in Large Language Models (LLMs) for programming. While current benchmarks have shown that LLMs can perform various software engineering tasks like human developers, the majority of their evaluations are limited to short and self-contained algorithmic tasks. Solving challenging and practical programming tasks requires…
▽ More
Automated software engineering has been greatly empowered by the recent advances in Large Language Models (LLMs) for programming. While current benchmarks have shown that LLMs can perform various software engineering tasks like human developers, the majority of their evaluations are limited to short and self-contained algorithmic tasks. Solving challenging and practical programming tasks requires the capability of utilizing diverse function calls as tools to efficiently implement functionalities like data analysis and web development. In addition, using multiple tools to solve a task needs compositional reasoning by accurately understanding complex instructions. Fulfilling both of these characteristics can pose a great challenge for LLMs. To assess how well LLMs can solve challenging and practical programming tasks, we introduce Bench, a benchmark that challenges LLMs to invoke multiple function calls as tools from 139 libraries and 7 domains for 1,140 fine-grained programming tasks. To evaluate LLMs rigorously, each programming task encompasses 5.6 test cases with an average branch coverage of 99%. In addition, we propose a natural-language-oriented variant of Bench, Benchi, that automatically transforms the original docstrings into short instructions only with essential information. Our extensive evaluation of 60 LLMs shows that LLMs are not yet capable of following complex instructions to use function calls precisely, with scores up to 60%, significantly lower than the human performance of 97%. The results underscore the need for further advancements in this area.
△ Less
Submitted 26 June, 2024; v1 submitted 22 June, 2024;
originally announced June 2024.
-
Quantum Extreme Learning of molecular potential energy surfaces and force fields
Authors:
Gabriele Lo Monaco,
Marco Bertini,
Salvatore Lorenzo,
G. Massimo Palma
Abstract:
Quantum machine learning algorithms are expected to play a pivotal role in quantum chemistry simulations in the immediate future. One such key application is the training of a quantum neural network to learn the potential energy surface and force field of molecular systems. We address this task by using the quantum extreme learning machine paradigm. This particular supervised learning routine allo…
▽ More
Quantum machine learning algorithms are expected to play a pivotal role in quantum chemistry simulations in the immediate future. One such key application is the training of a quantum neural network to learn the potential energy surface and force field of molecular systems. We address this task by using the quantum extreme learning machine paradigm. This particular supervised learning routine allows for resource-efficient training, consisting of a simple linear regression performed on a classical computer. We have tested a setup that can be used to study molecules of any dimension and is optimized for immediate use on NISQ devices with a limited number of native gates. We have applied this setup to three case studies: lithium hydride, water, and formamide, carrying out both noiseless simulations and actual implementation on IBM quantum hardware. Compared to other supervised learning routines, the proposed setup requires minimal quantum resources, making it feasible for direct implementation on quantum platforms, while still achieving a high level of predictive accuracy compared to simulations. Our encouraging results pave the way towards the future application to more complex molecules, being the proposed setup scalable.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
The cosmological significance of boundary term in non-metricity gravity
Authors:
Hamid Shabani,
Avik De,
Tee-How Loo
Abstract:
Within the context of metric-affine gravity, we examine the significance of the boundary term in symmetric teleparallel gravity by employing the cosmological dynamical system analysis method. We focus on the novel gravity models characterized by the functions $f(Q,C)$, where $f$ is a smooth function of the non-metricity scalar $Q$ and the associated boundary term $C$. In a cosmological setting ado…
▽ More
Within the context of metric-affine gravity, we examine the significance of the boundary term in symmetric teleparallel gravity by employing the cosmological dynamical system analysis method. We focus on the novel gravity models characterized by the functions $f(Q,C)$, where $f$ is a smooth function of the non-metricity scalar $Q$ and the associated boundary term $C$. In a cosmological setting adopting three different classes of symmetric teleparallel affine connections, we investigate a model $f(Q,C)=Q^{s}+eC^{r}$, and some special cases of this model. We show that the boundary term which is added to the Einsteinian field equations (or equivalently to $f(Q)=Q$ ones) are capable of bringing forward solutions corresponding to the early accelerated expansion. This alludes the physics behind the boundary terms which usually are discarded in the most gravitational theories.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
DataComp-LM: In search of the next generation of training sets for language models
Authors:
Jeffrey Li,
Alex Fang,
Georgios Smyrnis,
Maor Ivgi,
Matt Jordan,
Samir Gadre,
Hritik Bansal,
Etash Guha,
Sedrick Keh,
Kushal Arora,
Saurabh Garg,
Rui Xin,
Niklas Muennighoff,
Reinhard Heckel,
Jean Mercat,
Mayee Chen,
Suchin Gururangan,
Mitchell Wortsman,
Alon Albalak,
Yonatan Bitton,
Marianna Nezhurina,
Amro Abbas,
Cheng-Yu Hsieh,
Dhruba Ghosh,
Josh Gardner
, et al. (34 additional authors not shown)
Abstract:
We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models. As part of DCLM, we provide a standardized corpus of 240T tokens extracted from Common Crawl, effective pretraining recipes based on the OpenLM framework, and a broad suite of 53 downstream evaluations. Participants in the DCLM benchmark can experiment with dat…
▽ More
We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models. As part of DCLM, we provide a standardized corpus of 240T tokens extracted from Common Crawl, effective pretraining recipes based on the OpenLM framework, and a broad suite of 53 downstream evaluations. Participants in the DCLM benchmark can experiment with data curation strategies such as deduplication, filtering, and data mixing at model scales ranging from 412M to 7B parameters. As a baseline for DCLM, we conduct extensive experiments and find that model-based filtering is key to assembling a high-quality training set. The resulting dataset, DCLM-Baseline enables training a 7B parameter language model from scratch to 64% 5-shot accuracy on MMLU with 2.6T training tokens. Compared to MAP-Neo, the previous state-of-the-art in open-data language models, DCLM-Baseline represents a 6.6 percentage point improvement on MMLU while being trained with 40% less compute. Our baseline model is also comparable to Mistral-7B-v0.3 and Llama 3 8B on MMLU (63% & 66%), and performs similarly on an average of 53 natural language understanding tasks while being trained with 6.6x less compute than Llama 3 8B. Our results highlight the importance of dataset design for training language models and offer a starting point for further research on data curation.
△ Less
Submitted 20 June, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Coronal energy release by MHD avalanches II. EUV line emission from a multi-threaded coronal loop
Authors:
G. Cozzo,
J. Reid,
P. Pagano,
F. Reale,
P. Testa,
A. W. Hood,
C. Argiroffi,
A. Petralia,
E. Alaimo,
F. D'Anca,
L. Sciortino,
M. Todaro,
U. Lo Cicero,
M. Barbera,
B. De Pontieu,
J. Martinez-Sykora
Abstract:
MHD kink instability can trigger the fragmentation of a twisted magnetic flux tube into small-scale current sheets that dissipate as aperiodic impulsive heating events. This instability propagates as an avalanche to nearby flux tubes and leads to a nanoflare storm. Our previous work was devoted to related 3D MHD numerical modeling with a stratified and realistic atmosphere. This work addresses pre…
▽ More
MHD kink instability can trigger the fragmentation of a twisted magnetic flux tube into small-scale current sheets that dissipate as aperiodic impulsive heating events. This instability propagates as an avalanche to nearby flux tubes and leads to a nanoflare storm. Our previous work was devoted to related 3D MHD numerical modeling with a stratified and realistic atmosphere. This work addresses predictions for the EUV imaging spectroscopy of such structure and evolution of a loop, with an average temperature of 2.5 MK in the solar corona. We set a particular focus on the forthcoming MUSE mission. From the output of the numerical simulations, we synthesized the intensities, Doppler shifts, and non-thermal line broadening in 3 EUV spectral lines in the MUSE passbands: Fe IX 171A, Fe XV 284 A, and Fe XIX 108 A, at 1 MK, 2 MK, and 10 MK, respectively, according to the MUSE expected pixel size, temporal resolution, and temperature response functions. We provide maps showing different view angles and realistic spectra. Finally, we discuss the relevant evolutionary processes from the perspective of possible observations. We find that the MUSE observations might be able to detect the fine structure determined by tube fragmentation. In particular, the Fe IX line is mostly emitted at the loop footpoints, where we track the motions that drive the magnetic stressing and detect the upward motion of evaporating plasma from the chromosphere. In Fe XV, we see the bulk of the loop with increasing intensity. The Fe XIX line is very faint within the chosen simulation parameters; thus, any transient brightening around the loop apex may possibly be emphasized by the folding of sheet-like structure. In conclusion, we show that coronal loop observations with MUSE can pinpoint some crucial features of MHD-modeled ignition processes, such as the related dynamics, hel** to identify the heating processes.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens
Authors:
Anas Awadalla,
Le Xue,
Oscar Lo,
Manli Shu,
Hannah Lee,
Etash Kumar Guha,
Matt Jordan,
Sheng Shen,
Mohamed Awadalla,
Silvio Savarese,
Caiming Xiong,
Ran Xu,
Ye** Choi,
Ludwig Schmidt
Abstract:
Multimodal interleaved datasets featuring free-form interleaved sequences of images and text are crucial for training frontier large multimodal models (LMMs). Despite the rapid progression of open-source LMMs, there remains a pronounced scarcity of large-scale, diverse open-source multimodal interleaved datasets. In response, we introduce MINT-1T, the most extensive and diverse open-source Multimo…
▽ More
Multimodal interleaved datasets featuring free-form interleaved sequences of images and text are crucial for training frontier large multimodal models (LMMs). Despite the rapid progression of open-source LMMs, there remains a pronounced scarcity of large-scale, diverse open-source multimodal interleaved datasets. In response, we introduce MINT-1T, the most extensive and diverse open-source Multimodal INTerleaved dataset to date. MINT-1T comprises one trillion text tokens and three billion images, a 10x scale-up from existing open-source datasets. Additionally, we include previously untapped sources such as PDFs and ArXiv papers. As scaling multimodal interleaved datasets requires substantial engineering effort, sharing the data curation process and releasing the dataset greatly benefits the community. Our experiments show that LMMs trained on MINT-1T rival the performance of models trained on the previous leading dataset, OBELICS. Our data and code will be released at https://github.com/mlfoundations/MINT-1T.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Split-Apply-Combine with Dynamic Grou**
Authors:
Mark P. J. van der Loo
Abstract:
Partitioning a data set by one or more of its attributes and computing an aggregate for each part is one of the most common operations in data analyses. There are use cases where the partitioning is determined dynamically by collapsing smaller subsets into larger ones, to ensure sufficient support for the computed aggregate. These use cases are not supported by software implementing split-apply-co…
▽ More
Partitioning a data set by one or more of its attributes and computing an aggregate for each part is one of the most common operations in data analyses. There are use cases where the partitioning is determined dynamically by collapsing smaller subsets into larger ones, to ensure sufficient support for the computed aggregate. These use cases are not supported by software implementing split-apply-combine types of operations. This paper presents the \texttt{R} package \texttt{accumulate} that offers convenient interfaces for defining grouped aggregation where the grou** itself is dynamically determined, based on user-defined conditions on subsets, and a user-defined subset collapsing scheme. The formal underlying algorithm is described and analyzed as well.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Multiplexed Quantum Communication with Surface and Hypergraph Product Codes
Authors:
Shin Nishio,
Nicholas Connolly,
Nicolò Lo Piparo,
William John Munro,
Thomas Rowan Scruby,
Kae Nemoto
Abstract:
Connecting multiple processors via quantum interconnect technologies could help to overcome issues of scalability in single-processor quantum computers. Transmission via these interconnects can be performed more efficiently using quantum multiplexing, where information is encoded in high-dimensional photonic degrees of freedom. We explore the effects of multiplexing on logical error rates in surfa…
▽ More
Connecting multiple processors via quantum interconnect technologies could help to overcome issues of scalability in single-processor quantum computers. Transmission via these interconnects can be performed more efficiently using quantum multiplexing, where information is encoded in high-dimensional photonic degrees of freedom. We explore the effects of multiplexing on logical error rates in surface codes and hypergraph product codes. We show that, although multiplexing makes loss errors more damaging, assigning qubits to photons in an intelligent manner can minimize these effects, and the ability to encode higher-distance codes in a smaller number of photons can result in overall lower logical error rates. This multiplexing technique can also be adapted to quantum communication and multimode quantum memory with high-dimensional qudit systems.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Effect of Cr Segregation on Grain Growth in Nanocrystalline α-Fe Alloy: A Multiscale Modelling Approach
Authors:
Sandip Guin,
Albert Linda,
Yu-Chieh Lo,
Somanth Bhowmick,
Rajdip Mukherjee
Abstract:
We present a multiscale modelling framework that integrates density functional theory (DFT) with a phase-field model (PFM) to explore the intricate dynamics of grain growth in nanocrystalline α-Fe single-phase alloy in the presence of chromium (Cr) segregation. We begin our study by validating our simulation results for equilibrium segregation in stationary GB with Mclean isotherm. Polycrystal sim…
▽ More
We present a multiscale modelling framework that integrates density functional theory (DFT) with a phase-field model (PFM) to explore the intricate dynamics of grain growth in nanocrystalline α-Fe single-phase alloy in the presence of chromium (Cr) segregation. We begin our study by validating our simulation results for equilibrium segregation in stationary GB with Mclean isotherm. Polycrystal simulations featuring nanocrystalline grains at different temperatures reveal that the grain growth kinetics depends on the ratio of Cr diffusivity to intrinsic GB mobility. In the absence of segregation, the relationship between the square of average grain size (d 2
) and time (t) demonstrates a linear correlation. We observe that the d 2 vs. t plot exhibits a consistent linear trend up to a threshold grain size, independent of Cr segregation at GB. However, when Cr is segregated at GB, a deviation from this linear trend with a decreasing slope is evident within the temperature range of 700K to 900K beyond the threshold size. This threshold grain size decreases with increasing temperature. Notably, at 1000K, the deviation from the linear trend is observed from the initial stages of grain growth with segregation, albeit with a linear trend exhibiting a smaller slope. We also present an analytical formulation based on Cahn solute drag theory to predict grain growth behaviour in the presence of solute segregation and our simulation results well aligned this analytical formulation.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
NIRPS first light and early science: breaking the 1 m/s RV precision barrier at infrared wavelengths
Authors:
Étienne Artigau,
François Bouchy,
René Doyon,
Frédérique Baron,
Lison Malo,
François Wildi,
Franceso Pepe,
Neil J. Cook,
Simon Thibault,
Vladimir Reshetov,
Xavier Dumusque,
Christophe Lovis,
Danuta Sosnowska,
Bruno L. Canto Martins,
Jose Renan De Medeiros,
Xavier Delfosse,
Nuno Santos,
Rafael Rebolo,
Manuel Abreu,
Guillaume Allain,
Romain Allart,
Hugues Auger,
Susana Barros,
Luc Bazinet,
Nicolas Blind
, et al. (89 additional authors not shown)
Abstract:
The Near-InfraRed Planet Searcher or NIRPS is a precision radial velocity spectrograph developed through collaborative efforts among laboratories in Switzerland, Canada, Brazil, France, Portugal and Spain. NIRPS extends to the 0.98-1.8 $μ$m domain of the pioneering HARPS instrument at the La Silla 3.6-m telescope in Chile and it has achieved unparalleled precision, measuring stellar radial velocit…
▽ More
The Near-InfraRed Planet Searcher or NIRPS is a precision radial velocity spectrograph developed through collaborative efforts among laboratories in Switzerland, Canada, Brazil, France, Portugal and Spain. NIRPS extends to the 0.98-1.8 $μ$m domain of the pioneering HARPS instrument at the La Silla 3.6-m telescope in Chile and it has achieved unparalleled precision, measuring stellar radial velocities in the infrared with accuracy better than 1 m/s. NIRPS can be used either stand-alone or simultaneously with HARPS. Commissioned in late 2022 and early 2023, NIRPS embarked on a 5-year Guaranteed Time Observation (GTO) program in April 2023, spanning 720 observing nights. This program focuses on planetary systems around M dwarfs, encompassing both the immediate solar vicinity and transit follow-ups, alongside transit and emission spectroscopy observations. We highlight NIRPS's current performances and the insights gained during its deployment at the telescope. The lessons learned and successes achieved contribute to the ongoing advancement of precision radial velocity measurements and high spectral fidelity, further solidifying NIRPS' role in the forefront of the field of exoplanets.
△ Less
Submitted 13 June, 2024; v1 submitted 12 June, 2024;
originally announced June 2024.
-
Josephson Parametric Amplifier based Quantum Noise Limited Amplifier Development for Axion Search Experiments in CAPP
Authors:
Sergey V. Uchaikin,
**myeong Kim,
Caglar Kutlu,
Boris I. Ivanov,
**su Kim,
Arjan F. van Loo,
Yasunobu Nakamura,
Saebyeok Ahn,
Seonjeong Oh,
Minsu Ko,
Yannis K. Semertzidis
Abstract:
This paper provides a comprehensive overview of the development of flux-driven Josephson Parametric Amplifiers (JPAs) as Quantum Noise Limited Amplifier for axion search experiments conducted at the Center for Axion and Precision Physics Research (CAPP) of the Institute for Basic Science. It focuses on the characterization, and optimization of JPAs, which are crucial for achieving the highest sens…
▽ More
This paper provides a comprehensive overview of the development of flux-driven Josephson Parametric Amplifiers (JPAs) as Quantum Noise Limited Amplifier for axion search experiments conducted at the Center for Axion and Precision Physics Research (CAPP) of the Institute for Basic Science. It focuses on the characterization, and optimization of JPAs, which are crucial for achieving the highest sensitivity in axion particle detection. We discuss various characterization techniques, methods for improving bandwidth, and the attainment of ultra-low noise temperatures. JPAs have emerged as indispensable tools in CAPPs axion search endeavors, playing a significant role in advancing our understanding of fundamental physics and unraveling the mysteries of the universe.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature
Authors:
David Wadden,
Kejian Shi,
Jacob Morrison,
Aakanksha Naik,
Shruti Singh,
Nitzan Barzilay,
Kyle Lo,
Tom Hope,
Luca Soldaini,
Shannon Zejiang Shen,
Doug Downey,
Hannaneh Hajishirzi,
Arman Cohan
Abstract:
We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following demonstrations for 54 tasks covering five essential scientific literature understanding capabilities: information extraction, summarization, question answering, claim verification, and classification. SciRIFF demonstrations are notable for their long input contexts, detailed t…
▽ More
We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following demonstrations for 54 tasks covering five essential scientific literature understanding capabilities: information extraction, summarization, question answering, claim verification, and classification. SciRIFF demonstrations are notable for their long input contexts, detailed task specifications, and complex structured outputs. While instruction-following resources are available in specific domains such as clinical medicine and chemistry, SciRIFF is the first dataset focused on extracting and synthesizing information from research literature across a wide range of scientific fields. To demonstrate the utility of SciRIFF, we develop a sample-efficient strategy to adapt a general instruction-following model for science by performing additional finetuning on a mix of general-domain and SciRIFF demonstrations. In evaluations on nine held-out scientific tasks, our model -- called SciTulu -- improves over a strong LLM baseline by 28.1% and 6.5% at the 7B and 70B scales respectively, while maintaining general instruction-following performance within 2% of the baseline. We are optimistic that SciRIFF will facilitate the development and evaluation of LLMs to help researchers navigate the ever-growing body of scientific literature. We release our dataset, model checkpoints, and data processing and evaluation code to enable further research.
△ Less
Submitted 18 June, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
FPN-IAIA-BL: A Multi-Scale Interpretable Deep Learning Model for Classification of Mass Margins in Digital Mammography
Authors:
Julia Yang,
Alina Jade Barnett,
Jon Donnelly,
Satvik Kishore,
Jerry Fang,
Fides Regina Schwartz,
Chaofan Chen,
Joseph Y. Lo,
Cynthia Rudin
Abstract:
Digital mammography is essential to breast cancer detection, and deep learning offers promising tools for faster and more accurate mammogram analysis. In radiology and other high-stakes environments, uninterpretable ("black box") deep learning models are unsuitable and there is a call in these fields to make interpretable models. Recent work in interpretable computer vision provides transparency t…
▽ More
Digital mammography is essential to breast cancer detection, and deep learning offers promising tools for faster and more accurate mammogram analysis. In radiology and other high-stakes environments, uninterpretable ("black box") deep learning models are unsuitable and there is a call in these fields to make interpretable models. Recent work in interpretable computer vision provides transparency to these formerly black boxes by utilizing prototypes for case-based explanations, achieving high accuracy in applications including mammography. However, these models struggle with precise feature localization, reasoning on large portions of an image when only a small part is relevant. This paper addresses this gap by proposing a novel multi-scale interpretable deep learning model for mammographic mass margin classification. Our contribution not only offers an interpretable model with reasoning aligned with radiologist practices, but also provides a general architecture for computer vision with user-configurable prototypes from coarse- to fine-grained prototypes.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Three super-Earths and a possible water world from TESS and ESPRESSO
Authors:
M. J. Hobson,
F. Bouchy,
B. Lavie,
C. Lovis,
V. Adibekyan,
C. Allende Prieto,
Y. Alibert,
S. C. C. Barros,
A. Castro-González,
S. Cristiani,
V. D'Odorico,
M. Damasso,
P. Di Marcantonio,
X. Dumusque,
D. Ehrenreich,
P. Figueira,
R. Génova Santos,
J. I. González Hernández,
J. Lillo-Box,
G. Lo Curto,
C. J. A. P. Martins,
A. Mehner,
G. Micela,
P. Molaro,
N. J. Nunes
, et al. (29 additional authors not shown)
Abstract:
Since 2018, the ESPRESSO spectrograph at the VLT has been hunting for planets in the Southern skies via the RV method. One of its goals is to follow up candidate planets from transit surveys such as the TESS mission, particularly small planets. We analyzed photometry from TESS and ground-based facilities, high-resolution imaging, and RVs from ESPRESSO, HARPS, and HIRES, to confirm and characterize…
▽ More
Since 2018, the ESPRESSO spectrograph at the VLT has been hunting for planets in the Southern skies via the RV method. One of its goals is to follow up candidate planets from transit surveys such as the TESS mission, particularly small planets. We analyzed photometry from TESS and ground-based facilities, high-resolution imaging, and RVs from ESPRESSO, HARPS, and HIRES, to confirm and characterize three new planets: TOI-260 b, transiting a late K-dwarf, and TOI-286 b and c, orbiting an early K-dwarf. We also update parameters for the known super-Earth TOI-134 b , hosted by an M-dwarf. TOI-260 b has a $13.475853^{+0.000013}_{-0.000011}$ d period, $4.23 \pm1.60 \mathrm{M_\oplus}$ mass and $1.71\pm0.08\mathrm{R_\oplus}$ radius. For TOI-286 b we find a $4.5117244^{+0.0000031}_{-0.0000027}$ d period, $4.53\pm0.78\mathrm{M_\oplus}$ mass and $1.42\pm0.10\mathrm{R_\oplus}$ radius; for TOI-286 c, a $39.361826^{+0.000070}_{-0.000081}$ d period, $3.72\pm2.22\mathrm{M_\oplus}$ mass and $1.88\pm 0.12\mathrm{R_\oplus}$ radius. For TOI-134 b we obtain a $1.40152604^{+0.00000074}_{-0.00000082}$ d period, $4.07\pm0.45\mathrm{M_\oplus}$ mass, and $1.63\pm0.14\mathrm{R_\oplus}$ radius. Circular models are preferred for all, although for TOI-260 b the eccentricity is not well-constrained. We compute bulk densities and place the planets in the context of composition models. TOI-260 b lies within the radius valley, and is most likely a rocky planet. However, the uncertainty on the eccentricity and thus on the mass renders its composition hard to determine. TOI-286 b and c span the radius valley, with TOI-286 b lying below it and having a likely rocky composition, while TOI-286 c is within the valley, close to the upper border, and probably has a significant water fraction. With our updated parameters for TOI-134 b, we obtain a lower density than previous findings, giving a rocky or Earth-like composition.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Demystifying the Characteristics for Smart Contract Upgrades
Authors:
Ye Liu,
Shuo Li,
Xiuheng Wu,
Yi Li,
Zhiyang Chen,
David Lo
Abstract:
Upgradable smart contracts play an important role in the decentralized application ecosystem, to support routine maintenance, security patching, and feature additions. In this paper, we conduct an empirical study on proxy-based upgradable smart contracts to understand the characteristics of contract upgrading. Through our study on 57,118 open source proxy contracts, we found that 583 contracts hav…
▽ More
Upgradable smart contracts play an important role in the decentralized application ecosystem, to support routine maintenance, security patching, and feature additions. In this paper, we conduct an empirical study on proxy-based upgradable smart contracts to understand the characteristics of contract upgrading. Through our study on 57,118 open source proxy contracts, we found that 583 contracts have ever been upgraded on Ethereum, involving 973 unique implementation contract versions. The results show that developers often intend to improve usability of contracts if upgrading, where functionality addition and update are the most frequent upgrade intentions. We investigated the practical impacts of contract upgrades, e.g., breaking changes causing compatibility issues, storage collisions and initialization risks leading to security vulnerabilities. The results demonstrate that there are 4,334 ABI breaking changes due to the upgrades of 276 proxies, causing real-world broken usages within 584 transactions witnessed by the blockchain; 36 contract upgrades had storage collisions and five proxies with 59 implementation contracts are vulnerable to initialization attacks.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
lenscat: a Public and Community-Contributed Catalog of Known Strong Gravitational Lenses
Authors:
L. Vujeva,
R. K. L. Lo,
J. M. Ezquiaga,
J. C. L. Chan
Abstract:
We present lenscat, a public and community-contributed catalog of strong gravitational lenses found by electromagnetic surveys. The main objective of lenscat is to compile a simple, easy-to-access catalog that can be used in a variety of lensing studies, such as facilitating the search for the host galaxy of a candidate strongly lensed transient event. We also provide a python package to interact…
▽ More
We present lenscat, a public and community-contributed catalog of strong gravitational lenses found by electromagnetic surveys. The main objective of lenscat is to compile a simple, easy-to-access catalog that can be used in a variety of lensing studies, such as facilitating the search for the host galaxy of a candidate strongly lensed transient event. We also provide a python package to interact with tools commonly used by the community. This allows end users both with and without lensing expertise to obtain a list of known strong lenses within a given search area, and to also rank them by their respective searched probabilities. Here, we exemplify this by crossmatching the gravitational wave joint sky localization region of an interesting pair of events GW170104-GW170814. Other examples with short gamma-ray bursts are given. Thanks to the open and simple infrastructure of lenscat, members of the lensing community can directly add newly found lenses from their own studies to help create a long-lasting catalog that is as exhaustive and accessible as possible.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots
Authors:
Soroush Nasiriany,
Abhiram Maddukuri,
Lance Zhang,
Adeet Parikh,
Aaron Lo,
Abhishek Joshi,
Ajay Mandlekar,
Yuke Zhu
Abstract:
Recent advancements in Artificial Intelligence (AI) have largely been propelled by scaling. In Robotics, scaling is hindered by the lack of access to massive robot datasets. We advocate using realistic physical simulation as a means to scale environments, tasks, and datasets for robot learning methods. We present RoboCasa, a large-scale simulation framework for training generalist robots in everyd…
▽ More
Recent advancements in Artificial Intelligence (AI) have largely been propelled by scaling. In Robotics, scaling is hindered by the lack of access to massive robot datasets. We advocate using realistic physical simulation as a means to scale environments, tasks, and datasets for robot learning methods. We present RoboCasa, a large-scale simulation framework for training generalist robots in everyday environments. RoboCasa features realistic and diverse scenes focusing on kitchen environments. We provide thousands of 3D assets across over 150 object categories and dozens of interactable furniture and appliances. We enrich the realism and diversity of our simulation with generative AI tools, such as object assets from text-to-3D models and environment textures from text-to-image models. We design a set of 100 tasks for systematic evaluation, including composite tasks generated by the guidance of large language models. To facilitate learning, we provide high-quality human demonstrations and integrate automated trajectory generation methods to substantially enlarge our datasets with minimal human burden. Our experiments show a clear scaling trend in using synthetically generated robot data for large-scale imitation learning and show great promise in harnessing simulation data in real-world tasks. Videos and open-source code are available at https://robocasa.ai/
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Age-Gain-Dependent Random Access for Event-Driven Periodic Updating
Authors:
Yuqing Zhu,
Yiwen Zhu,
Aoyu Gong,
Yan Lin,
Yuan-Hsuan Lo,
Yi** Zhang
Abstract:
This paper considers utilizing the knowledge of age gains to reduce the network average age of information (AoI) in random access with event-driven periodic updating for the first time. Built on the form of slotted ALOHA, we require each device to determine its age gain threshold and transmission probability in an easily implementable decentralized manner, so that the unavoided contention can be l…
▽ More
This paper considers utilizing the knowledge of age gains to reduce the network average age of information (AoI) in random access with event-driven periodic updating for the first time. Built on the form of slotted ALOHA, we require each device to determine its age gain threshold and transmission probability in an easily implementable decentralized manner, so that the unavoided contention can be limited to devices with age gains as high as possible. For the basic case that each device utilizes its knowledge of age gain of only itself, we provide an analytical modeling approach by a multi-layer discrete-time Markov chains (DTMCs), where an external infinite-horizon DTMC manages the jumps between the beginnings of frames and an internal finite-horizon DTMC manages the evolution during an arbitrary frame. Such modelling enables that optimal access parameters can be obtained offline. For the enhanced case that each device utilizes its knowledge of age gains of all the devices, we require each device to adjust its access parameters for maximizing the estimated network \textit{expected AoI reduction} (EAR) per slot, which captures the essential for improving the contribution of the throughput to the AoI performance. To estimate the network EAR, we require each device to use Bayes' rule to keep a posteriori joint probability distribution of local age and age gain of an arbitrary device based on the channel observations. Numerical results validate our theoretical analysis and demonstrate the advantage of the proposed schemes over the existing schemes in a wide range of network configurations.
△ Less
Submitted 27 June, 2024; v1 submitted 2 June, 2024;
originally announced June 2024.
-
Contingency-Aware Station-Kee** Control of Halo Orbits
Authors:
Fausto Vega,
Zachary Manchester,
Martin Lo,
Ricardo Restrepo
Abstract:
We present an algorithm to perform fuel-optimal stationkee** for spacecraft in unstable halo orbits with additional constraints to ensure safety in the event of a control failure. We formulate a convex trajectory-optimization problem to generate impulsive spacecraft maneuvers to loosely track a halo orbit using a receding-horizon controller. Our solution also provides a safe exit strategy in the…
▽ More
We present an algorithm to perform fuel-optimal stationkee** for spacecraft in unstable halo orbits with additional constraints to ensure safety in the event of a control failure. We formulate a convex trajectory-optimization problem to generate impulsive spacecraft maneuvers to loosely track a halo orbit using a receding-horizon controller. Our solution also provides a safe exit strategy in the event that propulsion is lost at any point in the mission. We validate our algorithm in simulations of the three-body Earth-Moon and Saturn-Enceladus systems, demonstrating both low total delta-v and a safe contingency plan throughout the mission.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Can't make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models
Authors:
Himangi Mittal,
Nakul Agarwal,
Shao-Yuan Lo,
Kwonjoon Lee
Abstract:
We introduce PlausiVL, a large video-language model for anticipating action sequences that are plausible in the real-world. While significant efforts have been made towards anticipating future actions, prior approaches do not take into account the aspect of plausibility in an action sequence. To address this limitation, we explore the generative capability of a large video-language model in our wo…
▽ More
We introduce PlausiVL, a large video-language model for anticipating action sequences that are plausible in the real-world. While significant efforts have been made towards anticipating future actions, prior approaches do not take into account the aspect of plausibility in an action sequence. To address this limitation, we explore the generative capability of a large video-language model in our work and further, develop the understanding of plausibility in an action sequence by introducing two objective functions, a counterfactual-based plausible action sequence learning loss and a long-horizon action repetition loss. We utilize temporal logical constraints as well as verb-noun action pair logical constraints to create implausible/counterfactual action sequences and use them to train the model with plausible action sequence learning loss. This loss helps the model to differentiate between plausible and not plausible action sequences and also helps the model to learn implicit temporal cues crucial for the task of action anticipation. The long-horizon action repetition loss puts a higher penalty on the actions that are more prone to repetition over a longer temporal window. With this penalization, the model is able to generate diverse, plausible action sequences. We evaluate our approach on two large-scale datasets, Ego4D and EPIC-Kitchens-100, and show improvements on the task of action anticipation.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Wavefront Threading Enables Effective High-Level Synthesis
Authors:
Blake Pelton,
Adam Sapek,
Ken Eguro,
Daniel Lo,
Alessandro Forin,
Matt Humphrey,
**wen Xi,
David Cox,
Rajas Karandikar,
Johannes de Fine Licht,
Evgeny Babin,
Adrian Caulfield,
Doug Burger
Abstract:
Digital systems are growing in importance and computing hardware is growing more heterogeneous. Hardware design, however, remains laborious and expensive, in part due to the limitations of conventional hardware description languages (HDLs) like VHDL and Verilog. A longstanding research goal has been programming hardware like software, with high-level languages that can generate efficient hardware…
▽ More
Digital systems are growing in importance and computing hardware is growing more heterogeneous. Hardware design, however, remains laborious and expensive, in part due to the limitations of conventional hardware description languages (HDLs) like VHDL and Verilog. A longstanding research goal has been programming hardware like software, with high-level languages that can generate efficient hardware designs. This paper describes Kanagawa, a language that takes a new approach to combine the programmer productivity benefits of traditional High-Level Synthesis (HLS) approaches with the expressibility and hardware efficiency of Register-Transfer Level (RTL) design. The language's concise syntax, matched with a hardware design-friendly execution model, permits a relatively simple toolchain to map high-level code into efficient hardware implementations.
△ Less
Submitted 10 June, 2024; v1 submitted 29 May, 2024;
originally announced May 2024.
-
VisTA-SR: Improving the Accuracy and Resolution of Low-Cost Thermal Imaging Cameras for Agriculture
Authors:
Heesup Yun,
Sassoum Lo,
Christine H. Diepenbrock,
Brian N. Bailey,
J. Mason Earles
Abstract:
Thermal cameras are an important tool for agricultural research because they allow for non-invasive measurement of plant temperature, which relates to important photochemical, hydraulic, and agronomic traits. Utilizing low-cost thermal cameras can lower the barrier to introducing thermal imaging in agricultural research and production. This paper presents an approach to improve the temperature acc…
▽ More
Thermal cameras are an important tool for agricultural research because they allow for non-invasive measurement of plant temperature, which relates to important photochemical, hydraulic, and agronomic traits. Utilizing low-cost thermal cameras can lower the barrier to introducing thermal imaging in agricultural research and production. This paper presents an approach to improve the temperature accuracy and image quality of low-cost thermal imaging cameras for agricultural applications. Leveraging advancements in computer vision techniques, particularly deep learning networks, we propose a method, called $\textbf{VisTA-SR}$ ($\textbf{Vis}$ual \& $\textbf{T}$hermal $\textbf{A}$lignment and $\textbf{S}$uper-$\textbf{R}$esolution Enhancement) that combines RGB and thermal images to enhance the capabilities of low-resolution thermal cameras. The research includes calibration and validation of temperature measurements, acquisition of paired image datasets, and the development of a deep learning network tailored for agricultural thermal imaging. Our study addresses the challenges of image enhancement in the agricultural domain and explores the potential of low-cost thermal cameras to replace high-resolution industrial cameras. Experimental results demonstrate the effectiveness of our approach in enhancing temperature accuracy and image sharpness, paving the way for more accessible and efficient thermal imaging solutions in agriculture.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Determining state space anomalies in mean field games
Authors:
Hongyu Liu,
Catharine W. K. Lo
Abstract:
In this paper, we are concerned with the inverse problem of determining anomalies in the state space associated with the stationary mean field game (MFG) system. We establish novel unique identifiability results for the intrinsic structure of these anomalies in mean field games systems, including their topological structure and parameter configurations, in several general scenarios of practical in…
▽ More
In this paper, we are concerned with the inverse problem of determining anomalies in the state space associated with the stationary mean field game (MFG) system. We establish novel unique identifiability results for the intrinsic structure of these anomalies in mean field games systems, including their topological structure and parameter configurations, in several general scenarios of practical interest, including traffic flow, market economics and epidemics. To the best of our knowledge, this is the first work that considers anomalies in the state space for the nonlinear coupled MFG system.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Decoding a mean field game by the Cauchy data around its unknown stationary states
Authors:
Hongyu Liu,
Catharine W. K. Lo,
Shen Zhang
Abstract:
In recent years, mean field games (MFGs) have garnered considerable attention and emerged as a dynamic and actively researched field across various domains, including economics, social sciences, finance, and transportation. The inverse design and decoding of MFGs offer valuable means to extract information from observed data and gain insights into the intricate underlying dynamics and strategies o…
▽ More
In recent years, mean field games (MFGs) have garnered considerable attention and emerged as a dynamic and actively researched field across various domains, including economics, social sciences, finance, and transportation. The inverse design and decoding of MFGs offer valuable means to extract information from observed data and gain insights into the intricate underlying dynamics and strategies of these complex physical systems. This paper presents a novel approach to the study of inverse problems in MFGs by analyzing the Cauchy data around their unknown stationary states. This study distinguishes itself from existing inverse problem investigations in three key significant aspects: Firstly, we consider MFG problems in a highly general form. Secondly, we address the technical challenge of the probability measure constraint by utilizing Cauchy data in our inverse problem study. Thirdly, we enhance existing high order linearization methods by introducing a novel approach that involves conducting linearization around non-trivial stationary states of the MFG system, which are not a-priori known. These contributions provide new insights and offer promising avenues for studying inverse problems for MFGs. By unraveling the hidden structure of MFGs, researchers and practitioners can make informed decisions, optimize system performance, and address real-world challenges more effectively.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
JUNO Sensitivity to Invisible Decay Modes of Neutrons
Authors:
JUNO Collaboration,
Angel Abusleme,
Thomas Adam,
Kai Adamowicz,
Shakeel Ahmad,
Rizwan Ahmed,
Sebastiano Aiello,
Fengpeng An,
Qi An,
Giuseppe Andronico,
Nikolay Anfimov,
Vito Antonelli,
Tatiana Antoshkina,
João Pedro Athayde Marcondes de André,
Didier Auguste,
Weidong Bai,
Nikita Balashov,
Wander Baldini,
Andrea Barresi,
Davide Basilico,
Eric Baussan,
Marco Bellato,
Marco Beretta,
Antonio Bergnoli,
Daniel Bick
, et al. (635 additional authors not shown)
Abstract:
We explore the bound neutrons decay into invisible particles (e.g., $n\rightarrow 3 ν$ or $nn \rightarrow 2 ν$) in the JUNO liquid scintillator detector. The invisible decay includes two decay modes: $ n \rightarrow { inv} $ and $ nn \rightarrow { inv} $. The invisible decays of $s$-shell neutrons in $^{12}{\rm C}$ will leave a highly excited residual nucleus. Subsequently, some de-excitation mode…
▽ More
We explore the bound neutrons decay into invisible particles (e.g., $n\rightarrow 3 ν$ or $nn \rightarrow 2 ν$) in the JUNO liquid scintillator detector. The invisible decay includes two decay modes: $ n \rightarrow { inv} $ and $ nn \rightarrow { inv} $. The invisible decays of $s$-shell neutrons in $^{12}{\rm C}$ will leave a highly excited residual nucleus. Subsequently, some de-excitation modes of the excited residual nuclei can produce a time- and space-correlated triple coincidence signal in the JUNO detector. Based on a full Monte Carlo simulation informed with the latest available data, we estimate all backgrounds, including inverse beta decay events of the reactor antineutrino $\barν_e$, natural radioactivity, cosmogenic isotopes and neutral current interactions of atmospheric neutrinos. Pulse shape discrimination and multivariate analysis techniques are employed to further suppress backgrounds. With two years of exposure, JUNO is expected to give an order of magnitude improvement compared to the current best limits. After 10 years of data taking, the JUNO expected sensitivities at a 90% confidence level are $τ/B( n \rightarrow { inv} ) > 5.0 \times 10^{31} \, {\rm yr}$ and $τ/B( nn \rightarrow { inv} ) > 1.4 \times 10^{32} \, {\rm yr}$.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
HERMES: Gamma Ray Burst and Gravitational Wave counterpart hunter
Authors:
G. Ghirlanda,
L. Nava,
O. Salafia,
F. Fiore,
R. Campana,
R. Salvaterra,
A. Sanna,
W. Leone,
Y. Evangelista,
G. Dilillo,
S. Puccetti,
A. Santangelo,
M. Trenti,
A. Guzmán,
P. Hedderman,
G. Amelino-Camelia,
M. Barbera,
G. Baroni,
M. Bechini,
P. Bellutti,
G. Bertuccio,
G. Borghi,
A. Brandonisio,
L. Burderi,
C. Cabras
, et al. (45 additional authors not shown)
Abstract:
Gamma Ray Bursts (GRBs) bridge relativistic astrophysics and multi-messenger astronomy. Space-based gamma/X-ray wide field detectors have proven essential to detect and localize the highly variable GRB prompt emission, which is also a counterpart of gravitational wave events. We study the capabilities to detect long and short GRBs by the High Energy Rapid Modular Ensemble of Satellites (HERMES) Pa…
▽ More
Gamma Ray Bursts (GRBs) bridge relativistic astrophysics and multi-messenger astronomy. Space-based gamma/X-ray wide field detectors have proven essential to detect and localize the highly variable GRB prompt emission, which is also a counterpart of gravitational wave events. We study the capabilities to detect long and short GRBs by the High Energy Rapid Modular Ensemble of Satellites (HERMES) Pathfinder (HP) and SpIRIT, namely a swarm of six 3U CubeSats to be launched in early 2025, and a 6U CubeSat launched on December 1st 2023. We also study the capabilities of two advanced configurations of swarms of >8 satellites with improved detector performances (HERMES Constellations). The HERMES detectors, sensitive down to ~2-3 keV, will be able to detect faint/soft GRBs which comprise X-ray flashes and high redshift bursts. By combining state-of-the-art long and short GRB population models with a description of the single module performance, we estimate that HP will detect ~195^{+22}_{-21} long GRBs (3.4^{+0.3}_{-0.8} at redshift z>6) and ~19^{+5}_{-3} short GRBs per year. The larger HERMES Constellations under study can detect between ~1300 and ~3000 long GRBs per year and between ~160 and ~400 short GRBs per year, depending on the chosen configuration, with a rate of long GRBs above z>6 between 30 and 75 per year. Finally, we explore the capabilities of HERMES to detect short GRBs as electromagnetic counterparts of binary neutron star (BNS) mergers detected as gravitational signals by current and future ground-based interferometers. Under the assumption that the GRB jets are structured, we estimate that HP can provide up to 1 (14) yr^{-1} joint detections during the fifth LIGO-Virgo-KAGRA observing run (Einstein Telescope single triangle 10 km arm configuration). These numbers become 4 (100) yr^{-1}, respectively, for the HERMES Constellation configuration.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Rigidity on horocycles and hypercycles
Authors:
Cheikh Lo,
Abdoul Karim Sane
Abstract:
We show that a bijection $f:\mathbb{H}^2\rightarrow\mathbb{H}^2$ of the hyperbolic plane that sends horocycles to horocycles (respectively hypercycles to hypercycles) is an isometry. This extends a previous result of J. Jeffers on geodesics to all curves with constant curvature in $\mathbb{H}^2$. We go beyond by showing that every abstract automorphism of the geodesic graph (respectively horocycle…
▽ More
We show that a bijection $f:\mathbb{H}^2\rightarrow\mathbb{H}^2$ of the hyperbolic plane that sends horocycles to horocycles (respectively hypercycles to hypercycles) is an isometry. This extends a previous result of J. Jeffers on geodesics to all curves with constant curvature in $\mathbb{H}^2$. We go beyond by showing that every abstract automorphism of the geodesic graph (respectively horocycles and hypercycles graphs) is induced by an earthquake map (respectively an isometry) of $\mathbb{H}^2$. This shadowed the difference between the geometry of geodesics and that of horocycles/hypercycles.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
On anti-tempered local Arthur packets and a lemma of Arthur
Authors:
Baiying Liu,
Chi-Heng Lo,
Freydoon Shahidi
Abstract:
In this note, following Arthur's ideas, we rework the process of constructing the anti-tempered local Arthur packets for quasi-split classical groups and their pure inner forms. In particular, we present explicit examples illustrating certain gap in a lemma of Arthur and provide modifications, based on the work of Moeglin, Waldspurger, and Xu.
In this note, following Arthur's ideas, we rework the process of constructing the anti-tempered local Arthur packets for quasi-split classical groups and their pure inner forms. In particular, we present explicit examples illustrating certain gap in a lemma of Arthur and provide modifications, based on the work of Moeglin, Waldspurger, and Xu.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
On the Obstacle Problem in Fractional Generalised Orlicz Spaces
Authors:
Catharine W. K. Lo,
José Francisco Rodrigues
Abstract:
We consider the one and the two obstacles problems for the nonlocal nonlinear anisotropic $g$-Laplacian $\mathcal{L}_g^s$, with $0<s<1$. We prove the strict T-monotonicity of $\mathcal{L}_g^s$ and we obtain the Lewy-Stampacchia inequalities. We consider the approximation of the solutions through semilinear problems, for which we prove a global $L^\infty$-estimate, and we extend the local Hölder re…
▽ More
We consider the one and the two obstacles problems for the nonlocal nonlinear anisotropic $g$-Laplacian $\mathcal{L}_g^s$, with $0<s<1$. We prove the strict T-monotonicity of $\mathcal{L}_g^s$ and we obtain the Lewy-Stampacchia inequalities. We consider the approximation of the solutions through semilinear problems, for which we prove a global $L^\infty$-estimate, and we extend the local Hölder regularity to the solutions of the obstacle problems in the case of the fractional $p(x,y)$-Laplacian operator. We make further remarks on a few elementary properties of related capacities in the fractional generalised Orlicz framework, with a special reference to the Hilbertian nonlinear case in fractional Sobolev spaces.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Ecosystem of Large Language Models for Code
Authors:
Zhou Yang,
Jieke Shi,
David Lo
Abstract:
The availability of vast amounts of publicly accessible data of source code and the advances in modern language models, coupled with increasing computational resources, have led to a remarkable surge in the development of large language models for code (LLM4Code, for short). The interaction between code datasets and models gives rise to a complex ecosystem characterized by intricate dependencies t…
▽ More
The availability of vast amounts of publicly accessible data of source code and the advances in modern language models, coupled with increasing computational resources, have led to a remarkable surge in the development of large language models for code (LLM4Code, for short). The interaction between code datasets and models gives rise to a complex ecosystem characterized by intricate dependencies that are worth studying. This paper introduces a pioneering analysis of the code model ecosystem. Utilizing Hugging Face -- the premier hub for transformer-based models -- as our primary source, we curate a list of datasets and models that are manually confirmed to be relevant to software engineering. By analyzing the ecosystem, we first identify the popular and influential datasets, models, and contributors. The popularity is quantified by various metrics, including the number of downloads, the number of likes, the number of reuses, etc. The ecosystem follows a power-law distribution, indicating that users prefer widely recognized models and datasets. Then, we manually categorize how models in the ecosystem are reused into nine categories, analyzing prevalent model reuse practices. The top 3 most popular reuse types are fine-tuning, architecture sharing, and quantization. We also explore the practices surrounding the publication of LLM4Code, specifically focusing on documentation practice and license selection. We find that the documentation in the ecosystem contains less information than that in general artificial intelligence (AI)-related repositories hosted on GitHub. Additionally, the license usage is also different from other software repositories. Models in the ecosystem adopt some AI-specific licenses, e.g., RAIL (Responsible AI Licenses) and AI model license agreement.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
VICtoR: Learning Hierarchical Vision-Instruction Correlation Rewards for Long-horizon Manipulation
Authors:
Kuo-Han Hung,
Pang-Chi Lo,
Jia-Fong Yeh,
Han-Yuan Hsu,
Yi-Ting Chen,
Winston H. Hsu
Abstract:
We study reward models for long-horizon manipulation tasks by learning from action-free videos and language instructions, which we term the visual-instruction correlation (VIC) problem. Recent advancements in cross-modality modeling have highlighted the potential of reward modeling through visual and language correlations. However, existing VIC methods face challenges in learning rewards for long-…
▽ More
We study reward models for long-horizon manipulation tasks by learning from action-free videos and language instructions, which we term the visual-instruction correlation (VIC) problem. Recent advancements in cross-modality modeling have highlighted the potential of reward modeling through visual and language correlations. However, existing VIC methods face challenges in learning rewards for long-horizon tasks due to their lack of sub-stage awareness, difficulty in modeling task complexities, and inadequate object state estimation. To address these challenges, we introduce VICtoR, a novel hierarchical VIC reward model capable of providing effective reward signals for long-horizon manipulation tasks. VICtoR precisely assesses task progress at various levels through a novel stage detector and motion progress evaluator, offering insightful guidance for agents learning the task effectively. To validate the effectiveness of VICtoR, we conducted extensive experiments in both simulated and real-world environments. The results suggest that VICtoR outperformed the best existing VIC methods, achieving a 43% improvement in success rates for long-horizon tasks.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Magnetic effects in the Hadron Resonance Gas
Authors:
Michał Marczenko,
Michał Szymański,
Pok Man Lo,
Bithika Karmakar,
Pasi Huovinen,
Chihiro Sasaki,
Krzysztof Redlich
Abstract:
We discuss the modeling of the hadronic phase of QCD at finite magnetic field in the framework of hadron resonance gas (HRG). We focus on the statistical description of particle yields that include contribution from resonance decays. We demonstrate that the swift increase in the number of protons with magnetic field predicted in the HRG is due to the ill-defined description of higher-spin states.…
▽ More
We discuss the modeling of the hadronic phase of QCD at finite magnetic field in the framework of hadron resonance gas (HRG). We focus on the statistical description of particle yields that include contribution from resonance decays. We demonstrate that the swift increase in the number of protons with magnetic field predicted in the HRG is due to the ill-defined description of higher-spin states. We discuss fluctuations of conserved charges and show that at present the qualitative comparison of the model predictions with the Lattice QCD data should be treated with care. We also discuss the principle of detailed balance which allows to study the magnetic field dependence of neutral resonances.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
The Non-collinear Path to Topological Superconductivity
Authors:
Reiner Brüning,
Jasmin Bedow,
Roberto Lo Conte,
Kirsten von Bergmann,
Dirk. K. Morr,
Roland Wiesendanger
Abstract:
Combining spin textures in ultra-thin films with conventional superconductors has emerged as a powerful and versatile platform for designing topologically non-trivial superconducting phases as well as spin-triplet Cooper pairs. As a consequence, two-dimensional magnet-superconductor hybrids (2D MSHs) are promising candidate systems to realize devices for topology-based quantum technologies and sup…
▽ More
Combining spin textures in ultra-thin films with conventional superconductors has emerged as a powerful and versatile platform for designing topologically non-trivial superconducting phases as well as spin-triplet Cooper pairs. As a consequence, two-dimensional magnet-superconductor hybrids (2D MSHs) are promising candidate systems to realize devices for topology-based quantum technologies and superconducting spintronics. So far, studies have focused mostly on systems hosting collinear ferromagnets or antiferromagnets. However, topologically non-trivial phases have been predicted to emerge in MSH systems with non-collinear spin textures as well. In this article, we present the experimental discovery of topological superconductivity in the MSH system Fe/Ta(110) where a magnetic spiral is realized in the Fe monolayer on the surface of the s-wave superconductor Ta. By combining low-temperature spin-polarized scanning tunneling microscopy measurements with theoretical modeling, we are able to conclude that the system is in a topological nodal-point superconducting phase with low-energy edge modes. Due to the non-collinear spin texture in our MSH system, these edge modes exhibit a magnetization direction-dependent dispersion. Furthermore, we identify direct signatures of Rashba spin-orbit coupling in the experimentally measured differential tunneling conductance. The present work realizes a non-collinear spin texture-based path to topological superconductivity.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
DEX: Scalable Range Indexing on Disaggregated Memory [Extended Version]
Authors:
Baotong Lu,
Kaisong Huang,
Chieh-Jan Mike Liang,
Tianzheng Wang,
Eric Lo
Abstract:
Memory disaggregation can potentially allow memory-optimized range indexes such as B+-trees to scale beyond one machine while attaining high hardware utilization and low cost. Designing scalable indexes on disaggregated memory, however, is challenging due to rudimentary caching, unprincipled offloading and excessive inconsistency among servers.
This paper proposes DEX, a new scalable B+-tree for…
▽ More
Memory disaggregation can potentially allow memory-optimized range indexes such as B+-trees to scale beyond one machine while attaining high hardware utilization and low cost. Designing scalable indexes on disaggregated memory, however, is challenging due to rudimentary caching, unprincipled offloading and excessive inconsistency among servers.
This paper proposes DEX, a new scalable B+-tree for memory disaggregation. DEX includes a set of techniques to reduce remote accesses, including logical partitioning, lightweight caching and cost-aware offloading. Our evaluation shows that DEX can outperform the state-of-the-art by 1.7--56.3X, and the advantage remains under various setups, such as cache size and skewness.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.