-
Probing Dust and Gas Properties Using Ringed Disks
Authors:
Eve J. Lee
Abstract:
How rapidly a planet grows in mass and how far they may park from the host star depend sensitively on two non-dimensional parameters: Stokes number St and turbulent $α$. Yet, these parameters remain highly uncertain being difficult or impossible to measure directly. Here, we demonstrate how the ringed disks can be leveraged to obtain St and $α$ separately by constructing a simple toy model that co…
▽ More
How rapidly a planet grows in mass and how far they may park from the host star depend sensitively on two non-dimensional parameters: Stokes number St and turbulent $α$. Yet, these parameters remain highly uncertain being difficult or impossible to measure directly. Here, we demonstrate how the ringed disks can be leveraged to obtain St and $α$ separately by constructing a simple toy model that combines dust radial equation of motion under aerodynamic drag and coupling to gas motion with the measured distribution of dust masses in Class 0/I disks. Focusing on known systems with well-resolved dust rings, we find that the range of St and $α$ that are consistent with the measured properties of the rings are small: $10^{-4} \lesssim {\rm St} \lesssim 10^{-2}$ and $10^{-5} \lesssim α\lesssim 10^{-3}$. These low St and $α$ ensure the observed rings are stable against clum**. Even in one marginal case where the formation of bound clumps is possible, further mass growth by pebble accretion is inhibited. Furthermore, the derived low $α$ is consistent with the nearly inviscid regime where Type I migration can be prematurely halted. Our analysis predicts minimal planet population beyond $\sim$10s of au where we observe dust rings and significantly more vigorous planet formation inside $\sim$10 AU, consistent with current exo-giant statistics. We close with discussions on the implications of our results on small planet statistics at large orbital distances.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
A Robust Power Model Training Framework for Cloud Native Runtime Energy Metric Exporter
Authors:
Sunyanan Choochotkaew,
Chen Wang,
Huamin Chen,
Tatsuhiro Chiba,
Marcelo Amaral,
Eun Kyung Lee,
Tamar Eilam
Abstract:
Estimating power consumption in modern Cloud environments is essential for carbon quantification toward green computing. Specifically, it is important to properly account for the power consumed by each of the running applications, which are packaged as containers. This paper examines multiple challenges associated with this goal. The first challenge is that multiple customers are sharing the same…
▽ More
Estimating power consumption in modern Cloud environments is essential for carbon quantification toward green computing. Specifically, it is important to properly account for the power consumed by each of the running applications, which are packaged as containers. This paper examines multiple challenges associated with this goal. The first challenge is that multiple customers are sharing the same hardware platform (multi-tenancy), where information on the physical servers is mostly obscured. The second challenge is the overhead in power consumption that the Cloud platform control plane induces. This paper addresses these challenges and introduces a novel pipeline framework for power model training. This allows versatile power consumption approximation of individual containers on the basis of available performance counters and other metrics. The proposed model utilizes machine learning techniques to predict the power consumed by the control plane and associated processes, and uses it for isolating the power consumed by the user containers, from the server power consumption. To determine how well the prediction results in an isolation, we introduce a metric termed isolation goodness. Applying the proposed power model does not require online power measurements, nor does it need information on the physical servers, configuration, or information on other tenants sharing the same machine. The results of cross-workload, cross-platform experiments demonstrated the higher accuracy of the proposed model when predicting power consumption of unseen containers on unknown platforms, including on virtual machines.
△ Less
Submitted 9 April, 2024;
originally announced July 2024.
-
The impact of model size on catastrophic forgetting in Online Continual Learning
Authors:
Eunhae Lee
Abstract:
This study investigates the impact of model size on Online Continual Learning performance, with a focus on catastrophic forgetting. Employing ResNet architectures of varying sizes, the research examines how network depth and width affect model performance in class-incremental learning using the SplitCIFAR-10 dataset. Key findings reveal that larger models do not guarantee better Continual Learning…
▽ More
This study investigates the impact of model size on Online Continual Learning performance, with a focus on catastrophic forgetting. Employing ResNet architectures of varying sizes, the research examines how network depth and width affect model performance in class-incremental learning using the SplitCIFAR-10 dataset. Key findings reveal that larger models do not guarantee better Continual Learning performance; in fact, they often struggle more in adapting to new tasks, particularly in online settings. These results challenge the notion that larger models inherently mitigate catastrophic forgetting, highlighting the nuanced relationship between model size and Continual Learning efficacy. This study contributes to a deeper understanding of model scalability and its practical implications in Continual Learning scenarios.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
Perception Stitching: Zero-Shot Perception Encoder Transfer for Visuomotor Robot Policies
Authors:
**cheng Jian,
Easop Lee,
Zachary Bell,
Michael M. Zavlanos,
Boyuan Chen
Abstract:
Vision-based imitation learning has shown promising capabilities of endowing robots with various motion skills given visual observation. However, current visuomotor policies fail to adapt to drastic changes in their visual observations. We present Perception Stitching that enables strong zero-shot adaptation to large visual changes by directly stitching novel combinations of visual encoders. Our k…
▽ More
Vision-based imitation learning has shown promising capabilities of endowing robots with various motion skills given visual observation. However, current visuomotor policies fail to adapt to drastic changes in their visual observations. We present Perception Stitching that enables strong zero-shot adaptation to large visual changes by directly stitching novel combinations of visual encoders. Our key idea is to enforce modularity of visual encoders by aligning the latent visual features among different visuomotor policies. Our method disentangles the perceptual knowledge with the downstream motion skills and allows the reuse of the visual encoders by directly stitching them to a policy network trained with partially different visual conditions. We evaluate our method in various simulated and real-world manipulation tasks. While baseline methods failed at all attempts, our method could achieve zero-shot success in real-world visuomotor tasks. Our quantitative and qualitative analysis of the learned features of the policy network provides more insights into the high performance of our proposed method.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Latitudinal Asymmetry in the Dayside Atmosphere of WASP-43b
Authors:
Ryan C. Challener,
Zafar Rustamkulov,
Elspeth K. H. Lee,
Nikole Lewis,
David K. Sing,
Stephan M. Birkmann,
Nicolas Crouzet,
Néstor Espinoza,
Elena Manjavacas,
Natalia Oliveros-Gomez,
Jeff A. Valenti,
**gxuan Yang
Abstract:
We present two-dimensional near-infrared temperature maps of the canonical hot Jupiter WASP-43b using a phase-curve observation with JWST NIRSpec/G395H. From the white-light planetary transit, we improve constraints on the planet's orbital parameters and measure a planet-to-star radius ratio of $0.15883^{+0.00056}_{-0.00053}$. Using the white-light phase curve, we measure a longitude of maximum br…
▽ More
We present two-dimensional near-infrared temperature maps of the canonical hot Jupiter WASP-43b using a phase-curve observation with JWST NIRSpec/G395H. From the white-light planetary transit, we improve constraints on the planet's orbital parameters and measure a planet-to-star radius ratio of $0.15883^{+0.00056}_{-0.00053}$. Using the white-light phase curve, we measure a longitude of maximum brightness of $6.9^{+0^\circ.5}_{-0^\circ.5}$ east of the substellar point and a phase-curve offset of $10.0^{+0^\circ.8}_{-0^\circ.8}$. We also find an $\approx4σ$ detection of a latitudinal hotspot offset of $-13.4^{+3^\circ.2}_{-1^\circ.7}$, the first significant detection of a non-equatorial hotspot in an exoplanet atmosphere. We show that this detection is robust to variations within planetary parameter uncertainties, but only if the transit is used to improve constraints, showing the importance of transit observations to eclipse map**. Maps retrieved from the NRS1 and NRS2 detectors are similar, with hotspot locations consistent between the two detectors at the $1σ$ level. Our JWST data show brighter (hotter) nightsides and a dimmer (colder) dayside at the shorter wavelengths relative to fits to \textit{Spitzer} 3.6 and 4.5 \microns\ phase curves. Through comparison between our phase curves and a set of general circulation models, we find evidence for clouds on the nightside and atmospheric drag or high metallicity reducing the eastward hotspot offset.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Projected background and sensitivity of AMoRE-II
Authors:
A. Agrawal,
V. V. Alenkov,
P. Aryal,
J. Beyer,
B. Bhandari,
R. S. Boiko,
K. Boonin,
O. Buzanov,
C. R. Byeon,
N. Chanthima,
M. K. Cheoun,
J. S. Choe,
Seonho Choi,
S. Choudhury,
J. S. Chung,
F. A. Danevich,
M. Djamal,
D. Drung,
C. Enss,
A. Fleischmann,
A. M. Gangapshev,
L. Gastaldo,
Y. M. Gavrilyuk,
A. M. Gezhaev,
O. Gileva
, et al. (81 additional authors not shown)
Abstract:
AMoRE-II aims to search for neutrinoless double beta decay with an array of 423 Li$_2$$^{100}$MoO$_4$ crystals operating in the cryogenic system as the main phase of the Advanced Molybdenum-based Rare process Experiment (AMoRE). AMoRE has been planned to operate in three phases: AMoRE-pilot, AMoRE-I, and AMoRE-II. AMoRE-II is currently being installed at the Yemi Underground Laboratory, located ap…
▽ More
AMoRE-II aims to search for neutrinoless double beta decay with an array of 423 Li$_2$$^{100}$MoO$_4$ crystals operating in the cryogenic system as the main phase of the Advanced Molybdenum-based Rare process Experiment (AMoRE). AMoRE has been planned to operate in three phases: AMoRE-pilot, AMoRE-I, and AMoRE-II. AMoRE-II is currently being installed at the Yemi Underground Laboratory, located approximately 1000 meters deep in Jeongseon, Korea. The goal of AMoRE-II is to reach up to $T^{0νββ}_{1/2}$ $\sim$ 6 $\times$ 10$^{26}$ years, corresponding to an effective Majorana mass of 15 - 29 meV, covering all the inverted mass hierarchy regions. To achieve this, the background level of the experimental configurations and possible background sources of gamma and beta events should be well understood. We have intensively performed Monte Carlo simulations using the GEANT4 toolkit in all the experimental configurations with potential sources. We report the estimated background level that meets the 10$^{-4}$counts/(keV$\cdot$kg$\cdot$yr) requirement for AMoRE-II in the region of interest (ROI) and show the projected half-life sensitivity based on the simulation study.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Phase-resolving the absorption signatures of water and carbon monoxide in the atmosphere of the ultra-hot Jupiter WASP-121b with GEMINI-S/IGRINS
Authors:
Joost P. Wardenier,
Vivien Parmentier,
Michael R. Line,
Megan Weiner Mansfield,
Xianyu Tan,
Shang-Min Tsai,
Jacob L. Bean,
Jayne L. Birkby,
Matteo Brogi,
Jean-Michel Désert,
Siddharth Gandhi,
Elspeth K. H. Lee,
Colette I. Levens,
Lorenzo Pino,
Peter C. B. Smith
Abstract:
Ultra-hot Jupiters are among the best targets for atmospheric characterization at high spectral resolution. Resolving their transmission spectra as a function of orbital phase offers a unique window into the 3D nature of these objects. In this work, we present three transits of the ultra-hot Jupiter WASP-121b observed with Gemini-S/IGRINS. For the first time, we measure the phase-dependent absorpt…
▽ More
Ultra-hot Jupiters are among the best targets for atmospheric characterization at high spectral resolution. Resolving their transmission spectra as a function of orbital phase offers a unique window into the 3D nature of these objects. In this work, we present three transits of the ultra-hot Jupiter WASP-121b observed with Gemini-S/IGRINS. For the first time, we measure the phase-dependent absorption signals of CO and H$_{\text{2}}$O in the atmosphere of an exoplanet, and we find that they are different. While the blueshift of CO increases during the transit, the absorption lines of H$_{\text{2}}$O become less blueshifted with phase, and even show a redshift in the second half of the transit. These measurements reveal the distinct spatial distributions of both molecules across the atmospheres of ultra-hot Jupiters. Also, we find that the H$_{\text{2}}$O signal is absent in the first quarter of the transit, potentially hinting at cloud formation on the evening terminator of WASP-121b. To further interpret the absorption trails of CO and H$_{\text{2}}$O, as well as the Doppler shifts of Fe previously measured with VLT/ESPRESSO, we compare the data to simulated transits of WASP-121b. To this end, we post-processes the outputs of global circulation models with a 3D Monte-Carlo radiative transfer code. Our analysis shows that the atmosphere of WASP-121b is subject to atmospheric drag, as previously suggested by small hotspot offsets inferred from phase-curve observations. Our study highlights the importance of phase-resolved spectroscopy in unravelling the complex atmospheric structure of ultra-hot Jupiters and sets the stage for further investigations into their chemistry and dynamics.
△ Less
Submitted 19 June, 2024; v1 submitted 13 June, 2024;
originally announced June 2024.
-
ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models
Authors:
David Anugraha,
Genta Indra Winata,
Chenyue Li,
Patrick Amadeus Irawan,
En-Shiun Annie Lee
Abstract:
Performance prediction is a method to estimate the performance of Language Models (LMs) on various Natural Language Processing (NLP) tasks, mitigating computational costs associated with model capacity and data for fine-tuning. Our paper introduces ProxyLM, a scalable framework for predicting LM performance using proxy models in multilingual tasks. These proxy models act as surrogates, approximati…
▽ More
Performance prediction is a method to estimate the performance of Language Models (LMs) on various Natural Language Processing (NLP) tasks, mitigating computational costs associated with model capacity and data for fine-tuning. Our paper introduces ProxyLM, a scalable framework for predicting LM performance using proxy models in multilingual tasks. These proxy models act as surrogates, approximating the performance of the LM of interest. By leveraging proxy models, ProxyLM significantly reduces computational overhead on task evaluations, achieving up to a 37.08x speedup compared to traditional methods, even with our smallest proxy models. Additionally, our methodology showcases adaptability to previously unseen languages in pre-trained LMs, outperforming the state-of-the-art performance by 1.89x as measured by root-mean-square error (RMSE). This framework streamlines model selection, enabling efficient deployment and iterative LM enhancements without extensive computational resources.
△ Less
Submitted 14 June, 2024; v1 submitted 13 June, 2024;
originally announced June 2024.
-
PretVM: Predictable, Efficient Virtual Machine for Real-Time Concurrency
Authors:
Shaokai Lin,
Erling Jellum,
Mirco Theile,
Tassilo Tanneberger,
Binqi Sun,
Chadlia Jerad,
Ruomu Xu,
Guangyu Feng,
Christian Menard,
Marten Lohstroh,
Jeronimo Castrillon,
Sanjit Seshia,
Edward Lee
Abstract:
This paper introduces the Precision-Timed Virtual Machine (PretVM), an intermediate platform facilitating the execution of quasi-static schedules compiled from a subset of programs written in the Lingua Franca (LF) coordination language. The subset consists of those programs that in principle should have statically verifiable and predictable timing behavior. The PretVM provides a schedule with wel…
▽ More
This paper introduces the Precision-Timed Virtual Machine (PretVM), an intermediate platform facilitating the execution of quasi-static schedules compiled from a subset of programs written in the Lingua Franca (LF) coordination language. The subset consists of those programs that in principle should have statically verifiable and predictable timing behavior. The PretVM provides a schedule with well-defined worst-case timing bounds. The PretVM provides a clean separation between application logic and coordination logic, yielding more analyzable program executions. Experiments compare the PretVM against the default (more dynamic) LF scheduler and show that it delivers time-accurate deterministic execution.
△ Less
Submitted 25 June, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models
Authors:
David Ifeoluwa Adelani,
Jessica Ojo,
Israel Abebe Azime,
Jian Yun Zhuang,
Jesujoba O. Alabi,
Xuanli He,
Millicent Ochieng,
Sara Hooker,
Andiswa Bukula,
En-Shiun Annie Lee,
Chiamaka Chukwuneke,
Happy Buzaaba,
Blessing Sibanda,
Godson Kalipe,
Jonathan Mukiibi,
Salomon Kabongo,
Foutse Yuehgoh,
Mmasibidi Setaka,
Lolwethu Ndolela,
Nkiruka Odu,
Rooweither Mabuya,
Shamsuddeen Hassan Muhammad,
Salomey Osei,
Sokhar Samb,
Tadesse Kebede Guge
, et al. (1 additional authors not shown)
Abstract:
Despite the widespread adoption of Large language models (LLMs), their remarkable capabilities remain limited to a few high-resource languages. Additionally, many low-resource languages (e.g. African languages) are often evaluated only on basic text classification tasks due to the lack of appropriate or comprehensive benchmarks outside of high-resource languages. In this paper, we introduce IrokoB…
▽ More
Despite the widespread adoption of Large language models (LLMs), their remarkable capabilities remain limited to a few high-resource languages. Additionally, many low-resource languages (e.g. African languages) are often evaluated only on basic text classification tasks due to the lack of appropriate or comprehensive benchmarks outside of high-resource languages. In this paper, we introduce IrokoBench -- a human-translated benchmark dataset for 16 typologically-diverse low-resource African languages covering three tasks: natural language inference~(AfriXNLI), mathematical reasoning~(AfriMGSM), and multi-choice knowledge-based QA~(AfriMMLU). We use IrokoBench to evaluate zero-shot, few-shot, and translate-test settings~(where test sets are translated into English) across 10 open and four proprietary LLMs. Our evaluation reveals a significant performance gap between high-resource languages~(such as English and French) and low-resource African languages. We observe a significant performance gap between open and proprietary models, with the highest performing open model, Aya-101 only at 58\% of the best-performing proprietary model GPT-4o performance. Machine translating the test set to English before evaluation helped to close the gap for larger models that are English-centric, like LLaMa 3 70B. These findings suggest that more efforts are needed to develop and adapt LLMs for African languages.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Weak Degeneracy of Planar Graphs
Authors:
Anton Bernshteyn,
Eugene Lee,
Evelyne Smith-Roberge
Abstract:
The weak degeneracy of a graph $G$ is a numerical parameter that was recently introduced by the first two authors with the aim of understanding the power of greedy algorithms for graph coloring. Every $d$-degenerate graph is weakly $d$-degenerate, but the converse is not true in general (for example, all connected $d$-regular graphs except cycles and cliques are weakly $(d-1)$-degenerate). If $G$…
▽ More
The weak degeneracy of a graph $G$ is a numerical parameter that was recently introduced by the first two authors with the aim of understanding the power of greedy algorithms for graph coloring. Every $d$-degenerate graph is weakly $d$-degenerate, but the converse is not true in general (for example, all connected $d$-regular graphs except cycles and cliques are weakly $(d-1)$-degenerate). If $G$ is weakly $d$-degenerate, then the list-chromatic number of $G$ is at most $d+1$, and the same upper bound holds for various other parameters such as the DP-chromatic number and the paint number. Here we rectify a mistake in a paper of the first two authors and give a correct proof that planar graphs are weakly $4$-degenerate, strengthening the famous result of Thomassen that planar graphs are $5$-list-colorable.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Automatic Channel Pruning for Multi-Head Attention
Authors:
Eunho Lee,
Youngbae Hwang
Abstract:
Despite the strong performance of Transformers, their quadratic computation complexity presents challenges in applying them to vision tasks. Automatic pruning is one of effective methods for reducing computation complexity without heuristic approaches. However, directly applying it to multi-head attention is not straightforward due to channel misalignment. In this paper, we propose an automatic ch…
▽ More
Despite the strong performance of Transformers, their quadratic computation complexity presents challenges in applying them to vision tasks. Automatic pruning is one of effective methods for reducing computation complexity without heuristic approaches. However, directly applying it to multi-head attention is not straightforward due to channel misalignment. In this paper, we propose an automatic channel pruning method to take into account the multi-head attention mechanism. First, we incorporate channel similarity-based weights into the pruning indicator to preserve more informative channels in each head. Then, we adjust pruning indicator to enforce removal of channels in equal proportions across all heads, preventing the channel misalignment. We also add a reweight module to compensate for information loss resulting from channel removal, and an effective initialization step for pruning indicator based on difference of attention between original structure and each channel. Our proposed method can be used to not only original attention, but also linear attention, which is more efficient as linear complexity with respect to the number of tokens. On ImageNet-1K, applying our pruning method to the FLattenTransformer, which includes both attention mechanisms, shows outperformed accuracy for several MACs compared with previous state-of-the-art efficient models and pruned methods. Code will be available soon.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
First results of AUP Nb3Sn quadrupole horizontal tests
Authors:
M. Baldini,
G. Ambrosio,
G. Apollinari,
J. Blowers,
R. Bossert,
R. Carcagno,
G. Chlachidze,
J. DiMarco,
S. Feher,
S. Krave,
V. Lombardo,
L. Martin,
C. Narug,
T. H. Nicol,
V. Nikolic,
A. Nobrega,
V. Marinozzi,
C. Orozco,
T. Page,
S. Stoynev,
T. Strauss,
M. Turenne,
D. Turrioni,
A. Vouris,
M. Yu
, et al. (26 additional authors not shown)
Abstract:
The Large Hadron Collider will soon undergo an upgrade to increase its luminosity by a factor of ~10 [1]. A crucial part of this upgrade will be replacement of the NbTi focusing magnets with Nb3Sn magnets that achieve a ~50% increase in the field strength. This will be the first ever large-scale implementation of Nb3Sn magnets in a particle accelerator. The High-Luminosity LHC Upgrade, HL-LHC is a…
▽ More
The Large Hadron Collider will soon undergo an upgrade to increase its luminosity by a factor of ~10 [1]. A crucial part of this upgrade will be replacement of the NbTi focusing magnets with Nb3Sn magnets that achieve a ~50% increase in the field strength. This will be the first ever large-scale implementation of Nb3Sn magnets in a particle accelerator. The High-Luminosity LHC Upgrade, HL-LHC is a CERN project with a world-wide collaboration. It is under construction and utilizes Nb3Sn Magnets (named MQXF) as key ingredients to increase tenfold the integrated luminosity delivered to the CMS and ATLAS experiments in the next decade.
The HL-LHC AUP is the US effort to contribute approximately 50% of the low-beta focusing magnets and crab cavities for the HL-LHC.
This paper will present the program to fabricate the Nb3Sn superconducting magnets. We are reporting the status of the HL-LHC AUP project present the results from horizontal tests of the first fully assembled cryo-assembly.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Automorphisms and deformations of regular semisimple Hessenberg varieties
Authors:
Patrick Brosnan,
Laura Escobar,
Jaehyun Hong,
Donggun Lee,
Eunjeong Lee,
Anton Mellit,
Eric Sommers
Abstract:
We show that regular semisimple Hessenberg varieties can have moduli. To be precise, suppose $X$ is a regular semisimple Hessenberg variety of codimension 1 in the flag variety $G/B$, where $G$ is a simple algebraic group of rank $r$ over $\mathbb{C}$ and $B$ is a Borel subgroup. We show that the space $\mathrm{H}^1(X,TX)$ of first order deformations of $X$ has dimension $r-1$ except in type…
▽ More
We show that regular semisimple Hessenberg varieties can have moduli. To be precise, suppose $X$ is a regular semisimple Hessenberg variety of codimension 1 in the flag variety $G/B$, where $G$ is a simple algebraic group of rank $r$ over $\mathbb{C}$ and $B$ is a Borel subgroup. We show that the space $\mathrm{H}^1(X,TX)$ of first order deformations of $X$ has dimension $r-1$ except in type $A_2$. (In type $A_2$, the Hessenberg varieties in question are all isomorphic to the permutohedral toric surface, and $\mathrm{dim}\mathrm{H}^1(X,TX)=0$.) Moreover, we show that the Kodaira-Spencer map $\mathfrak{g}\to\mathrm{H}^1(X,TX)$ is onto, that the connected component of the automorphism group of $X$ is the maximal torus of $G$, and that $\mathrm{H}^i(X,TX)=0$ for $i\geq2$. Along the way, we prove several theorems of independent interest about the cohomology of homogeneous vector bundles on $G/B$.
In type $A$, we can give an even more precise statement determining when two codimension $1$ regular semisimple Hessenberg varieties in $G/B$ are isomorphic. We also compute the automorphism groups explicitly in type $A_{n-1}$ in the terms of stabilizer subgroups of the action of the symmetric group $S_n$ on the moduli space $M_{0,n+1}$ of smooth genus $0$ curves with $n+1$ marked points. Using this, we describe the moduli stack of the regular semisimple Hessenberg varieties $X$ explicitly as a quotient stack of $M_{0,n+1}$.
We prove several analogous results for Hessenberg varieties in generalized flag varieties $G/P$, where $P$ is a parabolic subgroup of $G$. In type $A$, these results are used in the proofs of the results for $G/B$, but they are also independently interesting because the associated moduli stacks are related directly to the action of $S_n$ on $M_{0,n}$.
△ Less
Submitted 18 June, 2024; v1 submitted 28 May, 2024;
originally announced May 2024.
-
Learning to Detour: Shortcut Mitigating Augmentation for Weakly Supervised Semantic Segmentation
Authors:
JuneHyoung Kwon,
Eunju Lee,
Yunsung Cho,
YoungBin Kim
Abstract:
Weakly supervised semantic segmentation (WSSS) employing weak forms of labels has been actively studied to alleviate the annotation cost of acquiring pixel-level labels. However, classifiers trained on biased datasets tend to exploit shortcut features and make predictions based on spurious correlations between certain backgrounds and objects, leading to a poor generalization performance. In this p…
▽ More
Weakly supervised semantic segmentation (WSSS) employing weak forms of labels has been actively studied to alleviate the annotation cost of acquiring pixel-level labels. However, classifiers trained on biased datasets tend to exploit shortcut features and make predictions based on spurious correlations between certain backgrounds and objects, leading to a poor generalization performance. In this paper, we propose shortcut mitigating augmentation (SMA) for WSSS, which generates synthetic representations of object-background combinations not seen in the training data to reduce the use of shortcut features. Our approach disentangles the object-relevant and background features. We then shuffle and combine the disentangled representations to create synthetic features of diverse object-background combinations. SMA-trained classifier depends less on contexts and focuses more on the target object when making predictions. In addition, we analyzed the behavior of the classifier on shortcut usage after applying our augmentation using an attribution method-based metric. The proposed method achieved the improved performance of semantic segmentation result on PASCAL VOC 2012 and MS COCO 2014 datasets.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
VADER: Visual Affordance Detection and Error Recovery for Multi Robot Human Collaboration
Authors:
Michael Ahn,
Montserrat Gonzalez Arenas,
Matthew Bennice,
Noah Brown,
Christine Chan,
Byron David,
Anthony Francis,
Gavin Gonzalez,
Rainer Hessmer,
Tomas Jackson,
Nikhil J Joshi,
Daniel Lam,
Tsang-Wei Edward Lee,
Alex Luong,
Sharath Maddineni,
Harsh Patel,
Jodilyn Peralta,
Jornell Quiambao,
Diego Reyes,
Rosario M Jauregui Ruano,
Dorsa Sadigh,
Pannag Sanketi,
Leila Takayama,
Pavel Vodenski,
Fei Xia
Abstract:
Robots today can exploit the rich world knowledge of large language models to chain simple behavioral skills into long-horizon tasks. However, robots often get interrupted during long-horizon tasks due to primitive skill failures and dynamic environments. We propose VADER, a plan, execute, detect framework with seeking help as a new skill that enables robots to recover and complete long-horizon ta…
▽ More
Robots today can exploit the rich world knowledge of large language models to chain simple behavioral skills into long-horizon tasks. However, robots often get interrupted during long-horizon tasks due to primitive skill failures and dynamic environments. We propose VADER, a plan, execute, detect framework with seeking help as a new skill that enables robots to recover and complete long-horizon tasks with the help of humans or other robots. VADER leverages visual question answering (VQA) modules to detect visual affordances and recognize execution errors. It then generates prompts for a language model planner (LMP) which decides when to seek help from another robot or human to recover from errors in long-horizon task execution. We show the effectiveness of VADER with two long-horizon robotic tasks. Our pilot study showed that VADER is capable of performing complex long-horizon tasks by asking for help from another robot to clear a table. Our user study showed that VADER is capable of performing complex long-horizon tasks by asking for help from a human to clear a path. We gathered feedback from people (N=19) about the performance of the VADER performance vs. a robot that did not ask for help. https://google-vader.github.io/
△ Less
Submitted 30 May, 2024; v1 submitted 24 May, 2024;
originally announced May 2024.
-
Strongly-Consistent Distributed Discrete-event Systems
Authors:
Peter Donovan,
Erling Jellum,
Byeonggil Jun,
Hokeun Kim,
Edward A. Lee,
Shaokai Lin,
Marten Lohstroh,
Anirudh Rengarajan
Abstract:
Discrete-event (DE) systems are concurrent programs where components communicate via tagged events, where tags are drawn from a totally ordered set. Reactors are an emerging model of computation based on DE and realized in the open-source coordination language Lingua Franca. Distributed DE (DDE) systems are DE systems where the components (reactors) communicate over networks. The prior art has req…
▽ More
Discrete-event (DE) systems are concurrent programs where components communicate via tagged events, where tags are drawn from a totally ordered set. Reactors are an emerging model of computation based on DE and realized in the open-source coordination language Lingua Franca. Distributed DE (DDE) systems are DE systems where the components (reactors) communicate over networks. The prior art has required that for DDE systems with cycles, each cycle must contain at least one logical delay, where the tag of events is incremented. Such delays, however, are not required by the elegant fixed-point semantics of DE. The only requirement is that the program be constructive, meaning it is free of causality cycles. This paper gives a way to coordinate the execution of DDE systems that can execute any constructive program, even one with zero-delay cycles. It provides a formal model that exposes exactly the information that must be shared across networks for such execution to be possible. Furthermore, it describes a concrete implementation that is an extension of the coordination mechanisms in Lingua Franca.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
A Reproducibility Study on Quantifying Language Similarity: The Impact of Missing Values in the URIEL Knowledge Base
Authors:
Hasti Toossi,
Guo Qing Huai,
**yu Liu,
Eric Khiu,
A. Seza Doğruöz,
En-Shiun Annie Lee
Abstract:
In the pursuit of supporting more languages around the world, tools that characterize properties of languages play a key role in expanding the existing multilingual NLP research. In this study, we focus on a widely used typological knowledge base, URIEL, which aggregates linguistic information into numeric vectors. Specifically, we delve into the soundness and reproducibility of the approach taken…
▽ More
In the pursuit of supporting more languages around the world, tools that characterize properties of languages play a key role in expanding the existing multilingual NLP research. In this study, we focus on a widely used typological knowledge base, URIEL, which aggregates linguistic information into numeric vectors. Specifically, we delve into the soundness and reproducibility of the approach taken by URIEL in quantifying language similarity. Our analysis reveals URIEL's ambiguity in calculating language distances and in handling missing values. Moreover, we find that URIEL does not provide any information about typological features for 31\% of the languages it represents, undermining the reliabilility of the database, particularly on low-resource languages. Our literature review suggests URIEL and lang2vec are used in papers on diverse NLP tasks, which motivates us to rigorously verify the database as the effectiveness of these works depends on the reliability of the information the tool provides.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
A warm Neptune's methane reveals core mass and vigorous atmospheric mixing
Authors:
David K. Sing,
Zafar Rustamkulov,
Daniel P. Thorngren,
Joanna K. Barstow,
Pascal Tremblin,
Catarina Alves de Oliveira,
Tracy L. Beck,
Stephan M. Birkmann,
Ryan C. Challener,
Nicolas Crouzet,
Néstor Espinoza,
Pierre Ferruit,
Giovanna Giardino,
Amélie Gressier,
Elspeth K. H. Lee,
Nikole K. Lewis,
Roberto Maiolino,
Elena Manjavacas,
Bernard J. Rauscher,
Marco Sirianni,
Jeff A. Valenti
Abstract:
Observations of transiting gas giant exoplanets have revealed a pervasive depletion of methane, which has only recently been identified atmospherically. The depletion is thought to be maintained by disequilibrium processes such as photochemistry or mixing from a hotter interior. However, the interiors are largely unconstrained along with the vertical mixing strength and only upper limits on the CH…
▽ More
Observations of transiting gas giant exoplanets have revealed a pervasive depletion of methane, which has only recently been identified atmospherically. The depletion is thought to be maintained by disequilibrium processes such as photochemistry or mixing from a hotter interior. However, the interiors are largely unconstrained along with the vertical mixing strength and only upper limits on the CH$_4$ depletion have been available. The warm Neptune WASP-107 b stands out among exoplanets with an unusually low density, reported low core mass, and temperatures amenable to CH$_4$ though previous observations have yet to find the molecule. Here we present a JWST NIRSpec transmission spectrum of WASP-107 b which shows features from both SO$_2$ and CH$_4$ along with H$_2$O, CO$_2$, and CO. We detect methane with 4.2$σ$ significance at an abundance of 1.0$\pm$0.5 ppm, which is depleted by 3 orders of magnitude relative to equilibrium expectations. Our results are highly constraining for the atmosphere and interior, which indicate the envelope has a super-solar metallicity of 43$\pm$8$\times$ solar, a hot interior with an intrinsic temperature of T$_{\rm int}$=460$\pm$40 K, and vigorous vertical mixing which depletes CH4 with a diffusion coefficient of Kzz = 10$^{11.6\pm0.1}$ cm$^2$/s. Photochemistry has a negligible effect on the CH$_4$ abundance, but is needed to account for the SO$_2$. We infer a core mass of 11.5$_{-3.6}^{+3.0}$ M$_{\odot}$, which is much higher than previous upper limits, releasing a tension with core-accretion models.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Giant Outer Transiting Exoplanet Mass (GOT `EM) Survey. V. Two Giant Planets in Kepler-511 but Only One Ran Away
Authors:
Yayaati Chachan,
Paul A. Dalba,
Daniel P. Thorngren,
Stephen R. Kane,
Eve J. Lee,
Edward W. Schwieterman,
Howard Isaacson,
Andrew W. Howard,
Matthew J. Payne
Abstract:
Systems hosting multiple giant planets are important laboratories for understanding planetary formation and migration processes. We present a nearly decade-long Doppler spectroscopy campaign from the Keck-I telescope to characterize the two transiting giant planets orbiting Kepler-511 on orbits of 27 days and 297 days. The radial velocity measurements yield precise masses for both planets, which w…
▽ More
Systems hosting multiple giant planets are important laboratories for understanding planetary formation and migration processes. We present a nearly decade-long Doppler spectroscopy campaign from the Keck-I telescope to characterize the two transiting giant planets orbiting Kepler-511 on orbits of 27 days and 297 days. The radial velocity measurements yield precise masses for both planets, which we use to infer their bulk heavy element content. Both planets contain approximately 30 Earth masses of heavy elements, but their bulk metallicities (i.e., the ratio between metal mass and total mass) are drastically different ($0.86 \pm 0.04$ and $0.22 \pm 0.04$ respectively). Envelope mass loss cannot account for this difference due to the relatively large orbital distance and mass of the inner planet. We conclude that the outer planet underwent runaway gas accretion while the inner planet did not. This bifurcation in accretion histories is likely a result of the accretion of gas with very different metallicities by the two planets or the late formation of the inner planet from a merger of sub-Neptunes. Kepler-511 uniquely demonstrates how giant planet formation can produce strikingly different outcomes even for planets in the same system.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Quantum-Inspired Genetic Algorithm for Designing Planar Multilayer Photonic Structure
Authors:
Zhihao Xu,
Wenjie Shang,
Seongmin Kim,
Alexandria Bobbitt,
Eungkyu Lee,
Tengfei Luo
Abstract:
Quantum algorithms are emerging tools in the design of functional materials due to their powerful solution space search capability. How to balance the high price of quantum computing resources and the growing computing needs has become an urgent problem to be solved. We propose a novel optimization strategy based on an active learning scheme that combines the improved Quantum Genetic Algorithm (QG…
▽ More
Quantum algorithms are emerging tools in the design of functional materials due to their powerful solution space search capability. How to balance the high price of quantum computing resources and the growing computing needs has become an urgent problem to be solved. We propose a novel optimization strategy based on an active learning scheme that combines the improved Quantum Genetic Algorithm (QGA) with machine learning surrogate model regression. Using Random Forests as the surrogate model circumvents the time-consuming physical modeling or experiments, thereby improving the optimization efficiency. QGA, a genetic algorithm embedded with quantum mechanics, combines the advantages of quantum computing and genetic algorithms, enabling faster and more robust convergence to the optimum. Using the design of planar multilayer photonic structures for transparent radiative cooling as a testbed, we show superiority of our algorithm over the classical genetic algorithm (CGA). Additionally, we show the precision advantage of the RF model as a flexible surrogate model, which relaxes the constraints on the type of surrogate model that can be used in other quantum computing optimization algorithms (e.g., quantum annealing needs Ising model as a surrogate).
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
HLSTransform: Energy-Efficient Llama 2 Inference on FPGAs Via High Level Synthesis
Authors:
Andy He,
Darren Key,
Mason Bulling,
Andrew Chang,
Skyler Shapiro,
Everett Lee
Abstract:
Graphics Processing Units (GPUs) have become the leading hardware accelerator for deep learning applications and are used widely in training and inference of transformers; transformers have achieved state-of-the-art performance in many areas of machine learning and are especially used in most modern Large Language Models (LLMs). However, GPUs require large amounts of energy, which poses environmen…
▽ More
Graphics Processing Units (GPUs) have become the leading hardware accelerator for deep learning applications and are used widely in training and inference of transformers; transformers have achieved state-of-the-art performance in many areas of machine learning and are especially used in most modern Large Language Models (LLMs). However, GPUs require large amounts of energy, which poses environmental concerns, demands high operational costs, and causes GPUs to be unsuitable for edge computing. We develop an accelerator for transformers, namely, Llama 2, an open-source state-of-the-art LLM, using high level synthesis (HLS) on Field Programmable Gate Arrays (FPGAs). HLS allows us to rapidly prototype FPGA designs without writing code at the register-transfer level (RTL). We name our method HLSTransform, and the FPGA designs we synthesize with HLS achieve up to a 12.75x reduction and 8.25x reduction in energy used per token on the Xilinx Virtex UltraScale+ VU9P FPGA compared to an Intel Xeon Broadwell E5-2686 v4 CPU and NVIDIA RTX 3090 GPU respectively, while increasing inference speeds by up to 2.46x compared to CPU and maintaining 0.53x the speed of an RTX 3090 GPU despite the GPU's 4 times higher base clock rate. With the lack of existing open-source FPGA accelerators for transformers, we open-source our code and document our steps for synthesis. We hope this work will serve as a step in democratizing the use of FPGAs in transformer inference and inspire research into energy-efficient inference methods as a whole. The code can be found on https://github.com/HLSTransform/submission.
△ Less
Submitted 29 April, 2024;
originally announced May 2024.
-
Understanding the Cluster LP for Correlation Clustering
Authors:
Nairen Cao,
Vincent Cohen-Addad,
Euiwoong Lee,
Shi Li,
Alantha Newman,
Lukas Vogl
Abstract:
In the classic Correlation Clustering problem introduced by Bansal, Blum, and Chawla~(FOCS 2002), the input is a complete graph where edges are labeled either $+$ or $-$, and the goal is to find a partition of the vertices that minimizes the sum of the +edges across parts plus the sum of the -edges within parts. In recent years, Chawla, Makarychev, Schramm and Yaroslavtsev~(STOC 2015) gave a 2.06-…
▽ More
In the classic Correlation Clustering problem introduced by Bansal, Blum, and Chawla~(FOCS 2002), the input is a complete graph where edges are labeled either $+$ or $-$, and the goal is to find a partition of the vertices that minimizes the sum of the +edges across parts plus the sum of the -edges within parts. In recent years, Chawla, Makarychev, Schramm and Yaroslavtsev~(STOC 2015) gave a 2.06-approximation by providing a near-optimal rounding of the standard LP, and Cohen-Addad, Lee, Li, and Newman~(FOCS 2022, 2023) finally bypassed the integrality gap of 2 for this LP giving a $1.73$-approximation for the problem.
In order to create a simple and unified framework for Correlation Clustering similar to those for {\em typical} approximate optimization tasks, we propose the {\em cluster LP} as a strong linear program that might tightly capture the approximability of Correlation Clustering. It unifies all the previous relaxations for the problem.
We demonstrate the power of the cluster LP by presenting a simple rounding algorithm, and providing two analyses, one analytically proving a 1.49-approximation and the other solving a factor-revealing SDP to show a 1.437-approximation. Both proofs introduce principled methods by which to analyze the performance of the algorithm, resulting in a significantly improved approximation guarantee.
Finally, we prove an integrality gap of $4/3$ for the cluster LP, showing our 1.437-upper bound cannot be drastically improved. Our gap instance directly inspires an improved NP-hardness of approximation with a ratio $24/23 \approx 1.042$; no explicit hardness ratio was known before.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
Toward Robust LiDAR based 3D Object Detection via Density-Aware Adaptive Thresholding
Authors:
Eunho Lee,
Minwoo Jung,
Ayoung Kim
Abstract:
Robust 3D object detection is a core challenge for autonomous mobile systems in field robotics. To tackle this issue, many researchers have demonstrated improvements in 3D object detection performance in datasets. However, real-world urban scenarios with unstructured and dynamic situations can still lead to numerous false positives, posing a challenge for robust 3D object detection models. This pa…
▽ More
Robust 3D object detection is a core challenge for autonomous mobile systems in field robotics. To tackle this issue, many researchers have demonstrated improvements in 3D object detection performance in datasets. However, real-world urban scenarios with unstructured and dynamic situations can still lead to numerous false positives, posing a challenge for robust 3D object detection models. This paper presents a post-processing algorithm that dynamically adjusts object detection thresholds based on the distance from the ego-vehicle. 3D object detection models usually perform well in detecting nearby objects but may exhibit suboptimal performance for distant ones. While conventional perception algorithms typically employ a single threshold in post-processing, the proposed algorithm addresses this issue by employing adaptive thresholds based on the distance from the ego-vehicle, minimizing false negatives and reducing false positives in urban scenarios. The results show performance enhancements in 3D object detection models across a range of scenarios, not only in dynamic urban road conditions but also in scenarios involving adverse weather conditions.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
Taxonomy and Analysis of Sensitive User Queries in Generative AI Search
Authors:
Hwiyeol Jo,
Taiwoo Park,
Nayoung Choi,
Changbong Kim,
Ohjoon Kwon,
Donghyeon Jeon,
Hyunwoo Lee,
Eui-Hyeon Lee,
Kyoungho Shin,
Sun Suk Lim,
Kyungmi Kim,
Jihye Lee,
Sun Kim
Abstract:
Although there has been a growing interest among industries to integrate generative LLMs into their services, limited experiences and scarcity of resources acts as a barrier in launching and servicing large-scale LLM-based conversational services. In this paper, we share our experiences in develo** and operating generative AI models within a national-scale search engine, with a specific focus on…
▽ More
Although there has been a growing interest among industries to integrate generative LLMs into their services, limited experiences and scarcity of resources acts as a barrier in launching and servicing large-scale LLM-based conversational services. In this paper, we share our experiences in develo** and operating generative AI models within a national-scale search engine, with a specific focus on the sensitiveness of user queries. We propose a taxonomy for sensitive search queries, outline our approaches, and present a comprehensive analysis report on sensitive queries from actual users.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
Figuring Out Gas & Galaxies In Enzo (FOGGIE) VIII: Complex and Stochastic Metallicity Gradients at z > 2
Authors:
Ayan Acharyya,
Molly S. Peeples,
Jason Tumlinson,
Brian W. O Shea,
Cassandra Lochhaas,
Anna C. Wright,
Raymond C. Simons,
Ramona Augustin,
Britton D. Smith,
Eugene Hyeonmin Lee
Abstract:
Gas-phase metallicity gradients are a crucial element in understanding the chemical evolution of galaxies. We use the FOGGIE simulations to study the metallicity gradients ($\nabla Z$) of six Milky Way-like galaxies throughout their evolution. FOGGIE galaxies generally exhibit steep negative gradients for most of their history, with only a few short-lived instances reaching positive slopes that ap…
▽ More
Gas-phase metallicity gradients are a crucial element in understanding the chemical evolution of galaxies. We use the FOGGIE simulations to study the metallicity gradients ($\nabla Z$) of six Milky Way-like galaxies throughout their evolution. FOGGIE galaxies generally exhibit steep negative gradients for most of their history, with only a few short-lived instances reaching positive slopes that appear to arise mainly from interactions with other galaxies. FOGGIE concurs with other simulation results but disagrees with the robust observational finding that flat and positive gradients are common at $z>1$. By tracking the metallicity gradient at a rapid cadence of simulation outputs ($\sim 5$--10 Myr), we find that theoretical gradients are highly stochastic: the FOGGIE galaxies spend $\sim 30-50$\% of their time far away from a smoothed trajectory inferred from analytic models or other, less high-cadence simulations. This rapid variation makes instantaneous gradients from observations more difficult to interpret in terms of physical processes. Because of these geometric and stochastic complications, we explore non-parametric methods of quantifying the evolving metallicity distribution at $z > 1$. We investigate how efficiently non-parametric measures of the 2-D metallicity distribution respond to metal production and mixing. Our results suggest that new methods of quantifying and interpreting gas-phase metallicity will be needed to relate trends in upcoming high-$z$ {\it JWST} observations with the underlying physics of gas accretion, expulsion, and recycling in early galaxies.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation
Authors:
Tong Su,
Xin Peng,
Sarubi Thillainathan,
David Guzmán,
Surangika Ranathunga,
En-Shiun Annie Lee
Abstract:
Parameter-efficient fine-tuning (PEFT) methods are increasingly vital in adapting large-scale pre-trained language models for diverse tasks, offering a balance between adaptability and computational efficiency. They are important in Low-Resource Language (LRL) Neural Machine Translation (NMT) to enhance translation accuracy with minimal resources. However, their practical effectiveness varies sign…
▽ More
Parameter-efficient fine-tuning (PEFT) methods are increasingly vital in adapting large-scale pre-trained language models for diverse tasks, offering a balance between adaptability and computational efficiency. They are important in Low-Resource Language (LRL) Neural Machine Translation (NMT) to enhance translation accuracy with minimal resources. However, their practical effectiveness varies significantly across different languages. We conducted comprehensive empirical experiments with varying LRL domains and sizes to evaluate the performance of 8 PEFT methods with in total of 15 architectures using the SacreBLEU score. We showed that 6 PEFT architectures outperform the baseline for both in-domain and out-domain tests and the Houlsby+Inversion adapter has the best performance overall, proving the effectiveness of PEFT methods.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
Fakes of Varying Shades: How Warning Affects Human Perception and Engagement Regarding LLM Hallucinations
Authors:
Mahjabin Nahar,
Haeseung Seo,
Eun-Ju Lee,
Dongwon Lee
Abstract:
The widespread adoption and transformative effects of large language models (LLMs) have sparked concerns regarding their capacity to produce inaccurate and fictitious content, referred to as `hallucinations'. Given the potential risks associated with hallucinations, humans should be able to identify them. This research aims to understand the human perception of LLM hallucinations by systematically…
▽ More
The widespread adoption and transformative effects of large language models (LLMs) have sparked concerns regarding their capacity to produce inaccurate and fictitious content, referred to as `hallucinations'. Given the potential risks associated with hallucinations, humans should be able to identify them. This research aims to understand the human perception of LLM hallucinations by systematically varying the degree of hallucination (genuine, minor hallucination, major hallucination) and examining its interaction with warning (i.e., a warning of potential inaccuracies: absent vs. present). Participants (N=419) from Prolific rated the perceived accuracy and engaged with content (e.g., like, dislike, share) in a Q/A format. Results indicate that humans rank content as truthful in the order genuine > minor hallucination > major hallucination and user engagement behaviors mirror this pattern. More importantly, we observed that warning improves hallucination detection without significantly affecting the perceived truthfulness of genuine content. We conclude by offering insights for future tools to aid human detection of hallucinations.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Towards Pareto Optimal Throughput in Small Language Model Serving
Authors:
Pol G. Recasens,
Yue Zhu,
Chen Wang,
Eun Kyung Lee,
Olivier Tardieu,
Alaa Youssef,
Jordi Torres,
Josep Ll. Berral
Abstract:
Large language models (LLMs) have revolutionized the state-of-the-art of many different natural language processing tasks. Although serving LLMs is computationally and memory demanding, the rise of Small Language Models (SLMs) offers new opportunities for resource-constrained users, who now are able to serve small models with cutting-edge performance. In this paper, we present a set of experiments…
▽ More
Large language models (LLMs) have revolutionized the state-of-the-art of many different natural language processing tasks. Although serving LLMs is computationally and memory demanding, the rise of Small Language Models (SLMs) offers new opportunities for resource-constrained users, who now are able to serve small models with cutting-edge performance. In this paper, we present a set of experiments designed to benchmark SLM inference at performance and energy levels. Our analysis provides a new perspective in serving, highlighting that the small memory footprint of SLMs allows for reaching the Pareto-optimal throughput within the resource capacity of a single accelerator. In this regard, we present an initial set of findings demonstrating how model replication can effectively improve resource utilization for serving SLMs.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
HyperCLOVA X Technical Report
Authors:
Kang Min Yoo,
Jaegeun Han,
Sookyo In,
Heewon Jeon,
Jisu Jeong,
Jaewook Kang,
Hyunwook Kim,
Kyung-Min Kim,
Munhyong Kim,
Sungju Kim,
Donghyun Kwak,
Hanock Kwak,
Se Jung Kwon,
Bado Lee,
Dongsoo Lee,
Gichang Lee,
Jooho Lee,
Baeseong Park,
Seong** Shin,
Joonsang Yu,
Seolki Baek,
Sumin Byeon,
Eungsup Cho,
Dooseok Choe,
Jeesung Han
, et al. (371 additional authors not shown)
Abstract:
We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t…
▽ More
We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment to responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding, math, chatting, instruction-following, and harmlessness, in both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in Korean backed by a deep understanding of the language and cultural nuances. Further analysis of the inherent bilingual nature and its extension to multilingualism highlights the model's cross-lingual proficiency and strong generalization ability to untargeted languages, including machine translation between several language pairs and cross-lingual inference tasks. We believe that HyperCLOVA X can provide helpful guidance for regions or countries in develo** their sovereign LLMs.
△ Less
Submitted 13 April, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
Cosmological inference from combining Planck and ACT cluster counts
Authors:
Eunseong Lee,
Richard Battye,
Boris Bolliet
Abstract:
We have adapted the Planck cluster likelihood in such a way that it can be applied to the sample of clusters detected by the Atacama Cosmology Telescope (ACT). Applying it to the 2016 sample from Planck and the 2018 sample from ACT we find, by fixing the cosmology using CMB observations and the cluster model adopted by Planck, that the mass bias required by the two are…
▽ More
We have adapted the Planck cluster likelihood in such a way that it can be applied to the sample of clusters detected by the Atacama Cosmology Telescope (ACT). Applying it to the 2016 sample from Planck and the 2018 sample from ACT we find, by fixing the cosmology using CMB observations and the cluster model adopted by Planck, that the mass bias required by the two are $1-b_{\rm Planck}=0.61\pm 0.03$ and $1-b_{\rm ACT}=0.75\pm 0.06$. These are broadly in agreement but hint that the model could be adapted to reach a better agreement. By normalizing the cluster model using weak lensing observations, we find evidence for either evolution in the cluster model, quantified by the cluster modeling parameter describing redshift dependence $β=0.86 \pm 0.07$ using an updated CCCP-based normalization, or evolution in the cosmological model quantified by the dark energy equation of state parameter $w=-0.82 \pm 0.07$.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Surface-based parcellation and vertex-wise analysis of ultra high-resolution ex vivo 7 tesla MRI in Alzheimer's disease and related dementias
Authors:
Pulkit Khandelwal,
Michael Tran Duong,
Lisa Levorse,
Constanza Fuentes,
Amanda Denning,
Winifred Trotman,
Ranjit Ittyerah,
Alejandra Bahena,
Theresa Schuck,
Marianna Gabrielyan,
Karthik Prabhakaran,
Daniel Ohm,
Gabor Mizsei,
John Robinson,
Monica Munoz,
John Detre,
Edward Lee,
David Irwin,
Corey McMillan,
M. Dylan Tisdall,
Sandhitsu Das,
David Wolk,
Paul A. Yushkevich
Abstract:
Magnetic resonance imaging (MRI) is the standard modality to understand human brain structure and function in vivo (antemortem). Decades of research in human neuroimaging has led to the widespread development of methods and tools to provide automated volume-based segmentations and surface-based parcellations which help localize brain functions to specialized anatomical regions. Recently ex vivo (p…
▽ More
Magnetic resonance imaging (MRI) is the standard modality to understand human brain structure and function in vivo (antemortem). Decades of research in human neuroimaging has led to the widespread development of methods and tools to provide automated volume-based segmentations and surface-based parcellations which help localize brain functions to specialized anatomical regions. Recently ex vivo (postmortem) imaging of the brain has opened-up avenues to study brain structure at sub-millimeter ultra high-resolution revealing details not possible to observe with in vivo MRI. Unfortunately, there has been limited methodological development in ex vivo MRI primarily due to lack of datasets and limited centers with such imaging resources. Therefore, in this work, we present one-of-its-kind dataset of 82 ex vivo T2w whole brain hemispheres MRI at 0.3 mm isotropic resolution spanning Alzheimer's disease and related dementias. We adapted and developed a fast and easy-to-use automated surface-based pipeline to parcellate, for the first time, ultra high-resolution ex vivo brain tissue at the native subject space resolution using the Desikan-Killiany-Tourville (DKT) brain atlas. This allows us to perform vertex-wise analysis in the template space and thereby link morphometry measures with pathology measurements derived from histology. We will open-source our dataset docker container, Jupyter notebooks for ready-to-use out-of-the-box set of tools and command line options to advance ex vivo MRI clinical brain imaging research on the project webpage.
△ Less
Submitted 2 July, 2024; v1 submitted 28 March, 2024;
originally announced March 2024.
-
The SZ effect with anisotropic distributions and high energy electrons
Authors:
Elizabeth Lee,
Jens Chluba
Abstract:
Future observations of the Sunyaev-Zeldovich (SZ) effect promise ever improving measurements in terms of both sensitivity and angular resolution. As such, it is increasingly relevant to model `higher-order' contributions to the SZ effect. This work examines the effects of high-energy non-thermal electron distributions and those of anisotropic electron and photon distributions on the SZ signals. An…
▽ More
Future observations of the Sunyaev-Zeldovich (SZ) effect promise ever improving measurements in terms of both sensitivity and angular resolution. As such, it is increasingly relevant to model `higher-order' contributions to the SZ effect. This work examines the effects of high-energy non-thermal electron distributions and those of anisotropic electron and photon distributions on the SZ signals. Analytic forms of the anisotropic scattering kernels for photons and electrons have been derived and investigated. We present a method for determining the anisotropic contributions through a spherical harmonic decomposition to arbitrary angular multipoles, and discuss the behaviour of these scattering kernels. We then carry out an exploration of various simplistic models of high energy non-thermal electron distributions, and examine their anisotropic behaviour. The kinematic SZ in the relativistic regime is studied using the kernel formulation allowing us to clarifying the role of kinematic corrections to the scattering optical depth. We finally present a release of an updated and refined version of SZpack including a new integrated Python interface and new modules for the calculation of various SZ signals, including those described in this paper.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Enhancing Taiwanese Hokkien Dual Translation by Exploring and Standardizing of Four Writing Systems
Authors:
Bo-Han Lu,
Yi-Hsuan Lin,
En-Shiun Annie Lee,
Richard Tzong-Han Tsai
Abstract:
Machine translation focuses mainly on high-resource languages (HRLs), while low-resource languages (LRLs) like Taiwanese Hokkien are relatively under-explored. The study aims to address this gap by develo** a dual translation model between Taiwanese Hokkien and both Traditional Mandarin Chinese and English. We employ a pre-trained LLaMA 2-7B model specialized in Traditional Mandarin Chinese to l…
▽ More
Machine translation focuses mainly on high-resource languages (HRLs), while low-resource languages (LRLs) like Taiwanese Hokkien are relatively under-explored. The study aims to address this gap by develo** a dual translation model between Taiwanese Hokkien and both Traditional Mandarin Chinese and English. We employ a pre-trained LLaMA 2-7B model specialized in Traditional Mandarin Chinese to leverage the orthographic similarities between Taiwanese Hokkien Han and Traditional Mandarin Chinese. Our comprehensive experiments involve translation tasks across various writing systems of Taiwanese Hokkien as well as between Taiwanese Hokkien and other HRLs. We find that the use of a limited monolingual corpus still further improves the model's Taiwanese Hokkien capabilities. We then utilize our translation model to standardize all Taiwanese Hokkien writing systems into Hokkien Han, resulting in further performance improvements. Additionally, we introduce an evaluation method incorporating back-translation and GPT-4 to ensure reliable translation quality assessment even for LRLs. The study contributes to narrowing the resource gap for Taiwanese Hokkien and empirically investigates the advantages and limitations of pre-training and fine-tuning based on LLaMA 2.
△ Less
Submitted 14 May, 2024; v1 submitted 18 March, 2024;
originally announced March 2024.
-
The error in the prime number theorem in short intervals
Authors:
Ethan Simpson Lee
Abstract:
Using a new smoothing argument, we establish explicit versions of the prime number theorem in short intervals and consider the limiting behaviour of this result.
Using a new smoothing argument, we establish explicit versions of the prime number theorem in short intervals and consider the limiting behaviour of this result.
△ Less
Submitted 9 April, 2024; v1 submitted 18 March, 2024;
originally announced March 2024.
-
Zooming by in the CARPoolGP lane: new CAMELS-TNG simulations of zoomed-in massive halos
Authors:
Max E. Lee,
Shy Genel,
Benjamin D. Wandelt,
Benjamin Zhang,
Ana Maria Delgado,
Shivam Pandey,
Erwin T. Lau,
Christopher Carr,
Harrison Cook,
Daisuke Nagai,
Daniel Angles-Alcazar,
Francisco Villaescusa-Navarro,
Greg L. Bryan
Abstract:
Galaxy formation models within cosmological hydrodynamical simulations contain numerous parameters with non-trivial influences over the resulting properties of simulated cosmic structures and galaxy populations. It is computationally challenging to sample these high dimensional parameter spaces with simulations, particularly for halos in the high-mass end of the mass function. In this work, we dev…
▽ More
Galaxy formation models within cosmological hydrodynamical simulations contain numerous parameters with non-trivial influences over the resulting properties of simulated cosmic structures and galaxy populations. It is computationally challenging to sample these high dimensional parameter spaces with simulations, particularly for halos in the high-mass end of the mass function. In this work, we develop a novel sampling and reduced variance regression method, CARPoolGP, which leverages built-in correlations between samples in different locations of high dimensional parameter spaces to provide an efficient way to explore parameter space and generate low variance emulations of summary statistics. We use this method to extend the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) to include a set of 768 zoom-in simulations of halos in the mass range of $10^{13} - 10^{14.5} M_\odot\,h^{-1}$ that span a 28-dimensional parameter space in the IllustrisTNG model. With these simulations and the CARPoolGP emulation method, we explore parameter trends in the Compton $Y-M$, black hole mass-halo mass, and metallicity-mass relations, as well as thermodynamic profiles and quenched fractions of satellite galaxies. We use these emulations to provide a physical picture of the complex interplay between supernova and active galactic nuclei feedback. We then use emulations of the $Y-M$ relation of massive halos to perform Fisher forecasts on astrophysical parameters for future Sunyaev-Zeldovich observations and find a significant improvement in forecasted constraints. We publicly release both the simulation suite and CARPoolGP software package.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Approximating Small Sparse Cuts
Authors:
Aditya Anand,
Euiwoong Lee,
Jason Li,
Thatchaphol Saranurak
Abstract:
We study polynomial-time approximation algorithms for (edge/vertex) Sparsest Cut and Small Set Expansion in terms of $k$, the number of edges or vertices cut in the optimal solution. Our main results are $\mathcal{O}(\text{polylog}\, k)$-approximation algorithms for various versions in this setting.
Our techniques involve an extension of the notion of sample sets (Feige and Mahdian STOC'06), ori…
▽ More
We study polynomial-time approximation algorithms for (edge/vertex) Sparsest Cut and Small Set Expansion in terms of $k$, the number of edges or vertices cut in the optimal solution. Our main results are $\mathcal{O}(\text{polylog}\, k)$-approximation algorithms for various versions in this setting.
Our techniques involve an extension of the notion of sample sets (Feige and Mahdian STOC'06), originally developed for small balanced cuts, to sparse cuts in general. We then show how to combine this notion of sample sets with two algorithms, one based on an existing framework of LP rounding and another new algorithm based on the cut-matching game, to get such approximation algorithms. Our cut-matching game algorithm can be viewed as a local version of the cut-matching game by Khandekar, Khot, Orecchia and Vishnoi and certifies an expansion of every vertex set of size $s$ in $\mathcal{O}(\log s)$ rounds. These techniques may be of independent interest.
As corollaries of our results, we also obtain an $\mathcal{O}(\log opt)$-approximation for min-max graph partitioning, where $opt$ is the min-max value of the optimal cut, and improve the bound on the size of multicut mimicking networks computable in polynomial time.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Friends not Foes: Strong Correlation between Inner Super-Earths and Outer Gas Giants
Authors:
Marta L. Bryan,
Eve J. Lee
Abstract:
The connection between outer gas giants and inner super-Earths reflects their formation and evolutionary histories. Past work exploring this link has suggested a tentative positive correlation between these two populations, but these studies have been limited by small sample sizes and in some cases sample biases. Here we take a new look at this connection with a sample of 184 super-Earth systems w…
▽ More
The connection between outer gas giants and inner super-Earths reflects their formation and evolutionary histories. Past work exploring this link has suggested a tentative positive correlation between these two populations, but these studies have been limited by small sample sizes and in some cases sample biases. Here we take a new look at this connection with a sample of 184 super-Earth systems with publicly available radial velocity data and resolved outer gas giants. We calculate the frequency of outer gas giants (GG) in super-Earth (SE) systems, dividing our sample into metal-rich ([Fe/H] $>$ 0) and metal-poor ([Fe/H]$\leq$0) hosts. We find P(GG$|$SE, [Fe/H]$>$0) = 28.0$^{+4.9}_{-4.6}\%$ and P(GG$|$SE, [Fe/H]$\leq$0) = 4.5$^{+2.6}_{-1.9}\%$. Comparing these conditional occurrence rates to field giant occurrence rates from Rosenthal et al 2021, we show that there is a distinct positive correlation between inner super-Earths and outer gas giants for metal-rich host stars at the 2.7$σ$ level, but this correlation disappears for metal-poor systems. We further find that, around metal-rich stars, the GG/SE correlation enhances slightly for systems with giants that are more distant (beyond 3 AU), more eccentric ($e > 0.2$), and/or in multi-gas giant systems. Such trends disappear around metal-poor stars with the exception of systems of multiple giants in which we observe a tentative anti-correlation. Our findings highlight the critical role metallicity (disk solid budget) plays in sha** the overall planetary architecture.
△ Less
Submitted 21 May, 2024; v1 submitted 13 March, 2024;
originally announced March 2024.
-
An atlas of resolved spectral features in the transmission spectrum of WASP-189 b with MAROON-X
Authors:
B. Prinoth,
H. J. Hoeijmakers,
B. M. Morris,
M. Lam,
D. Kitzmann,
E. Sedaghati,
J. V. Seidel,
E. K. H. Lee,
B. Thorsbro,
N. W. Borsato,
Y. C. Damasceno,
S. Pelletier,
A. Seifahrt
Abstract:
Exoplanets in the ultra-hot Jupiter regime provide an excellent laboratory for testing the impact of stellar irradiation on the dynamics and chemical composition of gas giant atmospheres. In this study, we observed two transits of the ultra-hot Jupiter WASP-189 b with MAROON-X/Gemini-North to probe its high-altitude atmospheric layers, using strong absorption lines. We derived posterior probabilit…
▽ More
Exoplanets in the ultra-hot Jupiter regime provide an excellent laboratory for testing the impact of stellar irradiation on the dynamics and chemical composition of gas giant atmospheres. In this study, we observed two transits of the ultra-hot Jupiter WASP-189 b with MAROON-X/Gemini-North to probe its high-altitude atmospheric layers, using strong absorption lines. We derived posterior probability distributions for the planetary and stellar parameters by calculating the stellar spectrum behind the planet at every orbital phase during the transit. This was used to correct the Rossiter-McLaughlin imprint on the transmission spectra. Using differential transmission spectroscopy, we detect strong absorption lines of Ca+, Ba+, Na, H$α$, Mg, Fe, and Fe+, providing an unprecedented and detailed view of the atmospheric chemical composition. Ca+ absorption is particularly well suited for analysis through time-resolved narrow-band spectroscopy, owing to its transition lines formed in high-altitude layers. The spectral absorption lines show no significant blueshifts that would indicate high-altitude day-to-night winds, and further analysis is needed to investigate the implications for atmospheric dynamics. These high signal-to-noise observations provide a benchmark data set for testing high-resolution retrievals and the assumptions of atmospheric models. We also simulate observations of WASP-189 b with ANDES/ELT, and show that ANDES will be highly sensitive to the individual absorption lines of a myriad of elements and molecules, including TiO and CO.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Second gadolinium loading to Super-Kamiokande
Authors:
K. Abe,
C. Bronner,
Y. Hayato,
K. Hiraide,
K. Hosokawa,
K. Ieki,
M. Ikeda,
J. Kameda,
Y. Kanemura,
R. Kaneshima,
Y. Kashiwagi,
Y. Kataoka,
S. Miki,
S. Mine,
M. Miura,
S. Moriyama,
Y. Nakano,
M. Nakahata,
S. Nakayama,
Y. Noguchi,
K. Sato,
H. Sekiya,
H. Shiba,
K. Shimizu,
M. Shiozawa
, et al. (225 additional authors not shown)
Abstract:
The first loading of gadolinium (Gd) into Super-Kamiokande in 2020 was successful, and the neutron capture efficiency on Gd reached 50\%. To further increase the Gd neutron capture efficiency to 75\%, 26.1 tons of $\rm Gd_2(\rm SO_4)_3\cdot \rm 8H_2O$ was additionally loaded into Super-Kamiokande (SK) from May 31 to July 4, 2022. As the amount of loaded $\rm Gd_2(\rm SO_4)_3\cdot \rm 8H_2O$ was do…
▽ More
The first loading of gadolinium (Gd) into Super-Kamiokande in 2020 was successful, and the neutron capture efficiency on Gd reached 50\%. To further increase the Gd neutron capture efficiency to 75\%, 26.1 tons of $\rm Gd_2(\rm SO_4)_3\cdot \rm 8H_2O$ was additionally loaded into Super-Kamiokande (SK) from May 31 to July 4, 2022. As the amount of loaded $\rm Gd_2(\rm SO_4)_3\cdot \rm 8H_2O$ was doubled compared to the first loading, the capacity of the powder dissolving system was doubled. We also developed new batches of gadolinium sulfate with even further reduced radioactive impurities. In addition, a more efficient screening method was devised and implemented to evaluate these new batches of $\rm Gd_2(\rm SO_4)_3\cdot \rm 8H_2O$. Following the second loading, the Gd concentration in SK was measured to be $333.5\pm2.5$ ppm via an Atomic Absorption Spectrometer (AAS). From the mean neutron capture time constant of neutrons from an Am/Be calibration source, the Gd concentration was independently measured to be 332.7 $\pm$ 6.8(sys.) $\pm$ 1.1(stat.) ppm, consistent with the AAS result. Furthermore, during the loading the Gd concentration was monitored continually using the capture time constant of each spallation neutron produced by cosmic-ray muons,and the final neutron capture efficiency was shown to become 1.5 times higher than that of the first loaded phase, as expected.
△ Less
Submitted 18 June, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
Recent Advances, Applications, and Open Challenges in Machine Learning for Health: Reflections from Research Roundtables at ML4H 2023 Symposium
Authors:
Hyewon Jeong,
Sarah Jabbour,
Yuzhe Yang,
Rahul Thapta,
Hussein Mozannar,
William Jongwon Han,
Nikita Mehandru,
Michael Wornow,
Vladislav Lialin,
Xin Liu,
Alejandro Lozano,
Jiacheng Zhu,
Rafal Dariusz Kocielnik,
Keith Harrigian,
Haoran Zhang,
Edward Lee,
Milos Vukadinovic,
Aparna Balagopalan,
Vincent Jeanselme,
Katherine Matton,
Ilker Demirel,
Jason Fries,
Parisa Rashidi,
Brett Beaulieu-Jones,
Xuhai Orson Xu
, et al. (18 additional authors not shown)
Abstract:
The third ML4H symposium was held in person on December 10, 2023, in New Orleans, Louisiana, USA. The symposium included research roundtable sessions to foster discussions between participants and senior researchers on timely and relevant topics for the \ac{ML4H} community. Encouraged by the successful virtual roundtables in the previous year, we organized eleven in-person roundtables and four vir…
▽ More
The third ML4H symposium was held in person on December 10, 2023, in New Orleans, Louisiana, USA. The symposium included research roundtable sessions to foster discussions between participants and senior researchers on timely and relevant topics for the \ac{ML4H} community. Encouraged by the successful virtual roundtables in the previous year, we organized eleven in-person roundtables and four virtual roundtables at ML4H 2022. The organization of the research roundtables at the conference involved 17 Senior Chairs and 19 Junior Chairs across 11 tables. Each roundtable session included invited senior chairs (with substantial experience in the field), junior chairs (responsible for facilitating the discussion), and attendees from diverse backgrounds with interest in the session's topic. Herein we detail the organization process and compile takeaways from these roundtable discussions, including recent advances, applications, and open challenges for each topic. We conclude with a summary and lessons learned across all roundtables. This document serves as a comprehensive review paper, summarizing the recent advancements in machine learning for healthcare as contributed by foremost researchers in the field.
△ Less
Submitted 5 April, 2024; v1 submitted 3 March, 2024;
originally announced March 2024.
-
Max-Cut with $ε$-Accurate Predictions
Authors:
Vincent Cohen-Addad,
Tommaso d'Orsi,
Anupam Gupta,
Euiwoong Lee,
Debmalya Panigrahi
Abstract:
We study the approximability of the MaxCut problem in the presence of predictions. Specifically, we consider two models: in the noisy predictions model, for each vertex we are given its correct label in $\{-1,+1\}$ with some unknown probability $1/2 + ε$, and the other (incorrect) label otherwise. In the more-informative partial predictions model, for each vertex we are given its correct label wit…
▽ More
We study the approximability of the MaxCut problem in the presence of predictions. Specifically, we consider two models: in the noisy predictions model, for each vertex we are given its correct label in $\{-1,+1\}$ with some unknown probability $1/2 + ε$, and the other (incorrect) label otherwise. In the more-informative partial predictions model, for each vertex we are given its correct label with probability $ε$ and no label otherwise. We assume only pairwise independence between vertices in both models.
We show how these predictions can be used to improve on the worst-case approximation ratios for this problem. Specifically, we give an algorithm that achieves an $α+ \widetildeΩ(ε^4)$-approximation for the noisy predictions model, where $α\approx 0.878$ is the MaxCut threshold. While this result also holds for the partial predictions model, we can also give a $β+ Ω(ε)$-approximation, where $β\approx 0.858$ is the approximation ratio for MaxBisection given by Raghavendra and Tan. This answers a question posed by Ola Svensson in his plenary session talk at SODA'23.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
Random Silicon Sampling: Simulating Human Sub-Population Opinion Using a Large Language Model Based on Group-Level Demographic Information
Authors:
Seungjong Sun,
Eungu Lee,
Dongyan Nan,
Xiangying Zhao,
Wonbyung Lee,
Bernard J. Jansen,
Jang Hyun Kim
Abstract:
Large language models exhibit societal biases associated with demographic information, including race, gender, and others. Endowing such language models with personalities based on demographic data can enable generating opinions that align with those of humans. Building on this idea, we propose "random silicon sampling," a method to emulate the opinions of the human population sub-group. Our study…
▽ More
Large language models exhibit societal biases associated with demographic information, including race, gender, and others. Endowing such language models with personalities based on demographic data can enable generating opinions that align with those of humans. Building on this idea, we propose "random silicon sampling," a method to emulate the opinions of the human population sub-group. Our study analyzed 1) a language model that generates the survey responses that correspond with a human group based solely on its demographic distribution and 2) the applicability of our methodology across various demographic subgroups and thematic questions. Through random silicon sampling and using only group-level demographic information, we discovered that language models can generate response distributions that are remarkably similar to the actual U.S. public opinion polls. Moreover, we found that the replicability of language models varies depending on the demographic group and topic of the question, and this can be attributed to inherent societal biases in the models. Our findings demonstrate the feasibility of mirroring a group's opinion using only demographic distribution and elucidate the effect of social biases in language models on such simulations.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
Testing approximate infrared scattering radiative-transfer methods for hot Jupiter atmospheres
Authors:
Elspeth K. H. Lee
Abstract:
The calculation of internal atmospheric (longwave) fluxes is a key component of any model of exoplanet atmospheres that requires radiative-transfer (RT) calculations. For atmospheres containing a strong scattering component such as cloud particles, most 1D multiple-scattering RT methods typically involve numerically expensive matrix inversions. This computational bottleneck is exacerbated when mul…
▽ More
The calculation of internal atmospheric (longwave) fluxes is a key component of any model of exoplanet atmospheres that requires radiative-transfer (RT) calculations. For atmospheres containing a strong scattering component such as cloud particles, most 1D multiple-scattering RT methods typically involve numerically expensive matrix inversions. This computational bottleneck is exacerbated when multitudes of RT calculations are required, such as in general circulation models (GCMs) and retrieval methods. In an effort to increase the speed of RT calculations without sacrificing too much accuracy, we investigate the applicability of approximate longwave scattering methods developed for the Earth science community to hot Jupiter atmospheres. We test the absorption approximation (AA) and variational iteration method (VIM) applied to typical cloudy hot Jupiter scenarios, using 64 stream DISORT calculations as reference solutions. We find the four-stream VIM variant is a highly promising method to explore using for hot Jupiter GCM and retrieval modelling, showing excellent speed characteristics, with typical errors $\sim$1\% for outgoing fluxes and within $\sim$50\%, but with larger errors in the deep cloud layer test case, for vertical heating rates. Other methods explored in this study were found to typically produce similar error characteristics in vertical heating rates.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
Holographic Tests for Giant Graviton Expansion
Authors:
Seunggyu Kim,
Eunwoo Lee
Abstract:
It has been proposed that the superconformal index admits a novel reformulation, called giant graviton expansion. In this paper, we investigate the properties of dual $AdS_5$ black holes using the giant graviton expansion framework. First, we compute the entropy of black holes in $AdS_5\times S^5$ with fixed charges through a large $N$ saddle point analysis on the giant graviton index and further…
▽ More
It has been proposed that the superconformal index admits a novel reformulation, called giant graviton expansion. In this paper, we investigate the properties of dual $AdS_5$ black holes using the giant graviton expansion framework. First, we compute the entropy of black holes in $AdS_5\times S^5$ with fixed charges through a large $N$ saddle point analysis on the giant graviton index and further extremize it in the wrap** number. We identify a specific regime of fugacities where our saddle point analysis is valid. It turns out that this condition ensures the absence of closed-time-like curves and the stability of dual black hole solutions with equal charges. In addition, the giant graviton expansion of the index provides insights into how small black holes in AdS can be interpreted as bound states of branes. We extend our study to include the giant graviton expansion with the insertion of a half-BPS surface defect in $\mathcal{N}=4$ SYM with a $U(N)$ gauge group. Finally, we test the giant graviton expansion in various holographic theories whose dual geometries are $AdS_5\times S^5/\mathbb{Z}_k$ and $AdS_5\times SE_5$.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Learning to Learn Faster from Human Feedback with Language Model Predictive Control
Authors:
Jacky Liang,
Fei Xia,
Wenhao Yu,
Andy Zeng,
Montserrat Gonzalez Arenas,
Maria Attarian,
Maria Bauza,
Matthew Bennice,
Alex Bewley,
Adil Dostmohamed,
Chuyuan Kelly Fu,
Nimrod Gileadi,
Marissa Giustina,
Keerthana Gopalakrishnan,
Leonard Hasenclever,
Jan Humplik,
Jasmine Hsu,
Nikhil Joshi,
Ben Jyenis,
Chase Kew,
Sean Kirmani,
Tsang-Wei Edward Lee,
Kuang-Huei Lee,
Assaf Hurwitz Michaely,
Joss Moore
, et al. (25 additional authors not shown)
Abstract:
Large language models (LLMs) have been shown to exhibit a wide range of capabilities, such as writing robot code from language commands -- enabling non-experts to direct robot behaviors, modify them based on feedback, or compose them to perform new tasks. However, these capabilities (driven by in-context learning) are limited to short-term interactions, where users' feedback remains relevant for o…
▽ More
Large language models (LLMs) have been shown to exhibit a wide range of capabilities, such as writing robot code from language commands -- enabling non-experts to direct robot behaviors, modify them based on feedback, or compose them to perform new tasks. However, these capabilities (driven by in-context learning) are limited to short-term interactions, where users' feedback remains relevant for only as long as it fits within the context size of the LLM, and can be forgotten over longer interactions. In this work, we investigate fine-tuning the robot code-writing LLMs, to remember their in-context interactions and improve their teachability i.e., how efficiently they adapt to human inputs (measured by average number of corrections before the user considers the task successful). Our key observation is that when human-robot interactions are viewed as a partially observable Markov decision process (in which human language inputs are observations, and robot code outputs are actions), then training an LLM to complete previous interactions is training a transition dynamics model -- that can be combined with classic robotics techniques such as model predictive control (MPC) to discover shorter paths to success. This gives rise to Language Model Predictive Control (LMPC), a framework that fine-tunes PaLM 2 to improve its teachability on 78 tasks across 5 robot embodiments -- improving non-expert teaching success rates of unseen tasks by 26.9% while reducing the average number of human corrections from 2.4 to 1.9. Experiments show that LMPC also produces strong meta-learners, improving the success rate of in-context learning new tasks on unseen robot embodiments and APIs by 31.5%. See videos, code, and demos at: https://robot-teaching.github.io/.
△ Less
Submitted 31 May, 2024; v1 submitted 17 February, 2024;
originally announced February 2024.
-
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Authors:
Soroush Nasiriany,
Fei Xia,
Wenhao Yu,
Ted Xiao,
Jacky Liang,
Ishita Dasgupta,
Annie Xie,
Danny Driess,
Ayzaan Wahid,
Zhuo Xu,
Quan Vuong,
Tingnan Zhang,
Tsang-Wei Edward Lee,
Kuang-Huei Lee,
Peng Xu,
Sean Kirmani,
Yuke Zhu,
Andy Zeng,
Karol Hausman,
Nicolas Heess,
Chelsea Finn,
Sergey Levine,
Brian Ichter
Abstract:
Vision language models (VLMs) have shown impressive capabilities across a variety of tasks, from logical reasoning to visual understanding. This opens the door to richer interaction with the world, for example robotic control. However, VLMs produce only textual outputs, while robotic control and other spatial tasks require outputting continuous coordinates, actions, or trajectories. How can we ena…
▽ More
Vision language models (VLMs) have shown impressive capabilities across a variety of tasks, from logical reasoning to visual understanding. This opens the door to richer interaction with the world, for example robotic control. However, VLMs produce only textual outputs, while robotic control and other spatial tasks require outputting continuous coordinates, actions, or trajectories. How can we enable VLMs to handle such settings without fine-tuning on task-specific data?
In this paper, we propose a novel visual prompting approach for VLMs that we call Prompting with Iterative Visual Optimization (PIVOT), which casts tasks as iterative visual question answering. In each iteration, the image is annotated with a visual representation of proposals that the VLM can refer to (e.g., candidate robot actions, localizations, or trajectories). The VLM then selects the best ones for the task. These proposals are iteratively refined, allowing the VLM to eventually zero in on the best available answer. We investigate PIVOT on real-world robotic navigation, real-world manipulation from images, instruction following in simulation, and additional spatial inference tasks such as localization. We find, perhaps surprisingly, that our approach enables zero-shot control of robotic systems without any robot training data, navigation in a variety of environments, and other capabilities. Although current performance is far from perfect, our work highlights potentials and limitations of this new regime and shows a promising approach for Internet-Scale VLMs in robotic and spatial reasoning domains. Website: pivot-prompt.github.io and HuggingFace: https://huggingface.co/spaces/pivot-prompt/pivot-prompt-demo.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
GPTs Are Multilingual Annotators for Sequence Generation Tasks
Authors:
Juhwan Choi,
Eunju Lee,
Kyohoon **,
YoungBin Kim
Abstract:
Data annotation is an essential step for constructing new datasets. However, the conventional approach of data annotation through crowdsourcing is both time-consuming and expensive. In addition, the complexity of this process increases when dealing with low-resource languages owing to the difference in the language pool of crowdworkers. To address these issues, this study proposes an autonomous an…
▽ More
Data annotation is an essential step for constructing new datasets. However, the conventional approach of data annotation through crowdsourcing is both time-consuming and expensive. In addition, the complexity of this process increases when dealing with low-resource languages owing to the difference in the language pool of crowdworkers. To address these issues, this study proposes an autonomous annotation method by utilizing large language models, which have been recently demonstrated to exhibit remarkable performance. Through our experiments, we demonstrate that the proposed method is not just cost-efficient but also applicable for low-resource language annotation. Additionally, we constructed an image captioning dataset using our approach and are committed to open this dataset for future study. We have opened our source code for further study and reproducibility.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
Modeling Atmospheric Lines By the Exoplanet Community (MALBEC) version 1.0: A CUISINES radiative transfer intercomparison project
Authors:
Geronimo L. Villanueva,
Thomas J. Fauchez,
Vincent Kofman,
Eleonora Alei,
Elspeth K. H. Lee,
Estelle Janin,
Michael D. Himes,
Jeremy Leconte,
Michaela Leung,
Sara Faggi,
Mei Ting Mak,
Denis E. Sergeev,
Thea Kozakis,
James Manners,
Nathan Mayne,
Edward W. Schwieterman,
Alex R. Howe,
Natasha Batalha
Abstract:
Radiative transfer (RT) models are critical in the interpretation of exoplanetary spectra, in simulating exoplanet climates and when designing the specifications of future flagship observatories. However, most models differ in methodologies and input data, which can lead to significantly different spectra. In this paper, we present the experimental protocol of the MALBEC (Modeling Atmospheric Line…
▽ More
Radiative transfer (RT) models are critical in the interpretation of exoplanetary spectra, in simulating exoplanet climates and when designing the specifications of future flagship observatories. However, most models differ in methodologies and input data, which can lead to significantly different spectra. In this paper, we present the experimental protocol of the MALBEC (Modeling Atmospheric Lines By the Exoplanet Community) project. MALBEC is an exoplanet model intercomparison project (exoMIP) that belongs to the CUISINES (Climates Using Interactive Suites of Intercomparisons Nested for Exoplanet Studies) framework which aims to provide the exoplanet community with a large and diverse set of comparison and validation of models. The proposed protocol tests include a large set of initial participating RT models, a broad range of atmospheres (from Hot Jupiters to temperate terrestrials) and several observation geometries, which would allow us to quantify and compare the differences between different RT models used by the exoplanetary community. Two types of tests are proposed: transit spectroscopy and direct imaging modeling, with results from the proposed tests to be published in dedicated follow-up papers. To encourage the community to join this comparison effort and as an example, we present simulation results for one specific transit case (GJ-1214 b), in which we find notable differences in how the various codes handle the discretization of the atmospheres (e.g., sub-layering), the treatment of molecular opacities (e.g., correlated-k, line-by-line) and the default spectroscopic repositories generally used by each model (e.g., HITRAN, HITEMP, ExoMol).
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Predicting Machine Translation Performance on Low-Resource Languages: The Role of Domain Similarity
Authors:
Eric Khiu,
Hasti Toossi,
David Anugraha,
**yu Liu,
Jiaxu Li,
Juan Armando Parra Flores,
Leandro Acros Roman,
A. Seza Doğruöz,
En-Shiun Annie Lee
Abstract:
Fine-tuning and testing a multilingual large language model is expensive and challenging for low-resource languages (LRLs). While previous studies have predicted the performance of natural language processing (NLP) tasks using machine learning methods, they primarily focus on high-resource languages, overlooking LRLs and shifts across domains. Focusing on LRLs, we investigate three factors: the si…
▽ More
Fine-tuning and testing a multilingual large language model is expensive and challenging for low-resource languages (LRLs). While previous studies have predicted the performance of natural language processing (NLP) tasks using machine learning methods, they primarily focus on high-resource languages, overlooking LRLs and shifts across domains. Focusing on LRLs, we investigate three factors: the size of the fine-tuning corpus, the domain similarity between fine-tuning and testing corpora, and the language similarity between source and target languages. We employ classical regression models to assess how these factors impact the model's performance. Our results indicate that domain similarity has the most critical impact on predicting the performance of Machine Translation models.
△ Less
Submitted 4 February, 2024;
originally announced February 2024.