Search | arXiv e-print repository

Constant Modulus Waveform Design with Interference Exploitation for DFRC Systems: A Block-Level Optimization Approach

Authors: Byunghyun Lee, Anindya Bijoy Das, David Love, Christopher Brinton, James Krogmeier

Abstract: Dual-function radar-communication (DFRC) is a key enabler of location-based services for next-generation communication systems. In this paper, we investigate the problem of designing constant modulus waveforms for DFRC systems. For high-precision radar sensing, we consider joint optimization of the correlation properties and spatial beam pattern. For communication, we employ constructive interfere… ▽ More Dual-function radar-communication (DFRC) is a key enabler of location-based services for next-generation communication systems. In this paper, we investigate the problem of designing constant modulus waveforms for DFRC systems. For high-precision radar sensing, we consider joint optimization of the correlation properties and spatial beam pattern. For communication, we employ constructive interference-based block-level precoding (CI-BLP) to leverage distortion induced by multiuser multiple-input multiple-output (MU-MIMO) and radar transmission on a block level. We propose two solution algorithms based on the alternating direction method of multipliers (ADMM) and majorization-minimization (MM) principles, which are effective for small and large block sizes, respectively. The proposed ADMM-based solution decomposes the nonconvex formulated problem into multiple tractable subproblems, each of which admits a closed-form solution. To accelerate convergence of the MM-based solution, we propose an improved majorizing function that leverages a novel diagonal matrix structure. After majorization, we decompose the approximated problem into independent subproblems for parallelization, mitigating the complexity that increases with block size. We then evaluate the performance of the proposed algorithms through a series of numerical experiments. Simulation results demonstrate that the proposed methods can substantially enhance spatial/temporal sidelobe suppression through block-level optimization. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: arXiv admin note: text overlap with arXiv:2310.10804

arXiv:2406.16042 [pdf, other]

Pose-Diversified Augmentation with Diffusion Model for Person Re-Identification

Authors: Inès Hyeonsu Kim, JoungBin Lee, Soowon Son, Woojeong **, Kyusun Cho, Junyoung Seo, Min-Seop Kwak, Seokju Cho, JeongYeol Baek, Byeongwon Lee, Seungryong Kim

Abstract: Person re-identification (Re-ID) often faces challenges due to variations in human poses and camera viewpoints, which significantly affect the appearance of individuals across images. Existing datasets frequently lack diversity and scalability in these aspects, hindering the generalization of Re-ID models to new camera systems. Previous methods have attempted to address these issues through data a… ▽ More Person re-identification (Re-ID) often faces challenges due to variations in human poses and camera viewpoints, which significantly affect the appearance of individuals across images. Existing datasets frequently lack diversity and scalability in these aspects, hindering the generalization of Re-ID models to new camera systems. Previous methods have attempted to address these issues through data augmentation; however, they rely on human poses already present in the training dataset, failing to effectively reduce the human pose bias in the dataset. We propose Diff-ID, a novel data augmentation approach that incorporates sparse and underrepresented human pose and camera viewpoint examples into the training data, addressing the limited diversity in the original training data distribution. Our objective is to augment a training dataset that enables existing Re-ID models to learn features unbiased by human pose and camera viewpoint variations. To achieve this, we leverage the knowledge of pre-trained large-scale diffusion models. Using the SMPL model, we simultaneously capture both the desired human poses and camera viewpoints, enabling realistic human rendering. The depth information provided by the SMPL model indirectly conveys the camera viewpoints. By conditioning the diffusion model on both the human pose and camera viewpoint concurrently through the SMPL model, we generate realistic images with diverse human poses and camera viewpoints. Qualitative results demonstrate the effectiveness of our method in addressing human pose bias and enhancing the generalizability of Re-ID models compared to other data augmentation-based Re-ID approaches. The performance gains achieved by training Re-ID models on our offline augmented dataset highlight the potential of our proposed framework in improving the scalability and generalizability of person Re-ID models. △ Less

Submitted 23 June, 2024; originally announced June 2024.

Comments: The project page is available at https://ku-cvlab.github.io/Diff-ID/

arXiv:2406.12246 [pdf, other]

TroL: Traversal of Layers for Large Language and Vision Models

Authors: Byung-Kwan Lee, Sangyun Chung, Chae Won Kim, Beomchan Park, Yong Man Ro

Abstract: Large language and vision models (LLVMs) have been driven by the generalization power of large language models (LLMs) and the advent of visual instruction tuning. Along with scaling them up directly, these models enable LLVMs to showcase powerful vision language (VL) performances by covering diverse tasks via natural language instructions. However, existing open-source LLVMs that perform comparabl… ▽ More Large language and vision models (LLVMs) have been driven by the generalization power of large language models (LLMs) and the advent of visual instruction tuning. Along with scaling them up directly, these models enable LLVMs to showcase powerful vision language (VL) performances by covering diverse tasks via natural language instructions. However, existing open-source LLVMs that perform comparably to closed-source LLVMs such as GPT-4V are often considered too large (e.g., 26B, 34B, and 110B parameters), having a larger number of layers. These large models demand costly, high-end resources for both training and inference. To address this issue, we present a new efficient LLVM family with 1.8B, 3.8B, and 7B LLM model sizes, Traversal of Layers (TroL), which enables the reuse of layers in a token-wise manner. This layer traversing technique simulates the effect of looking back and retracing the answering stream while increasing the number of forward propagation layers without physically adding more layers. We demonstrate that TroL employs a simple layer traversing approach yet efficiently outperforms the open-source LLVMs with larger model sizes and rivals the performances of the closed-source LLVMs with substantial sizes. △ Less

Submitted 19 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

Comments: Code is available in https://github.com/ByungKwanLee/TroL

arXiv:2406.09698 [pdf, other]

Projected background and sensitivity of AMoRE-II

Authors: A. Agrawal, V. V. Alenkov, P. Aryal, J. Beyer, B. Bhandari, R. S. Boiko, K. Boonin, O. Buzanov, C. R. Byeon, N. Chanthima, M. K. Cheoun, J. S. Choe, Seonho Choi, S. Choudhury, J. S. Chung, F. A. Danevich, M. Djamal, D. Drung, C. Enss, A. Fleischmann, A. M. Gangapshev, L. Gastaldo, Y. M. Gavrilyuk, A. M. Gezhaev, O. Gileva , et al. (81 additional authors not shown)

Abstract: AMoRE-II aims to search for neutrinoless double beta decay with an array of 423 Li$_2$$^{100}$MoO$_4$ crystals operating in the cryogenic system as the main phase of the Advanced Molybdenum-based Rare process Experiment (AMoRE). AMoRE has been planned to operate in three phases: AMoRE-pilot, AMoRE-I, and AMoRE-II. AMoRE-II is currently being installed at the Yemi Underground Laboratory, located ap… ▽ More AMoRE-II aims to search for neutrinoless double beta decay with an array of 423 Li$_2$$^{100}$MoO$_4$ crystals operating in the cryogenic system as the main phase of the Advanced Molybdenum-based Rare process Experiment (AMoRE). AMoRE has been planned to operate in three phases: AMoRE-pilot, AMoRE-I, and AMoRE-II. AMoRE-II is currently being installed at the Yemi Underground Laboratory, located approximately 1000 meters deep in Jeongseon, Korea. The goal of AMoRE-II is to reach up to $T^{0νββ}_{1/2}$ $\sim$ 6 $\times$ 10$^{26}$ years, corresponding to an effective Majorana mass of 15 - 29 meV, covering all the inverted mass hierarchy regions. To achieve this, the background level of the experimental configurations and possible background sources of gamma and beta events should be well understood. We have intensively performed Monte Carlo simulations using the GEANT4 toolkit in all the experimental configurations with potential sources. We report the estimated background level that meets the 10$^{-4}$counts/(keV$\cdot$kg$\cdot$yr) requirement for AMoRE-II in the region of interest (ROI) and show the projected half-life sensitivity based on the simulation study. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.09460 [pdf, other]

doi 10.1002/advs.202401348

Origin of Distinct Insulating Domains in the Layered Charge Density Wave Material 1T-TaS2

Authors: Hyungryul Yang, Byeongin Lee, Junho Bang, Sunghun Kim, Dirk Wulferding, Sung-Hoon Lee, Doohee Cho

Abstract: Vertical charge order shapes the electronic properties in layered charge density wave (CDW) materials. Various stacking orders inevitably create nanoscale domains with distinct electronic structures inaccessible to bulk probes. Here, the stacking characteristics of bulk 1$T$-TaS$2$ are analyzed using scanning tunneling spectroscopy (STS) and density functional theory (DFT) calculations. It is obse… ▽ More Vertical charge order shapes the electronic properties in layered charge density wave (CDW) materials. Various stacking orders inevitably create nanoscale domains with distinct electronic structures inaccessible to bulk probes. Here, the stacking characteristics of bulk 1$T$-TaS$2$ are analyzed using scanning tunneling spectroscopy (STS) and density functional theory (DFT) calculations. It is observed that Mott-insulating domains undergo a transition to band-insulating domains restoring vertical dimerization of the CDWs. Furthermore, STS measurements covering a wide terrace reveal two distinct band insulating domains differentiated by band edge broadening. These DFT calculations reveal that the Mott insulating layers preferably reside on the subsurface, forming broader band edges in the neighboring band insulating layers. Ultimately, buried Mott insulating layers believed to harbor the quantum spin liquid phase are identified. These results resolve persistent issues regarding vertical charge order in 1$T$-TaS$2$, providing a new perspective for investigating emergent quantum phenomena in layered CDW materials. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 26 pages and 13 figures

arXiv:2406.08719 [pdf, other]

TikTag: Breaking ARM's Memory Tagging Extension with Speculative Execution

Authors: Juhee Kim, **bum Park, Sihyeon Roh, Jaeyoung Chung, Youngjoo Lee, Taesoo Kim, Byoungyoung Lee

Abstract: ARM Memory Tagging Extension (MTE) is a new hardware feature introduced in ARMv8.5-A architecture, aiming to detect memory corruption vulnerabilities. The low overhead of MTE makes it an attractive solution to mitigate memory corruption attacks in modern software systems and is considered the most promising path forward for improving C/C++ software security. This paper explores the potential secur… ▽ More ARM Memory Tagging Extension (MTE) is a new hardware feature introduced in ARMv8.5-A architecture, aiming to detect memory corruption vulnerabilities. The low overhead of MTE makes it an attractive solution to mitigate memory corruption attacks in modern software systems and is considered the most promising path forward for improving C/C++ software security. This paper explores the potential security risks posed by speculative execution attacks against MTE. Specifically, this paper identifies new TikTag gadgets capable of leaking the MTE tags from arbitrary memory addresses through speculative execution. With TikTag gadgets, attackers can bypass the probabilistic defense of MTE, increasing the attack success rate by close to 100%. We demonstrate that TikTag gadgets can be used to bypass MTE-based mitigations in real-world systems, Google Chrome and the Linux kernel. Experimental results show that TikTag gadgets can successfully leak an MTE tag with a success rate higher than 95% in less than 4 seconds. We further propose new defense mechanisms to mitigate the security risks posed by TikTag gadgets. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.08301 [pdf, other]

Jet modification via $π^0$-hadron correlations in Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV

Authors: PHENIX Collaboration, N. J. Abdulameer, U. Acharya, A. Adare, S. Afanasiev, C. Aidala, N. N. Ajitanand, Y. Akiba, H. Al-Bataineh, J. Alexander, M. Alfred, K. Aoki, N. Apadula, L. Aphecetche, J. Asai, H. Asano, E. T. Atomssa, R. Averbeck, T. C. Awes, B. Azmoun, V. Babintsev, M. Bai, G. Baksay, L. Baksay, A. Baldisseri , et al. (510 additional authors not shown)

Abstract: High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is obs… ▽ More High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is observed in the yield of high-momentum jet fragments opposite the trigger particle, which indicates jet suppression stemming from in-medium partonic energy loss, while enhancement is observed for low-momentum particles. The ratio and differences between the yield in Au$+$Au collisions and $p$$+$$p$ collisions, $I_{AA}$ and $Δ_{AA}$, as a function of the trigger-hadron azimuthal separation, $Δφ$, are measured for the first time at the Relativistic Heavy Ion Collider. These results better quantify how the yield of low-$p_T$ associated hadrons is enhanced at wide angle, which is crucial for studying energy loss as well as medium-response effects. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 534 authors from 83 institutions, 12 pages, 7 figures. v1 is version submitted to Physical Review C. HEPdata tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.html

arXiv:2406.07960 [pdf, other]

doi 10.1103/PhysRevB.109.195170

Charge ordered phases in the hole-doped triangular Mott insulator 4Hb-TaS2

Authors: Junho Bang, Byeongin Lee, Hyungryul Yang, Sunghun Kim, Dirk Wulferding, Doohee Cho

Abstract: 4Hb-TaS2 has been proposed to possess unconventional superconductivity with broken time reveral symmetry due to distinctive layered structure, featuring a heterojunction between a 2D triangular Mott insulator and a charge density wave metal. However, since a frustrated spin state in the correlated insulating layer is susceptible to charge ordering with carrier do**, it is required to investigate… ▽ More 4Hb-TaS2 has been proposed to possess unconventional superconductivity with broken time reveral symmetry due to distinctive layered structure, featuring a heterojunction between a 2D triangular Mott insulator and a charge density wave metal. However, since a frustrated spin state in the correlated insulating layer is susceptible to charge ordering with carrier do**, it is required to investigate the charge distribution driven by inter-layer charge transfer to understand its superconductivity. Here, we use scanning tunneling microscopy and spectroscopy (STM/S) to investigate the charge ordered phases of 1T-TaS2 layers within 4Hb-TaS2, explicitly focusing on the non-half-filled regime. Our STS results show an energy gap which exhibits an out-of-phase relation with the charge density. We ascribe the competition between on-site and nonlocal Coulomb repulsion as the driving force for the charge-ordered insulating phase of a doped triangular Mott insulator. In addition, we discuss the role of the insulating layer in the enhanced superconductivity of 4Hb-TaS2. △ Less

Submitted 17 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

Comments: 18 pages, 6 figures

Journal ref: Phys. Rev. B 109, 195170 (2024)

arXiv:2406.06913 [pdf]

Frustrated phonon with charge density wave in vanadium Kagome metal

Authors: Seung-Phil Heo, Choongjae Won, Heemin Lee, Hanbyul Kim, Eunyoung Park, Sung Yun Lee, Junha Hwang, Hyeongi Choi, Sang-Youn Park, Byungjune Lee, Woo-Suk Noh, Hoyoung Jang, Jae-Hoon Park, Dongbin Shin, Changyong Song

Abstract: Crystals with unique ionic arrangements and strong electronic correlations serve as a fertile ground for the emergence of exotic phases, as evidenced by the coexistence of charge density wave (CDW) and superconductivity in vanadium Kagome metals, specifically AV3Sb5 (where A represents K, Rb, or Cs). The formation of a star of David CDW superstructure, resulting from the coordinated displacements… ▽ More Crystals with unique ionic arrangements and strong electronic correlations serve as a fertile ground for the emergence of exotic phases, as evidenced by the coexistence of charge density wave (CDW) and superconductivity in vanadium Kagome metals, specifically AV3Sb5 (where A represents K, Rb, or Cs). The formation of a star of David CDW superstructure, resulting from the coordinated displacements of vanadium ions on a corner sharing triangular lattice, has garnered significant attention in efforts to comprehend the influence of electron phonon interaction within this geometrically intricate lattice. However, understanding of the underlying mechanism behind CDW formation, coupled with symmetry protected lattice vibrations, remains elusive. In this study, we employed time resolved X ray scattering experiments utilising an X ray free electron laser. Our findings reveal that the phonon mode associated with the out of plane motion of Cs ions becomes frustrated in the CDW phase. Furthermore, we observed the photoinduced emergence of a metastable CDW phase, facilitated by the alleviation of frustration through nonadiabatic changes in free energy. By elucidating the longstanding puzzle surrounding the intervention of phonons in CDW ordering, this research offers fresh insights into the competition between phonons and periodic lattice distortions, a phenomenon widespread in other correlated quantum materials including layered high Tc superconductors. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: Manuscript: 20 pages, 4 figures, SI: 14 pages, 8 figures

arXiv:2406.06316 [pdf, other]

Tx-LLM: A Large Language Model for Therapeutics

Authors: Juan Manuel Zambrano Chaves, Eric Wang, Tao Tu, Eeshit Dhaval Vaishnav, Byron Lee, S. Sara Mahdavi, Christopher Semturs, David Fleet, Vivek Natarajan, Shekoofeh Azizi

Abstract: Develo** therapeutics is a lengthy and expensive process that requires the satisfaction of many different criteria, and AI models capable of expediting the process would be invaluable. However, the majority of current AI approaches address only a narrowly defined set of tasks, often circumscribed within a particular domain. To bridge this gap, we introduce Tx-LLM, a generalist large language mod… ▽ More Develo** therapeutics is a lengthy and expensive process that requires the satisfaction of many different criteria, and AI models capable of expediting the process would be invaluable. However, the majority of current AI approaches address only a narrowly defined set of tasks, often circumscribed within a particular domain. To bridge this gap, we introduce Tx-LLM, a generalist large language model (LLM) fine-tuned from PaLM-2 which encodes knowledge about diverse therapeutic modalities. Tx-LLM is trained using a collection of 709 datasets that target 66 tasks spanning various stages of the drug discovery pipeline. Using a single set of weights, Tx-LLM simultaneously processes a wide variety of chemical or biological entities(small molecules, proteins, nucleic acids, cell lines, diseases) interleaved with free-text, allowing it to predict a broad range of associated properties, achieving competitive with state-of-the-art (SOTA) performance on 43 out of 66 tasks and exceeding SOTA on 22. Among these, Tx-LLM is particularly powerful and exceeds best-in-class performance on average for tasks combining molecular SMILES representations with text such as cell line names or disease names, likely due to context learned during pretraining. We observe evidence of positive transfer between tasks with diverse drug types (e.g.,tasks involving small molecules and tasks involving proteins), and we study the impact of model size, domain finetuning, and prompting strategies on performance. We believe Tx-LLM represents an important step towards LLMs encoding biochemical knowledge and could have a future role as an end-to-end tool across the drug discovery development pipeline. △ Less

Submitted 10 June, 2024; originally announced June 2024.

arXiv:2406.06072 [pdf, other]

Adapting Pretrained ViTs with Convolution Injector for Visuo-Motor Control

Authors: Dongyoon Hwang, Byungkun Lee, Hojoon Lee, Hyunseung Kim, Jaegul Choo

Abstract: Vision Transformers (ViT), when paired with large-scale pretraining, have shown remarkable performance across various computer vision tasks, primarily due to their weak inductive bias. However, while such weak inductive bias aids in pretraining scalability, this may hinder the effective adaptation of ViTs for visuo-motor control tasks as a result of the absence of control-centric inductive biases.… ▽ More Vision Transformers (ViT), when paired with large-scale pretraining, have shown remarkable performance across various computer vision tasks, primarily due to their weak inductive bias. However, while such weak inductive bias aids in pretraining scalability, this may hinder the effective adaptation of ViTs for visuo-motor control tasks as a result of the absence of control-centric inductive biases. Such absent inductive biases include spatial locality and translation equivariance bias which convolutions naturally offer. To this end, we introduce Convolution Injector (CoIn), an add-on module that injects convolutions which are rich in locality and equivariance biases into a pretrained ViT for effective adaptation in visuo-motor control. We evaluate CoIn with three distinct types of pretrained ViTs (CLIP, MVP, VC-1) across 12 varied control tasks within three separate domains (Adroit, MetaWorld, DMC), and demonstrate that CoIn consistently enhances control task performance across all experimented environments and models, validating the effectiveness of providing pretrained ViTs with control-centric biases. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: accepted to ICML 2024

arXiv:2406.05431 [pdf]

MaTableGPT: GPT-based Table Data Extractor from Materials Science Literature

Authors: Gyeong Hoon Yi, Jiwoo Choi, Hyeongyun Song, Olivia Miano, Jaewoong Choi, Kihoon Bang, Byungju Lee, Seok Su Sohn, David Buttler, Anna Hiszpanski, Sang Soo Han, Donghun Kim

Abstract: Efficiently extracting data from tables in the scientific literature is pivotal for building large-scale databases. However, the tables reported in materials science papers exist in highly diverse forms; thus, rule-based extractions are an ineffective approach. To overcome this challenge, we present MaTableGPT, which is a GPT-based table data extractor from the materials science literature. MaTabl… ▽ More Efficiently extracting data from tables in the scientific literature is pivotal for building large-scale databases. However, the tables reported in materials science papers exist in highly diverse forms; thus, rule-based extractions are an ineffective approach. To overcome this challenge, we present MaTableGPT, which is a GPT-based table data extractor from the materials science literature. MaTableGPT features key strategies of table data representation and table splitting for better GPT comprehension and filtering hallucinated information through follow-up questions. When applied to a vast volume of water splitting catalysis literature, MaTableGPT achieved an extraction accuracy (total F1 score) of up to 96.8%. Through comprehensive evaluations of the GPT usage cost, labeling cost, and extraction accuracy for the learning methods of zero-shot, few-shot and fine-tuning, we present a Pareto-front map** where the few-shot learning method was found to be the most balanced solution owing to both its high extraction accuracy (total F1 score>95%) and low cost (GPT usage cost of 5.97 US dollars and labeling cost of 10 I/O paired examples). The statistical analyses conducted on the database generated by MaTableGPT revealed valuable insights into the distribution of the overpotential and elemental utilization across the reported catalysts in the water splitting literature. △ Less

Submitted 8 June, 2024; originally announced June 2024.

arXiv:2406.04147 [pdf, other]

Direct optimization of neoclassical ion transport in stellarator reactors

Authors: B. F. Lee, S. A. Lazerson, H. M. Smith, C. D. Beidler, N. A. Pablant

Abstract: We directly optimize stellarator neoclassical ion transport while holding neoclassical electron transport at a moderate level, creating a scenario favorable for impurity expulsion and retaining good ion confinement. Traditional neoclassical stellarator optimization has focused on minimizing $ε_\mathrm{eff}$, the geometric factor that characterizes the amount of radial transport due to particles in… ▽ More We directly optimize stellarator neoclassical ion transport while holding neoclassical electron transport at a moderate level, creating a scenario favorable for impurity expulsion and retaining good ion confinement. Traditional neoclassical stellarator optimization has focused on minimizing $ε_\mathrm{eff}$, the geometric factor that characterizes the amount of radial transport due to particles in the $1/ν$ regime. Under expected reactor-relevant conditions, core electrons will be in the $1/ν$ regime and core fuel ions will be in the $\sqrtν$ regime. Traditional optimizations thus minimize electron transport and rely on the radial electric field $\left(E_r\right)$ that develops to confine the ions. This often results in an inward-pointing $E_r$ that drives high-$Z$ impurities into the core, which may be troublesome in future reactors. In our optimizations, we increase the ratio of the thermal transport coefficients $L_{1 1}^{e}/L_{1 1}^{i}$, which previous work has shown can create an outward-pointing $E_r$. This effect is very beneficial for impurity expulsion. We obtain self-consistent density, temperature, and $E_r$ profiles at reactor-relevant conditions for optimized equilibria. These equilibria are expected to enjoy significantly improved impurity transport properties. We conclude by providing several directions of future research that may help further improve the presented optimization algorithm. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2406.03867 [pdf, other]

A Comprehensive Study of Quantum Arithmetic Circuits

Authors: Siyi Wang, Xiufan Li, Wei Jie Bryan Lee, Suman Deb, Eugene Lim, Anupam Chattopadhyay

Abstract: In recent decades, the field of quantum computing has experienced remarkable progress. This progress is marked by the superior performance of many quantum algorithms compared to their classical counterparts, with Shor's algorithm serving as a prominent illustration. Quantum arithmetic circuits, which are the fundamental building blocks in numerous quantum algorithms, have attracted much attention.… ▽ More In recent decades, the field of quantum computing has experienced remarkable progress. This progress is marked by the superior performance of many quantum algorithms compared to their classical counterparts, with Shor's algorithm serving as a prominent illustration. Quantum arithmetic circuits, which are the fundamental building blocks in numerous quantum algorithms, have attracted much attention. Despite extensive exploration of various designs in the existing literature, researchers remain keen on develo** novel designs and improving existing ones. In this review article, we aim to provide a systematically organized and easily comprehensible overview of the current state-of-the-art in quantum arithmetic circuits. Specifically, this study covers fundamental operations such as addition, subtraction, multiplication, division and modular exponentiation. We delve into the detailed quantum implementations of these prominent designs and evaluate their efficiency considering various objectives. We also discuss potential applications of presented arithmetic circuits and suggest future research directions. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: Under review at the Royal Society's Philosophical Transactions A

arXiv:2406.02562 [pdf, other]

Gated Low-rank Adaptation for personalized Code-Switching Automatic Speech Recognition on the low-spec devices

Authors: Gwantae Kim, Bokyeung Lee, Donghyeon Kim, Hanseok Ko

Abstract: In recent times, there has been a growing interest in utilizing personalized large models on low-spec devices, such as mobile and CPU-only devices. However, utilizing a personalized large model in the on-device is inefficient, and sometimes limited due to computational cost. To tackle the problem, this paper presents the weights separation method to minimize on-device model weights using parameter… ▽ More In recent times, there has been a growing interest in utilizing personalized large models on low-spec devices, such as mobile and CPU-only devices. However, utilizing a personalized large model in the on-device is inefficient, and sometimes limited due to computational cost. To tackle the problem, this paper presents the weights separation method to minimize on-device model weights using parameter-efficient fine-tuning methods. Moreover, some people speak multiple languages in an utterance, as known as code-switching, the personalized ASR model is necessary to address such cases. However, current multilingual speech recognition models are limited to recognizing a single language within each utterance. To tackle this problem, we propose code-switching speech recognition models that incorporate fine-tuned monolingual and multilingual speech recognition models. Additionally, we introduce a gated low-rank adaptation(GLoRA) for parameter-efficient fine-tuning with minimal performance degradation. Our experiments, conducted on Korean-English code-switching datasets, demonstrate that fine-tuning speech recognition models for code-switching surpasses the performance of traditional code-switching speech recognition models trained from scratch. Furthermore, GLoRA enhances parameter-efficient fine-tuning performance compared to conventional LoRA. △ Less

Submitted 23 April, 2024; originally announced June 2024.

Comments: Table 2 is revised

Journal ref: ICASSP 2024 Workshop(HSCMA 2024) paper

arXiv:2406.01570 [pdf, ps, other]

Single Trajectory Conformal Prediction

Authors: Brian Lee, Nikolai Matni

Abstract: We study the performance of risk-controlling prediction sets (RCPS), an empirical risk minimization-based formulation of conformal prediction, with a single trajectory of temporally correlated data from an unknown stochastic dynamical system. First, we use the blocking technique to show that RCPS attains performance guarantees similar to those enjoyed in the iid setting whenever data is generated… ▽ More We study the performance of risk-controlling prediction sets (RCPS), an empirical risk minimization-based formulation of conformal prediction, with a single trajectory of temporally correlated data from an unknown stochastic dynamical system. First, we use the blocking technique to show that RCPS attains performance guarantees similar to those enjoyed in the iid setting whenever data is generated by asymptotically stationary and contractive dynamics. Next, we use the decoupling technique to characterize the graceful degradation in RCPS guarantees when the data generating process deviates from stationarity and contractivity. We conclude by discussing how these tools could be used toward a unified analysis of online and offline conformal prediction algorithms, which are currently treated with very different tools. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: 16 pages

arXiv:2406.00730 [pdf, other]

Assessing survival models by interval testing

Authors: Ben Lee

Abstract: When considering many survival models, decisions become more challenging in health economic evaluation. In this paper, we present a set of methods to assist with selecting the most appropriate survival models. The methods highlight areas of particularly poor fit. Furthermore, plots and overall p-values provide guidance on whether a survival model should be rejected or not. When considering many survival models, decisions become more challenging in health economic evaluation. In this paper, we present a set of methods to assist with selecting the most appropriate survival models. The methods highlight areas of particularly poor fit. Furthermore, plots and overall p-values provide guidance on whether a survival model should be rejected or not. △ Less

Submitted 2 June, 2024; originally announced June 2024.

Comments: Main: 11 pages. Total: 14 pages

arXiv:2406.00324 [pdf, other]

Do's and Don'ts: Learning Desirable Skills with Instruction Videos

Authors: Hyunseung Kim, Byungkun Lee, Hojoon Lee, Dongyoon Hwang, Donghu Kim, Jaegul Choo

Abstract: Unsupervised skill discovery is a learning paradigm that aims to acquire diverse behaviors without explicit rewards. However, it faces challenges in learning complex behaviors and often leads to learning unsafe or undesirable behaviors. For instance, in various continuous control tasks, current unsupervised skill discovery methods succeed in learning basic locomotions like standing but struggle wi… ▽ More Unsupervised skill discovery is a learning paradigm that aims to acquire diverse behaviors without explicit rewards. However, it faces challenges in learning complex behaviors and often leads to learning unsafe or undesirable behaviors. For instance, in various continuous control tasks, current unsupervised skill discovery methods succeed in learning basic locomotions like standing but struggle with learning more complex movements such as walking and running. Moreover, they may acquire unsafe behaviors like trip** and rolling or navigate to undesirable locations such as pitfalls or hazardous areas. In response, we present DoDont (Do's and Don'ts), an instruction-based skill discovery algorithm composed of two stages. First, in an instruction learning stage, DoDont leverages action-free instruction videos to train an instruction network to distinguish desirable transitions from undesirable ones. Then, in the skill learning stage, the instruction network adjusts the reward function of the skill discovery algorithm to weight the desired behaviors. Specifically, we integrate the instruction network into a distance-maximizing skill discovery algorithm, where the instruction network serves as the distance function. Empirically, with less than 8 instruction videos, DoDont effectively learns desirable behaviors and avoids undesirable ones across complex continuous control tasks. Code and videos are available at https://mynsng.github.io/dodont/ △ Less

Submitted 1 June, 2024; originally announced June 2024.

arXiv:2405.17918 [pdf, other]

Cost-Sensitive Multi-Fidelity Bayesian Optimization with Transfer of Learning Curve Extrapolation

Authors: Dong Bok Lee, Aoxuan Silvia Zhang, Byungjoo Kim, Junhyeon Park, Juho Lee, Sung Ju Hwang, Hae Beom Lee

Abstract: In this paper, we address the problem of cost-sensitive multi-fidelity Bayesian Optimization (BO) for efficient hyperparameter optimization (HPO). Specifically, we assume a scenario where users want to early-stop the BO when the performance improvement is not satisfactory with respect to the required computational cost. Motivated by this scenario, we introduce utility, which is a function predefin… ▽ More In this paper, we address the problem of cost-sensitive multi-fidelity Bayesian Optimization (BO) for efficient hyperparameter optimization (HPO). Specifically, we assume a scenario where users want to early-stop the BO when the performance improvement is not satisfactory with respect to the required computational cost. Motivated by this scenario, we introduce utility, which is a function predefined by each user and describes the trade-off between cost and performance of BO. This utility function, combined with our novel acquisition function and stop** criterion, allows us to dynamically choose for each BO step the best configuration that we expect to maximally improve the utility in future, and also automatically stop the BO around the maximum utility. Further, we improve the sample efficiency of existing learning curve (LC) extrapolation methods with transfer learning, while successfully capturing the correlations between different configurations to develop a sensible surrogate function for multi-fidelity BO. We validate our algorithm on various LC datasets and found it outperform all the previous multi-fidelity BO and transfer-BO baselines we consider, achieving significantly better trade-off between cost and performance of BO. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.16855 [pdf, ps, other]

Maximal operators given by Fourier multipliers with dilation of fractional dimensions

Authors: ** Bong Lee, **sol Seo

Abstract: In this paper, we investigate $L^p$ bounds of maximal Fourier multiplier operators with dilation of fractional dimensions. For the Fourier multipliers, we suggest a criterion related to dimensions of dilation sets which guarantees $L^p$ bounds of the maximal operators for each $p$. Our criterion covers Mikhlin-type multipliers, multipliers with limited decay, and multipliers with slow decay. In this paper, we investigate $L^p$ bounds of maximal Fourier multiplier operators with dilation of fractional dimensions. For the Fourier multipliers, we suggest a criterion related to dimensions of dilation sets which guarantees $L^p$ bounds of the maximal operators for each $p$. Our criterion covers Mikhlin-type multipliers, multipliers with limited decay, and multipliers with slow decay. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 15 pages

MSC Class: 42B25; 42B15; 42B35; 42B37

arXiv:2405.15998 [pdf, other]

Gauss-Bonnet Cosmology: large-temperature behaviour and bounds from Gravitational Waves

Authors: Anirban Biswas, Arpan Kar, Bum-Hoon Lee, Hocheol Lee, Wonwoo Lee, Stefano Scopel, Liliana Velasco-Sevilla, Lu Yin

Abstract: We provide a transparent discussion of the high temperature asymptotic behaviour of Cosmology in a dilaton-Einstein-Gauss-Bonnet (dEGB) scenario of modified gravity with vanishing scalar potential. In particular, we show that it has a clear interpretation in terms of only three attractors (stable critical points) of a set of autonomous differential equations: $w=-\frac{1}{3}$, $w=1$ and… ▽ More We provide a transparent discussion of the high temperature asymptotic behaviour of Cosmology in a dilaton-Einstein-Gauss-Bonnet (dEGB) scenario of modified gravity with vanishing scalar potential. In particular, we show that it has a clear interpretation in terms of only three attractors (stable critical points) of a set of autonomous differential equations: $w=-\frac{1}{3}$, $w=1$ and $1<w<\frac{7}{3}$, where $w\equiv p/ρ$ is the equation of state, defined as the ratio of the total pressure and the total energy density. All the possible different high-temperature evolution histories of the model are exhausted by only eight paths in the flow of the set of the autonomous differential equations. Our discussion clearly explains why five out of them are characterized by a swift transition of the system toward the attractor, while the remaining three show a more convoluted evolution, where the system follows a meta-stable equation of state at intermediate temperatures before eventually jum** to the real attractor at higher temperatures. Compared to standard Cosmology, the regions of the dEGB parameter space with $w=-\frac{1}{3}$ show a strong enhancement of the expected Gravitational Wave stochastic background produced by the primordial plasma of relativistic particles of the Standard Model. This is due to the very peculiar fact that dEGB allows to have an epoch when the energy density $ρ_{\rm rad}$ of the relativistic plasma dominates the energy of the Universe while at the same time the rate of dilution with $T$ of the total energy density is slower than what usually expected during radiation dominance. This allows to use the bound from BBN to put in dEGB a constraint $T_{\rm RH}\lesssim 10^8 - 10^9$ GeV on the reheating temperature of the Universe $T_{\rm RH}$. Such BBN bound is complementary to late-time constraints from compact binary mergers. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 38 pages, 15 figures and one table

Report number: CQUeST-2024-0735

arXiv:2405.15574 [pdf, other]

Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Authors: Byung-Kwan Lee, Chae Won Kim, Beomchan Park, Yong Man Ro

Abstract: The rapid development of large language and vision models (LLVMs) has been driven by advances in visual instruction tuning. Recently, open-source LLVMs have curated high-quality visual instruction tuning datasets and utilized additional vision encoders or multiple computer vision models in order to narrow the performance gap with powerful closed-source LLVMs. These advancements are attributed to m… ▽ More The rapid development of large language and vision models (LLVMs) has been driven by advances in visual instruction tuning. Recently, open-source LLVMs have curated high-quality visual instruction tuning datasets and utilized additional vision encoders or multiple computer vision models in order to narrow the performance gap with powerful closed-source LLVMs. These advancements are attributed to multifaceted information required for diverse capabilities, including fundamental image understanding, real-world knowledge about common-sense and non-object concepts (e.g., charts, diagrams, symbols, signs, and math problems), and step-by-step procedures for solving complex questions. Drawing from the multifaceted information, we present a new efficient LLVM, Mamba-based traversal of rationales (Meteor), which leverages multifaceted rationale to enhance understanding and answering capabilities. To embed lengthy rationales containing abundant information, we employ the Mamba architecture, capable of processing sequential data with linear time complexity. We introduce a new concept of traversal of rationale that facilitates efficient embedding of rationale. Subsequently, the backbone multimodal language model (MLM) is trained to generate answers with the aid of rationale. Through these steps, Meteor achieves significant improvements in vision language performances across multiple evaluation benchmarks requiring diverse capabilities, without scaling up the model size or employing additional vision encoders and computer vision models. △ Less

Submitted 27 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

Comments: Code is available in https://github.com/ByungKwanLee/Meteor

arXiv:2405.13858 [pdf, other]

Carbon Connect: An Ecosystem for Sustainable Computing

Authors: Benjamin C. Lee, David Brooks, Arthur van Benthem, Udit Gupta, Gage Hills, Vincent Liu, Benjamin Pierce, Christopher Stewart, Emma Strubell, Gu-Yeon Wei, Adam Wierman, Yuan Yao, Minlan Yu

Abstract: Computing is at a moment of profound opportunity. Emerging applications -- such as capable artificial intelligence, immersive virtual realities, and pervasive sensor systems -- drive unprecedented demand for computer. Despite recent advances toward net zero carbon emissions, the computing industry's gross energy usage continues to rise at an alarming rate, outpacing the growth of new energy instal… ▽ More Computing is at a moment of profound opportunity. Emerging applications -- such as capable artificial intelligence, immersive virtual realities, and pervasive sensor systems -- drive unprecedented demand for computer. Despite recent advances toward net zero carbon emissions, the computing industry's gross energy usage continues to rise at an alarming rate, outpacing the growth of new energy installations and renewable energy deployments. A shift towards sustainability is needed to spark a transformation in how computer systems are manufactured, allocated, and consumed. Carbon Connect envisions coordinated research thrusts that produce design and management strategies for sustainable, next-generation computer systems. These strategies must flatten and then reverse growth trajectories for computing power and carbon for society's most rapidly growing applications such as artificial intelligence and virtual spaces. We will require accurate models for carbon accounting in computing technology. For embodied carbon, we must re-think conventional design strategies -- over-provisioned monolithic servers, frequent hardware refresh cycles, custom silicon -- and adopt life-cycle design strategies that more effectively reduce, reuse and recycle hardware at scale. For operational carbon, we must not only embrace renewable energy but also design systems to use that energy more efficiently. Finally, new hardware design and management strategies must be cognizant of economic policy and regulatory landscape, aligning private initiatives with societal goals. Many of these broader goals will require computer scientists to develop deep, enduring collaborations with researchers in economics, law, and industrial ecology to spark change in broader practice. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2405.10013 [pdf, ps, other]

Charged rotating wormholes: charge without charge

Authors: Hyeong-Chan Kim, Sung-Won Kim, Bum-Hoon Lee, Wonwoo Lee

Abstract: We present a family of charged rotating wormhole solutions to the Einstein-Maxwell equations, supported by anisotropic matter fields. We first revisit the charged static cases and analyze the conditions for the solution to represent a wormhole geometry. The rotating geometry is obtained by applying the Newman-Janis algorithm to the static geometry. We show the solutions to Maxwell equations in det… ▽ More We present a family of charged rotating wormhole solutions to the Einstein-Maxwell equations, supported by anisotropic matter fields. We first revisit the charged static cases and analyze the conditions for the solution to represent a wormhole geometry. The rotating geometry is obtained by applying the Newman-Janis algorithm to the static geometry. We show the solutions to Maxwell equations in detail. We believe that our wormhole geometry offers a geometric realization corresponding to the concept of 'charge without charge'. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: 24 pages, 3 figures

arXiv:2405.03212 [pdf, other]

Using magnetic dynamics to measure the spin gap in a candidate Kitaev material

Authors: Xinyi Jiang, Qingzheng Qiu, Cheng Peng, Hoyoung Jang, Wenjie Chen, Xianghong **, Li Yue, Byungjune Lee, Sang-Youn Park, Minseok Kim, Hyeong-Do Kim, Xinqiang Cai, Qizhi Li, Tao Dong, Nanlin Wang, Joshua J. Turner, Yuan Li, Yao Wang, Yingying Peng

Abstract: Materials potentially hosting Kitaev spin-liquid states are considered crucial for realizing topological quantum computing. However, the intricate nature of spin interactions within these materials complicates the precise measurement of low-energy spin excitations indicative of fractionalized excitations. Using Na$_{2}$Co$_2$TeO$_{6}$ as an example, we study these low-energy spin excitations using… ▽ More Materials potentially hosting Kitaev spin-liquid states are considered crucial for realizing topological quantum computing. However, the intricate nature of spin interactions within these materials complicates the precise measurement of low-energy spin excitations indicative of fractionalized excitations. Using Na$_{2}$Co$_2$TeO$_{6}$ as an example, we study these low-energy spin excitations using the time-resolved resonant elastic x-ray scattering (tr-REXS). Our observations unveil remarkably slow spin dynamics at the magnetic peak, whose recovery timescale is several nanoseconds. This timescale aligns with the extrapolated spin gap of $\sim$ 1 $μ$eV, obtained by density matrix renormalization group (DMRG) simulations in the thermodynamic limit. The consistency demonstrates the efficacy of tr-REXS in discerning low-energy spin gaps inaccessible to conventional spectroscopic techniques. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: 9 pages, 6 figures

arXiv:2405.00260 [pdf, other]

CREPE: Coordinate-Aware End-to-End Document Parser

Authors: Yamato Okamoto, Youngmin Baek, Geewook Kim, Ryota Nakao, DongHyun Kim, Moon Bin Yim, Seunghyun Park, Bado Lee

Abstract: In this study, we formulate an OCR-free sequence generation model for visual document understanding (VDU). Our model not only parses text from document images but also extracts the spatial coordinates of the text based on the multi-head architecture. Named as Coordinate-aware End-to-end Document Parser (CREPE), our method uniquely integrates these capabilities by introducing a special token for OC… ▽ More In this study, we formulate an OCR-free sequence generation model for visual document understanding (VDU). Our model not only parses text from document images but also extracts the spatial coordinates of the text based on the multi-head architecture. Named as Coordinate-aware End-to-end Document Parser (CREPE), our method uniquely integrates these capabilities by introducing a special token for OCR text, and token-triggered coordinate decoding. We also proposed a weakly-supervised framework for cost-efficient training, requiring only parsing annotations without high-cost coordinate annotations. Our experimental evaluations demonstrate CREPE's state-of-the-art performances on document parsing tasks. Beyond that, CREPE's adaptability is further highlighted by its successful usage in other document understanding tasks such as layout analysis, document visual question answering, and so one. CREPE's abilities including OCR and semantic parsing not only mitigate error propagation issues in existing OCR-dependent methods, it also significantly enhance the functionality of sequence generation models, ushering in a new era for document understanding studies. △ Less

Submitted 30 April, 2024; originally announced May 2024.

Comments: Accepted at the International Conference on Document Analysis and Recognition (ICDAR 2024) main conference

arXiv:2404.16489 [pdf, other]

doi 10.1145/3626183.3659964

Cost-Driven Data Replication with Predictions

Authors: Tianyu Zuo, Xueyan Tang, Bu Sung Lee

Abstract: This paper studies an online replication problem for distributed data access. The goal is to dynamically create and delete data copies in a multi-server system as time passes to minimize the total storage and network cost of serving access requests. We study the problem in the emergent learning-augmented setting, assuming simple binary predictions about inter-request times at individual servers. W… ▽ More This paper studies an online replication problem for distributed data access. The goal is to dynamically create and delete data copies in a multi-server system as time passes to minimize the total storage and network cost of serving access requests. We study the problem in the emergent learning-augmented setting, assuming simple binary predictions about inter-request times at individual servers. We develop an online algorithm and prove that it is ($\frac{5+α}{3}$)-consistent (competitiveness under perfect predictions) and ($1 + \frac{1}α$)-robust (competitiveness under terrible predictions), where $α\in (0, 1]$ is a hyper-parameter representing the level of distrust in the predictions. We also study the impact of mispredictions on the competitive ratio of the proposed algorithm and adapt it to achieve a bounded robustness while retaining its consistency. We further establish a lower bound of $\frac{3}{2}$ on the consistency of any deterministic learning-augmented algorithm. Experimental evaluations are carried out to evaluate our algorithms using real data access traces. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: The formal version of this draft will appear in ACM SPAA'24 conference

arXiv:2404.14079 [pdf, other]

Dynamical scaling and Planckian dissipation due to heavy-fermion quantum criticality

Authors: Andreas Gleis, Seung-Sup B. Lee, Gabriel Kotliar, Jan von Delft

Abstract: We study dynamical scaling associated with a Kondo-breakdown quantum critical point (KB-QCP) of the periodic Anderson model, treated by two-site cellular dynamical mean-field theory (2CDMFT). In the quantum critical region, the staggered spin exhibits SYK-like slow dynamics and its dynamical susceptibility shows $ω/T$ scaling. We propose a scaling Ansatz that describes this behavior. It also impli… ▽ More We study dynamical scaling associated with a Kondo-breakdown quantum critical point (KB-QCP) of the periodic Anderson model, treated by two-site cellular dynamical mean-field theory (2CDMFT). In the quantum critical region, the staggered spin exhibits SYK-like slow dynamics and its dynamical susceptibility shows $ω/T$ scaling. We propose a scaling Ansatz that describes this behavior. It also implies Planckian dissipation for the longest-lived excitations. The current susceptibility follows the same scaling ansatz, leading to strange-metal scaling. This demonstrates that the KB-QCP described by 2CDMFT is an intrinsic (i.e., disorder-free) strange-metal fixed point. Surprisingly, the SYK-like dynamics and scaling are driven by strong vertex contributions to the susceptibilities. Our results for the optical conductivity match experimental observations on YbRh${}_2$Si${}_2$ and CeCoIn${}_5$. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: 16 pages, 13 figures, comments are welcome!

arXiv:2404.11017 [pdf, other]

doi 10.1117/12.2567224

SPHEREx: NASA's Near-Infrared Spectrophotmetric All-Sky Survey

Authors: Brendan P. Crill, Michael Werner, Rachel Akeson, Matthew Ashby, Lindsey Bleem, James J. Bock, Sean Bryan, Jill Burnham, Joyce Byunh, Tzu-Ching Chang, Yi-Kuan Chiang, Walter Cook, Asantha Cooray, Andrew Davis, Olivier Doré, C. Darren Dowell, Gregory Dubois-Felsmann, Tim Eifler, Andreas Faisst, Salman Habib, Chen Heinrich, Katrin Heitmann, Grigory Heaton, Christopher Hirata, Viktor Hristov , et al. (29 additional authors not shown)

Abstract: SPHEREx, the Spectro-Photometer for the History of the Universe, Epoch of Reionization, and ices Explorer, is a NASA MIDEX mission planned for launch in 2024. SPHEREx will carry out the first all-sky spectral survey at wavelengths between 0.75 micron and 5 micron with spectral resolving power ~40 between 0.75 and 3.8 micron and ~120 between 3.8 and 5 micron At the end of its two-year mission, SPHE… ▽ More SPHEREx, the Spectro-Photometer for the History of the Universe, Epoch of Reionization, and ices Explorer, is a NASA MIDEX mission planned for launch in 2024. SPHEREx will carry out the first all-sky spectral survey at wavelengths between 0.75 micron and 5 micron with spectral resolving power ~40 between 0.75 and 3.8 micron and ~120 between 3.8 and 5 micron At the end of its two-year mission, SPHEREx will provide 0.75-to-5 micron spectra of each 6.2"x6.2" pixel on the sky - 14 billion spectra in all. This paper updates an earlier description of SPHEREx presenting changes made during the mission's Preliminary Design Phase, including a discussion of instrument integration and test and a summary of the data processing, analysis, and distribution plans. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Journal ref: Proceedings Volume 11443, Space Telescopes and Instrumentation 2020: Optical, Infrared, and Millimeter Wave; 114430I (2020)

arXiv:2404.09030 [pdf, other]

Active Learning for Control-Oriented Identification of Nonlinear Systems

Authors: Bruce D. Lee, Ingvar Ziemann, George J. Pappas, Nikolai Matni

Abstract: Model-based reinforcement learning is an effective approach for controlling an unknown system. It is based on a longstanding pipeline familiar to the control community in which one performs experiments on the environment to collect a dataset, uses the resulting dataset to identify a model of the system, and finally performs control synthesis using the identified model. As interacting with the syst… ▽ More Model-based reinforcement learning is an effective approach for controlling an unknown system. It is based on a longstanding pipeline familiar to the control community in which one performs experiments on the environment to collect a dataset, uses the resulting dataset to identify a model of the system, and finally performs control synthesis using the identified model. As interacting with the system may be costly and time consuming, targeted exploration is crucial for develo** an effective control-oriented model with minimal experimentation. Motivated by this challenge, recent work has begun to study finite sample data requirements and sample efficient algorithms for the problem of optimal exploration in model-based reinforcement learning. However, existing theory and algorithms are limited to model classes which are linear in the parameters. Our work instead focuses on models with nonlinear parameter dependencies, and presents the first finite sample analysis of an active learning algorithm suitable for a general class of nonlinear dynamics. In certain settings, the excess control cost of our algorithm achieves the optimal rate, up to logarithmic factors. We validate our approach in simulation, showcasing the advantage of active, control-oriented exploration for controlling nonlinear systems. △ Less

Submitted 13 April, 2024; originally announced April 2024.

arXiv:2404.05541 [pdf, other]

Overcomplete intermediate representation of two-particle Green's functions and its relation to partial spectral functions

Authors: Selina Dirnböck, Seung-Sup B. Lee, Fabian B. Kugler, Sebastian Huber, Jan von Delft, Karsten Held, Markus Wallerberger

Abstract: Two-particle response functions are a centerpiece of both experimental and theoretical quantum many-body physics. Yet, due to their size and discontinuity structure, they are challenging to handle numerically. Recently, two advances were made to tackle this problem: first, the overcomplete intermediate representation (OIR), which provides a highly efficient compression of Green's functions in imag… ▽ More Two-particle response functions are a centerpiece of both experimental and theoretical quantum many-body physics. Yet, due to their size and discontinuity structure, they are challenging to handle numerically. Recently, two advances were made to tackle this problem: first, the overcomplete intermediate representation (OIR), which provides a highly efficient compression of Green's functions in imaginary frequency, and second, partial spectral functions (PSFs), which allow for an efficient evaluation in real frequency. We show that there is a two-to-one correspondence between PSFs and OIR coefficients and exploit this fact to construct the OIR for three-or-more-particle propagators. We then use OIR to fit and compress imaginary-frequency data obtained from the numerical renormalization group (NRG), reaching a compression ratio of more than 400. Finally, we attempt to match the OIR data to partial Green's functions from NRG.Due to the overcompleteness, we achieve only qualitative agreement. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: 9 pages, 4 figures

arXiv:2404.05026 [pdf, ps, other]

On two-coloring bipartite uniform hypergraphs

Authors: Boyoon Lee, Theodore Molla, Brendan Nagle

Abstract: Of a given bipartite graph $G = (V, E)$, it is elementary to construct a bipartition in time $O(|E|)$. For a given $k$-graph $H = H^{(k)}$ with $k \geq 3$ fixed, Lovász proved that deciding whether $H$ is bipartite is NP-complete. Let $\mathcal{B}_n$ denote the collection of all $[n]$-vertex bipartite $k$-graphs. We construct, of a given $H \in \mathcal{B}_n$, a bipartition in time averaging… ▽ More Of a given bipartite graph $G = (V, E)$, it is elementary to construct a bipartition in time $O(|E|)$. For a given $k$-graph $H = H^{(k)}$ with $k \geq 3$ fixed, Lovász proved that deciding whether $H$ is bipartite is NP-complete. Let $\mathcal{B}_n$ denote the collection of all $[n]$-vertex bipartite $k$-graphs. We construct, of a given $H \in \mathcal{B}_n$, a bipartition in time averaging $O(n^k)$ over the class $\mathcal{B}_n$. (When $k=3$, our result expedites one of Person and Schacht.) We also consider an application to the class of all $[n]$-vertex 3-graphs $H$ forbidding the Fano plane as a subhypergraph. △ Less

Submitted 7 April, 2024; originally announced April 2024.

Comments: 11 pages

arXiv:2404.01954 [pdf, other]

HyperCLOVA X Technical Report

Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seong** Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment to responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding, math, chatting, instruction-following, and harmlessness, in both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in Korean backed by a deep understanding of the language and cultural nuances. Further analysis of the inherent bilingual nature and its extension to multilingualism highlights the model's cross-lingual proficiency and strong generalization ability to untargeted languages, including machine translation between several language pairs and cross-lingual inference tasks. We believe that HyperCLOVA X can provide helpful guidance for regions or countries in develo** their sovereign LLMs. △ Less

Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: 44 pages; updated authors list and fixed author names

arXiv:2404.01636 [pdf, other]

Learning to Control Camera Exposure via Reinforcement Learning

Authors: Kyunghyun Lee, Ukcheol Shin, Byeong-Uk Lee

Abstract: Adjusting camera exposure in arbitrary lighting conditions is the first step to ensure the functionality of computer vision applications. Poorly adjusted camera exposure often leads to critical failure and performance degradation. Traditional camera exposure control methods require multiple convergence steps and time-consuming processes, making them unsuitable for dynamic lighting conditions. In t… ▽ More Adjusting camera exposure in arbitrary lighting conditions is the first step to ensure the functionality of computer vision applications. Poorly adjusted camera exposure often leads to critical failure and performance degradation. Traditional camera exposure control methods require multiple convergence steps and time-consuming processes, making them unsuitable for dynamic lighting conditions. In this paper, we propose a new camera exposure control framework that rapidly controls camera exposure while performing real-time processing by exploiting deep reinforcement learning. The proposed framework consists of four contributions: 1) a simplified training ground to simulate real-world's diverse and dynamic lighting changes, 2) flickering and image attribute-aware reward design, along with lightweight state design for real-time processing, 3) a static-to-dynamic lighting curriculum to gradually improve the agent's exposure-adjusting capability, and 4) domain randomization techniques to alleviate the limitation of the training ground and achieve seamless generalization in the wild.As a result, our proposed method rapidly reaches a desired exposure level within five steps with real-time processing (1 ms). Also, the acquired images are well-exposed and show superiority in various computer vision tasks, such as feature extraction and object detection. △ Less

Submitted 2 April, 2024; originally announced April 2024.

Comments: Accepted at CVPR 2024, *First two authors contributed equally to this work. Project page link: https://sites.google.com/view/drl-ae

arXiv:2403.19985 [pdf, other]

Stable Surface Regularization for Fast Few-Shot NeRF

Authors: Byeongin Joung, Byeong-Uk Lee, Jaesung Choe, Ukcheol Shin, Minjun Kang, Taeyeop Lee, In So Kweon, Kuk-** Yoon

Abstract: This paper proposes an algorithm for synthesizing novel views under few-shot setup. The main concept is to develop a stable surface regularization technique called Annealing Signed Distance Function (ASDF), which anneals the surface in a coarse-to-fine manner to accelerate convergence speed. We observe that the Eikonal loss - which is a widely known geometric regularization - requires dense traini… ▽ More This paper proposes an algorithm for synthesizing novel views under few-shot setup. The main concept is to develop a stable surface regularization technique called Annealing Signed Distance Function (ASDF), which anneals the surface in a coarse-to-fine manner to accelerate convergence speed. We observe that the Eikonal loss - which is a widely known geometric regularization - requires dense training signal to shape different level-sets of SDF, leading to low-fidelity results under few-shot training. In contrast, the proposed surface regularization successfully reconstructs scenes and produce high-fidelity geometry with stable training. Our method is further accelerated by utilizing grid representation and monocular geometric priors. Finally, the proposed approach is up to 45 times faster than existing few-shot novel view synthesis methods, and it produces comparable results in the ScanNet dataset and NeRF-Real dataset. △ Less

Submitted 29 March, 2024; originally announced March 2024.

Comments: 3DV 2024

arXiv:2403.18707 [pdf, other]

Connections between Reachability and Time Optimality

Authors: Juho Bae, Ji Hoon Bai, Byung-Yoon Lee, Jun-Yong Lee, Chang-Hun Lee

Abstract: This paper presents the concept of an equivalence relation between the set of optimal control problems. By leveraging this concept, we show that the boundary of the reachability set can be constructed by the solutions of time optimal problems. Alongside, a more generalized equivalence theorem is presented together. The findings facilitate the use of solution structures from a certain class of opti… ▽ More This paper presents the concept of an equivalence relation between the set of optimal control problems. By leveraging this concept, we show that the boundary of the reachability set can be constructed by the solutions of time optimal problems. Alongside, a more generalized equivalence theorem is presented together. The findings facilitate the use of solution structures from a certain class of optimal control problems to address problems in corresponding equivalent classes. As a byproduct, we state and prove the construction methods of the reachability sets of three-dimensional curves with prescribed curvature bound. The findings are twofold: Firstly, we prove that any boundary point of the reachability set, with the terminal direction taken into account, can be accessed via curves of H, CSC, CCC, or their respective subsegments, where H denotes a helicoidal arc, C a circular arc with maximum curvature, and S a straight segment. Secondly, we show that any boundary point of the reachability set, without considering the terminal direction, can be accessed by curves of CC, CS, or their respective subsegments. These findings extend the developments presented in literature regarding planar curves, or Dubins car dynamics, into spatial curves in $\mathbb{R}^3$. For higher dimensions, we confirm that the problem of identifying the reachability set of curvature bounded paths subsumes the well-known Markov-Dubins problem. These advancements in understanding the reachability of curvature bounded paths in $\mathbb{R}^3$ hold significant practical implications, particularly in the contexts of mission planning problems and time optimal guidance. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: Submitted to Automatica

arXiv:2403.18222 [pdf, other]

Uncertainty-Aware Deployment of Pre-trained Language-Conditioned Imitation Learning Policies

Authors: Bo Wu, Bruce D. Lee, Kostas Daniilidis, Bernadette Bucher, Nikolai Matni

Abstract: Large-scale robotic policies trained on data from diverse tasks and robotic platforms hold great promise for enabling general-purpose robots; however, reliable generalization to new environment conditions remains a major challenge. Toward addressing this challenge, we propose a novel approach for uncertainty-aware deployment of pre-trained language-conditioned imitation learning agents. Specifical… ▽ More Large-scale robotic policies trained on data from diverse tasks and robotic platforms hold great promise for enabling general-purpose robots; however, reliable generalization to new environment conditions remains a major challenge. Toward addressing this challenge, we propose a novel approach for uncertainty-aware deployment of pre-trained language-conditioned imitation learning agents. Specifically, we use temperature scaling to calibrate these models and exploit the calibrated model to make uncertainty-aware decisions by aggregating the local information of candidate actions. We implement our approach in simulation using three such pre-trained models, and showcase its potential to significantly enhance task completion rates. The accompanying code is accessible at the link: https://github.com/BobWu1998/uncertainty_quant_all.git △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: 8 pages, 7 figures

arXiv:2403.15692 [pdf, other]

Block Orthogonal Sparse Superposition Codes for $ \sf{L}^3 $ Communications: Low Error Rate, Low Latency, and Low Power Consumption

Authors: Donghwa Han, Bowhyung Lee, Min Jang, Donghun Lee, Seho Myung, Namyoon Lee

Abstract: Block orthogonal sparse superposition (BOSS) code is a class of joint coded modulation methods, which can closely achieve the finite-blocklength capacity with a low-complexity decoder at a few coding rates under Gaussian channels. However, for fading channels, the code performance degrades considerably because coded symbols experience different channel fading effects. In this paper, we put forth n… ▽ More Block orthogonal sparse superposition (BOSS) code is a class of joint coded modulation methods, which can closely achieve the finite-blocklength capacity with a low-complexity decoder at a few coding rates under Gaussian channels. However, for fading channels, the code performance degrades considerably because coded symbols experience different channel fading effects. In this paper, we put forth novel joint demodulation and decoding methods for BOSS codes under fading channels. For a fast fading channel, we present a minimum mean square error approximate maximum a posteriori (MMSE-A-MAP) algorithm for the joint demodulation and decoding when channel state information is available at the receiver (CSIR). We also propose a joint demodulation and decoding method without using CSIR for a block fading channel scenario. We refer to this as the non-coherent sphere decoding (NSD) algorithm. Simulation results demonstrate that BOSS codes with MMSE-A-MAP decoding outperform CRC-aided polar codes, while NSD decoding achieves comparable performance to quasi-maximum likelihood decoding with significantly reduced complexity. Both decoding algorithms are suitable for parallelization, satisfying low-latency constraints. Additionally, real-time simulations on a software-defined radio testbed validate the feasibility of using BOSS codes for low-power transmission. △ Less

Submitted 22 March, 2024; originally announced March 2024.

arXiv:2403.15321 [pdf, other]

doi 10.1111/cgf.15105

Visual Highlighting for Situated Brushing and Linking

Authors: Nina Doerr, Benjamin Lee, Katarina Baricova, Dieter Schmalstieg, Michael Sedlmair

Abstract: Brushing and linking is widely used for visual analytics in desktop environments. However, using this approach to link many data items between situated (e.g., a virtual screen with data) and embedded views (e.g., highlighted objects in the physical environment) is largely unexplored. To this end, we study the effectiveness of visual highlighting techniques in hel** users identify and link physic… ▽ More Brushing and linking is widely used for visual analytics in desktop environments. However, using this approach to link many data items between situated (e.g., a virtual screen with data) and embedded views (e.g., highlighted objects in the physical environment) is largely unexplored. To this end, we study the effectiveness of visual highlighting techniques in hel** users identify and link physical referents to brushed data marks in a situated scatterplot. In an exploratory virtual reality user study (N=20), we evaluated four highlighting techniques under different physical layouts and tasks. We discuss the effectiveness of these techniques, as well as implications for the design of brushing and linking operations in situated analytics. △ Less

Submitted 11 June, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

Comments: published at EuroVis 2024

arXiv:2403.15266 [pdf]

Graph neural network coarse-grain force field for the molecular crystal RDX

Authors: Brian H. Lee, James P. Larentzos, John K. Brennan, Alejandro Strachan

Abstract: Condense phase molecular systems organize in wide range of distinct molecular configurations, including amorphous melt and glass as well as crystals often exhibiting polymorphism, that originate from their intricate intra- and intermolecular forces. While accurate coarse-grain (CG) models for these materials are critical to understand phenomena beyond the reach of all-atom simulations, current mod… ▽ More Condense phase molecular systems organize in wide range of distinct molecular configurations, including amorphous melt and glass as well as crystals often exhibiting polymorphism, that originate from their intricate intra- and intermolecular forces. While accurate coarse-grain (CG) models for these materials are critical to understand phenomena beyond the reach of all-atom simulations, current models cannot capture the diversity of molecular structures. We introduce a generally applicable approach to develop CG force fields for molecular crystals combining graph neural networks (GNN) and data from an all-atom simulations and apply it to the high-energy density material RDX. We address the challenge of expanding the training data with relevant configurations via an iterative procedure that performs CG molecular dynamics of processes of interest and reconstructs the atomistic configurations using a pre-trained neural network decoder. The multi-site CG model uses a GNN architecture constructed to satisfy translational invariance and rotational covariance for forces. The resulting model captures both crystalline and amorphous states for a wide range of temperatures and densities. △ Less

Submitted 22 March, 2024; originally announced March 2024.

arXiv:2403.13517 [pdf, other]

doi 10.1145/3652920.3653043

Putting Our Minds Together: Iterative Exploration for Collaborative Mind Map**

Authors: Ying Yang, Tim Dwyer, Zachari Swiecki, Benjamin Lee, Michael Wybrow, Maxime Cordeil, Teresa Wulandari, Bruce H. Thomas, Mark Billinghurst

Abstract: We delineate the development of a mind-map** system designed concurrently for both VR and desktop platforms. Employing an iterative methodology with groups of users, we systematically examined and improved various facets of our system, including interactions, communication mechanisms and gamification elements, to streamline the mind-map** process while augmenting situational awareness and prom… ▽ More We delineate the development of a mind-map** system designed concurrently for both VR and desktop platforms. Employing an iterative methodology with groups of users, we systematically examined and improved various facets of our system, including interactions, communication mechanisms and gamification elements, to streamline the mind-map** process while augmenting situational awareness and promoting active engagement among collaborators. We also report our observational findings on these facets from this iterative design process. △ Less

Submitted 23 March, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

Comments: Accepted at AHs 2024

arXiv:2403.12341 [pdf, ps, other]

Diophantine approximation by rational numbers of certain parity types

Authors: Dong Han Kim, Seul Bee Lee, Lingmin Liao

Abstract: For a given irrational number, we consider the properties of best rational approximations of given parities. There are three different kinds of rational numbers according to the parity of the numerator and denominator, say odd/odd, even/odd and odd/even rational numbers. We study algorithms to find best approximations by rational numbers of given parities and compare these algorithms with continue… ▽ More For a given irrational number, we consider the properties of best rational approximations of given parities. There are three different kinds of rational numbers according to the parity of the numerator and denominator, say odd/odd, even/odd and odd/even rational numbers. We study algorithms to find best approximations by rational numbers of given parities and compare these algorithms with continued fraction expansions. △ Less

Submitted 18 March, 2024; originally announced March 2024.

Comments: 22 pages, 6 figures

MSC Class: 11J04; 11J70

arXiv:2403.07508 [pdf, other]

MoAI: Mixture of All Intelligence for Large Language and Vision Models

Authors: Byung-Kwan Lee, Beomchan Park, Chae Won Kim, Yong Man Ro

Abstract: The rise of large language models (LLMs) and instruction tuning has led to the current trend of instruction-tuned large language and vision models (LLVMs). This trend involves either meticulously curating numerous instruction tuning datasets tailored to specific objectives or enlarging LLVMs to manage vast amounts of vision language (VL) data. However, current LLVMs have disregarded the detailed a… ▽ More The rise of large language models (LLMs) and instruction tuning has led to the current trend of instruction-tuned large language and vision models (LLVMs). This trend involves either meticulously curating numerous instruction tuning datasets tailored to specific objectives or enlarging LLVMs to manage vast amounts of vision language (VL) data. However, current LLVMs have disregarded the detailed and comprehensive real-world scene understanding available from specialized computer vision (CV) models in visual perception tasks such as segmentation, detection, scene graph generation (SGG), and optical character recognition (OCR). Instead, the existing LLVMs rely mainly on the large capacity and emergent capabilities of their LLM backbones. Therefore, we present a new LLVM, Mixture of All Intelligence (MoAI), which leverages auxiliary visual information obtained from the outputs of external segmentation, detection, SGG, and OCR models. MoAI operates through two newly introduced modules: MoAI-Compressor and MoAI-Mixer. After verbalizing the outputs of the external CV models, the MoAI-Compressor aligns and condenses them to efficiently use relevant auxiliary visual information for VL tasks. MoAI-Mixer then blends three types of intelligence (1) visual features, (2) auxiliary features from the external CV models, and (3) language features by utilizing the concept of Mixture of Experts. Through this integration, MoAI significantly outperforms both open-source and closed-source LLVMs in numerous zero-shot VL tasks, particularly those related to real-world scene understanding such as object existence, positions, relations, and OCR without enlarging the model size or curating extra visual instruction tuning datasets. △ Less

Submitted 12 March, 2024; originally announced March 2024.

Comments: Code available: https://github.com/ByungKwanLee/MoAI

arXiv:2403.05530 [pdf, other]

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content. △ Less

Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.02568 [pdf, other]

Designing Born-Accessible Courses in Data Science and Visualization: Challenges and Opportunities of a Remote Curriculum Taught by Blind Instructors to Blind Students

Authors: JooYoung Seo, Sile O'Modhrain, Yilin Xia, Sanchita Kamath, Bongshin Lee, James M. Coughlan

Abstract: While recent years have seen a growing interest in accessible visualization tools and techniques for blind people, little attention is paid to the learning opportunities and teaching strategies of data science and visualization tailored for blind individuals. Whereas the former focuses on the accessibility issues of data visualization tools, the latter is concerned with the learnability of concept… ▽ More While recent years have seen a growing interest in accessible visualization tools and techniques for blind people, little attention is paid to the learning opportunities and teaching strategies of data science and visualization tailored for blind individuals. Whereas the former focuses on the accessibility issues of data visualization tools, the latter is concerned with the learnability of concepts and skills for data science and visualization. In this paper, we present novel approaches to teaching data science and visualization to blind students in an online setting. Taught by blind instructors, nine blind learners having a wide range of professional backgrounds participated in a two-week summer course. We describe the course design, teaching strategies, and learning outcomes. We also discuss the challenges and opportunities of teaching data science and visualization to blind students. Our work contributes to the growing body of knowledge on accessible data science and visualization education, and provides insights into the design of online courses for blind students. △ Less

Submitted 22 May, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

arXiv:2403.01827 [pdf]

Analysis and Fully Memristor-based Reservoir Computing for Temporal Data Classification

Authors: Ankur Singh, Sanghyeon Choi, Gunuk Wang, Maryaradhiya Daimari, Byung-Geun Lee

Abstract: Reservoir computing (RC) offers a neuromorphic framework that is particularly effective for processing spatiotemporal signals. Known for its temporal processing prowess, RC significantly lowers training costs compared to conventional recurrent neural networks. A key component in its hardware deployment is the ability to generate dynamic reservoir states. Our research introduces a novel dual-memory… ▽ More Reservoir computing (RC) offers a neuromorphic framework that is particularly effective for processing spatiotemporal signals. Known for its temporal processing prowess, RC significantly lowers training costs compared to conventional recurrent neural networks. A key component in its hardware deployment is the ability to generate dynamic reservoir states. Our research introduces a novel dual-memory RC system, integrating a short-term memory via a WOx-based memristor, capable of achieving 16 distinct states encoded over 4 bits, and a long-term memory component using a TiOx-based memristor within the readout layer. We thoroughly examine both memristor types and leverage the RC system to process temporal data sets. The performance of the proposed RC system is validated through two benchmark tasks: isolated spoken digit recognition with incomplete inputs and Mackey-Glass time series prediction. The system delivered an impressive 98.84% accuracy in digit recognition and sustained a low normalized root mean square error (NRMSE) of 0.036 in the time series prediction task, underscoring its capability. This study illuminates the adeptness of memristor-based RC systems in managing intricate temporal challenges, laying the groundwork for further innovations in neuromorphic computing. △ Less

Submitted 16 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

Comments: 22 pages, 20 figures, Journal, Typo corrected and updated reference

arXiv:2403.00717 [pdf, other]

doi 10.1145/3613904.3642730

MAIDR: Making Statistical Visualizations Accessible with Multimodal Data Representation

Authors: JooYoung Seo, Yilin Xia, Bongshin Lee, Sean McCurry, Yu Jun Yam

Abstract: This paper investigates new data exploration experiences that enable blind users to interact with statistical data visualizations$-$bar plots, heat maps, box plots, and scatter plots$-$leveraging multimodal data representations. In addition to sonification and textual descriptions that are commonly employed by existing accessible visualizations, our MAIDR (multimodal access and interactive data re… ▽ More This paper investigates new data exploration experiences that enable blind users to interact with statistical data visualizations$-$bar plots, heat maps, box plots, and scatter plots$-$leveraging multimodal data representations. In addition to sonification and textual descriptions that are commonly employed by existing accessible visualizations, our MAIDR (multimodal access and interactive data representation) system incorporates two additional modalities (braille and review) that offer complementary benefits. It also provides blind users with the autonomy and control to interactively access and understand data visualizations. In a user study involving 11 blind participants, we found the MAIDR system facilitated the accurate interpretation of statistical visualizations. Participants exhibited a range of strategies in combining multiple modalities, influenced by their past interactions and experiences with data visualizations. This work accentuates the overlooked potential of combining refreshable tactile representation with other modalities and elevates the discussion on the importance of user autonomy when designing accessible data visualizations. △ Less

Submitted 1 March, 2024; originally announced March 2024.

Comments: Accepted to CHI 2024. Source code is available at https://github.com/xability/maidr

arXiv:2402.15705 [pdf, other]

A Variational Approach for Modeling High-dimensional Spatial Generalized Linear Mixed Models

Authors: ** Hyung Lee, Ben Seiyon Lee

Abstract: Gaussian and discrete non-Gaussian spatial datasets are prevalent across many fields such as public health, ecology, geosciences, and social sciences. Bayesian spatial generalized linear mixed models (SGLMMs) are a flexible class of models designed for these data, but SGLMMs do not scale well, even to moderately large datasets. State-of-the-art scalable SGLMMs (i.e., basis representations or spars… ▽ More Gaussian and discrete non-Gaussian spatial datasets are prevalent across many fields such as public health, ecology, geosciences, and social sciences. Bayesian spatial generalized linear mixed models (SGLMMs) are a flexible class of models designed for these data, but SGLMMs do not scale well, even to moderately large datasets. State-of-the-art scalable SGLMMs (i.e., basis representations or sparse covariance/precision matrices) require posterior sampling via Markov chain Monte Carlo (MCMC), which can be prohibitive for large datasets. While variational Bayes (VB) have been extended to SGLMMs, their focus has primarily been on smaller spatial datasets. In this study, we propose two computationally efficient VB approaches for modeling moderate-sized and massive (millions of locations) Gaussian and discrete non-Gaussian spatial data. Our scalable VB method embeds semi-parametric approximations for the latent spatial random processes and parallel computing offered by modern high-performance computing systems. Our approaches deliver nearly identical inferential and predictive performance compared to 'gold standard' methods but achieve computational speedups of up to 1000x. We demonstrate our approaches through a comparative numerical study as well as applications to two real-world datasets. Our proposed VB methodology enables practitioners to model millions of non-Gaussian spatial observations using a standard laptop within a short timeframe. △ Less

Submitted 17 March, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

Comments: 34 Pages for the main paper, 72 pages for the supplemental information, 4 tables, 5 figures

arXiv:2402.12139 [pdf]

Thermal Radiation at the Nanoscale and Applications

Authors: Pierre-Olivier Chapuis, Bong Jae Lee, Alejandro Rodriguez

Abstract: There has been a paradigm shift from the well-known laws of thermal radiation derived over a century ago, valid only when the length scales involved are much larger than the thermal wavelength (around 10 $μ$m at room temperature), to a general framework known as fluctuational electrodynamics that allows calculations of radiative heat transfer for arbitrary sizes and length scales. Near-field radia… ▽ More There has been a paradigm shift from the well-known laws of thermal radiation derived over a century ago, valid only when the length scales involved are much larger than the thermal wavelength (around 10 $μ$m at room temperature), to a general framework known as fluctuational electrodynamics that allows calculations of radiative heat transfer for arbitrary sizes and length scales. Near-field radiative heat transfer and thermal emission in systems of sub-wavelength size can exhibit super-Planckian behaviour, i.e. flux rates several orders of magnitude larger than that predicted by the Stefan-Boltzmann (or blackbody) limit. These effects can be combined with novel materials, e.g. low-dimensional or topological systems, to yield even larger modifications and spectral and/or directional selectivity. We introduce briefly the context and the main steps that have led to the current boom of ideas and applications. We then discuss the original and impactful works gathered in the associated Special Topic collection, which provides an overview of the flourishing field of nanoscale thermal radiation. △ Less

Submitted 29 November, 2023; originally announced February 2024.

Journal ref: Applied Physics Letters, 2023, 123

arXiv:2402.11776 [pdf, other]

Early feasibility of an embedded bi-directional brain-computer interface for ambulation

Authors: Jeffrey Lim, Po T. Wang, Wonjoon Sohn, Claudia Serrano-Amenos, Mina Ibrahim, Derrick Lin, Shravan Thaploo, Susan J. Shaw, Michelle Armacost, Hui Gong, Brian Lee, Darrin Lee, Richard A. Andersen, Payam Heydari, Charles Y. Liu, Zoran Nenadic, An H. Do

Abstract: Current treatments for paraplegia induced by spinal cord injury (SCI) are often limited by the severity of the injury. The accompanying loss of sensory and motor functions often results in reliance on wheelchairs, which in turn causes reduced quality of life and increased risk of co-morbidities. While brain-computer interfaces (BCIs) for ambulation have shown promise in restoring or replacing lowe… ▽ More Current treatments for paraplegia induced by spinal cord injury (SCI) are often limited by the severity of the injury. The accompanying loss of sensory and motor functions often results in reliance on wheelchairs, which in turn causes reduced quality of life and increased risk of co-morbidities. While brain-computer interfaces (BCIs) for ambulation have shown promise in restoring or replacing lower extremity motor functions, none so far have simultaneously implemented sensory feedback functions. Additionally, many existing BCIs for ambulation rely on bulky external hardware that make them ill-suited for non-research settings. Here, we present an embedded bi-directional BCI (BDBCI), that restores motor function by enabling neural control over a robotic gait exoskeleton (RGE) and delivers sensory feedback via direct cortical electrical stimulation (DCES) in response to RGE leg swing. A first demonstration with this system was performed with a single subject implanted with electrocorticography electrodes, achieving an average lag-optimized cross-correlation of 0.80$\pm$0.08 between cues and decoded states over 5 runs. △ Less

Submitted 18 February, 2024; originally announced February 2024.

Comments: 5 pages, 6 figures, two tables, also submitted to IEEE EMBC 2024 conference

MSC Class: 92C55

Showing 1–50 of 1,234 results for author: Lee, B