Search | arXiv e-print repository

From Pixels to Torques with Linear Feedback

Authors: Jeong Hun Lee, Sam Schoedel, Aditya Bhardwaj, Zachary Manchester

Abstract: We demonstrate the effectiveness of simple observer-based linear feedback policies for "pixels-to-torques" control of robotic systems using only a robot-facing camera. Specifically, we show that the matrices of an image-based Luenberger observer (linear state estimator) for a "student" output-feedback policy can be learned from demonstration data provided by a "teacher" state-feedback policy via s… ▽ More We demonstrate the effectiveness of simple observer-based linear feedback policies for "pixels-to-torques" control of robotic systems using only a robot-facing camera. Specifically, we show that the matrices of an image-based Luenberger observer (linear state estimator) for a "student" output-feedback policy can be learned from demonstration data provided by a "teacher" state-feedback policy via simple linear-least-squares regression. The resulting linear output-feedback controller maps directly from high-dimensional raw images to torques while being amenable to the rich set of analytical tools from linear systems theory, allowing us to enforce closed-loop stability constraints in the learning problem. We also investigate a nonlinear extension of the method via the Koopman embedding. Finally, we demonstrate the surprising effectiveness of linear pixels-to-torques policies on a cartpole system, both in simulation and on real-world hardware. The policy successfully executes both stabilizing and swing-up trajectory tracking tasks using only camera feedback while subject to model mismatch, process and sensor noise, perturbations, and occlusions. △ Less

Submitted 7 July, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

Comments: Submitted to Workshop on Algorithmic Foundations of Robotics (WAFR) 2024

arXiv:2406.18505 [pdf, other]

Mental Modeling of Reinforcement Learning Agents by Language Models

Authors: Wenhao Lu, Xufeng Zhao, Josua Spisak, Jae Hee Lee, Stefan Wermter

Abstract: Can emergent language models faithfully model the intelligence of decision-making agents? Though modern language models exhibit already some reasoning ability, and theoretically can potentially express any probable distribution over tokens, it remains underexplored how the world knowledge these pretrained models have memorized can be utilized to comprehend an agent's behaviour in the physical worl… ▽ More Can emergent language models faithfully model the intelligence of decision-making agents? Though modern language models exhibit already some reasoning ability, and theoretically can potentially express any probable distribution over tokens, it remains underexplored how the world knowledge these pretrained models have memorized can be utilized to comprehend an agent's behaviour in the physical world. This study empirically examines, for the first time, how well large language models (LLMs) can build a mental model of agents, termed agent mental modelling, by reasoning about an agent's behaviour and its effect on states from agent interaction history. This research may unveil the potential of leveraging LLMs for elucidating RL agent behaviour, addressing a key challenge in eXplainable reinforcement learning (XRL). To this end, we propose specific evaluation metrics and test them on selected RL task datasets of varying complexity, reporting findings on agent mental model establishment. Our results disclose that LLMs are not yet capable of fully mental modelling agents through inference alone without further innovations. This work thus provides new insights into the capabilities and limitations of modern LLMs. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: https://lukaswill.github.io/

arXiv:2406.17221 [pdf, other]

IR physics from the holographic RG flow

Authors: Chanyong Park, Jung Hun Lee

Abstract: Applying the holographic method, we investigate an RG flow and IR physics holographically when a two-dimensional conformal field theory is deformed by a relevant scalar operator. To do so, we first assume an RG flow from a UV to new IR CFT. On the dual gravity side, such an RG flow can be described by rolling down of a bulk scalar field from an unstable to stable equilibrium point. After consideri… ▽ More Applying the holographic method, we investigate an RG flow and IR physics holographically when a two-dimensional conformal field theory is deformed by a relevant scalar operator. To do so, we first assume an RG flow from a UV to new IR CFT. On the dual gravity side, such an RG flow can be described by rolling down of a bulk scalar field from an unstable to stable equilibrium point. After considering a simple scalar potential allowing several local extrema, we study the change of a ground state along the RG flow. We show that the entanglement entropy at an IR fixed point leads to a logarithmic divergence due to restoring of the conformal symmetry. We study how the change of the ground state affects two-point functions. In the probe limit, we numerically calculate the change of a conformal dimension caused by the modification of the ground state. We further study the analytic form of the IR conformal dimension which is perfectly matched to the numerical result. △ Less

Submitted 8 July, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

Comments: 15 pages with appendices; typos corrected, references added

arXiv:2406.13924 [pdf, other]

Impact of Internal Dust Correction on the Stellar Populations of Galaxies Estimated Using the Full Spectrum Fitting

Authors: Joon Hyeop Lee, Hyun** Jeong, Jiwon Chung, Mina Pak, Sree Oh

Abstract: Full spectrum fitting is a powerful tool for estimating the stellar populations of galaxies, but the fitting results are often significantly influenced by internal dust attenuation. For understanding how the choice of the internal dust correction method affects the detailed stellar populations estimated from the full spectrum fitting, we analyze the Sydney-Australian Astronomical Observatory Multi… ▽ More Full spectrum fitting is a powerful tool for estimating the stellar populations of galaxies, but the fitting results are often significantly influenced by internal dust attenuation. For understanding how the choice of the internal dust correction method affects the detailed stellar populations estimated from the full spectrum fitting, we analyze the Sydney-Australian Astronomical Observatory Multi-object Integral field spectrograph (SAMI) galaxy survey data using the Penalized PiXel-Fitting (PPXF) package. Three choices are compared: (Choice-1) using the PPXF reddening option, (Choice-2) using the multiplicative Legendre polynomial, and (Choice-3) using none of them (no dust correction). In any case, the total mean stellar populations show reasonable mass-age and mass-metallicity relations (MTR and MZR), although the correlations appear to be strongest for Choice-1 (MTR) and Choice-2 (MZR). When we compare the age-divided mean stellar populations, the MZR of young (< 10^9.5 yr ~ 3.2 Gyr) stellar components in Choice-2 is consistent with the gas-phase MZR, whereas those in the other two choices hardly are. On the other hand, the MTR of old (>= 10^9.5 yr) stellar components in Choice-1 seems to be more reasonable than that in Choice-2, because the old stellar components in low-mass galaxies tend to be relatively younger than those in massive galaxies. Based on the results, we provide empirical guidelines for choosing the optimal options for dust correction. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 10 pages, 8 figures, accepted for publication in Journal of the Korean Astronomical Society

arXiv:2406.13309 [pdf, other]

The reducing sphere complexes for the $3$-sphere are connected: a proof of the Powell Conjecture

Authors: Sangbum Cho, Yuya Koda, Jung Hoon Lee, Nozomu Sekino

Abstract: The genus-$g$ Goeritz group is the group of isotopy classes of orientation-preserving self-homeomorphisms of the $3$-sphere that preserve the genus-$g$ Heegaard splitting of the $3$-sphere. In 1933, Goeritz found first a finite generating set of the genus-$2$ Goeritz group. The Powell Conjecture offers four specific elements that suffice to generate higher genus Goeritz group. We show that the red… ▽ More The genus-$g$ Goeritz group is the group of isotopy classes of orientation-preserving self-homeomorphisms of the $3$-sphere that preserve the genus-$g$ Heegaard splitting of the $3$-sphere. In 1933, Goeritz found first a finite generating set of the genus-$2$ Goeritz group. The Powell Conjecture offers four specific elements that suffice to generate higher genus Goeritz group. We show that the reducing sphere complexes for the higher genus splittings of the $3$-sphere are all connected, and consequently the conjecture is true. △ Less

Submitted 6 July, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

Comments: Version 2: this is an updated version, actually extension or generalization of the previous version, which proved that the Powell Conjecture in genus $4$ is true. In this version, the conjecture is proved to be true in any genus greater than or equal to $3$. One more author was joined. The title is changed. 10 pages, 3 figures

MSC Class: 57K30; 57K20; 20F05

arXiv:2406.09988 [pdf, other]

Details Make a Difference: Object State-Sensitive Neurorobotic Task Planning

Authors: Xiaowen Sun, Xufeng Zhao, Jae Hee Lee, Wenhao Lu, Matthias Kerzel, Stefan Wermter

Abstract: The state of an object reflects its current status or condition and is important for a robot's task planning and manipulation. However, detecting an object's state and generating a state-sensitive plan for robots is challenging. Recently, pre-trained Large Language Models (LLMs) and Vision-Language Models (VLMs) have shown impressive capabilities in generating plans. However, to the best of our kn… ▽ More The state of an object reflects its current status or condition and is important for a robot's task planning and manipulation. However, detecting an object's state and generating a state-sensitive plan for robots is challenging. Recently, pre-trained Large Language Models (LLMs) and Vision-Language Models (VLMs) have shown impressive capabilities in generating plans. However, to the best of our knowledge, there is hardly any investigation on whether LLMs or VLMs can also generate object state-sensitive plans. To study this, we introduce an Object State-Sensitive Agent (OSSA), a task-planning agent empowered by pre-trained neural networks. We propose two methods for OSSA: (i) a modular model consisting of a pre-trained vision processing module (dense captioning model, DCM) and a natural language processing model (LLM), and (ii) a monolithic model consisting only of a VLM. To quantitatively evaluate the performances of the two methods, we use tabletop scenarios where the task is to clear the table. We contribute a multimodal benchmark dataset that takes object states into consideration. Our results show that both methods can be used for object state-sensitive tasks, but the monolithic approach outperforms the modular approach. The code for OSSA is available at \url{https://github.com/Xiao-wen-Sun/OSSA} △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.07783 [pdf, other]

One-sided H alpha Excess before the First Pericentre Passage in Galaxy Pairs

Authors: Jiwon Chung, Joon Hyeop Lee, Hyun** Jeong

Abstract: We present novel insights into the interplay between tidal forces and star formation in interacting galaxies before their first pericentre passage. We investigate seven close pair galaxies devoid of visible tidal disturbances, such as tails, bridges, and shells. Using integral field spectroscopy (IFS) data of extended Calar Alto Legacy Integral Field Area (eCALIFA), we unveil a previously unreport… ▽ More We present novel insights into the interplay between tidal forces and star formation in interacting galaxies before their first pericentre passage. We investigate seven close pair galaxies devoid of visible tidal disturbances, such as tails, bridges, and shells. Using integral field spectroscopy (IFS) data of extended Calar Alto Legacy Integral Field Area (eCALIFA), we unveil a previously unreported phenomenon: H alhpa emission, a proxy for recent star formation, exhibits a significant enhancement in regions facing the companion galaxy, reaching up to 1.9 times higher flux compared to opposite directions. Notably, fainter companions within pairs display a more pronounced one-sided H alpha excess, exceeding the typical range observed in isolated galaxies with 2 sigma confidence level. Furthermore, the observed H alpha excess in fainter companion galaxies exhibits a heightened prominence at the outer galactic regions. These findings suggest that tidal forces generated before the first pericentre passage exert a stronger influence on fainter galaxies due to their shallower potential wells by their brighter companions. This unveils a more intricate interplay between gravitational interactions and star formation history within interacting galaxies than previously understood, highlighting the need further to explore the early stages of interaction in galaxy evolution. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 7 pages, 4 figgures, Accepted for publication in MNRAS Letters

arXiv:2406.02989 [pdf, other]

Learning Semantic Traversability with Egocentric Video and Automated Annotation Strategy

Authors: Yunho Kim, Jeong Hyun Lee, Choongin Lee, Juhyeok Mun, Donghoon Youm, Jeongsoo Park, Jemin Hwangbo

Abstract: For reliable autonomous robot navigation in urban settings, the robot must have the ability to identify semantically traversable terrains in the image based on the semantic understanding of the scene. This reasoning ability is based on semantic traversability, which is frequently achieved using semantic segmentation models fine-tuned on the testing domain. This fine-tuning process often involves m… ▽ More For reliable autonomous robot navigation in urban settings, the robot must have the ability to identify semantically traversable terrains in the image based on the semantic understanding of the scene. This reasoning ability is based on semantic traversability, which is frequently achieved using semantic segmentation models fine-tuned on the testing domain. This fine-tuning process often involves manual data collection with the target robot and annotation by human labelers which is prohibitively expensive and unscalable. In this work, we present an effective methodology for training a semantic traversability estimator using egocentric videos and an automated annotation process. Egocentric videos are collected from a camera mounted on a pedestrian's chest. The dataset for training the semantic traversability estimator is then automatically generated by extracting semantically traversable regions in each video frame using a recent foundation model in image segmentation and its prompting technique. Extensive experiments with videos taken across several countries and cities, covering diverse urban scenarios, demonstrate the high scalability and generalizability of the proposed annotation method. Furthermore, performance analysis and real-world deployment for autonomous robot navigation showcase that the trained semantic traversability estimator is highly accurate, able to handle diverse camera viewpoints, computationally light, and real-world applicable. The summary video is available at https://youtu.be/EUVoH-wA-lA. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: Submitted to IEEE Robotics and Automation Letters (RA-L), First two authors contributed equally

arXiv:2406.01007 [pdf, other]

Measurement of Electron Antineutrino Oscillation Amplitude and Frequency via Neutron Capture on Hydrogen at Daya Bay

Authors: Daya Bay collaboration, F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, J. Cheng, Y. -C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng , et al. (177 additional authors not shown)

Abstract: This Letter reports the first measurement of the oscillation amplitude and frequency of reactor antineutrinos at Daya Bay via neutron capture on hydrogen using 1958 days of data. With over 3.6 million signal candidates, an optimized candidate selection, improved treatment of backgrounds and efficiencies, refined energy calibration, and an energy response model for the capture-on-hydrogen sensitive… ▽ More This Letter reports the first measurement of the oscillation amplitude and frequency of reactor antineutrinos at Daya Bay via neutron capture on hydrogen using 1958 days of data. With over 3.6 million signal candidates, an optimized candidate selection, improved treatment of backgrounds and efficiencies, refined energy calibration, and an energy response model for the capture-on-hydrogen sensitive region, the relative $\overlineν_{e}$ rates and energy spectra variation among the near and far detectors gives $\mathrm{sin}^22θ_{13} = 0.0759_{-0.0049}^{+0.0050}$ and $Δm^2_{32} = (2.72^{+0.14}_{-0.15})\times10^{-3}$ eV$^2$ assuming the normal neutrino mass ordering, and $Δm^2_{32} = (-2.83^{+0.15}_{-0.14})\times10^{-3}$ eV$^2$ for the inverted neutrino mass ordering. This estimate of $\sin^2 2θ_{13}$ is consistent with and essentially independent from the one obtained using the capture-on-gadolinium sample at Daya Bay. The combination of these two results yields $\mathrm{sin}^22θ_{13}= 0.0833\pm0.0022$, which represents an 8% relative improvement in precision regarding the Daya Bay full 3158-day capture-on-gadolinium result. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2406.00669 [pdf]

Multi-technology co-optimization approach for sustainable hydrogen and electricity supply chains considering variability and demand scale

Authors: Sunwoo Kim, Joungho Park, Jay H. Lee

Abstract: In the pursuit of a carbon-neutral future, hydrogen emerges as a pivotal element, serving as a carbon-free energy carrier and feedstock. As efforts to decarbonize sectors such as heating and transportation intensify, understanding and navigating through the dynamics of hydrogen demand expansion becomes critical. Transitioning to hydrogen economy is complicated by varying regional scales and types… ▽ More In the pursuit of a carbon-neutral future, hydrogen emerges as a pivotal element, serving as a carbon-free energy carrier and feedstock. As efforts to decarbonize sectors such as heating and transportation intensify, understanding and navigating through the dynamics of hydrogen demand expansion becomes critical. Transitioning to hydrogen economy is complicated by varying regional scales and types of hydrogen demand, with forecasts indicating a rise in variable demand that calls for diverse production technologies. Currently, steam methane reforming is prevalent, but its significant carbon emissions make a shift to cleaner alternatives like blue and green hydrogen imperative. Each production method possesses distinct characteristics, necessitating a thorough exploration and co-optimization with electricity supply chains as well as carbon capture, utilization, and storage systems. Our study fills existing research gaps by introducing a superstructure optimization framework that accommodates various demand scenarios and technologies. Through case studies in California, we underscore the critical role of demand profiles in sha** the optimal configurations and economics of supply chains and emphasize the need for diversified portfolios and co-optimization to facilitate sustainable energy transitions. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2406.00665

Integrating solid direct air capture systems with green hydrogen production: Economic synergy of sector coupling

Authors: Sunwoo Kim, Joungho Park, Jay H. Lee

Abstract: In the global pursuit of sustainable energy solutions, mitigating carbon dioxide (CO2) emissions stands as a pivotal challenge. With escalating atmospheric CO2 levels, the imperative of direct air capture (DAC) systems becomes evident. Simultaneously, green hydrogen (GH) emerges as a pivotal medium for renewable energy. Nevertheless, the substantial expenses associated with these technologies impe… ▽ More In the global pursuit of sustainable energy solutions, mitigating carbon dioxide (CO2) emissions stands as a pivotal challenge. With escalating atmospheric CO2 levels, the imperative of direct air capture (DAC) systems becomes evident. Simultaneously, green hydrogen (GH) emerges as a pivotal medium for renewable energy. Nevertheless, the substantial expenses associated with these technologies impede widespread adoption, primarily due to significant installation costs and underutilized operational advantages when deployed independently. Integration through sector coupling enhances system efficiency and sustainability, while shared power sources and energy storage devices offer additional economic benefits. In this study, we assess the economic viability of polymer electrolyte membrane electrolyzers versus alkaline electrolyzers within the context of sector coupling. Our findings indicate that combining GH production with solid DAC systems yields significant economic advantages, with approximately a 10% improvement for PEM electrolyzers and a 20% enhancement for alkaline electrolyzers. These results highlight a substantial opportunity to improve the efficiency and economic viability of renewable energy and green hydrogen initiatives, thereby facilitating the broader adoption of cleaner technologies. △ Less

Submitted 28 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

Comments: Some of the results of our previous preprint paper are flawed, and we are withdrawing them to prevent the spread of incorrect knowledge

arXiv:2405.20605 [pdf, other]

Searching for internal symbols underlying deep learning

Authors: Jung H. Lee, Sujith Vijayan

Abstract: Deep learning (DL) enables deep neural networks (DNNs) to automatically learn complex tasks or rules from given examples without instructions or guiding principles. As we do not engineer DNNs' functions, it is extremely difficult to diagnose their decisions, and multiple lines of studies proposed to explain principles of DNNs/DL operations. Notably, one line of studies suggests that DNNs may learn… ▽ More Deep learning (DL) enables deep neural networks (DNNs) to automatically learn complex tasks or rules from given examples without instructions or guiding principles. As we do not engineer DNNs' functions, it is extremely difficult to diagnose their decisions, and multiple lines of studies proposed to explain principles of DNNs/DL operations. Notably, one line of studies suggests that DNNs may learn concepts, the high level features recognizable to humans. Thus, we hypothesized that DNNs develop abstract codes, not necessarily recognizable to humans, which can be used to augment DNNs' decision-making. To address this hypothesis, we combined foundation segmentation models and unsupervised learning to extract internal codes and identify potential use of abstract codes to make DL's decision-making more reliable and safer. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 10 pages, 7 figures, 3 tables and Appendix

arXiv:2405.11563 [pdf, other]

User-Centric Association and Feedback Bit Allocation for FDD Cell-Free Massive MIMO

Authors: Kwangjae Lee, Jung Hoon Lee, Wan Choi

Abstract: In this paper, we introduce a novel approach to user-centric association and feedback bit allocation for the downlink of a cell-free massive MIMO (CF-mMIMO) system, operating under limited feedback constraints. In CF-mMIMO systems employing frequency division duplexing, each access point (AP) relies on channel information provided by its associated user equipments (UEs) for beamforming design. Sin… ▽ More In this paper, we introduce a novel approach to user-centric association and feedback bit allocation for the downlink of a cell-free massive MIMO (CF-mMIMO) system, operating under limited feedback constraints. In CF-mMIMO systems employing frequency division duplexing, each access point (AP) relies on channel information provided by its associated user equipments (UEs) for beamforming design. Since the uplink control channel is typically shared among UEs, we take account of each AP's total feedback budget, which is distributed among its associated UEs. By employing the Saleh-Valenzuela multi-resolvable path channel model with different average path gains, we first identify necessary feedback information for each UE, along with an appropriate codebook structure. This structure facilitates adaptive quantization of multiple paths based on their dominance. We then formulate a joint optimization problem addressing user-centric UE-AP association and feedback bit allocation. To address this challenge, we analyze the impact of feedback bit allocation and derive our proposed scheme from the solution of an alternative optimization problem aimed at devising long-term policies, explicitly considering the effects of feedback bit allocation. Numerical results show that our proposed scheme effectively enhances the performance of conventional approaches in CF-mMIMO systems. △ Less

Submitted 19 May, 2024; originally announced May 2024.

arXiv:2405.03931 [pdf, ps, other]

Incorporating changeable attitudes toward vaccination into an SIR infectious disease model

Authors: Yi Jiang, Kristin M. Kurianski, Jane H. Lee, Yan** Ma, Daniel Cicala, Glenn Ledder

Abstract: We develop a mechanistic model that classifies individuals both in terms of epidemiological status (SIR) and vaccination attitude (willing or unwilling), with the goal of discovering how disease spread is influenced by changing opinions about vaccination. Analysis of the model identifies existence and stability criteria for both disease-free and endemic disease equilibria. The analytical results,… ▽ More We develop a mechanistic model that classifies individuals both in terms of epidemiological status (SIR) and vaccination attitude (willing or unwilling), with the goal of discovering how disease spread is influenced by changing opinions about vaccination. Analysis of the model identifies existence and stability criteria for both disease-free and endemic disease equilibria. The analytical results, supported by numerical simulations, show that attitude changes induced by disease prevalence can destabilize endemic disease equilibria, resulting in limit cycles. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: 30 pages, 3 tables, 10 figures

MSC Class: 37N25 (Primary) 92D30 (Secondary)

arXiv:2404.01954 [pdf, other]

HyperCLOVA X Technical Report

Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seong** Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment to responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding, math, chatting, instruction-following, and harmlessness, in both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in Korean backed by a deep understanding of the language and cultural nuances. Further analysis of the inherent bilingual nature and its extension to multilingualism highlights the model's cross-lingual proficiency and strong generalization ability to untargeted languages, including machine translation between several language pairs and cross-lingual inference tasks. We believe that HyperCLOVA X can provide helpful guidance for regions or countries in develo** their sovereign LLMs. △ Less

Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: 44 pages; updated authors list and fixed author names

arXiv:2404.01687 [pdf, other]

Search for a sub-eV sterile neutrino using Daya Bay's full dataset

Authors: F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, Y. C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng, X. Y. Ding, Y. Y. Ding , et al. (176 additional authors not shown)

Abstract: This Letter presents results of a search for the mixing of a sub-eV sterile neutrino with three active neutrinos based on the full data sample of the Daya Bay Reactor Neutrino Experiment, collected during 3158 days of detector operation, which contains $5.55 \times 10^{6}$ reactor \anue candidates identified as inverse beta-decay interactions followed by neutron-capture on gadolinium. The analysis… ▽ More This Letter presents results of a search for the mixing of a sub-eV sterile neutrino with three active neutrinos based on the full data sample of the Daya Bay Reactor Neutrino Experiment, collected during 3158 days of detector operation, which contains $5.55 \times 10^{6}$ reactor \anue candidates identified as inverse beta-decay interactions followed by neutron-capture on gadolinium. The analysis benefits from a doubling of the statistics of our previous result and from improvements of several important systematic uncertainties. No significant oscillation due to mixing of a sub-eV sterile neutrino with active neutrinos was found. Exclusion limits are set by both Feldman-Cousins and CLs methods. Light sterile neutrino mixing with $\sin^2 2θ_{14} \gtrsim 0.01$ can be excluded at 95\% confidence level in the region of $0.01$ eV$^2 \lesssim |Δm^{2}_{41}| \lesssim 0.1 $ eV$^2$. This result represents the world-leading constraints in the region of $2 \times 10^{-4}$ eV$^2 \lesssim |Δm^{2}_{41}| \lesssim 0.2 $ eV$^2$. △ Less

Submitted 15 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: 7 pages, 4 figures, 1 table

arXiv:2403.16011 [pdf, other]

Uncovering the Ghostly Remains of an Extremely Diffuse Satellite in the Remote Halo of NGC 253

Authors: Sakurako Okamoto, Annette M. N. Ferguson, Nobuo Arimoto, Itsuki Ogami, Rokas Zemaitis, Masashi Chiba, Mike J. Irwin, In Sung Jang, ** Koda, Yutaka Komiyama, Myung Gyoon Lee, Jeong Hwan Lee, Michael Rich, Masayuki Tanaka, Mikito Tanaka

Abstract: We present the discovery of NGC253-SNFC-dw1, a new satellite galaxy in the remote stellar halo of the Sculptor Group spiral, NGC 253. The system was revealed using deep resolved star photometry obtained as part of the Subaru Near-Field Cosmology Survey that uses the Hyper Suprime-Cam on the Subaru Telescope. Although rather luminous ($\rm{M_{V}} = -11.7 \pm 0.2$) and massive (… ▽ More We present the discovery of NGC253-SNFC-dw1, a new satellite galaxy in the remote stellar halo of the Sculptor Group spiral, NGC 253. The system was revealed using deep resolved star photometry obtained as part of the Subaru Near-Field Cosmology Survey that uses the Hyper Suprime-Cam on the Subaru Telescope. Although rather luminous ($\rm{M_{V}} = -11.7 \pm 0.2$) and massive ($M_* \sim 1.25\times 10^7~\rm{M}_{\odot}$), the system is one of the most diffuse satellites yet known, with a half-light radius of $\rm{R_{h}} = 3.37 \pm 0.36$ kpc and an average surface brightness of $\sim 30.1$ mag arcmin$^{-2}$ within the $\rm{R_{h}}$. The colour-magnitude diagram shows a dominant old ($\sim 10$ Gyr) and metal-poor ($\rm{[M/H]}=-1.5 \pm 0.1$ dex) stellar population, as well as several candidate thermally-pulsing asymptotic giant branch stars. The distribution of red giant branch stars is asymmetrical and displays two elongated tidal extensions pointing towards NGC 253, suggestive of a highly disrupted system being observed at apocenter. NGC253-SNFC-dw1 has a size comparable to that of the puzzling Local Group dwarfs Andromeda XIX and Antlia 2 but is two magnitudes brighter. While unambiguous evidence of tidal disruption in these systems has not yet been demonstrated, the morphology of NGC253-SNFC-dw1 clearly shows that this is a natural path to produce such diffuse and extended galaxies. The surprising discovery of this system in a previously well-searched region of the sky emphasizes the importance of surface brightness limiting depth in satellite searches. △ Less

Submitted 26 April, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

Comments: 10 pages, 4 figures, 1 table. Accepted for publication in ApJL

arXiv:2403.08244 [pdf]

Evaluating the Efficiency and Cost-effectiveness of RPB-based CO2 Capture: A Comprehensive Approach to Simultaneous Design and Operating Condition Optimization

Authors: Howoun Jung, Noh** Park, Jay H. Lee

Abstract: Despite ongoing global initiatives to reduce CO2 emissions, implementing large-scale CO2 capture using amine solvents is fraught with economic uncertainties and technical hurdles. The Rotating Packed Bed (RPB) presents a promising alternative to traditional packed towers, offering compact design and adaptability. Nonetheless, scaling RPB processes to an industrial level is challenging due to the n… ▽ More Despite ongoing global initiatives to reduce CO2 emissions, implementing large-scale CO2 capture using amine solvents is fraught with economic uncertainties and technical hurdles. The Rotating Packed Bed (RPB) presents a promising alternative to traditional packed towers, offering compact design and adaptability. Nonetheless, scaling RPB processes to an industrial level is challenging due to the nascent nature of its application. The complexity of designing RPB units, setting operating conditions, and evaluating process performance adds layers of difficulty to the adoption of RPB-based systems in industries. This study introduces an optimization-driven design and evaluation for CO2 capture processes utilizing RPB columns. By employing detailed process simulation, we aim to concurrently optimize unit design and operating parameters, underscoring its advantage over conventional sequential approaches. Our process design method integrates heuristic design recommendations as constraints, resulting in 9.4% to 12.7% cost savings compared to conventional sequential design methods. Furthermore, our comprehensive process-level analysis reveals that using concentrated MEA solvent can yield total cost savings of 13.4% to 25.0% compared to the standard 30wt% MEA solvent. Additionally, the RPB unit can deliver an 8.5 to 23.6 times reduction in packing volume. While the commercial-scale feasibility of RPB technology has been established, the advancement of this field hinges on acquiring a broader and more robust dataset from commercial-scale implementations. Employing strategic methods like modularization could significantly reduce the entry barriers for CO2 capture projects, facilitating their broader adoption and implementation. △ Less

Submitted 13 March, 2024; originally announced March 2024.

Comments: 44 pages, 11 figures, 6 tables

arXiv:2403.05136 [pdf, other]

DeRO: Dead Reckoning Based on Radar Odometry With Accelerometers Aided for Robot Localization

Authors: Hoang Viet Do, Yong Hun Kim, Joo Han Lee, Min Ho Lee, ** Woo Song

Abstract: In this paper, we propose a radar odometry structure that directly utilizes radar velocity measurements for dead reckoning while maintaining its ability to update estimations within the Kalman filter framework. Specifically, we employ the Doppler velocity obtained by a 4D Frequency Modulated Continuous Wave (FMCW) radar in conjunction with gyroscope data to calculate poses. This approach helps mit… ▽ More In this paper, we propose a radar odometry structure that directly utilizes radar velocity measurements for dead reckoning while maintaining its ability to update estimations within the Kalman filter framework. Specifically, we employ the Doppler velocity obtained by a 4D Frequency Modulated Continuous Wave (FMCW) radar in conjunction with gyroscope data to calculate poses. This approach helps mitigate high drift resulting from accelerometer biases and double integration. Instead, tilt angles measured by gravitational force are utilized alongside relative distance measurements from radar scan matching for the filter's measurement update. Additionally, to further enhance the system's accuracy, we estimate and compensate for the radar velocity scale factor. The performance of the proposed method is verified through five real-world open-source datasets. The results demonstrate that our approach reduces position error by 47% and rotation error by 52% on average compared to the state-of-the-art radar-inertial fusion method in terms of absolute trajectory error. △ Less

Submitted 8 March, 2024; originally announced March 2024.

Comments: 9 pages, 5 figures, 1 table, conference

ACM Class: I.2.9

arXiv:2403.01415 [pdf]

Phonon-pair-driven Ferroelectricity Causes Costless Domain-walls and Bulk-boundary Duality

Authors: Hyun-Jae Lee, Kyoung-June Go, Pawan Kumar, Chang Hoon Kim, Yungyeom Kim, Kyoungjun Lee, Takao Shimizu, Seung Chul Chae, Hosub **, Minseong Lee, Umesh Waghmare, Si-Young Choi, Jun Hee Lee

Abstract: Ferroelectric domain walls, recognized as distinct from the bulk in terms of symmetry, structure, and electronic properties, host exotic phenomena including conductive walls, ferroelectric vortices, novel topologies, and negative capacitance. Contrary to conventional understanding, our study reveals that the structure of domain walls in HfO2 closely resembles its bulk. First, our first-principles… ▽ More Ferroelectric domain walls, recognized as distinct from the bulk in terms of symmetry, structure, and electronic properties, host exotic phenomena including conductive walls, ferroelectric vortices, novel topologies, and negative capacitance. Contrary to conventional understanding, our study reveals that the structure of domain walls in HfO2 closely resembles its bulk. First, our first-principles simulations unveil that the robust ferroelectricity is supported by bosonic pairing of all the anionic phonons in bulk HfO2. Strikingly, the paired phonons strongly bond with each other and successfully reach the center of the domain wall without losing their integrity and produce bulk-like domain walls. We then confirmed preservation of the bulk phonon displacements and consequently full revival of the bulk structure at domain walls via aberration-corrected STEM. The newly found duality between the bulk and the domain wall sheds light on previously enigmatic properties such as zero-energy domain walls, perfect Ising-type polar ordering, and exceptionally robust ferroelectricity at the sub-nm scales. The phonon-pairing discovered here is robust against physical boundaries such as domain walls and enables zero momentum and zero-energy cost local ferroelectric switching. This phenomenon demonstrated in Si-compatible ferroelectrics provides a novel technological platform where data storage on domain walls is as feasible as that within the domains, thereby expanding the potential for high-density data storage and advanced ferroelectric applications. △ Less

Submitted 3 March, 2024; originally announced March 2024.

Comments: 24 pages, 4 figures

arXiv:2402.18361 [pdf, other]

Why Do Animals Need Sha**? A Theory of Task Composition and Curriculum Learning

Authors: ** Hwa Lee, Stefano Sarao Mannelli, Andrew Saxe

Abstract: Diverse studies in systems neuroscience begin with extended periods of curriculum training known as `sha**' procedures. These involve progressively studying component parts of more complex tasks, and can make the difference between learning a task quickly, slowly or not at all. Despite the importance of sha** to the acquisition of complex tasks, there is as yet no theory that can help guide th… ▽ More Diverse studies in systems neuroscience begin with extended periods of curriculum training known as `sha**' procedures. These involve progressively studying component parts of more complex tasks, and can make the difference between learning a task quickly, slowly or not at all. Despite the importance of sha** to the acquisition of complex tasks, there is as yet no theory that can help guide the design of sha** procedures, or more fundamentally, provide insight into its key role in learning. Modern deep reinforcement learning systems might implicitly learn compositional primitives within their multilayer policy networks. Inspired by these models, we propose and analyse a model of deep policy gradient learning of simple compositional reinforcement learning tasks. Using the tools of statistical physics, we solve for exact learning dynamics and characterise different learning strategies including primitives pre-training, in which task primitives are studied individually before learning compositional tasks. We find a complex interplay between task complexity and the efficacy of sha** strategies. Overall, our theory provides an analytical understanding of the benefits of sha** in a class of compositional tasks and a quantitative account of how training protocols can disclose useful task primitives, ultimately yielding faster and more robust learning. △ Less

Submitted 12 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

Comments: Accepted to ICML 2024. 5 figures, 9 pages and Appendix

arXiv:2402.17517 [pdf, other]

Label-Noise Robust Diffusion Models

Authors: Byeonghu Na, Yeongmin Kim, HeeSun Bae, Jung Hyun Lee, Se Jung Kwon, Wanmo Kang, Il-Chul Moon

Abstract: Conditional diffusion models have shown remarkable performance in various generative tasks, but training them requires large-scale datasets that often contain noise in conditional inputs, a.k.a. noisy labels. This noise leads to condition mismatch and quality degradation of generated data. This paper proposes Transition-aware weighted Denoising Score Matching (TDSM) for training conditional diffus… ▽ More Conditional diffusion models have shown remarkable performance in various generative tasks, but training them requires large-scale datasets that often contain noise in conditional inputs, a.k.a. noisy labels. This noise leads to condition mismatch and quality degradation of generated data. This paper proposes Transition-aware weighted Denoising Score Matching (TDSM) for training conditional diffusion models with noisy labels, which is the first study in the line of diffusion models. The TDSM objective contains a weighted sum of score networks, incorporating instance-wise and time-dependent label transition probabilities. We introduce a transition-aware weight estimator, which leverages a time-dependent noisy-label classifier distinctively customized to the diffusion process. Through experiments across various datasets and noisy label settings, TDSM improves the quality of generated samples aligned with given conditions. Furthermore, our method improves generation performance even on prevalent benchmark datasets, which implies the potential noisy labels and their risk of generative model learning. Finally, we show the improved performance of TDSM on top of conventional noisy label corrections, which empirically proving its contribution as a part of label-noise robust generative models. Our code is available at: https://github.com/byeonghu-na/tdsm. △ Less

Submitted 27 February, 2024; originally announced February 2024.

Comments: Accepted at ICLR 2024

arXiv:2402.15760 [pdf]

Tunable incommensurability and spontaneous symmetry breaking in the reconstructed moiré-of-moiré lattices

Authors: Daesung Park, Changwon Park, Eunjung Ko, Kunihiro Yananose, Rebecca Engelke, Xi Zhang, Konstantin Davydov, Matthew Green, Sang Hwa Park, Jae Heon Lee, Kenji Watanabe, Takashi Taniguchi, Sang Mo Yang, Ke Wang, Philip Kim, Young-Woo Son, Hyobin Yoo

Abstract: Imposing incommensurable periodicity on the periodic atomic lattice can lead to complex structural phases consisting of locally periodic structure bounded by topological defects. Twisted trilayer graphene (TTG) is an ideal material platform to study the interplay between different atomic periodicities, which can be tuned by twist angles between the layers, leading to moiré-of-moiré lattices. Inter… ▽ More Imposing incommensurable periodicity on the periodic atomic lattice can lead to complex structural phases consisting of locally periodic structure bounded by topological defects. Twisted trilayer graphene (TTG) is an ideal material platform to study the interplay between different atomic periodicities, which can be tuned by twist angles between the layers, leading to moiré-of-moiré lattices. Interlayer and intralayer interactions between two interfaces in TTG transform this moiré-of-moiré lattice into an intricate network of domain structures at small twist angles, which can harbor exotic electronic behaviors. Here we report a complete structural phase diagram of TTG with atomic scale lattice reconstruction. Using transmission electron microscopy combined with a new interatomic potential simulation, we show that a cornucopia of large-scale moiré lattices, ranging from triangular, kagome, and a corner-shared hexagram-shaped domain pattern, are present. For small twist angles below 0.1°, all domains are bounded by a network of two-dimensional domain wall lattices. In particular, in the limit of small twist angles, the competition between interlayer stacking energy and the formation of discommensurate domain walls leads to unique spontaneous symmetry breaking structures with nematic orders, suggesting the pivotal role of long-range interactions across entire layers. The diverse tessellation of distinct domains, whose topological network can be tuned by the adjustment of the twist angles, establishes TTG as a platform for exploring the interplay between emerging quantum properties and controllable nontrivial lattices. △ Less

Submitted 24 February, 2024; originally announced February 2024.

arXiv:2402.15705 [pdf, other]

A Variational Approach for Modeling High-dimensional Spatial Generalized Linear Mixed Models

Authors: ** Hyung Lee, Ben Seiyon Lee

Abstract: Gaussian and discrete non-Gaussian spatial datasets are prevalent across many fields such as public health, ecology, geosciences, and social sciences. Bayesian spatial generalized linear mixed models (SGLMMs) are a flexible class of models designed for these data, but SGLMMs do not scale well, even to moderately large datasets. State-of-the-art scalable SGLMMs (i.e., basis representations or spars… ▽ More Gaussian and discrete non-Gaussian spatial datasets are prevalent across many fields such as public health, ecology, geosciences, and social sciences. Bayesian spatial generalized linear mixed models (SGLMMs) are a flexible class of models designed for these data, but SGLMMs do not scale well, even to moderately large datasets. State-of-the-art scalable SGLMMs (i.e., basis representations or sparse covariance/precision matrices) require posterior sampling via Markov chain Monte Carlo (MCMC), which can be prohibitive for large datasets. While variational Bayes (VB) have been extended to SGLMMs, their focus has primarily been on smaller spatial datasets. In this study, we propose two computationally efficient VB approaches for modeling moderate-sized and massive (millions of locations) Gaussian and discrete non-Gaussian spatial data. Our scalable VB method embeds semi-parametric approximations for the latent spatial random processes and parallel computing offered by modern high-performance computing systems. Our approaches deliver nearly identical inferential and predictive performance compared to 'gold standard' methods but achieve computational speedups of up to 1000x. We demonstrate our approaches through a comparative numerical study as well as applications to two real-world datasets. Our proposed VB methodology enables practitioners to model millions of non-Gaussian spatial observations using a standard laptop within a short timeframe. △ Less

Submitted 17 March, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

Comments: 34 Pages for the main paper, 72 pages for the supplemental information, 4 tables, 5 figures

arXiv:2402.07438 [pdf, ps, other]

The Powell Conjecture for the genus-three Heegaard splitting of the $3$-sphere

Authors: Sangbum Cho, Yuya Koda, Jung Hoon Lee

Abstract: The Powell Conjecture states that four specific elements suffice to generate the Goeritz group of the Heegaard splitting of the $3$-sphere. We present an alternative proof of the Powell Conjecture when the genus of the splitting is $3$, and suggest a strategy for the case of higher genera. The Powell Conjecture states that four specific elements suffice to generate the Goeritz group of the Heegaard splitting of the $3$-sphere. We present an alternative proof of the Powell Conjecture when the genus of the splitting is $3$, and suggest a strategy for the case of higher genera. △ Less

Submitted 12 February, 2024; originally announced February 2024.

Comments: 9 pages, 3 figures

MSC Class: 57K30

arXiv:2402.06185 [pdf, other]

Development and validation of an artificial intelligence model to accurately predict spinopelvic parameters

Authors: Edward S. Harake, Joseph R. Linzey, Cheng Jiang, Rushikesh S. Joshi, Mark M. Zaki, Jaes C. Jones, Siri S. Khalsa, John H. Lee, Zachary Wilseck, Jacob R. Joseph, Todd C. Hollon, Paul Park

Abstract: Objective. Achieving appropriate spinopelvic alignment has been shown to be associated with improved clinical symptoms. However, measurement of spinopelvic radiographic parameters is time-intensive and interobserver reliability is a concern. Automated measurement tools have the promise of rapid and consistent measurements, but existing tools are still limited by some degree of manual user-entry re… ▽ More Objective. Achieving appropriate spinopelvic alignment has been shown to be associated with improved clinical symptoms. However, measurement of spinopelvic radiographic parameters is time-intensive and interobserver reliability is a concern. Automated measurement tools have the promise of rapid and consistent measurements, but existing tools are still limited by some degree of manual user-entry requirements. This study presents a novel artificial intelligence (AI) tool called SpinePose that automatically predicts spinopelvic parameters with high accuracy without the need for manual entry. Methods. SpinePose was trained and validated on 761 sagittal whole-spine X-rays to predict sagittal vertical axis (SVA), pelvic tilt (PT), pelvic incidence (PI), sacral slope (SS), lumbar lordosis (LL), T1-pelvic angle (T1PA), and L1-pelvic angle (L1PA). A separate test set of 40 X-rays was labeled by 4 reviewers, including fellowship-trained spine surgeons and a fellowship-trained radiologist with neuroradiology subspecialty certification. Median errors relative to the most senior reviewer were calculated to determine model accuracy on test images. Intraclass correlation coefficients (ICC) were used to assess inter-rater reliability. Results. SpinePose exhibited the following median (interquartile range) parameter errors: SVA: 2.2(2.3)mm, p=0.93; PT: 1.3(1.2)°, p=0.48; SS: 1.7(2.2)°, p=0.64; PI: 2.2(2.1)°, p=0.24; LL: 2.6(4.0)°, p=0.89; T1PA: 1.1(0.9)°, p=0.42; and L1PA: 1.4(1.6)°, p=0.49. Model predictions also exhibited excellent reliability at all parameters (ICC: 0.91-1.0). Conclusions. SpinePose accurately predicted spinopelvic parameters with excellent reliability comparable to fellowship-trained spine surgeons and neuroradiologists. Utilization of predictive AI tools in spinal imaging can substantially aid in patient selection and surgical planning. △ Less

Submitted 8 February, 2024; originally announced February 2024.

Comments: 10 pages, 5 figures, to appear in Journal of Neurosurgery: Spine

arXiv:2402.05383 [pdf, other]

First measurement of the yield of $^8$He isotopes produced in liquid scintillator by cosmic-ray muons at Daya Bay

Authors: Daya Bay Collaboration, F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, Y. C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng, X. Y. Ding , et al. (177 additional authors not shown)

Abstract: Daya Bay presents the first measurement of cosmogenic $^8$He isotope production in liquid scintillator, using an innovative method for identifying cascade decays of $^8$He and its child isotope, $^8$Li. We also measure the production yield of $^9$Li isotopes using well-established methodology. The results, in units of 10$^{-8}μ^{-1}$g$^{-1}$cm$^{2}$, are 0.307$\pm$0.042, 0.341$\pm$0.040, and 0.546… ▽ More Daya Bay presents the first measurement of cosmogenic $^8$He isotope production in liquid scintillator, using an innovative method for identifying cascade decays of $^8$He and its child isotope, $^8$Li. We also measure the production yield of $^9$Li isotopes using well-established methodology. The results, in units of 10$^{-8}μ^{-1}$g$^{-1}$cm$^{2}$, are 0.307$\pm$0.042, 0.341$\pm$0.040, and 0.546$\pm$0.076 for $^8$He, and 6.73$\pm$0.73, 6.75$\pm$0.70, and 13.74$\pm$0.82 for $^9$Li at average muon energies of 63.9~GeV, 64.7~GeV, and 143.0~GeV, respectively. The measured production rate of $^8$He isotopes is more than an order of magnitude lower than any other measurement of cosmogenic isotope production. It replaces the results of previous attempts to determine the ratio of $^8$He to $^9$Li production that yielded a wide range of limits from 0 to 30\%. The results provide future liquid-scintillator-based experiments with improved ability to predict cosmogenic backgrounds. △ Less

Submitted 7 February, 2024; originally announced February 2024.

arXiv:2401.16716 [pdf, ps, other]

A parameter-free approach for solving SOS-convex semi-algebraic fractional programs

Authors: Chengmiao Yang, Liguo Jiao, Jae Hyoung Lee

Abstract: In this paper, we study a class of nonsmooth fractional programs {\rm (FP, for short)} with SOS-convex semi-algebraic functions. Under suitable assumptions, we derive a strong duality result between the problem (FP) and its semidefinite programming (SDP) relaxations. Remarkably, we extract an optimal solution of the problem (FP) by solving one and only one associated SDP problem. Numerical example… ▽ More In this paper, we study a class of nonsmooth fractional programs {\rm (FP, for short)} with SOS-convex semi-algebraic functions. Under suitable assumptions, we derive a strong duality result between the problem (FP) and its semidefinite programming (SDP) relaxations. Remarkably, we extract an optimal solution of the problem (FP) by solving one and only one associated SDP problem. Numerical examples are also given. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: 22 pages

MSC Class: 90C32; 90C22; 90C23

arXiv:2401.02901 [pdf, other]

Charged-current non-standard neutrino interactions at Daya Bay

Authors: Daya Bay collaboration, F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, Y. C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng, X. Y. Ding , et al. (177 additional authors not shown)

Abstract: The full data set of the Daya Bay reactor neutrino experiment is used to probe the effect of the charged current non-standard interactions (CC-NSI) on neutrino oscillation experiments. Two different approaches are applied and constraints on the corresponding CC-NSI parameters are obtained with the neutrino flux taken from the Huber-Mueller model with a $5\%$ uncertainty. For the quantum mechanics-… ▽ More The full data set of the Daya Bay reactor neutrino experiment is used to probe the effect of the charged current non-standard interactions (CC-NSI) on neutrino oscillation experiments. Two different approaches are applied and constraints on the corresponding CC-NSI parameters are obtained with the neutrino flux taken from the Huber-Mueller model with a $5\%$ uncertainty. For the quantum mechanics-based approach (QM-NSI), the constraints on the CC-NSI parameters $ε_{eα}$ and $ε_{eα}^{s}$ are extracted with and without the assumption that the effects of the new physics are the same in the production and detection processes, respectively. The approach based on the weak effective field theory (WEFT-NSI) deals with four types of CC-NSI represented by the parameters $[\varepsilon_{X}]_{eα}$. For both approaches, the results for the CC-NSI parameters are shown for cases with various fixed values of the CC-NSI and the Dirac CP-violating phases, and when they are allowed to vary freely. We find that constraints on the QM-NSI parameters $ε_{eα}$ and $ε_{eα}^{s}$ from the Daya Bay experiment alone can reach the order $\mathcal{O}(0.01)$ for the former and $\mathcal{O}(0.1)$ for the latter, while for WEFT-NSI parameters $[\varepsilon_{X}]_{eα}$, we obtain $\mathcal{O}(0.1)$ for both cases. △ Less

Submitted 19 March, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

Comments: 25 pages, 16 figures, 6 tables; 36 pages, format changed, references added

arXiv:2401.02184 [pdf, ps, other]

The primitive curve complex for a handlebody

Authors: Sangbum Cho, Jung Hoon Lee

Abstract: A simple closed curve in the boundary surface of a handlebody is called primitive if there exists an essential disk in the handlebody whose boundary circle intersects the curve transversely in a single point. The primitive curve complex is then defined to be the full subcomplex of the curve complex for the boundary surface, spanned by the vertices of primitive curves. Given any two primitive curve… ▽ More A simple closed curve in the boundary surface of a handlebody is called primitive if there exists an essential disk in the handlebody whose boundary circle intersects the curve transversely in a single point. The primitive curve complex is then defined to be the full subcomplex of the curve complex for the boundary surface, spanned by the vertices of primitive curves. Given any two primitive curves, we construct a sequence of primitive curves from one to the other one satisfying a certain property. As a consequence, we prove that the primitive curve complex for the handlebody is connected. △ Less

Submitted 4 January, 2024; originally announced January 2024.

Comments: 16 pages, 22 figures

MSC Class: 57K30

arXiv:2401.00265 [pdf, ps, other]

An unconventional platform for two-dimensional Kagome flat bands on semiconductor surfaces

Authors: Jae Hyuck Lee, GwanWoo Kim, Inkyung Song, Ye** Kim, Yeonjae Lee, Sung Jong Yoo, Deok-Yong Cho, Jun-Won Rhim, Jongkeun Jung, Gunn Kim, Changyoung Kim

Abstract: In condensed matter physics, the Kagome lattice and its inherent flat bands have attracted considerable attention for their potential to host a variety of exotic physical phenomena. Despite extensive efforts to fabricate thin films of Kagome materials aimed at modulating the flat bands through electrostatic gating or strain manipulation, progress has been limited. Here, we report the observation o… ▽ More In condensed matter physics, the Kagome lattice and its inherent flat bands have attracted considerable attention for their potential to host a variety of exotic physical phenomena. Despite extensive efforts to fabricate thin films of Kagome materials aimed at modulating the flat bands through electrostatic gating or strain manipulation, progress has been limited. Here, we report the observation of a novel $d$-orbital hybridized Kagome-derived flat band in Ag/Si(111) $\sqrt{3}\times\sqrt{3}$ as revealed by angle-resolved photoemission spectroscopy. Our findings indicate that silver atoms on a silicon substrate form a Kagome-like structure, where a delicate balance in the hop** parameters of the in-plane $d$-orbitals leads to destructive interference, resulting in a flat band. These results not only introduce a new platform for Kagome physics but also illuminate the potential for integrating metal-semiconductor interfaces into Kagome-related research, thereby opening a new avenue for exploring ideal two-dimensional Kagome systems. △ Less

Submitted 30 December, 2023; originally announced January 2024.

Comments: 7 pages, 4 figures

arXiv:2401.00104 [pdf, other]

Causal State Distillation for Explainable Reinforcement Learning

Authors: Wenhao Lu, Xufeng Zhao, Thilo Fryen, Jae Hee Lee, Mengdi Li, Sven Magg, Stefan Wermter

Abstract: Reinforcement learning (RL) is a powerful technique for training intelligent agents, but understanding why these agents make specific decisions can be quite challenging. This lack of transparency in RL models has been a long-standing problem, making it difficult for users to grasp the reasons behind an agent's behaviour. Various approaches have been explored to address this problem, with one promi… ▽ More Reinforcement learning (RL) is a powerful technique for training intelligent agents, but understanding why these agents make specific decisions can be quite challenging. This lack of transparency in RL models has been a long-standing problem, making it difficult for users to grasp the reasons behind an agent's behaviour. Various approaches have been explored to address this problem, with one promising avenue being reward decomposition (RD). RD is appealing as it sidesteps some of the concerns associated with other methods that attempt to rationalize an agent's behaviour in a post-hoc manner. RD works by exposing various facets of the rewards that contribute to the agent's objectives during training. However, RD alone has limitations as it primarily offers insights based on sub-rewards and does not delve into the intricate cause-and-effect relationships that occur within an RL agent's neural model. In this paper, we present an extension of RD that goes beyond sub-rewards to provide more informative explanations. Our approach is centred on a causal learning framework that leverages information-theoretic measures for explanation objectives that encourage three crucial properties of causal factors: causal sufficiency, sparseness, and orthogonality. These properties help us distill the cause-and-effect relationships between the agent's states and actions or rewards, allowing for a deeper understanding of its decision-making processes. Our framework is designed to generate local explanations and can be applied to a wide range of RL tasks with multiple reward channels. Through a series of experiments, we demonstrate that our approach offers more meaningful and insightful explanations for the agent's action selections. △ Less

Submitted 1 April, 2024; v1 submitted 29 December, 2023; originally announced January 2024.

Comments: https://lukaswill.github.io/; Accepted as oral by CLeaR 2024

arXiv:2401.00066 [pdf, other]

A Quantum $H^*(T)$-module via Quasimap Invariants

Authors: Jae Hwang Lee

Abstract: For $X$ a smooth projective variety, the quantum cohomology ring $QH^*(X)$ is a deformation of the usual cohomology ring $H^*(X)$, where the product structure is modified to incorporate quantum corrections. These correction terms are defined using Gromov-Witten invariants. When $X$ is toric with the geometric quotient description $V /\!/ T$, the cohomology ring $H^*(V /\!/T)$ also has the structur… ▽ More For $X$ a smooth projective variety, the quantum cohomology ring $QH^*(X)$ is a deformation of the usual cohomology ring $H^*(X)$, where the product structure is modified to incorporate quantum corrections. These correction terms are defined using Gromov-Witten invariants. When $X$ is toric with the geometric quotient description $V /\!/ T$, the cohomology ring $H^*(V /\!/T)$ also has the structure of a quantum $H^*(T)$-module. In this paper, we give a new deformation using quasimap invariants with a light point. This defines $H^*(T)$-module structure on $H^*(X)$ through a modified version of the WDVV equations. Using the Atiyah-Bott localization theorem, we explicitly compute this structure for the Hirzebruch surface of type 2. We conjecture that this new quantum module structure is isomorphic to the natural module structure of the Batyrev ring for a semipositive toric variety. △ Less

Submitted 29 December, 2023; originally announced January 2024.

MSC Class: 14N35; 53D45

arXiv:2312.09567 [pdf, other]

Discovery of a large-scale H I plume in the NGC 7194 Group

Authors: Mina Pak, Junhyun Baek, Joon Hyeop Lee, Aeree Chung, Matt Owers, Hyun** Jeong, Eon-Chang Sung, Yun-Kyeong Sheen

Abstract: We present the discovery of a new H I structure in the NGC 7194 group from the observations using the Karl G. Jansky Very Large Array. NGC 7194 group is a nearby (z ~ 0.027) small galaxy group with five quiescent members. The observations reveal a 200 kpc-long H I plume that spans the entire group with a total mass of M$_{HI}$ = 3.4 x 10$^{10}$ M$_{\odot}$. The line-of-sight velocity of the H I ga… ▽ More We present the discovery of a new H I structure in the NGC 7194 group from the observations using the Karl G. Jansky Very Large Array. NGC 7194 group is a nearby (z ~ 0.027) small galaxy group with five quiescent members. The observations reveal a 200 kpc-long H I plume that spans the entire group with a total mass of M$_{HI}$ = 3.4 x 10$^{10}$ M$_{\odot}$. The line-of-sight velocity of the H I gas gradually increases from south (7200 km s$^{-1}$) to north (8200 km $^{-1}$), and the local velocity dispersion is up to 70 km s$^{-1}$. The structure is not spatially coincident with any member galaxies but it shows close associations with a number of blue star-forming knots. Intragroup H I gas is not rare, but this particular structure is still one of the unusual cases in the sense that it does not show any clear connection with sizable galaxies in the group. We discuss the potential origins of this large-scale H I gas in the NGC 7194 group and its relation with the intergalactic star-forming knots. We propose that this HI feature could have originated from tidal interactions among group members or the infall of a late-type galaxy into the group. Alternatively, it might be leftover gas from flyby intruders. △ Less

Submitted 15 December, 2023; originally announced December 2023.

Comments: 9 pages, 3 figures

arXiv:2312.08888 [pdf, other]

Read Between the Layers: Leveraging Multi-Layer Representations for Rehearsal-Free Continual Learning with Pre-Trained Models

Authors: Kyra Ahrens, Hans Hergen Lehmann, Jae Hee Lee, Stefan Wermter

Abstract: We address the Continual Learning (CL) problem, wherein a model must learn a sequence of tasks from non-stationary distributions while preserving prior knowledge upon encountering new experiences. With the advancement of foundation models, CL research has pivoted from the initial learning-from-scratch paradigm towards utilizing generic features from large-scale pre-training. However, existing appr… ▽ More We address the Continual Learning (CL) problem, wherein a model must learn a sequence of tasks from non-stationary distributions while preserving prior knowledge upon encountering new experiences. With the advancement of foundation models, CL research has pivoted from the initial learning-from-scratch paradigm towards utilizing generic features from large-scale pre-training. However, existing approaches to CL with pre-trained models primarily focus on separating class-specific features from the final representation layer and neglect the potential of intermediate representations to capture low- and mid-level features, which are more invariant to domain shifts. In this work, we propose LayUP, a new prototype-based approach to CL that leverages second-order feature statistics from multiple intermediate layers of a pre-trained network. Our method is conceptually simple, does not require access to prior data, and works out of the box with any foundation model. LayUP surpasses the state of the art in four of the seven class-incremental learning benchmarks, all three domain-incremental learning benchmarks and in six of the seven online continual learning benchmarks, while significantly reducing memory and computational requirements compared to existing baselines. Our results demonstrate that fully exhausting the representational capacities of pre-trained models in CL goes well beyond their final embeddings. △ Less

Submitted 5 July, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

Comments: Accepted for publication in Transactions of Machine Learning Research (TMLR) journal

arXiv:2312.04899 [pdf, other]

Morphology of Galaxies in JWST Fields: Initial Distribution and Evolution of Galaxy Morphology

Authors: Jeong Hwan Lee, Changbom Park, Ho Seong Hwang, Minseong Kwon

Abstract: A recent study from the Horizon Run (HR5) cosmological simulation has predicted that galaxies with ${\rm log}~M_{\ast}/M_{\odot}\lesssim 10$ in the cosmic morning ($10\gtrsim z\gtrsim 4$) dominantly have disk-like morphology in the $Λ$CDM universe, which is driven by the tidal torque in the initial matter fluctuations. For a direct comparison with observation, we identify a total of about… ▽ More A recent study from the Horizon Run (HR5) cosmological simulation has predicted that galaxies with ${\rm log}~M_{\ast}/M_{\odot}\lesssim 10$ in the cosmic morning ($10\gtrsim z\gtrsim 4$) dominantly have disk-like morphology in the $Λ$CDM universe, which is driven by the tidal torque in the initial matter fluctuations. For a direct comparison with observation, we identify a total of about $19,000$ James Webb Space Telescope (JWST) galaxies with ${\rm log}~M_{\ast}/M_{\odot}>9$ at $z=0.6-8.0$ utilizing deep JWST/NIRCam images of publicly released fields, including NEP-TDF, NGDEEP, CEERS, COSMOS, UDS, and SMACS J0723$-$7327. We estimate their stellar masses and photometric redshifts with the redshift dispersion of $σ_{\rm NMAD}=0.009$ and outlier fraction of only about $6\%$. We classify galaxies into three morphological types, `disks', `spheroids', and `irregulars', applying the same criteria used in the HR5 study. The morphological distribution of the JWST galaxies shows that disk galaxies account for $60-70\%$ at all redshift ranges. However, in the high-mass regime (${\rm log}~M_{\ast}/M_{\odot}\gtrsim11$), spheroidal morphology becomes the dominant type. This implies that mass growth of galaxies is accompanied with morphological transition from disks to spheroids. The fraction of irregulars is about 20\% or less at all mass and redshifts. All the trends in the morphology distribution are consistently found in the six JWST fields. These results are in close agreement with the results from the HR5 simulation, particularly confirming the prevalence of disk galaxies at small masses in the cosmic morning and noon. △ Less

Submitted 13 March, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

Comments: Accepted for publication in ApJ, 30 pages, 14 figures, 5 tables, 3 appendices

arXiv:2312.02100 [pdf, ps, other]

Quantum Steenrod operations of symplectic resolutions

Authors: Jae Hee Lee

Abstract: We study the mod $p$ equivariant quantum cohomology of conical symplectic resolutions. Using symplectic genus zero enumerative geometry, Fukaya and Wilkins defined operations on mod $p$ quantum cohomology deforming the classical Steenrod operations on mod $p$ cohomology. We conjecture that these quantum Steenrod operations on divisor classes agree with the $p$-curvature of the mod $p$ equivariant… ▽ More We study the mod $p$ equivariant quantum cohomology of conical symplectic resolutions. Using symplectic genus zero enumerative geometry, Fukaya and Wilkins defined operations on mod $p$ quantum cohomology deforming the classical Steenrod operations on mod $p$ cohomology. We conjecture that these quantum Steenrod operations on divisor classes agree with the $p$-curvature of the mod $p$ equivariant quantum connection, and verify this in the case of the Springer resolution. The key ingredient is a new compatibility relation between the quantum Steenrod operations and the shift operators. △ Less

Submitted 10 March, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

Comments: 35 pages, comments welcome! v2: added references, fixed minor typos

MSC Class: 53D45

arXiv:2312.00242 [pdf, ps, other]

Trace relations and open string vacua

Authors: Ji Hoon Lee

Abstract: We study to what extent, and in what form, the notion of gauge-string duality may persist at finite $N$. It is shown, in the half-BPS sector, that the states of D3 giant graviton branes in $\mathrm{AdS}_5 \times S^5$ are holographically dual to certain auxiliary ghosts that compensate for finite $N$ trace relations in $U(N)$ $\mathcal{N}=4$ super Yang-Mills. The complex formed from spaces of state… ▽ More We study to what extent, and in what form, the notion of gauge-string duality may persist at finite $N$. It is shown, in the half-BPS sector, that the states of D3 giant graviton branes in $\mathrm{AdS}_5 \times S^5$ are holographically dual to certain auxiliary ghosts that compensate for finite $N$ trace relations in $U(N)$ $\mathcal{N}=4$ super Yang-Mills. The complex formed from spaces of states of bulk D3 giants is observed to furnish a BRST-like resolution of the half-BPS Hilbert space of $U(N)$ $\mathcal{N}=4$ SYM at finite $N$. We argue that the identification between the states of certain bulk D-branes and the auxiliary ghosts in the boundary holds rather generally at vanishing 't Hooft coupling $λ= 0$. We propose that a complex, which furnishes a BRST-like resolution of the finite $N$ Hilbert space of a boundary $U(N)$ gauge theory at $λ= 0$, should be identified as the space of states of the dual string theory in the $α' \to \infty$ limit. The Lefschetz trace formula provides the holographic map in this regime, where bulk observables are computed by taking the alternating sum of the expectation values in an ensemble of states built on each open string vacuum. The giant graviton expansion is recovered and generalized in a limit of the resolution. △ Less

Submitted 22 December, 2023; v1 submitted 30 November, 2023; originally announced December 2023.

Comments: 45 pages; v2: minor edits, references added

arXiv:2311.15356 [pdf, other]

Having Second Thoughts? Let's hear it

Authors: Jung H. Lee, Sujith Vijayan

Abstract: Deep learning models loosely mimic bottom-up signal pathways from low-order sensory areas to high-order cognitive areas. After training, DL models can outperform humans on some domain-specific tasks, but their decision-making process has been known to be easily disrupted. Since the human brain consists of multiple functional areas highly connected to one another and relies on intricate interplays… ▽ More Deep learning models loosely mimic bottom-up signal pathways from low-order sensory areas to high-order cognitive areas. After training, DL models can outperform humans on some domain-specific tasks, but their decision-making process has been known to be easily disrupted. Since the human brain consists of multiple functional areas highly connected to one another and relies on intricate interplays between bottom-up and top-down (from high-order to low-order areas) processing, we hypothesize that incorporating top-down signal processing may make DL models more robust. To address this hypothesis, we propose a certification process mimicking selective attention and test if it could make DL models more robust. Our empirical evaluations suggest that this newly proposed certification can improve DL models' accuracy and help us build safety measures to alleviate their vulnerabilities with both artificial and natural adversarial examples. △ Less

Submitted 31 May, 2024; v1 submitted 26 November, 2023; originally announced November 2023.

Comments: 10 pages, 6 figures, 3 table and Append/Supplementary materials. Section 3 has been substantially revised

arXiv:2311.13229 [pdf, other]

doi 10.1021/acs.nanolett.4c00574

Heat dissipation mechanisms in hybrid superconductor-semiconductor devices revealed by Joule spectroscopy

Authors: Angel Ibabe, Gorm O. Steffensen, Ignacio Casal, Mario Gomez, Thomas Kanne, Jesper Nygard, Alfredo Levy Yeyati, Eduardo J. H. Lee

Abstract: Understanding heating and cooling mechanisms in mesoscopic superconductor-semiconductor hybrid devices is crucial for their application in quantum technologies. Owing to the poor thermal conductivity of typical devices, heating effects can drive superconducting-to-normal phase transitions even at low applied bias, observed as sharp conductance dips through the loss of Andreev excess currents. Trac… ▽ More Understanding heating and cooling mechanisms in mesoscopic superconductor-semiconductor hybrid devices is crucial for their application in quantum technologies. Owing to the poor thermal conductivity of typical devices, heating effects can drive superconducting-to-normal phase transitions even at low applied bias, observed as sharp conductance dips through the loss of Andreev excess currents. Tracking such dips across magnetic field, cryostat temperature, and applied microwave power, which constitutes Joule spectroscopy, allows to uncover the underlying cooling bottlenecks in different parts of a device. By applying this technique, we analyze heat dissipation in devices based on full-shell InAs-Al nanowires and reveal that superconducting islands are strongly susceptible to heating as their cooling is limited by the rather inefficient electron-phonon coupling, as opposed to grounded superconductors that primarily cool by quasiparticle diffusion. Our measurements indicate that powers as low as 50-150 pW are able to fully suprpress the superconductivity of an island. Finally, we show that applied microwaves lead to similar heating effects as DC signals, and explore the interplay of the microwave frequency and the effective electron-phonon relaxation time. △ Less

Submitted 22 November, 2023; originally announced November 2023.

Comments: 9 pages, 4 figures

Journal ref: Nano Lett. 24, 6488, 2024

arXiv:2311.11527 [pdf]

doi 10.1038/s41467-023-44448-9

Thermal Hall effects due to topological spin fluctuations in YMnO$_3$

Authors: Ha-Leem Kim, Takuma Saito, Heejun Yang, Hiroaki Ishizuka, Matthew John Coak, Jun Han Lee, Hasung Sim, Yoon Seok Oh, Naoto Nagaosa, Je-Geun Park

Abstract: The thermal Hall effect in magnetic insulators has been considered a powerful method for examining the topological nature of charge-neutral quasiparticles such as magnons. Yet, unlike the kagome system, the triangular lattice has received less attention for studying the thermal Hall effect because the scalar spin chirality cancels out between adjacent triangles. However, such cancellation cannot b… ▽ More The thermal Hall effect in magnetic insulators has been considered a powerful method for examining the topological nature of charge-neutral quasiparticles such as magnons. Yet, unlike the kagome system, the triangular lattice has received less attention for studying the thermal Hall effect because the scalar spin chirality cancels out between adjacent triangles. However, such cancellation cannot be perfect if the triangular lattice is distorted, which could open the possibility of a non-zero thermal Hall effect. Here, we report that the trimerized triangular lattice of multiferroic hexagonal manganite YMnO$_3$ produces a highly unusual thermal Hall effect due to topological spin fluctuations with the additional intricacy of a Dzyaloshinskii-Moriya interaction under an applied magnetic field. We conclude the thermal Hall conductivity arises from the system's topological nature of spin fluctuations. Our theoretical calculations demonstrate that the thermal Hall conductivity is also related in this material to the splitting of the otherwise degenerate two chiralities, left and right, of its 120$^{\circ}$ magnetic structure. Our result is one of the most unusual cases of topological physics due to this broken $Z_2$ symmetry of the chirality in the supposedly paramagnetic state of YMnO$_3$, with strong topological spin fluctuations. These new mechanisms in this important class of materials are crucial in exploring new thermal Hall physics and exotic excitations. △ Less

Submitted 19 November, 2023; originally announced November 2023.

Comments: 14 pages, 3 figures; accepted for publication in Nature Communications

arXiv:2311.10792 [pdf]

Enhancing Data Efficiency and Feature Identification for Lithium-Ion Battery Lifespan Prediction by Deciphering Interpretation of Temporal Patterns and Cyclic Variability Using Attention-Based Models

Authors: Jaewook Lee, Seongmin Heo, Jay H. Lee

Abstract: Accurately predicting the lifespan of lithium-ion batteries is crucial for optimizing operational strategies and mitigating risks. While numerous studies have aimed at predicting battery lifespan, few have examined the interpretability of their models or how such insights could improve predictions. Addressing this gap, we introduce three innovative models that integrate shallow attention layers in… ▽ More Accurately predicting the lifespan of lithium-ion batteries is crucial for optimizing operational strategies and mitigating risks. While numerous studies have aimed at predicting battery lifespan, few have examined the interpretability of their models or how such insights could improve predictions. Addressing this gap, we introduce three innovative models that integrate shallow attention layers into a foundational model from our previous work, which combined elements of recurrent and convolutional neural networks. Utilizing a well-known public dataset, we showcase our methodology's effectiveness. Temporal attention is applied to identify critical timesteps and highlight differences among test cell batches, particularly underscoring the significance of the "rest" phase. Furthermore, by applying cyclic attention via self-attention to context vectors, our approach effectively identifies key cycles, enabling us to strategically decrease the input size for quicker predictions. Employing both single- and multi-head attention mechanisms, we have systematically minimized the required input from 100 to 50 and then to 30 cycles, refining this process based on cyclic attention scores. Our refined model exhibits strong regression capabilities, accurately forecasting the initiation of rapid capacity fade with an average deviation of only 58 cycles by analyzing just the initial 30 cycles of easily accessible input data. △ Less

Submitted 11 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

arXiv:2311.08022 [pdf, ps, other]

Two-Stage Predict+Optimize for Mixed Integer Linear Programs with Unknown Parameters in Constraints

Authors: Xinyi Hu, Jasper C. H. Lee, Jimmy H. M. Lee

Abstract: Consider the setting of constrained optimization, with some parameters unknown at solving time and requiring prediction from relevant features. Predict+Optimize is a recent framework for end-to-end training supervised learning models for such predictions, incorporating information about the optimization problem in the training process in order to yield better predictions in terms of the quality of… ▽ More Consider the setting of constrained optimization, with some parameters unknown at solving time and requiring prediction from relevant features. Predict+Optimize is a recent framework for end-to-end training supervised learning models for such predictions, incorporating information about the optimization problem in the training process in order to yield better predictions in terms of the quality of the predicted solution under the true parameters. Almost all prior works have focused on the special case where the unknowns appear only in the optimization objective and not the constraints. Hu et al.~proposed the first adaptation of Predict+Optimize to handle unknowns appearing in constraints, but the framework has somewhat ad-hoc elements, and they provided a training algorithm only for covering and packing linear programs. In this work, we give a new \emph{simpler} and \emph{more powerful} framework called \emph{Two-Stage Predict+Optimize}, which we believe should be the canonical framework for the Predict+Optimize setting. We also give a training algorithm usable for all mixed integer linear programs, vastly generalizing the applicability of the framework. Experimental results demonstrate the superior prediction performance of our training framework over all classical and state-of-the-art methods. △ Less

Submitted 14 November, 2023; originally announced November 2023.

arXiv:2311.06798 [pdf, other]

doi 10.1609/aaai.v38i12.29212

MetaMix: Meta-state Precision Searcher for Mixed-precision Activation Quantization

Authors: Han-Byul Kim, Joo Hyung Lee, Sungjoo Yoo, Hong-Seok Kim

Abstract: Mixed-precision quantization of efficient networks often suffer from activation instability encountered in the exploration of bit selections. To address this problem, we propose a novel method called MetaMix which consists of bit selection and weight training phases. The bit selection phase iterates two steps, (1) the mixed-precision-aware weight update, and (2) the bit-search training with the fi… ▽ More Mixed-precision quantization of efficient networks often suffer from activation instability encountered in the exploration of bit selections. To address this problem, we propose a novel method called MetaMix which consists of bit selection and weight training phases. The bit selection phase iterates two steps, (1) the mixed-precision-aware weight update, and (2) the bit-search training with the fixed mixed-precision-aware weights, both of which combined reduce activation instability in mixed-precision quantization and contribute to fast and high-quality bit selection. The weight training phase exploits the weights and step sizes trained in the bit selection phase and fine-tunes them thereby offering fast training. Our experiments with efficient and hard-to-quantize networks, i.e., MobileNet v2 and v3, and ResNet-18 on ImageNet show that our proposed method pushes the boundary of mixed-precision quantization, in terms of accuracy vs. operations, by outperforming both mixed- and single-precision SOTA methods. △ Less

Submitted 9 April, 2024; v1 submitted 12 November, 2023; originally announced November 2023.

Comments: Proc. The 38th Annual AAAI Conference on Artificial Intelligence (AAAI)

arXiv:2311.05373 [pdf]

What is prompt literacy? An exploratory study of language learners' development of new literacy skill using generative AI

Authors: Yohan Hwang, Jang Ho Lee, Dongkwang Shin

Abstract: In the current study,we propose that, in the era of generative AI, there is now a new form of literacy called "prompt literacy," which refers to the ability to generate precise prompts as input for AI systems, interpret the outputs, and iteratively refine prompts to achieve desired results. To explore the emergence and development of this literacy skill, the current study examined 30 EFL students'… ▽ More In the current study,we propose that, in the era of generative AI, there is now a new form of literacy called "prompt literacy," which refers to the ability to generate precise prompts as input for AI systems, interpret the outputs, and iteratively refine prompts to achieve desired results. To explore the emergence and development of this literacy skill, the current study examined 30 EFL students' engagement in an AI-powered image creation project, through which they created artworks representing the socio-cultural meanings of English words by iteratively drafting and refining prompts in generative AI tools. By examining AI-generated images and the participants' drafting and revision of their prompts, this study demonstrated the emergence of learners' prompt literacy skills. The survey data further showed the participants' perceived improvement in their vocabulary learning strategies as a result of engaging in the target AI-powered project. In addition, the participants' post-project reflection revealed three benefits of develo** prompt literacy: enjoyment from manifesting imagined outcomes; recognition of its importance for communication, problem-solving and career development; and the enhanced understanding of the collaborative nature of human-AI interaction. These findings suggest that prompt literacy is an increasingly crucial literacy for the AI era. △ Less

Submitted 9 November, 2023; originally announced November 2023.

Comments: 22 pages

arXiv:2311.01724 [pdf, other]

Holography Transformer

Authors: Chanyong Park, Se** Kim, Jung Hun Lee

Abstract: We have constructed a generative artificial intelligence model to predict dual gravity solutions when provided with the input of holographic entanglement entropy. The model utilized in our study is based on the transformer algorithm, widely used for various natural language tasks including text generation, summarization, and translation. This algorithm possesses the ability to understand the meani… ▽ More We have constructed a generative artificial intelligence model to predict dual gravity solutions when provided with the input of holographic entanglement entropy. The model utilized in our study is based on the transformer algorithm, widely used for various natural language tasks including text generation, summarization, and translation. This algorithm possesses the ability to understand the meanings of input and output sequences by utilizing multi-head attention layers. In the training procedure, we generated pairs of examples consisting of holographic entanglement entropy data and their corresponding metric solutions. Once the model has completed the training process, it demonstrates the ability to generate predictions regarding a dual geometry that corresponds to the given holographic entanglement entropy. Subsequently, we proceed to validate the dual geometry to confirm its correspondence with the holographic entanglement entropy data. △ Less

Submitted 15 January, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

Comments: 14 pages, 11 figures, add references (version 2), add some comment (version 3)

arXiv:2310.15571 [pdf, other]

doi 10.18653/v1/2023.findings-emnlp.469

Visually Grounded Continual Language Learning with Selective Specialization

Authors: Kyra Ahrens, Lennart Bengtson, Jae Hee Lee, Stefan Wermter

Abstract: A desirable trait of an artificial agent acting in the visual world is to continually learn a sequence of language-informed tasks while striking a balance between sufficiently specializing in each task and building a generalized knowledge for transfer. Selective specialization, i.e., a careful selection of model components to specialize in each task, is a strategy to provide control over this trad… ▽ More A desirable trait of an artificial agent acting in the visual world is to continually learn a sequence of language-informed tasks while striking a balance between sufficiently specializing in each task and building a generalized knowledge for transfer. Selective specialization, i.e., a careful selection of model components to specialize in each task, is a strategy to provide control over this trade-off. However, the design of selection strategies requires insights on the role of each model component in learning rather specialized or generalizable representations, which poses a gap in current research. Thus, our aim with this work is to provide an extensive analysis of selection strategies for visually grounded continual language learning. Due to the lack of suitable benchmarks for this purpose, we introduce two novel diagnostic datasets that provide enough control and flexibility for a thorough model analysis. We assess various heuristics for module specialization strategies as well as quantifiable measures for two different types of model architectures. Finally, we design conceptually simple approaches based on our analysis that outperform common continual learning baselines. Our results demonstrate the need for further efforts towards better aligning continual learning algorithms with the learning behaviors of individual model parts. △ Less

Submitted 24 October, 2023; originally announced October 2023.

Comments: Accepted to EMNLP 2023 Findings

Journal ref: Findings of the Association for Computational Linguistics: EMNLP 2023

arXiv:2310.14119 [pdf]

Accelerating Aquatic Soft Robots with Elastic Instability Effects

Authors: Zechen Xiong, Jeong Hun Lee, Hod Lipson

Abstract: Sinusoidal undulation has long been considered the most successful swimming pattern for fish and bionic aquatic robots [1]. However, a swimming pattern generated by the hair clip mechanism (HCM, part iii, Figure 1A) [2]~[5] may challenge this knowledge. HCM is an in-plane prestressed bi-stable mechanism that stores elastic energy and releases the stored energy quickly via its snap-through buckling… ▽ More Sinusoidal undulation has long been considered the most successful swimming pattern for fish and bionic aquatic robots [1]. However, a swimming pattern generated by the hair clip mechanism (HCM, part iii, Figure 1A) [2]~[5] may challenge this knowledge. HCM is an in-plane prestressed bi-stable mechanism that stores elastic energy and releases the stored energy quickly via its snap-through buckling. When used for fish robots, the HCM functions as the fish body and creates unique swimming patterns that we term HCM undulation. With the same energy consumption [3], HCM fish outperforms the traditionally designed soft fish with a two-fold increase in cruising speed. We reproduce this phenomenon in a single-link simulation with Aquarium [6]. HCM undulation generates an average propulsion of 16.7 N/m, 2-3 times larger than the reference undulation (6.78 N/m), sine pattern (5.34 N/m/s), and cambering sine pattern (6.36 N/m), and achieves an efficiency close to the sine pattern. These results can aid in develo** fish robots and faster swimming patterns. △ Less

Submitted 11 July, 2024; v1 submitted 21 October, 2023; originally announced October 2023.

arXiv:2310.12674 [pdf, other]

Observation of the Antimatter Hypernucleus $^4_{\barΛ}\overline{\hbox{H}}$

Authors: STAR Collaboration, M. I. Abdulhamid, B. E. Aboona, J. Adam, L. Adamczyk, J. R. Adams, I. Aggarwal, M. M. Aggarwal, Z. Ahammed, E. C. Aschenauer, S. Aslam, J. Atchison, V. Bairathi, J. G. Ball Cap, K. Barish, R. Bellwied, P. Bhagat, A. Bhasin, S. Bhatta, S. R. Bhosale, J. Bielcik, J. Bielcikova, J. D. Brandenburg, C. Broodo, X. Z. Cai , et al. (342 additional authors not shown)

Abstract: At the origin of the Universe, asymmetry between the amount of created matter and antimatter led to the matter-dominated Universe as we know today. The origins of this asymmetry remain not completely understood yet. High-energy nuclear collisions create conditions similar to the Universe microseconds after the Big Bang, with comparable amounts of matter and antimatter. Much of the created antimatt… ▽ More At the origin of the Universe, asymmetry between the amount of created matter and antimatter led to the matter-dominated Universe as we know today. The origins of this asymmetry remain not completely understood yet. High-energy nuclear collisions create conditions similar to the Universe microseconds after the Big Bang, with comparable amounts of matter and antimatter. Much of the created antimatter escapes the rapidly expanding fireball without annihilating, making such collisions an effective experimental tool to create heavy antimatter nuclear objects and study their properties, ho** to shed some light on existing questions on the asymmetry between matter and antimatter. Here we report the first observation of the antimatter hypernucleus \hbox{$^4_{\barΛ}\overline{\hbox{H}}$}, composed of a $\barΛ$ , an antiproton and two antineutrons. The discovery was made through its two-body decay after production in ultrarelativistic heavy-ion collisions by the STAR experiment at the Relativistic Heavy Ion Collider. In total, 15.6 candidate \hbox{$^4_{\barΛ}\overline{\hbox{H}}$} antimatter hypernuclei are obtained with an estimated background count of 6.4. The lifetimes of the antihypernuclei \hbox{$^3_{\barΛ}\overline{\hbox{H}}$} and \hbox{$^4_{\barΛ}\overline{\hbox{H}}$} are measured and compared with the lifetimes of their corresponding hypernuclei, testing the symmetry between matter and antimatter. Various production yield ratios among (anti)hypernuclei and (anti)nuclei are also measured and compared with theoretical model predictions, shedding light on their production mechanisms. △ Less

Submitted 8 June, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

Comments: 28 pages, 5 figures in the main paper; 16 pages, 5 figures in the methods part

arXiv:2310.11884 [pdf, other]

From Neural Activations to Concepts: A Survey on Explaining Concepts in Neural Networks

Authors: Jae Hee Lee, Sergio Lanza, Stefan Wermter

Abstract: In this paper, we review recent approaches for explaining concepts in neural networks. Concepts can act as a natural link between learning and reasoning: once the concepts are identified that a neural learning system uses, one can integrate those concepts with a reasoning system for inference or use a reasoning system to act upon them to improve or enhance the learning system. On the other hand, k… ▽ More In this paper, we review recent approaches for explaining concepts in neural networks. Concepts can act as a natural link between learning and reasoning: once the concepts are identified that a neural learning system uses, one can integrate those concepts with a reasoning system for inference or use a reasoning system to act upon them to improve or enhance the learning system. On the other hand, knowledge can not only be extracted from neural networks but concept knowledge can also be inserted into neural network architectures. Since integrating learning and reasoning is at the core of neuro-symbolic AI, the insights gained from this survey can serve as an important step towards realizing neuro-symbolic AI based on explainable concepts. △ Less

Submitted 3 May, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

Comments: Accepted in Neurosymbolic Artificial Intelligence

Showing 1–50 of 617 results for author: Lee, J H