-
Probing the connection between IceCube neutrinos and MOJAVE AGN
Authors:
R. Abbasi,
M. Ackermann,
J. Adams,
S. K. Agarwalla,
J. A. Aguilar,
M. Ahlers,
J. M. Alameddine,
N. M. Amin,
K. Andeen,
C. Argüelles,
Y. Ashida,
S. Athanasiadou,
L. Ausborm,
S. N. Axani,
X. Bai,
A. Balagopal V.,
M. Baricevic,
S. W. Barwick,
S. Bash,
V. Basu,
R. Bay,
J. J. Beatty,
J. Becker Tjus,
J. Beise,
C. Bellenghi
, et al. (399 additional authors not shown)
Abstract:
Active Galactic Nuclei (AGN) are prime candidate sources of the high-energy, astrophysical neutrinos detected by IceCube. This is demonstrated by the real-time multi-messenger detection of the blazar TXS 0506+056 and the recent evidence of neutrino emission from NGC 1068 from a separate time-averaged study. However, the production mechanism of the astrophysical neutrinos in AGN is not well establi…
▽ More
Active Galactic Nuclei (AGN) are prime candidate sources of the high-energy, astrophysical neutrinos detected by IceCube. This is demonstrated by the real-time multi-messenger detection of the blazar TXS 0506+056 and the recent evidence of neutrino emission from NGC 1068 from a separate time-averaged study. However, the production mechanism of the astrophysical neutrinos in AGN is not well established which can be resolved via correlation studies with photon observations. For neutrinos produced due to photohadronic interactions in AGN, in addition to a correlation of neutrinos with high-energy photons, there would also be a correlation of neutrinos with photons emitted at radio wavelengths. In this work, we perform an in-depth stacking study of the correlation between 15 GHz radio observations of AGN reported in the MOJAVE XV catalog, and ten years of neutrino data from IceCube. We also use a time-dependent approach which improves the statistical power of the stacking analysis. No significant correlation was found for both analyses and upper limits are reported. When compared to the IceCube diffuse flux, at 100 TeV and for a spectral index of 2.5, the upper limits derived are $\sim3\%$ and $\sim9\%$ for the time-averaged and time-dependent case, respectively.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Search for a light sterile neutrino with 7.5 years of IceCube DeepCore data
Authors:
R. Abbasi,
M. Ackermann,
J. Adams,
S. K. Agarwalla,
J. A. Aguilar,
M. Ahlers,
J. M. Alameddine,
N. M. Amin,
K. Andeen,
C. Argüelles,
Y. Ashida,
S. Athanasiadou,
L. Ausborm,
S. N. Axani,
X. Bai,
A. Balagopal V.,
M. Baricevic,
S. W. Barwick,
S. Bash,
V. Basu,
R. Bay,
J. J. Beatty,
J. Becker Tjus,
J. Beise,
C. Bellenghi
, et al. (399 additional authors not shown)
Abstract:
We present a search for an eV-scale sterile neutrino using 7.5 years of data from the IceCube DeepCore detector. The analysis uses a sample of 21,914 events with energies between 5 and 150 GeV to search for sterile neutrinos through atmospheric muon neutrino disappearance. Improvements in event selection and treatment of systematic uncertainties provide greater statistical power compared to previo…
▽ More
We present a search for an eV-scale sterile neutrino using 7.5 years of data from the IceCube DeepCore detector. The analysis uses a sample of 21,914 events with energies between 5 and 150 GeV to search for sterile neutrinos through atmospheric muon neutrino disappearance. Improvements in event selection and treatment of systematic uncertainties provide greater statistical power compared to previous DeepCore sterile neutrino searches. Our results are compatible with the absence of mixing between active and sterile neutrino states, and we place constraints on the mixing matrix elements $|U_{μ4}|^2 < 0.0534$ and $|U_{τ4}|^2 < 0.0574$ at 90% CL under the assumption that $Δm^2_{41}\geq 1\;\mathrm{eV^2}$. These null results add to the growing tension between anomalous appearance results and constraints from disappearance searches in the 3+1 sterile neutrino landscape.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Deep Learning Models for Flap** Fin Unmanned Underwater Vehicle Control System Gait Optimization
Authors:
Brian Zhou,
Kamal Viswanath,
Jason Geder,
Alisha Sharma,
Julian Lee
Abstract:
The last few decades have led to the rise of research focused on propulsion and control systems for bio-inspired unmanned underwater vehicles (UUVs), which provide more maneuverable alternatives to traditional UUVs in underwater missions. Recent work has explored the use of time-series neural network surrogate models to predict thrust and power from vehicle design and fin kinematics. We develop a…
▽ More
The last few decades have led to the rise of research focused on propulsion and control systems for bio-inspired unmanned underwater vehicles (UUVs), which provide more maneuverable alternatives to traditional UUVs in underwater missions. Recent work has explored the use of time-series neural network surrogate models to predict thrust and power from vehicle design and fin kinematics. We develop a search-based inverse model that leverages kinematics-to-thrust and kinematics-to-power neural network models for control system design. Our inverse model finds a set of fin kinematics with the multi-objective goal of reaching a target thrust under power constraints while creating a smooth kinematics transition between flap** cycles. We demonstrate how a control system integrating this inverse model can make online, cycle-to-cycle adjustments to prioritize different system objectives, with improvements in increasing thrust generation or reducing power consumption of any given movement upwards of 0.5 N and 3.0 W in a range of 2.2 N and 9.0 W. As propulsive efficiency is of utmost importance for flap**-fin UUVs in order to extend their range and endurance for essential operations but lacks prior research, we develop a non-dimensional figure of merit (FOM), derived from measures of propulsive efficiency, that is able to evaluate different fin designs and kinematics, and allow for comparison with other bio-inspired platforms. We use the developed FOM to analyze optimal gaits and compare the performance between different fin materials, providing a better understanding of how fin materials affect thrust generation and propulsive efficiency and allowing us to inform control systems and weight for efficiency on the developed inverse gait-selector model.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Measurement of the integrated luminosity of data samples collected during 2019-2022 by the Belle II experiment
Authors:
The Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Ahmed,
J. K. Ahn,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Althubiti,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien
, et al. (382 additional authors not shown)
Abstract:
A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, diga…
▽ More
A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, digamma, and dimuon events is (426.52 $\pm$ 0.03 $\pm$ 2.48)~fb$^{-1}$, (427.32 $\pm$ 0.03 $\pm$ 2.56)~fb$^{-1}$, and (424.84 $\pm$ 0.04 $\pm$ 3.88)~fb$^{-1}$, where the first uncertainties are statistical and the second are systematic. The resulting total integrated luminosity obtained from the combination of the three methods is (426.88 $\pm$ 1.93)~fb$^{-1}$.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Active Healing of Microtubule-Motor Networks
Authors:
Fan Yang,
Shichen Liu,
Heun ** Lee,
Rob Phillips,
Matt Thomson
Abstract:
Cytoskeletal networks have a self-healing property where networks can repair defects to maintain structural integrity. However, both the mechanisms and dynamics of healing remain largely unknown. Here we report an unexplored healing mechanism in microtubule-motor networks by active crosslinking. We directly generate network cracks using a light-controlled microtubule-motor system, and observe that…
▽ More
Cytoskeletal networks have a self-healing property where networks can repair defects to maintain structural integrity. However, both the mechanisms and dynamics of healing remain largely unknown. Here we report an unexplored healing mechanism in microtubule-motor networks by active crosslinking. We directly generate network cracks using a light-controlled microtubule-motor system, and observe that the cracks can self-heal. Combining theory and experiment, we find that the networks must overcome internal elastic resistance in order to heal cracks, giving rise to a bifurcation of dynamics dependent on the initial opening angle of the crack: the crack heals below a critical angle and opens up at larger angles. Simulation of a continuum model reproduces the bifurcation dynamics, revealing the importance of a boundary layer where free motors and microtubules can actively crosslink and thereby heal the crack. We also formulate a simple elastic-rod model that can qualitatively predict the critical angle, which is found to be tunable by two dimensionless geometric parameters, the ratio of the boundary layer and network width, and the aspect ratio of the network. Our results provide a new framework for understanding healing in cytoskeletal networks and designing self-healable biomaterials.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Locate&Edit: Energy-based Text Editing for Efficient, Flexible, and Faithful Controlled Text Generation
Authors:
Hye Ryung Son,
Jay-Yoon Lee
Abstract:
Recent approaches to controlled text generation (CTG) often involve manipulating the weights or logits of base language models (LMs) at decoding time. However, these methods are inapplicable to latest black-box LMs and ineffective at preserving the core semantics of the base LM's original generations. In this work, we propose Locate&Edit(L&E), an efficient and flexible energy-based approach to CTG…
▽ More
Recent approaches to controlled text generation (CTG) often involve manipulating the weights or logits of base language models (LMs) at decoding time. However, these methods are inapplicable to latest black-box LMs and ineffective at preserving the core semantics of the base LM's original generations. In this work, we propose Locate&Edit(L&E), an efficient and flexible energy-based approach to CTG, which edits text outputs from a base LM using off-the-shelf energy models. Given text outputs from the base LM, L&E first locates spans that are most relevant to constraints (e.g., toxicity) utilizing energy models, and then edits these spans by replacing them with more suitable alternatives. Importantly, our method is compatible with black-box LMs, as it requires only the text outputs. Also, since L&E doesn't mandate specific architecture for its component models, it can work with a diverse combination of available off-the-shelf models. Moreover, L&E preserves the base LM's original generations, by selectively modifying constraint-related aspects of the texts and leaving others unchanged. These targeted edits also ensure that L&E operates efficiently. Our experiments confirm that L&E achieves superior semantic preservation of the base LM generations and speed, while simultaneously obtaining competitive or improved constraint satisfaction. Furthermore, we analyze how the granularity of energy distribution impacts CTG performance and find that fine-grained, regression-based energy models improve constraint satisfaction, compared to conventional binary classifier energy models.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
A Perspective on Quantum Sensors from Basic Research to Commercial Applications
Authors:
Eun Oh,
Maxwell D. Gregoire,
Adam T. Black,
K. Jeramy Hughes,
Paul D. Kunz,
Michael Larsen,
Jean Lautier-Gaud,
Jongmin Lee,
Peter D. D. Schwindt,
Sara L. Mouradian,
Frank A. Narducci,
Charles A. Sackett
Abstract:
Quantum sensors represent a new generation of sensors with improved precision, accuracy, stability, and robustness to environmental effects compared to their classical predecessors. After decades of laboratory development, several types of quantum sensors are now commercially available or are part-way through the commercialization process. This article provides a brief description of the operation…
▽ More
Quantum sensors represent a new generation of sensors with improved precision, accuracy, stability, and robustness to environmental effects compared to their classical predecessors. After decades of laboratory development, several types of quantum sensors are now commercially available or are part-way through the commercialization process. This article provides a brief description of the operation of a selection of quantum sensors that employ the principles of atom-light interactions and discusses progress toward packaging those sensors into products. This article covers quantum inertial and gravitational sensors, including gyroscopes, accelerometers, gravimeters, and gravity gradiometers that employ atom interferometry, nuclear magnetic resonance gyroscopes, atomic and spin-defect magnetometers, and Rydberg electric field sensors.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models
Authors:
Jaeyoung Lee,
Ximing Lu,
Jack Hessel,
Faeze Brahman,
Youngjae Yu,
Yonatan Bisk,
Ye** Choi,
Saadia Gabriel
Abstract:
Given the growing influx of misinformation across news and social media, there is a critical need for systems that can provide effective real-time verification of news claims. Large language or multimodal model based verification has been proposed to scale up online policing mechanisms for mitigating spread of false and harmful content. While these can potentially reduce burden on human fact-check…
▽ More
Given the growing influx of misinformation across news and social media, there is a critical need for systems that can provide effective real-time verification of news claims. Large language or multimodal model based verification has been proposed to scale up online policing mechanisms for mitigating spread of false and harmful content. While these can potentially reduce burden on human fact-checkers, such efforts may be hampered by foundation model training data becoming outdated. In this work, we test the limits of improving foundation model performance without continual updating through an initial study of knowledge transfer using either existing intra- and inter- domain benchmarks or explanations generated from large language models (LLMs). We evaluate on 12 public benchmarks for fact-checking and misinformation detection as well as two other tasks relevant to content moderation -- toxicity and stance detection. Our results on two recent multi-modal fact-checking benchmarks, Mocheg and Fakeddit, indicate that knowledge transfer strategies can improve Fakeddit performance over the state-of-the-art by up to 1.7% and Mocheg performance by up to 2.9%.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Dark Superabsorbers with Dirac-delta-like superdirective radiation
Authors:
Jeng Yi Lee,
Irving Rondon,
Andrey E. Miroshnichenko,
Pai-Yen Chen
Abstract:
We theoretically and numerically reveal that under a given level of extinction cross section and with definite angular momentum channels dominant, there exists a physical limitation for absorption cross section being maximum and scattering cross section being minimum. In addition, any scattering systems operated at this condition would be accompanied by a needle Dirac-delta-like far-field radiatio…
▽ More
We theoretically and numerically reveal that under a given level of extinction cross section and with definite angular momentum channels dominant, there exists a physical limitation for absorption cross section being maximum and scattering cross section being minimum. In addition, any scattering systems operated at this condition would be accompanied by a needle Dirac-delta-like far-field radiation pattern, reducing to perturb the background field except in the forward direction. We therefore refer to this outcome as dark superabsorbers. Moreover, by considering the mathematical Gibbs phenomenon, we find that a completely equivalent Dirac-delta far-field radiation is excluded even we could properly design the scatterers operated at such conditions. We believe this finding has potential applications in design of dark energy harvesting, lower-visibility receivers, superdirective light-matter interaction, and Fresnel diffractive imaging.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
Authors:
Xiang Li,
Cristina Mata,
Jongwoo Park,
Kumara Kahatapitiya,
Yoo Sung Jang,
**ghuan Shang,
Kanchana Ranasinghe,
Ryan Burgert,
Mu Cai,
Yong Jae Lee,
Michael S. Ryoo
Abstract:
Large Language Models (LLMs) equipped with extensive world knowledge and strong reasoning skills can tackle diverse tasks across domains, often by posing them as conversation-style instruction-response pairs. In this paper, we propose LLaRA: Large Language and Robotics Assistant, a framework which formulates robot action policy as conversations, and provides improved responses when trained with au…
▽ More
Large Language Models (LLMs) equipped with extensive world knowledge and strong reasoning skills can tackle diverse tasks across domains, often by posing them as conversation-style instruction-response pairs. In this paper, we propose LLaRA: Large Language and Robotics Assistant, a framework which formulates robot action policy as conversations, and provides improved responses when trained with auxiliary data that complements policy learning. LLMs with visual inputs, i.e., Vision Language Models (VLMs), have the capacity to process state information as visual-textual prompts and generate optimal policy decisions in text. To train such action policy VLMs, we first introduce an automated pipeline to generate diverse high-quality robotics instruction data from existing behavior cloning data. A VLM finetuned with the resulting collection of datasets based on a conversation-style formulation tailored for robotics tasks, can generate meaningful robot action policy decisions. Our experiments across multiple simulated and real-world environments demonstrate the state-of-the-art performance of the proposed LLaRA framework. The code, datasets, and pretrained models are available at https://github.com/LostXine/LLaRA.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management
Authors:
Wonbeom Lee,
Jungi Lee,
Junghwan Seo,
Jaewoong Sim
Abstract:
Transformer-based large language models (LLMs) demonstrate impressive performance across various natural language processing tasks. Serving LLM inference for generating long contents, however, poses a challenge due to the enormous memory footprint of the transient state, known as the key-value (KV) cache, which scales with the sequence length and batch size. In this paper, we present InfiniGen, a…
▽ More
Transformer-based large language models (LLMs) demonstrate impressive performance across various natural language processing tasks. Serving LLM inference for generating long contents, however, poses a challenge due to the enormous memory footprint of the transient state, known as the key-value (KV) cache, which scales with the sequence length and batch size. In this paper, we present InfiniGen, a novel KV cache management framework tailored for long-text generation, which synergistically works with modern offloading-based inference systems. InfiniGen leverages the key insight that a few important tokens that are essential for computing the subsequent attention layer in the Transformer can be speculated by performing a minimal rehearsal with the inputs of the current layer and part of the query weight and key cache of the subsequent layer. This allows us to prefetch only the essential KV cache entries (without fetching them all), thereby mitigating the fetch overhead from the host memory in offloading-based LLM serving systems. Our evaluation on several representative LLMs shows that InfiniGen improves the overall performance of a modern offloading-based system by up to 3.00x compared to prior KV cache management methods while offering substantially better model accuracy.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity
Authors:
Qian Yu,
Yining Wang,
Baihe Huang,
Qi Lei,
Jason D. Lee
Abstract:
Optimization of convex functions under stochastic zeroth-order feedback has been a major and challenging question in online learning. In this work, we consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries. We provide the first tight characterization for the rate of the mi…
▽ More
Optimization of convex functions under stochastic zeroth-order feedback has been a major and challenging question in online learning. In this work, we consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries. We provide the first tight characterization for the rate of the minimax simple regret by develo** matching upper and lower bounds. We propose an algorithm that features a combination of a bootstrap** stage and a mirror-descent stage. Our main technical innovation consists of a sharp characterization for the spherical-sampling gradient estimator under higher-order smoothness conditions, which allows the algorithm to optimally balance the bias-variance tradeoff, and a new iterative method for the bootstrap** stage, which maintains the performance for unbounded Hessian.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Turbojet Module Sizing for Integration with Turbine-Based Combined Cycle Engine
Authors:
S. Rajashankar,
N. Ananthkrishnan,
A. Sharma,
J. Lee,
H. J. Namkoung
Abstract:
A turbine-based combined cycle (TBCC) vehicle is studied that relies on a scramjet engine for high-speed flight but requires a turbojet module to accelerate it to a high supersonic handover Mach number. The challenge is to scale a given turbojet engine (TJE) core (compressor, burner, turbine) to a particular value of the air mass flow rate such that the desired thrust at the handover point is achi…
▽ More
A turbine-based combined cycle (TBCC) vehicle is studied that relies on a scramjet engine for high-speed flight but requires a turbojet module to accelerate it to a high supersonic handover Mach number. The challenge is to scale a given turbojet engine (TJE) core (compressor, burner, turbine) to a particular value of the air mass flow rate such that the desired thrust at the handover point is achieved. To this end, a model for the engine core is integrated with a supersonic intake model that is designed to supply the required mass flow rate, and a nozzle model that is expected to deliver the desired thrust. Both the TJE intake and nozzle are constrained by the design choices made for the DMSJ module, and the TJE core is itself constrained by the volume available from the TBCC vehicle sizing for hypersonic flight. The TJE module is sized by scaling the engine core with matching intake and nozzle designs in an iterative manner until the process converges to a solution with acceptable thrust satisfying all the system constraints. The task turns out to be non-trivial due to the scarcity of steady operating points for the engine core at high speeds, due to possible mismatch between the mass flow rate demanded by the compressor and that delivered by the supersonic intake, and due to the difficulty in adapting a DMSJ-style single-expansion ramp nozzle (SERN) to adequately expand the turbojet exhaust flow.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
The Belle II Detector Upgrades Framework Conceptual Design Report
Authors:
H. Aihara,
A. Aloisio,
D. P. Auguste,
M. Aversano,
M. Babeluk,
S. Bahinipati,
Sw. Banerjee,
M. Barbero,
J. Baudot,
A. Beaubien,
F. Becherer,
T. Bergauer,
F. U. Bernlochner.,
V. Bertacchi,
G. Bertolone,
C. Bespin,
M. Bessner,
S. Bettarini,
A. J. Bevan,
B. Bhuyan,
M. Bona,
J. F. Bonis,
J. Borah,
F. Bosi,
R. Boudagga
, et al. (183 additional authors not shown)
Abstract:
We describe the planned near-term and potential longer-term upgrades of the Belle II detector at the SuperKEKB electron-positron collider operating at the KEK laboratory in Tsukuba, Japan. These upgrades will allow increasingly sensitive searches for possible new physics beyond the Standard Model in flavor, tau, electroweak and dark sector physics that are both complementary to and competitive wit…
▽ More
We describe the planned near-term and potential longer-term upgrades of the Belle II detector at the SuperKEKB electron-positron collider operating at the KEK laboratory in Tsukuba, Japan. These upgrades will allow increasingly sensitive searches for possible new physics beyond the Standard Model in flavor, tau, electroweak and dark sector physics that are both complementary to and competitive with the LHC and other experiments.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
From Pixels to Torques with Linear Feedback
Authors:
Jeong Hun Lee,
Sam Schoedel,
Aditya Bhardwaj,
Zachary Manchester
Abstract:
We demonstrate the effectiveness of simple observer-based linear feedback policies for "pixels-to-torques" control of robotic systems using only a robot-facing camera. Specifically, we show that the matrices of an image-based Luenberger observer (linear state estimator) for a "student" output-feedback policy can be learned from demonstration data provided by a "teacher" state-feedback policy via s…
▽ More
We demonstrate the effectiveness of simple observer-based linear feedback policies for "pixels-to-torques" control of robotic systems using only a robot-facing camera. Specifically, we show that the matrices of an image-based Luenberger observer (linear state estimator) for a "student" output-feedback policy can be learned from demonstration data provided by a "teacher" state-feedback policy via simple linear-least-squares regression. The resulting linear output-feedback controller maps directly from high-dimensional raw images to torques while being amenable to the rich set of analytical tools from linear systems theory, alowing us to enforce closed-loop stability constraints in the learning problem. We also investigate a nonlinear extension of the method via the Koopman embedding. Finally, we demonstrate the surprising effectiveness of linear pixels-to-torques policies on a cartpole system, both in simulation and on real-world hardware. The policy successfully executes both stabilizing and swing-up trajectory tracking tasks using only camera feedback while subject to model mismatch, process and sensor noise, perturbations, and occlusions.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Mental Modeling of Reinforcement Learning Agents by Language Models
Authors:
Wenhao Lu,
Xufeng Zhao,
Josua Spisak,
Jae Hee Lee,
Stefan Wermter
Abstract:
Can emergent language models faithfully model the intelligence of decision-making agents? Though modern language models exhibit already some reasoning ability, and theoretically can potentially express any probable distribution over tokens, it remains underexplored how the world knowledge these pretrained models have memorized can be utilized to comprehend an agent's behaviour in the physical worl…
▽ More
Can emergent language models faithfully model the intelligence of decision-making agents? Though modern language models exhibit already some reasoning ability, and theoretically can potentially express any probable distribution over tokens, it remains underexplored how the world knowledge these pretrained models have memorized can be utilized to comprehend an agent's behaviour in the physical world. This study empirically examines, for the first time, how well large language models (LLMs) can build a mental model of agents, termed agent mental modelling, by reasoning about an agent's behaviour and its effect on states from agent interaction history. This research may unveil the potential of leveraging LLMs for elucidating RL agent behaviour, addressing a key challenge in eXplainable reinforcement learning (XRL). To this end, we propose specific evaluation metrics and test them on selected RL task datasets of varying complexity, reporting findings on agent mental model establishment. Our results disclose that LLMs are not yet capable of fully mental modelling agents through inference alone without further innovations. This work thus provides new insights into the capabilities and limitations of modern LLMs.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
The impact of constrained interacting dark energy on the bound-zone velocity profile
Authors:
Jounghun Lee,
Marco Baldi
Abstract:
We numerically study the effects of constrained interacting dark energy (CIDER) on the bound-zone velocity profiles around massive dark matter halos. Analyzing the CIDER simulations performed by Baldi (2023) for three different cases of dark sector coupling ($β=0.03$, $0.05$ and $0.08$) as well as for the standard $Λ$CDM cosmology ($β=0$), we determine the mean peculiar velocity profiles in the bo…
▽ More
We numerically study the effects of constrained interacting dark energy (CIDER) on the bound-zone velocity profiles around massive dark matter halos. Analyzing the CIDER simulations performed by Baldi (2023) for three different cases of dark sector coupling ($β=0.03$, $0.05$ and $0.08$) as well as for the standard $Λ$CDM cosmology ($β=0$), we determine the mean peculiar velocity profiles in the bound zones around the friends-of-friends halos with masses larger than $M_{\rm cut}=3\times 10^{13}\,h^{-1}M_{\odot}$ at three redshifts, $z=0$, $0.5$ and $1$. It is found that the universal power-law formula proposed by Falco et al. (2024) originally for the $Λ$CDM case still describes well the bound-zone velocity profiles, $V(r)$, even in the CIDER models. The slope of $V(r)$, turns out to be significantly affected by the CIDER, progressively decreasing as $β$ increases. Meanwhile, the amplitude of $V(r)$ exhibits little dependence on $β$, which is ascribed to the identical Hubble parameters shared by the $Λ$CDM and CIDER models in the entire redshift range. Our results imply that the bound-zone velocity slope can break a degeneracy even between the $Λ$CDM and CIDER models with $β\le 0.03$, which the standard cosmological diagnostics fail to distinguish. We devise a simple analytic formula for the bound-zone slope as a function of $β$, and prove its validity at all of the three redshifts. It is concluded that the slope of the mean bound-zone peculiar velocity profile should be in principle a powerful probe of dark sector interaction.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
B-TMS: Bayesian Traversable Terrain Modeling and Segmentation Across 3D LiDAR Scans and Maps for Enhanced Off-Road Navigation
Authors:
Minho Oh,
Gunhee Shin,
Seoyeon Jang,
Seungjae Lee,
Dongkyu Lee,
Wonho Song,
Byeongho Yu,
Hyungtae Lim,
Jaeyoung Lee,
Hyun Myung
Abstract:
Recognizing traversable terrain from 3D point cloud data is critical, as it directly impacts the performance of autonomous navigation in off-road environments. However, existing segmentation algorithms often struggle with challenges related to changes in data distribution, environmental specificity, and sensor variations. Moreover, when encountering sunken areas, their performance is frequently co…
▽ More
Recognizing traversable terrain from 3D point cloud data is critical, as it directly impacts the performance of autonomous navigation in off-road environments. However, existing segmentation algorithms often struggle with challenges related to changes in data distribution, environmental specificity, and sensor variations. Moreover, when encountering sunken areas, their performance is frequently compromised, and they may even fail to recognize them. To address these challenges, we introduce B-TMS, a novel approach that performs map-wise terrain modeling and segmentation by utilizing Bayesian generalized kernel (BGK) within the graph structure known as the tri-grid field (TGF). Our experiments encompass various data distributions, ranging from single scans to partial maps, utilizing both public datasets representing urban scenes and off-road environments, and our own dataset acquired from extremely bumpy terrains. Our results demonstrate notable contributions, particularly in terms of robustness to data distribution variations, adaptability to diverse environmental conditions, and resilience against the challenges associated with parameter changes.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Decoding with Limited Teacher Supervision Requires Understanding When to Trust the Teacher
Authors:
Hyunjong Ok,
Jegwang Ryu,
Jaeho Lee
Abstract:
How can sLLMs efficiently utilize the supervision of LLMs to improve their generative quality? This question has been well studied in scenarios where there is no restriction on the number of LLM supervisions one can use, giving birth to many decoding algorithms that utilize supervision without further training. However, it is still unclear what is an effective strategy under the limited supervisio…
▽ More
How can sLLMs efficiently utilize the supervision of LLMs to improve their generative quality? This question has been well studied in scenarios where there is no restriction on the number of LLM supervisions one can use, giving birth to many decoding algorithms that utilize supervision without further training. However, it is still unclear what is an effective strategy under the limited supervision scenario, where we assume that no more than a few tokens can be generated by LLMs. To this end, we develop an algorithm to effectively aggregate the sLLM and LLM predictions on initial tokens so that the generated tokens can more accurately condition the subsequent token generation by sLLM only. Critically, we find that it is essential to adaptively overtrust or disregard the LLM prediction based on the confidence of the sLLM. Through our experiments on a wide range of models and datasets, we demonstrate that our method provides a consistent improvement over conventional decoding strategies.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Discrete-time thermodynamic speed limit
Authors:
Sangyun Lee,
Jae Sung Lee,
Jong-Min Park
Abstract:
As a fundamental thermodynamic principle, speed limits reveal the lower bound of entropy production (EP) required for a system to transition from a given initial state to a final state. While various speed limits have been developed for continuous-time Markov processes, their application to discrete-time Markov chains remains unexplored. In this study, we investigate the speed limits in discrete-t…
▽ More
As a fundamental thermodynamic principle, speed limits reveal the lower bound of entropy production (EP) required for a system to transition from a given initial state to a final state. While various speed limits have been developed for continuous-time Markov processes, their application to discrete-time Markov chains remains unexplored. In this study, we investigate the speed limits in discrete-time Markov chains, focusing on two types of EP commonly used to measure the irreversibility of a discrete-time process: time-reversed EP and time-backward EP. We find that time-reversed EP satisfies the speed limit for the continuous-time Markov processes, whereas time-backward EP does not. Additionally, for time-reversed EP, we derive practical speed limits applicable to systems driven by cyclic protocols or with unidirectional transitions, where conventional speed limits become meaningless or invalid. We show that these relations also hold for continuous-time Markov processes by taking the time-continuum limit of our results. Finally, we validate our findings through several examples.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Modeling and simulations of high-density two-phase flows using projection-based Cahn-Hilliard Navier-Stokes equations
Authors:
Ali Rabeh,
Makrand A. Khanwale,
Jonghyun Lee,
Baskar Ganapathysubramanian
Abstract:
Accurately modeling the dynamics of high-density ratio ($\mathcal{O}(10^5)$) two-phase flows is important for many applications in material science and manufacturing. In this work, we consider numerical simulations of molten metal undergoing microgravity oscillations. Accurate simulation of the oscillation dynamics allows us to characterize the interplay between the two fluids' surface tension and…
▽ More
Accurately modeling the dynamics of high-density ratio ($\mathcal{O}(10^5)$) two-phase flows is important for many applications in material science and manufacturing. In this work, we consider numerical simulations of molten metal undergoing microgravity oscillations. Accurate simulation of the oscillation dynamics allows us to characterize the interplay between the two fluids' surface tension and density ratio, which is an important consideration for terrestrial manufacturing applications. We present a projection-based computational framework for solving a thermodynamically-consistent Cahn-Hilliard Navier-Stokes equations for two-phase flows under these large density ratios. A modified version of the pressure-decoupled solver based on the Helmholtz-Hodge decomposition presented in Khanwale et al. [$\textit{A fully-coupled framework for solving Cahn-Hilliard Navier-Stokes equations: Second-order, energy-stable numerical methods on adaptive octree based meshes.}$, Journal of Computational Physics 475 (2023): 111874] is used. We present a comprehensive convergence study to investigate the effect of mesh resolution, time-step, and interfacial thickness on droplet-shape oscillations. We deploy our framework to predict the oscillation behavior of three physical systems exhibiting very large density ratios ($10^4-10^5:1$) that have previously never been performed.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Burst Image Super-Resolution with Base Frame Selection
Authors:
Sanghyun Kim,
Min Jung Lee,
Woohyeok Kim,
Deunsol Jung,
Jaesung Rim,
Sunghyun Cho,
Minsu Cho
Abstract:
Burst image super-resolution has been a topic of active research in recent years due to its ability to obtain a high-resolution image by using complementary information between multiple frames in the burst. In this work, we explore using burst shots with non-uniform exposures to confront real-world practical scenarios by introducing a new benchmark dataset, dubbed Non-uniformly Exposed Burst Image…
▽ More
Burst image super-resolution has been a topic of active research in recent years due to its ability to obtain a high-resolution image by using complementary information between multiple frames in the burst. In this work, we explore using burst shots with non-uniform exposures to confront real-world practical scenarios by introducing a new benchmark dataset, dubbed Non-uniformly Exposed Burst Image (NEBI), that includes the burst frames at varying exposure times to obtain a broader range of irradiance and motion characteristics within a scene. As burst shots with non-uniform exposures exhibit varying levels of degradation, fusing information of the burst shots into the first frame as a base frame may not result in optimal image quality. To address this limitation, we propose a Frame Selection Network (FSN) for non-uniform scenarios. This network seamlessly integrates into existing super-resolution methods in a plug-and-play manner with low computational costs. The comparative analysis reveals the effectiveness of the nonuniform setting for the practical scenario and our FSN on synthetic-/real- NEBI datasets.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Towards Federated Low-Rank Adaptation with Rank-Heterogeneous Communication
Authors:
Yuji Byun,
Jaeho Lee
Abstract:
Low-rank adaptation (LoRA) is an attractive alternative of adapting full weights for the federated fine-tuning of large pretrained models, which can significantly reduce the memory and communication burden. In principle, federated LoRA can provide an effective mean to allocate different resources to each client by tuning ranks for each client, which can be useful in achieving a better communicatio…
▽ More
Low-rank adaptation (LoRA) is an attractive alternative of adapting full weights for the federated fine-tuning of large pretrained models, which can significantly reduce the memory and communication burden. In principle, federated LoRA can provide an effective mean to allocate different resources to each client by tuning ranks for each client, which can be useful in achieving a better communication-performance tradeoff. We find, however, that the empirical performance of LoRA is highly unstable with respect to such rank-heterogeneity, severely limiting the applicability to the scenarios where it is desirable or even required to allocate nonuniform communication bandwidth to each client due to constrained total bandwidth. Our investigation reveals that the root cause of this instability is the zero-padding-based aggregation strategy adopted in conventional federated LoRA frameworks, which causes the information from high rank clients to get diluted during the aggregation process. To address this issue, we propose a new replication-based padding strategy, which allows us to better leverage the information from clients with high-quality datasets. This method ensures that valuable information from high rank clients is retained during the aggregation process, accelerating the convergence speed and enhancing the overall prediction quality of the global model.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
High Fidelity Text-to-Speech Via Discrete Tokens Using Token Transducer and Group Masked Language Model
Authors:
Joun Yeop Lee,
Myeonghun Jeong,
Minchan Kim,
Ji-Hyun Lee,
Hoon-Young Cho,
Nam Soo Kim
Abstract:
We propose a novel two-stage text-to-speech (TTS) framework with two types of discrete tokens, i.e., semantic and acoustic tokens, for high-fidelity speech synthesis. It features two core components: the Interpreting module, which processes text and a speech prompt into semantic tokens focusing on linguistic contents and alignment, and the Speaking module, which captures the timbre of the target v…
▽ More
We propose a novel two-stage text-to-speech (TTS) framework with two types of discrete tokens, i.e., semantic and acoustic tokens, for high-fidelity speech synthesis. It features two core components: the Interpreting module, which processes text and a speech prompt into semantic tokens focusing on linguistic contents and alignment, and the Speaking module, which captures the timbre of the target voice to generate acoustic tokens from semantic tokens, enriching speech reconstruction. The Interpreting stage employs a transducer for its robustness in aligning text to speech. In contrast, the Speaking stage utilizes a Conformer-based architecture integrated with a Grouped Masked Language Model (G-MLM) to boost computational efficiency. Our experiments verify that this innovative structure surpasses the conventional models in the zero-shot scenario in terms of speech quality and speaker similarity.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
IR physics from the holographic RG flow
Authors:
Chanyong Park,
Jung Hun Lee
Abstract:
Applying the holographic method, we investigate an RG flow and IR physics holographically when a two-dimensional conformal field theory is deformed by a relevant scalar operator. To do so, we first assume an RG flow from a UV to new IR CFT. On the dual gravity side, such an RG flow can be described by rolling down of a bulk scalar field from an unstable to stable equilibrium point. After consideri…
▽ More
Applying the holographic method, we investigate an RG flow and IR physics holographically when a two-dimensional conformal field theory is deformed by a relevant scalar operator. To do so, we first assume an RG flow from a UV to new IR CFT. On the dual gravity side, such an RG flow can be described by rolling down of a bulk scalar field from an unstable to stable equilibrium point. After considering a simple scalar potential allowing several local extrema, we study the change of a ground state along the RG flow. We show that the entanglement entropy at an IR fixed point leads to a logarithmic divergence due to restoring of the conformal symmetry. We study how the change of the ground state affects two-point functions. In the probe limit, we numerically calculate the change of a conformal dimension caused by the modification of the ground state. We further study the analytic form of the IR conformal dimension which is perfectly matched to the numerical result.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Bayesian temporal biclustering with applications to multi-subject neuroscience studies
Authors:
Federica Zoe Ricci,
Erik B. Sudderth,
Jaylen Lee,
Megan A. K. Peters,
Marina Vannucci,
Michele Guindani
Abstract:
We consider the problem of analyzing multivariate time series collected on multiple subjects, with the goal of identifying groups of subjects exhibiting similar trends in their recorded measurements over time as well as time-varying groups of associated measurements. To this end, we propose a Bayesian model for temporal biclustering featuring nested partitions, where a time-invariant partition of…
▽ More
We consider the problem of analyzing multivariate time series collected on multiple subjects, with the goal of identifying groups of subjects exhibiting similar trends in their recorded measurements over time as well as time-varying groups of associated measurements. To this end, we propose a Bayesian model for temporal biclustering featuring nested partitions, where a time-invariant partition of subjects induces a time-varying partition of measurements. Our approach allows for data-driven determination of the number of subject and measurement clusters as well as estimation of the number and location of changepoints in measurement partitions. To efficiently perform model fitting and posterior estimation with Markov Chain Monte Carlo, we derive a blocked update of measurements' cluster-assignment sequences. We illustrate the performance of our model in two applications to functional magnetic resonance imaging data and to an electroencephalogram dataset. The results indicate that the proposed model can combine information from potentially many subjects to discover a set of interpretable, dynamic patterns. Experiments on simulated data compare the estimation performance of the proposed model against ground-truth values and other statistical methods, showing that it performs well at identifying ground-truth subject and measurement clusters even when no subject or time dependence is present.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
InstructPatentGPT: Training patent language models to follow instructions with human feedback
Authors:
Jieh-Sheng Lee
Abstract:
In this research, patent prosecution is conceptualized as a system of reinforcement learning from human feedback. The objective of the system is to increase the likelihood for a language model to generate patent claims that have a higher chance of being granted. To showcase the controllability of the language model, the system learns from granted patents and pre-grant applications with different r…
▽ More
In this research, patent prosecution is conceptualized as a system of reinforcement learning from human feedback. The objective of the system is to increase the likelihood for a language model to generate patent claims that have a higher chance of being granted. To showcase the controllability of the language model, the system learns from granted patents and pre-grant applications with different rewards. The status of "granted" and "pre-grant" are perceived as labeled human feedback implicitly. In addition, specific to patent drafting, the experiments in this research demonstrate the model's capability to learn from adjusting claim length and inclusion of limiting terms for narrowing claim scope. As proof of concept, the experiments focus on claim ones only and the training data originates from a patent dataset tailored specifically for artificial intelligence. Although the available human feedback in patent prosecution are limited and the quality of generated patent text requires improvement, the experiments following the 3-stage reinforcement learning from human feedback have demonstrated that generative language models are capable of reflecting the human feedback or intent in patent prosecution. To enhance the usability of language models, the implementation in this research utilizes modern techniques that enable execution on a single consumer-grade GPU. The demonstrated proof of concept, which reduces hardware requirements, will prove valuable in the future as more human feedback in patent prosecution become available for broader use, either within patent offices or in the public domain.
△ Less
Submitted 25 May, 2024;
originally announced June 2024.
-
Carrot and Stick: Inducing Self-Motivation with Positive & Negative Feedback
Authors:
Jimin Sohn,
Jeihee Cho,
Junyong Lee,
Songmu Heo,
Ji-Eun Han,
David R. Mortensen
Abstract:
Positive thinking is thought to be an important component of self-motivation in various practical fields such as education and the workplace. Previous work, including sentiment transfer and positive reframing, has focused on the positive side of language. However, self-motivation that drives people to reach their goals has not yet been studied from a computational perspective. Moreover, negative f…
▽ More
Positive thinking is thought to be an important component of self-motivation in various practical fields such as education and the workplace. Previous work, including sentiment transfer and positive reframing, has focused on the positive side of language. However, self-motivation that drives people to reach their goals has not yet been studied from a computational perspective. Moreover, negative feedback has not yet been explored, even though positive and negative feedback are both necessary to grow self-motivation. To facilitate self-motivation, we propose CArrot and STICk (CASTIC) dataset, consisting of 12,590 sentences with 5 different strategies for enhancing self-motivation. Our data and code are publicly available at here.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
PlagBench: Exploring the Duality of Large Language Models in Plagiarism Generation and Detection
Authors:
Jooyoung Lee,
Toshini Agrawal,
Adaku Uchendu,
Thai Le,
**ghui Chen,
Dongwon Lee
Abstract:
Recent literature has highlighted potential risks to academic integrity associated with large language models (LLMs), as they can memorize parts of training instances and reproduce them in the generated texts without proper attribution. In addition, given their capabilities in generating high-quality texts, plagiarists can exploit LLMs to generate realistic paraphrases or summaries indistinguishab…
▽ More
Recent literature has highlighted potential risks to academic integrity associated with large language models (LLMs), as they can memorize parts of training instances and reproduce them in the generated texts without proper attribution. In addition, given their capabilities in generating high-quality texts, plagiarists can exploit LLMs to generate realistic paraphrases or summaries indistinguishable from original work. In response to possible malicious use of LLMs in plagiarism, we introduce PlagBench, a comprehensive dataset consisting of 46.5K synthetic plagiarism cases generated using three instruction-tuned LLMs across three writing domains. The quality of PlagBench is ensured through fine-grained automatic evaluation for each type of plagiarism, complemented by human annotation. We then leverage our proposed dataset to evaluate the plagiarism detection performance of five modern LLMs and three specialized plagiarism checkers. Our findings reveal that GPT-3.5 tends to generates paraphrases and summaries of higher quality compared to Llama2 and GPT-4. Despite LLMs' weak performance in summary plagiarism identification, they can surpass current commercial plagiarism detectors. Overall, our results highlight the potential of LLMs to serve as robust plagiarism detection tools.
△ Less
Submitted 23 June, 2024;
originally announced June 2024.
-
Improved Modularity and New Features in ipie: Towards Even Larger AFQMC Calculations on CPUs and GPUs at Zero and Finite Temperatures
Authors:
Tong Jiang,
Moritz K. A. Baumgarten,
Pierre-François Loos,
Ankit Mahajan,
Anthony Scemama,
Shu Fay Ung,
**ghong Zhang,
Fionn D Malone,
Joonho Lee
Abstract:
ipie is a Python-based auxiliary-field quantum Monte Carlo (AFQMC) package that has undergone substantial improvements since its initial release [J. Chem. Theory Comput., 2022, 19(1): 109-121]. This paper outlines the improved modularity and new capabilities implemented in ipie. We highlight the ease of incorporating different trial and walker types and the seamless integration of ipie with extern…
▽ More
ipie is a Python-based auxiliary-field quantum Monte Carlo (AFQMC) package that has undergone substantial improvements since its initial release [J. Chem. Theory Comput., 2022, 19(1): 109-121]. This paper outlines the improved modularity and new capabilities implemented in ipie. We highlight the ease of incorporating different trial and walker types and the seamless integration of ipie with external libraries. We enable distributed Hamiltonian simulations, allowing for multi-GPU simulations of large systems. This development enabled us to compute the interaction energy of a benzene dimer with 84 electrons and 1512 orbitals, which otherwise would not have fit on a single GPU. We also support GPU-accelerated multi-slater determinant trial wavefunctions [arXiv:2406.08314] to enable efficient and highly accurate simulations of large-scale systems. This allows for near-exact ground state energies of multi-reference clusters, [Cu$_2$O$_2$]$^{2+}$ and [Fe$_2$S$_2$(SCH$_3$)]$^{2-}$. We also describe implementations of free projection AFQMC, finite temperature AFQMC, AFQMC for electron-phonon systems, and automatic differentiation in AFQMC for calculating physical properties. These advancements position ipie as a leading platform for AFQMC research in quantum chemistry, facilitating more complex and ambitious computational method development and their applications.
△ Less
Submitted 25 June, 2024; v1 submitted 23 June, 2024;
originally announced June 2024.
-
Pose-Diversified Augmentation with Diffusion Model for Person Re-Identification
Authors:
Inès Hyeonsu Kim,
JoungBin Lee,
Soowon Son,
Woojeong **,
Kyusun Cho,
Junyoung Seo,
Min-Seop Kwak,
Seokju Cho,
JeongYeol Baek,
Byeongwon Lee,
Seungryong Kim
Abstract:
Person re-identification (Re-ID) often faces challenges due to variations in human poses and camera viewpoints, which significantly affect the appearance of individuals across images. Existing datasets frequently lack diversity and scalability in these aspects, hindering the generalization of Re-ID models to new camera systems. Previous methods have attempted to address these issues through data a…
▽ More
Person re-identification (Re-ID) often faces challenges due to variations in human poses and camera viewpoints, which significantly affect the appearance of individuals across images. Existing datasets frequently lack diversity and scalability in these aspects, hindering the generalization of Re-ID models to new camera systems. Previous methods have attempted to address these issues through data augmentation; however, they rely on human poses already present in the training dataset, failing to effectively reduce the human pose bias in the dataset. We propose Diff-ID, a novel data augmentation approach that incorporates sparse and underrepresented human pose and camera viewpoint examples into the training data, addressing the limited diversity in the original training data distribution. Our objective is to augment a training dataset that enables existing Re-ID models to learn features unbiased by human pose and camera viewpoint variations. To achieve this, we leverage the knowledge of pre-trained large-scale diffusion models. Using the SMPL model, we simultaneously capture both the desired human poses and camera viewpoints, enabling realistic human rendering. The depth information provided by the SMPL model indirectly conveys the camera viewpoints. By conditioning the diffusion model on both the human pose and camera viewpoint concurrently through the SMPL model, we generate realistic images with diverse human poses and camera viewpoints. Qualitative results demonstrate the effectiveness of our method in addressing human pose bias and enhancing the generalizability of Re-ID models compared to other data augmentation-based Re-ID approaches. The performance gains achieved by training Re-ID models on our offline augmented dataset highlight the potential of our proposed framework in improving the scalability and generalizability of person Re-ID models.
△ Less
Submitted 23 June, 2024;
originally announced June 2024.
-
Deep Learning Segmentation of Ascites on Abdominal CT Scans for Automatic Volume Quantification
Authors:
Benjamin Hou,
Sung-Won Lee,
Jung-Min Lee,
Christopher Koh,
**g Xiao,
Perry J. Pickhardt,
Ronald M. Summers
Abstract:
Purpose: To evaluate the performance of an automated deep learning method in detecting ascites and subsequently quantifying its volume in patients with liver cirrhosis and ovarian cancer.
Materials and Methods: This retrospective study included contrast-enhanced and non-contrast abdominal-pelvic CT scans of patients with cirrhotic ascites and patients with ovarian cancer from two institutions, N…
▽ More
Purpose: To evaluate the performance of an automated deep learning method in detecting ascites and subsequently quantifying its volume in patients with liver cirrhosis and ovarian cancer.
Materials and Methods: This retrospective study included contrast-enhanced and non-contrast abdominal-pelvic CT scans of patients with cirrhotic ascites and patients with ovarian cancer from two institutions, National Institutes of Health (NIH) and University of Wisconsin (UofW). The model, trained on The Cancer Genome Atlas Ovarian Cancer dataset (mean age, 60 years +/- 11 [s.d.]; 143 female), was tested on two internal (NIH-LC and NIH-OV) and one external dataset (UofW-LC). Its performance was measured by the Dice coefficient, standard deviations, and 95% confidence intervals, focusing on ascites volume in the peritoneal cavity.
Results: On NIH-LC (25 patients; mean age, 59 years +/- 14 [s.d.]; 14 male) and NIH-OV (166 patients; mean age, 65 years +/- 9 [s.d.]; all female), the model achieved Dice scores of 0.855 +/- 0.061 (CI: 0.831-0.878) and 0.826 +/- 0.153 (CI: 0.764-0.887), with median volume estimation errors of 19.6% (IQR: 13.2-29.0) and 5.3% (IQR: 2.4-9.7) respectively. On UofW-LC (124 patients; mean age, 46 years +/- 12 [s.d.]; 73 female), the model had a Dice score of 0.830 +/- 0.107 (CI: 0.798-0.863) and median volume estimation error of 9.7% (IQR: 4.5-15.1). The model showed strong agreement with expert assessments, with r^2 values of 0.79, 0.98, and 0.97 across the test sets.
Conclusion: The proposed deep learning method performed well in segmenting and quantifying the volume of ascites in concordance with expert radiologist assessments.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Data Issues in Industrial AI System: A Meta-Review and Research Strategy
Authors:
Xuejiao Li,
Cheng Yang,
Charles Møller,
Jay Lee
Abstract:
In the era of Industry 4.0, artificial intelligence (AI) is assuming an increasingly pivotal role within industrial systems. Despite the recent trend within various industries to adopt AI, the actual adoption of AI is not as developed as perceived. A significant factor contributing to this lag is the data issues in AI implementation. How to address these data issues stands as a significant concern…
▽ More
In the era of Industry 4.0, artificial intelligence (AI) is assuming an increasingly pivotal role within industrial systems. Despite the recent trend within various industries to adopt AI, the actual adoption of AI is not as developed as perceived. A significant factor contributing to this lag is the data issues in AI implementation. How to address these data issues stands as a significant concern confronting both industry and academia. To address data issues, the first step involves map** out these issues. Therefore, this study conducts a meta-review to explore data issues and methods within the implementation of industrial AI. Seventy-two data issues are identified and categorized into various stages of the data lifecycle, including data source and collection, data access and storage, data integration and interoperation, data pre-processing, data processing, data security and privacy, and AI technology adoption. Subsequently, the study analyzes the data requirements of various AI algorithms. Building on the aforementioned analyses, it proposes a data management framework, addressing how data issues can be systematically resolved at every stage of the data lifecycle. Finally, the study highlights future research directions. In doing so, this study enriches the existing body of knowledge and provides guidelines for professionals navigating the complex landscape of achieving data usability and usefulness in industrial AI.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
I Experienced More than 10 DeFi Scams: On DeFi Users' Perception of Security Breaches and Countermeasures
Authors:
Mingyi Liu,
Jun Ho Huh,
HyungSeok Han,
Jaehyuk Lee,
Jihae Ahn,
Frank Li,
Hyoungshick Kim,
Taesoo Kim
Abstract:
Decentralized Finance (DeFi) offers a whole new investment experience and has quickly emerged as an enticing alternative to Centralized Finance (CeFi). Rapidly growing market size and active users, however, have also made DeFi a lucrative target for scams and hacks, with 1.95 billion USD lost in 2023. Unfortunately, no prior research thoroughly investigates DeFi users' security risk awareness leve…
▽ More
Decentralized Finance (DeFi) offers a whole new investment experience and has quickly emerged as an enticing alternative to Centralized Finance (CeFi). Rapidly growing market size and active users, however, have also made DeFi a lucrative target for scams and hacks, with 1.95 billion USD lost in 2023. Unfortunately, no prior research thoroughly investigates DeFi users' security risk awareness levels and the adequacy of their risk mitigation strategies.
Based on a semi-structured interview study (N = 14) and a follow-up survey (N = 493), this paper investigates DeFi users' security perceptions and commonly adopted practices, and how those affected by previous scams or hacks (DeFi victims) respond and try to recover their losses. Our analysis shows that users often prefer DeFi over CeFi due to their decentralized nature and strong profitability. Despite being aware that DeFi, compared to CeFi, is prone to more severe attacks, users are willing to take those risks to explore new investment opportunities. Worryingly, most victims do not learn from previous experiences; unlike victims studied through traditional systems, DeFi victims tend to find new services, without revising their security practices, to recover their losses quickly. The abundance of various DeFi services and opportunities allows victims to continuously explore new financial opportunities, and this reality seems to cloud their security priorities. Indeed, our results indicate that DeFi users' strong financial motivations outweigh their security concerns - much like those who are addicted to gambling. Our observations about victims' post-incident behaviors suggest that stronger control in the form of industry regulations would be necessary to protect DeFi users from future breaches.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
DataFreeShield: Defending Adversarial Attacks without Training Data
Authors:
Hyeyoon Lee,
Kanghyun Choi,
Dain Kwon,
Sunjong Park,
Mayoore Selvarasa Jaiswal,
Noseong Park,
Jonghyun Choi,
**ho Lee
Abstract:
Recent advances in adversarial robustness rely on an abundant set of training data, where using external or additional datasets has become a common setting. However, in real life, the training data is often kept private for security and privacy issues, while only the pretrained weight is available to the public. In such scenarios, existing methods that assume accessibility to the original data bec…
▽ More
Recent advances in adversarial robustness rely on an abundant set of training data, where using external or additional datasets has become a common setting. However, in real life, the training data is often kept private for security and privacy issues, while only the pretrained weight is available to the public. In such scenarios, existing methods that assume accessibility to the original data become inapplicable. Thus we investigate the pivotal problem of data-free adversarial robustness, where we try to achieve adversarial robustness without accessing any real data. Through a preliminary study, we highlight the severity of the problem by showing that robustness without the original dataset is difficult to achieve, even with similar domain datasets. To address this issue, we propose DataFreeShield, which tackles the problem from two perspectives: surrogate dataset generation and adversarial training using the generated data. Through extensive validation, we show that DataFreeShield outperforms baselines, demonstrating that the proposed method sets the first entirely data-free solution for the adversarial robustness problem.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Rethinking Pruning Large Language Models: Benefits and Pitfalls of Reconstruction Error Minimization
Authors:
Sungbin Shin,
Wonpyo Park,
Jaeho Lee,
Namhoon Lee
Abstract:
This work suggests fundamentally rethinking the current practice of pruning large language models (LLMs). The way it is done is by divide and conquer: split the model into submodels, sequentially prune them, and reconstruct predictions of the dense counterparts on small calibration data one at a time; the final model is obtained simply by putting the resulting sparse submodels together. While this…
▽ More
This work suggests fundamentally rethinking the current practice of pruning large language models (LLMs). The way it is done is by divide and conquer: split the model into submodels, sequentially prune them, and reconstruct predictions of the dense counterparts on small calibration data one at a time; the final model is obtained simply by putting the resulting sparse submodels together. While this approach enables pruning under memory constraints, it generates high reconstruction errors. In this work, we first present an array of reconstruction techniques that can significantly reduce this error by more than $90\%$. Unwittingly, however, we discover that minimizing reconstruction error is not always ideal and can overfit the given calibration data, resulting in rather increased language perplexity and poor performance at downstream tasks. We find out that a strategy of self-generating calibration data can mitigate this trade-off between reconstruction and generalization, suggesting new directions in the presence of both benefits and pitfalls of reconstruction for pruning LLMs.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Minimal grid diagrams of the prime alternating knots with 13 crossings
Authors:
Hwa Jeong Lee,
Alexander Stoimenow,
Gyo Taek **
Abstract:
A knot is a closed loop in space without self-intersection. Two knots are equivalent if there is a self homeomorphism of space bringing one onto the other. An arc presentation is an embedding of a knot in the union of finitely many half planes with a common boundary line such that each half plane contains a simple arc of the knot. The minimal number of such half planes among all arc presentations…
▽ More
A knot is a closed loop in space without self-intersection. Two knots are equivalent if there is a self homeomorphism of space bringing one onto the other. An arc presentation is an embedding of a knot in the union of finitely many half planes with a common boundary line such that each half plane contains a simple arc of the knot. The minimal number of such half planes among all arc presentations of a given knot is called the arc index of the knot. A knot is usually presented as a planar diagram with finitely many crossings of two strands where one of the strands goes over the other. A grid diagram is a planar diagram which is a non-simple rectilinear polygon such that vertical edges always cross over horizontal edges at all crossings. It is easily seen that an arc presentation gives rise to a grid diagram and vice versa. It is known that the arc index of an alternating knot is two plus its minimal crossing number. There are 4878 prime alternating knots with minimal crossing number 13. We obtained minimal arc presentations of them in the form of grid diagrams having 15 vertical segments. This is a continuation of the works on prime alternating knots of 11 crossings and 12 crossings.
△ Less
Submitted 31 March, 2024;
originally announced June 2024.
-
Cognitive Map for Language Models: Optimal Planning via Verbally Representing the World Model
Authors:
Doyoung Kim,
Jongwon Lee,
**ho Park,
Minjoon Seo
Abstract:
Language models have demonstrated impressive capabilities across various natural language processing tasks, yet they struggle with planning tasks requiring multi-step simulations. Inspired by human cognitive processes, this paper investigates the optimal planning power of language models that can construct a cognitive map of a given environment. Our experiments demonstrate that cognitive map signi…
▽ More
Language models have demonstrated impressive capabilities across various natural language processing tasks, yet they struggle with planning tasks requiring multi-step simulations. Inspired by human cognitive processes, this paper investigates the optimal planning power of language models that can construct a cognitive map of a given environment. Our experiments demonstrate that cognitive map significantly enhances the performance of both optimal and reachable planning generation ability in the Gridworld path planning task. We observe that our method showcases two key characteristics similar to human cognition: \textbf{generalization of its planning ability to extrapolated environments and rapid adaptation with limited training data.} We hope our findings in the Gridworld task provide insights into modeling human cognitive processes in language models, potentially leading to the development of more advanced and robust systems that better resemble human cognition.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Ring-LWE based encrypted controller with unlimited number of recursive multiplications and effect of error growth
Authors:
Yeongjun Jang,
Joowon Lee,
Seonhong Min,
Hyesun Kwak,
Junsoo Kim,
Yongsoo Song
Abstract:
In this paper, we propose a method to encrypt linear dynamic controllers that enables an unlimited number of recursive homomorphic multiplications on a Ring Learning With Errors (Ring-LWE) based cryptosystem without bootstrap**. Unlike LWE based schemes, where a scalar error is injected during encryption for security, Ring-LWE based schemes are based on polynomial rings and inject error as a pol…
▽ More
In this paper, we propose a method to encrypt linear dynamic controllers that enables an unlimited number of recursive homomorphic multiplications on a Ring Learning With Errors (Ring-LWE) based cryptosystem without bootstrap**. Unlike LWE based schemes, where a scalar error is injected during encryption for security, Ring-LWE based schemes are based on polynomial rings and inject error as a polynomial having multiple error coefficients. Such errors accumulate under recursive homomorphic operations, and it has been studied that their effect can be suppressed by the closed-loop stability when dynamic controllers are encrypted using LWE based schemes. We show that this also holds for the proposed controller encrypted using a Ring-LWE based scheme. Specifically, only the constant terms of the error polynomials affect the control performance, and their effect can be arbitrarily bounded even when the noneffective terms diverge. Furthermore, a novel packing algorithm is applied, resulting in reduced computation time and enhanced memory efficiency. Simulation results demonstrate the effectiveness of the proposed method.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
CityNav: Language-Goal Aerial Navigation Dataset with Geographic Information
Authors:
Jungdae Lee,
Taiki Miyanishi,
Shuhei Kurita,
Koya Sakamoto,
Daichi Azuma,
Yutaka Matsuo,
Nakamasa Inoue
Abstract:
Vision-and-language navigation (VLN) aims to guide autonomous agents through real-world environments by integrating visual and linguistic cues. While substantial progress has been made in understanding these interactive modalities in ground-level navigation, aerial navigation remains largely underexplored. This is primarily due to the scarcity of resources suitable for real-world, city-scale aeria…
▽ More
Vision-and-language navigation (VLN) aims to guide autonomous agents through real-world environments by integrating visual and linguistic cues. While substantial progress has been made in understanding these interactive modalities in ground-level navigation, aerial navigation remains largely underexplored. This is primarily due to the scarcity of resources suitable for real-world, city-scale aerial navigation studies. To bridge this gap, we introduce CityNav, a new dataset for language-goal aerial navigation using a 3D point cloud representation from real-world cities. CityNav includes 32,637 natural language descriptions paired with human demonstration trajectories, collected from participants via a new web-based 3D simulator developed for this research. Each description specifies a navigation goal, leveraging the names and locations of landmarks within real-world cities. We also provide baseline models of navigation agents that incorporate an internal 2D spatial map representing landmarks referenced in the descriptions. We benchmark the latest aerial navigation baselines and our proposed model on the CityNav dataset. The results using this dataset reveal the following key findings: (i) Our aerial agent models trained on human demonstration trajectories outperform those trained on shortest path trajectories, highlighting the importance of human-driven navigation strategies; (ii) The integration of a 2D spatial map significantly enhances navigation efficiency at city scale. Our dataset and code are available at https://water-cookie.github.io/city-nav-proj/
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
CONMOD: Controllable Neural Frame-based Modulation Effects
Authors:
Gyubin Lee,
Hounsu Kim,
Junwon Lee,
Juhan Nam
Abstract:
Deep learning models have seen widespread use in modelling LFO-driven audio effects, such as phaser and flanger. Although existing neural architectures exhibit high-quality emulation of individual effects, they do not possess the capability to manipulate the output via control parameters. To address this issue, we introduce Controllable Neural Frame-based Modulation Effects (CONMOD), a single blac…
▽ More
Deep learning models have seen widespread use in modelling LFO-driven audio effects, such as phaser and flanger. Although existing neural architectures exhibit high-quality emulation of individual effects, they do not possess the capability to manipulate the output via control parameters. To address this issue, we introduce Controllable Neural Frame-based Modulation Effects (CONMOD), a single black-box model which emulates various LFO-driven effects in a frame-wise manner, offering control over LFO frequency and feedback parameters. Additionally, the model is capable of learning the continuous embedding space of two distinct phaser effects, enabling us to steer between effects and achieve creative outputs. Our model outperforms previous work while possessing both controllability and universality, presenting opportunities to enhance creativity in modern LFO-driven audio effects.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Impact of Internal Dust Correction on the Stellar Populations of Galaxies Estimated Using the Full Spectrum Fitting
Authors:
Joon Hyeop Lee,
Hyun** Jeong,
Jiwon Chung,
Mina Pak,
Sree Oh
Abstract:
Full spectrum fitting is a powerful tool for estimating the stellar populations of galaxies, but the fitting results are often significantly influenced by internal dust attenuation. For understanding how the choice of the internal dust correction method affects the detailed stellar populations estimated from the full spectrum fitting, we analyze the Sydney-Australian Astronomical Observatory Multi…
▽ More
Full spectrum fitting is a powerful tool for estimating the stellar populations of galaxies, but the fitting results are often significantly influenced by internal dust attenuation. For understanding how the choice of the internal dust correction method affects the detailed stellar populations estimated from the full spectrum fitting, we analyze the Sydney-Australian Astronomical Observatory Multi-object Integral field spectrograph (SAMI) galaxy survey data using the Penalized PiXel-Fitting (PPXF) package. Three choices are compared: (Choice-1) using the PPXF reddening option, (Choice-2) using the multiplicative Legendre polynomial, and (Choice-3) using none of them (no dust correction). In any case, the total mean stellar populations show reasonable mass-age and mass-metallicity relations (MTR and MZR), although the correlations appear to be strongest for Choice-1 (MTR) and Choice-2 (MZR). When we compare the age-divided mean stellar populations, the MZR of young (< 10^9.5 yr ~ 3.2 Gyr) stellar components in Choice-2 is consistent with the gas-phase MZR, whereas those in the other two choices hardly are. On the other hand, the MTR of old (>= 10^9.5 yr) stellar components in Choice-1 seems to be more reasonable than that in Choice-2, because the old stellar components in low-mass galaxies tend to be relatively younger than those in massive galaxies. Based on the results, we provide empirical guidelines for choosing the optimal options for dust correction.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
The Powell Conjecture in genus four
Authors:
Sangbum Cho,
Yuya Koda,
Jung Hoon Lee
Abstract:
The Powell Conjecture states that four specific elements suffice to generate the Goeritz group of the Heegaard splitting of the $3$-sphere. We show that this conjecture is true when the genus of the splitting is four.
The Powell Conjecture states that four specific elements suffice to generate the Goeritz group of the Heegaard splitting of the $3$-sphere. We show that this conjecture is true when the genus of the splitting is four.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?
Authors:
**hyuk Lee,
Anthony Chen,
Zhuyun Dai,
Dheeru Dua,
Devendra Singh Sachan,
Michael Boratko,
Yi Luan,
Sébastien M. R. Arnold,
Vincent Perot,
Siddharth Dalmia,
Hexiang Hu,
Xudong Lin,
Panupong Pasupat,
Aida Amini,
Jeremy R. Cole,
Sebastian Riedel,
Iftekhar Naim,
Ming-Wei Chang,
Kelvin Guu
Abstract:
Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases. Leveraging LCLMs' ability to natively ingest and process entire corpora of information offers numerous advantages. It enhances user-friendliness by eliminating the need for specialized knowledge of tools, provides robust end-to-…
▽ More
Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases. Leveraging LCLMs' ability to natively ingest and process entire corpora of information offers numerous advantages. It enhances user-friendliness by eliminating the need for specialized knowledge of tools, provides robust end-to-end modeling that minimizes cascading errors in complex pipelines, and allows for the application of sophisticated prompting techniques across the entire system. To assess this paradigm shift, we introduce LOFT, a benchmark of real-world tasks requiring context up to millions of tokens designed to evaluate LCLMs' performance on in-context retrieval and reasoning. Our findings reveal LCLMs' surprising ability to rival state-of-the-art retrieval and RAG systems, despite never having been explicitly trained for these tasks. However, LCLMs still face challenges in areas like compositional reasoning that are required in SQL-like tasks. Notably, prompting strategies significantly influence performance, emphasizing the need for continued research as context lengths grow. Overall, LOFT provides a rigorous testing ground for LCLMs, showcasing their potential to supplant existing paradigms and tackle novel tasks as model capabilities scale.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantization
Authors:
Jungi Lee,
Wonbeom Lee,
Jaewoong Sim
Abstract:
Large language models (LLMs) demonstrate outstanding performance in various tasks in machine learning and have thus become one of the most important workloads in today's computing landscape. However, deploying LLM inference poses challenges due to the high compute and memory requirements stemming from the enormous model size and the difficulty of running it in the integer pipelines. In this paper,…
▽ More
Large language models (LLMs) demonstrate outstanding performance in various tasks in machine learning and have thus become one of the most important workloads in today's computing landscape. However, deploying LLM inference poses challenges due to the high compute and memory requirements stemming from the enormous model size and the difficulty of running it in the integer pipelines. In this paper, we present Tender, an algorithm-hardware co-design solution that enables efficient deployment of LLM inference at low precision. Based on our analysis of outlier values in LLMs, we propose a decomposed quantization technique in which the scale factors of decomposed matrices are powers of two apart. The proposed scheme allows us to avoid explicit requantization (i.e., dequantization/quantization) when accumulating the partial sums from the decomposed matrices, with a minimal extension to the commodity tensor compute hardware. Our evaluation shows that Tender achieves higher accuracy and inference performance compared to the state-of-the-art methods while also being significantly less intrusive to the existing accelerators.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
Meent: Differentiable Electromagnetic Simulator for Machine Learning
Authors:
Yongha Kim,
Anthony W. Jung,
Sanmun Kim,
Kevin Octavian,
Doyoung Heo,
Chae** Park,
Jeongmin Shin,
Sunghyun Nam,
Chanhyung Park,
Juho Park,
Sangjun Han,
**myoung Lee,
Seolho Kim,
Min Seok Jang,
Chan Y. Park
Abstract:
Electromagnetic (EM) simulation plays a crucial role in analyzing and designing devices with sub-wavelength scale structures such as solar cells, semiconductor devices, image sensors, future displays and integrated photonic devices. Specifically, optics problems such as estimating semiconductor device structures and designing nanophotonic devices provide intriguing research topics with far-reachin…
▽ More
Electromagnetic (EM) simulation plays a crucial role in analyzing and designing devices with sub-wavelength scale structures such as solar cells, semiconductor devices, image sensors, future displays and integrated photonic devices. Specifically, optics problems such as estimating semiconductor device structures and designing nanophotonic devices provide intriguing research topics with far-reaching real world impact. Traditional algorithms for such tasks require iteratively refining parameters through simulations, which often yield sub-optimal results due to the high computational cost of both the algorithms and EM simulations. Machine learning (ML) emerged as a promising candidate to mitigate these challenges, and optics research community has increasingly adopted ML algorithms to obtain results surpassing classical methods across various tasks. To foster a synergistic collaboration between the optics and ML communities, it is essential to have an EM simulation software that is user-friendly for both research communities. To this end, we present Meent, an EM simulation software that employs rigorous coupled-wave analysis (RCWA). Developed in Python and equipped with automatic differentiation (AD) capabilities, Meent serves as a versatile platform for integrating ML into optics research and vice versa. To demonstrate its utility as a research platform, we present three applications of Meent: 1) generating a dataset for training neural operator, 2) serving as an environment for the reinforcement learning of nanophotonic device optimization, and 3) providing a solution for inverse problems with gradient-based optimizers. These applications highlight Meent's potential to advance both EM simulation and ML methodologies. The code is available at https://github.com/kc-ml2/meent with the MIT license to promote the cross-polinations of ideas among academic researchers and industry practitioners.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Obstructing two-torsion in the rational knot concordance group
Authors:
Jaewon Lee
Abstract:
It is well known that there are many 2-torsion elements in the classical knot concordance group. On the other hand, it is not known if there is any torsion element in the rational knot concordance group $\mathcal{C}_\mathbb{Q}$. Cha defined the algebraic rational concordance group $\mathcal{AC}_\mathbb{Q}$, an analogue of the classical algebraic concordance group, and showed that…
▽ More
It is well known that there are many 2-torsion elements in the classical knot concordance group. On the other hand, it is not known if there is any torsion element in the rational knot concordance group $\mathcal{C}_\mathbb{Q}$. Cha defined the algebraic rational concordance group $\mathcal{AC}_\mathbb{Q}$, an analogue of the classical algebraic concordance group, and showed that $\mathcal{AC}_\mathbb{Q}\cong\mathbb{Z}^\infty\oplus\mathbb{Z}_2^\infty\oplus\mathbb{Z}_4^\infty$. The knots that represent 2-torsions in $\mathcal{AC}_\mathbb{Q}$ potentially have order $2$ in $\mathcal{C}_\mathbb{Q}$. In this paper, we provide an obstruction for knots of order $2$ in $\mathcal{AC}_\mathbb{Q}$ from being of finite order in $\mathcal{C}_\mathbb{Q}$. Moreover, we give a family consisting of such knots that generates an infinite rank subgroup of $\mathcal{C}_\mathbb{Q}$. We also note that Cha proved that in higher dimensions, the algebraic rational concordance order is the same as the rational knot concordance order. Our obstruction is based on the localized von Neumann $ρ$-invariant.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Advancing Cross-Domain Generalizability in Face Anti-Spoofing: Insights, Design, and Metrics
Authors:
Hyo** Kim,
Jiyoon Lee,
Yonghyun Jeong,
Haneol Jang,
YoungJoon Yoo
Abstract:
This paper presents a novel perspective for enhancing anti-spoofing performance in zero-shot data domain generalization. Unlike traditional image classification tasks, face anti-spoofing datasets display unique generalization characteristics, necessitating novel zero-shot data domain generalization. One step forward to the previous frame-wise spoofing prediction, we introduce a nuanced metric calc…
▽ More
This paper presents a novel perspective for enhancing anti-spoofing performance in zero-shot data domain generalization. Unlike traditional image classification tasks, face anti-spoofing datasets display unique generalization characteristics, necessitating novel zero-shot data domain generalization. One step forward to the previous frame-wise spoofing prediction, we introduce a nuanced metric calculation that aggregates frame-level probabilities for a video-wise prediction, to tackle the gap between the reported frame-wise accuracy and instability in real-world use-case. This approach enables the quantification of bias and variance in model predictions, offering a more refined analysis of model generalization. Our investigation reveals that simply scaling up the backbone of models does not inherently improve the mentioned instability, leading us to propose an ensembled backbone method from a Bayesian perspective. The probabilistically ensembled backbone both improves model robustness measured from the proposed metric and spoofing accuracy, and also leverages the advantages of measuring uncertainty, allowing for enhanced sampling during training that contributes to model generalization across new datasets. We evaluate the proposed method from the benchmark OMIC dataset and also the public CelebA-Spoof and SiW-Mv2. Our final model outperforms existing state-of-the-art methods across the datasets, showcasing advancements in Bias, Variance, HTER, and AUC metrics.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Fast Global Localization on Neural Radiance Field
Authors:
Mangyu Kong,
Seongwon Lee,
Jaewon Lee,
Euntai Kim
Abstract:
Neural Radiance Fields (NeRF) presented a novel way to represent scenes, allowing for high-quality 3D reconstruction from 2D images. Following its remarkable achievements, global localization within NeRF maps is an essential task for enabling a wide range of applications. Recently, Loc-NeRF demonstrated a localization approach that combines traditional Monte Carlo Localization with NeRF, showing p…
▽ More
Neural Radiance Fields (NeRF) presented a novel way to represent scenes, allowing for high-quality 3D reconstruction from 2D images. Following its remarkable achievements, global localization within NeRF maps is an essential task for enabling a wide range of applications. Recently, Loc-NeRF demonstrated a localization approach that combines traditional Monte Carlo Localization with NeRF, showing promising results for using NeRF as an environment map. However, despite its advancements, Loc-NeRF encounters the challenge of a time-intensive ray rendering process, which can be a significant limitation in practical applications. To address this issue, we introduce Fast Loc-NeRF, which leverages a coarse-to-fine approach to enable more efficient and accurate NeRF map-based global localization. Specifically, Fast Loc-NeRF matches rendered pixels and observed images on a multi-resolution from low to high resolution. As a result, it speeds up the costly particle update process while maintaining precise localization results. Additionally, to reject the abnormal particles, we propose particle rejection weighting, which estimates the uncertainty of particles by exploiting NeRF's characteristics and considers them in the particle weighting process. Our Fast Loc-NeRF sets new state-of-the-art localization performances on several benchmarks, convincing its accuracy and efficiency.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization
Authors:
Seungwoo Son,
Wonpyo Park,
Woohyun Han,
Kyuyeun Kim,
Jaeho Lee
Abstract:
Despite recent advances in LLM quantization, activation quantization remains to be challenging due to the activation outliers. Conventional remedies, e.g., mixing precisions for different channels, introduce extra overhead and reduce the speedup. In this work, we develop a simple yet effective strategy to facilitate per-tensor activation quantization by preventing the generation of problematic tok…
▽ More
Despite recent advances in LLM quantization, activation quantization remains to be challenging due to the activation outliers. Conventional remedies, e.g., mixing precisions for different channels, introduce extra overhead and reduce the speedup. In this work, we develop a simple yet effective strategy to facilitate per-tensor activation quantization by preventing the generation of problematic tokens. Precisely, we propose a method to find a set of key-value cache, coined CushionCache, which mitigates outliers in subsequent tokens when inserted as a prefix. CushionCache works in two steps: First, we greedily search for a prompt token sequence that minimizes the maximum activation values in subsequent tokens. Then, we further tune the token cache to regularize the activations of subsequent tokens to be more quantization-friendly. The proposed method successfully addresses activation outliers of LLMs, providing a substantial performance boost for per-tensor activation quantization methods. We thoroughly evaluate our method over a wide range of models and benchmarks and find that it significantly surpasses the established baseline of per-tensor W8A8 quantization and can be seamlessly integrated with the recent activation quantization method.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.