Search | arXiv e-print repository

doi 10.1021/acsenergylett.3c02296

Highly-Sensitive Resonance-Enhanced Organic Photodetectors for Shortwave Infrared Sensing

Authors: Hoang Mai Luong, Chokchai Kaiyasuan, Ahra Yi, Sangmin Chae, Brian Minki Kim, Patchareepond Panoy, Hyo Jung Kim, Vinich Promarak, Yasuo Miyata, Hidenori Nakayama, Thuc-Quyen Nguyen

Abstract: Shortwave infrared (SWIR) has various applications, including night vision, remote sensing, and medical imaging. SWIR organic photodetectors (OPDs) offer advantages such as flexibility, cost-effectiveness, and tunable properties, however, lower sensitivity and limited spectral coverage compared to inorganic counterparts are major drawbacks. Here, we propose a simple yet effective and widely applic… ▽ More Shortwave infrared (SWIR) has various applications, including night vision, remote sensing, and medical imaging. SWIR organic photodetectors (OPDs) offer advantages such as flexibility, cost-effectiveness, and tunable properties, however, lower sensitivity and limited spectral coverage compared to inorganic counterparts are major drawbacks. Here, we propose a simple yet effective and widely applicable strategy to extend the wavelength detection range of OPD to a longer wavelength, using resonant optical microcavity. We demonstrate a proof-of-concept in PTB7-Th:COTIC-4F blend system, achieving external quantum efficiency (EQE) > 50 % over a broad spectrum 450 - 1100 nm with a peak specific detectivity (D*) of 1.1E13 Jones at 1100 nm, while cut-off bandwidth, speed, and linearity are preserved. By employing a novel small-molecule acceptor IR6, a record high EQE = 35 % and D* = 4.1E12 Jones are obtained at 1150 nm. This research emphasizes the importance of optical design in optoelectronic devices, presenting a considerably simpler method to expand the photodetection range compared to a traditional approach that involves develo** absorbers with narrow optical gaps. △ Less

Submitted 13 September, 2023; originally announced September 2023.

arXiv:2309.03406 [pdf, other]

Distribution-Aware Prompt Tuning for Vision-Language Models

Authors: Eulrang Cho, Jooyeon Kim, Hyunwoo J. Kim

Abstract: Pre-trained vision-language models (VLMs) have shown impressive performance on various downstream tasks by utilizing knowledge learned from large data. In general, the performance of VLMs on target tasks can be further improved by prompt tuning, which adds context to the input image or text. By leveraging data from target tasks, various prompt-tuning methods have been studied in the literature. A… ▽ More Pre-trained vision-language models (VLMs) have shown impressive performance on various downstream tasks by utilizing knowledge learned from large data. In general, the performance of VLMs on target tasks can be further improved by prompt tuning, which adds context to the input image or text. By leveraging data from target tasks, various prompt-tuning methods have been studied in the literature. A key to prompt tuning is the feature space alignment between two modalities via learnable vectors with model parameters fixed. We observed that the alignment becomes more effective when embeddings of each modality are `well-arranged' in the latent space. Inspired by this observation, we proposed distribution-aware prompt tuning (DAPT) for vision-language models, which is simple yet effective. Specifically, the prompts are learned by maximizing inter-dispersion, the distance between classes, as well as minimizing the intra-dispersion measured by the distance between embeddings from the same class. Our extensive experiments on 11 benchmark datasets demonstrate that our method significantly improves generalizability. The code is available at https://github.com/mlvlab/DAPT. △ Less

Submitted 6 September, 2023; originally announced September 2023.

Comments: Accepted to ICCV2023

arXiv:2308.14971 [pdf, other]

doi 10.1007/s12555-022-0555-0

Distributed multi-agent target search and tracking with Gaussian process and reinforcement learning

Authors: Jigang Kim, Dohyun Jang, H. ** Kim

Abstract: Deploying multiple robots for target search and tracking has many practical applications, yet the challenge of planning over unknown or partially known targets remains difficult to address. With recent advances in deep learning, intelligent control techniques such as reinforcement learning have enabled agents to learn autonomously from environment interactions with little to no prior knowledge. Su… ▽ More Deploying multiple robots for target search and tracking has many practical applications, yet the challenge of planning over unknown or partially known targets remains difficult to address. With recent advances in deep learning, intelligent control techniques such as reinforcement learning have enabled agents to learn autonomously from environment interactions with little to no prior knowledge. Such methods can address the exploration-exploitation tradeoff of planning over unknown targets in a data-driven manner, eliminating the reliance on heuristics typical of traditional approaches and streamlining the decision-making pipeline with end-to-end training. In this paper, we propose a multi-agent reinforcement learning technique with target map building based on distributed Gaussian process. We leverage the distributed Gaussian process to encode belief over the target locations and efficiently plan over unknown targets. We evaluate the performance and transferability of the trained policy in simulation and demonstrate the method on a swarm of micro unmanned aerial vehicles with hardware experiments. △ Less

Submitted 28 August, 2023; originally announced August 2023.

Comments: 10 pages, 6 figures; preprint submitted to IJCAS; first two authors contributed equally

Journal ref: International Journal of Control, Automation, and Systems 2023 21(9): 3057-3067

arXiv:2308.14960 [pdf, other]

Read-only Prompt Optimization for Vision-Language Few-shot Learning

Authors: Dongjun Lee, Seokwon Song, Jihee Suh, Joonmyung Choi, Sanghyeok Lee, Hyunwoo J. Kim

Abstract: In recent years, prompt tuning has proven effective in adapting pre-trained vision-language models to downstream tasks. These methods aim to adapt the pre-trained models by introducing learnable prompts while kee** pre-trained weights frozen. However, learnable prompts can affect the internal representation within the self-attention module, which may negatively impact performance variance and ge… ▽ More In recent years, prompt tuning has proven effective in adapting pre-trained vision-language models to downstream tasks. These methods aim to adapt the pre-trained models by introducing learnable prompts while kee** pre-trained weights frozen. However, learnable prompts can affect the internal representation within the self-attention module, which may negatively impact performance variance and generalization, especially in data-deficient settings. To address these issues, we propose a novel approach, Read-only Prompt Optimization (RPO). RPO leverages masked attention to prevent the internal representation shift in the pre-trained model. Further, to facilitate the optimization of RPO, the read-only prompts are initialized based on special tokens of the pre-trained model. Our extensive experiments demonstrate that RPO outperforms CLIP and CoCoOp in base-to-new generalization and domain generalization while displaying better robustness. Also, the proposed method achieves better generalization on extremely data-deficient settings, while improving parameter efficiency and computational overhead. Code is available at https://github.com/mlvlab/RPO. △ Less

Submitted 9 November, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

Comments: Accepted at ICCV2023

arXiv:2308.13561 [pdf, other]

Project Aria: A New Tool for Egocentric Multi-Modal AI Research

Authors: Jakob Engel, Kiran Somasundaram, Michael Goesele, Albert Sun, Alexander Gamino, Andrew Turner, Arjang Talattof, Arnie Yuan, Bilal Souti, Brighid Meredith, Cheng Peng, Chris Sweeney, Cole Wilson, Dan Barnes, Daniel DeTone, David Caruso, Derek Valleroy, Dinesh Ginjupalli, Duncan Frost, Edward Miller, Elias Mueggler, Evgeniy Oleinik, Fan Zhang, Guruprasad Somasundaram, Gustavo Solaira , et al. (49 additional authors not shown)

Abstract: Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception. These future devices will need to be all-day wearable in a socially acceptable form-factor to support always available, context-aware and personalized AI applications. Our team at Meta Reality Labs Research built the Aria device, an egocentric, mul… ▽ More Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception. These future devices will need to be all-day wearable in a socially acceptable form-factor to support always available, context-aware and personalized AI applications. Our team at Meta Reality Labs Research built the Aria device, an egocentric, multi-modal data recording and streaming device with the goal to foster and accelerate research in this area. In this paper, we describe the Aria device hardware including its sensor configuration and the corresponding software tools that enable recording and processing of such data. △ Less

Submitted 1 October, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

arXiv:2308.11920 [pdf, other]

Concept Bottleneck with Visual Concept Filtering for Explainable Medical Image Classification

Authors: Injae Kim, Jongha Kim, Joonmyung Choi, Hyunwoo J. Kim

Abstract: Interpretability is a crucial factor in building reliable models for various medical applications. Concept Bottleneck Models (CBMs) enable interpretable image classification by utilizing human-understandable concepts as intermediate targets. Unlike conventional methods that require extensive human labor to construct the concept set, recent works leveraging Large Language Models (LLMs) for generati… ▽ More Interpretability is a crucial factor in building reliable models for various medical applications. Concept Bottleneck Models (CBMs) enable interpretable image classification by utilizing human-understandable concepts as intermediate targets. Unlike conventional methods that require extensive human labor to construct the concept set, recent works leveraging Large Language Models (LLMs) for generating concepts made automatic concept generation possible. However, those methods do not consider whether a concept is visually relevant or not, which is an important factor in computing meaningful concept scores. Therefore, we propose a visual activation score that measures whether the concept contains visual cues or not, which can be easily computed with unlabeled image data. Computed visual activation scores are then used to filter out the less visible concepts, thus resulting in a final concept set with visually meaningful concepts. Our experimental results show that adopting the proposed visual activation score for concept filtering consistently boosts performance compared to the baseline. Moreover, qualitative analyses also validate that visually relevant concepts are successfully selected with the visual activation score. △ Less

Submitted 23 August, 2023; originally announced August 2023.

Comments: Accepted to MedAGI Workshop at MICCAI 2023 (Oral Presentation)

arXiv:2308.11916 [pdf, other]

Semantic-Aware Implicit Template Learning via Part Deformation Consistency

Authors: Sihyeon Kim, Minseok Joo, Jaewon Lee, Juyeon Ko, Juhan Cha, Hyunwoo J. Kim

Abstract: Learning implicit templates as neural fields has recently shown impressive performance in unsupervised shape correspondence. Despite the success, we observe current approaches, which solely rely on geometric information, often learn suboptimal deformation across generic object shapes, which have high structural variability. In this paper, we highlight the importance of part deformation consistency… ▽ More Learning implicit templates as neural fields has recently shown impressive performance in unsupervised shape correspondence. Despite the success, we observe current approaches, which solely rely on geometric information, often learn suboptimal deformation across generic object shapes, which have high structural variability. In this paper, we highlight the importance of part deformation consistency and propose a semantic-aware implicit template learning framework to enable semantically plausible deformation. By leveraging semantic prior from a self-supervised feature extractor, we suggest local conditioning with novel semantic-aware deformation code and deformation consistency regularizations regarding part deformation, global deformation, and global scaling. Our extensive experiments demonstrate the superiority of the proposed method over baselines in various tasks: keypoint transfer, part label transfer, and texture transfer. More interestingly, our framework shows a larger performance gain under more challenging settings. We also provide qualitative analyses to validate the effectiveness of semantic-aware deformation. The code is available at https://github.com/mlvlab/PDC. △ Less

Submitted 23 August, 2023; originally announced August 2023.

Comments: ICCV camera-ready version

arXiv:2308.09363 [pdf, other]

Open-vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models

Authors: Dohwan Ko, Ji Soo Lee, Miso Choi, Jaewon Chu, Jihwan Park, Hyunwoo J. Kim

Abstract: Video Question Answering (VideoQA) is a challenging task that entails complex multi-modal reasoning. In contrast to multiple-choice VideoQA which aims to predict the answer given several options, the goal of open-ended VideoQA is to answer questions without restricting candidate answers. However, the majority of previous VideoQA models formulate open-ended VideoQA as a classification task to class… ▽ More Video Question Answering (VideoQA) is a challenging task that entails complex multi-modal reasoning. In contrast to multiple-choice VideoQA which aims to predict the answer given several options, the goal of open-ended VideoQA is to answer questions without restricting candidate answers. However, the majority of previous VideoQA models formulate open-ended VideoQA as a classification task to classify the video-question pairs into a fixed answer set, i.e., closed-vocabulary, which contains only frequent answers (e.g., top-1000 answers). This leads the model to be biased toward only frequent answers and fail to generalize on out-of-vocabulary answers. We hence propose a new benchmark, Open-vocabulary Video Question Answering (OVQA), to measure the generalizability of VideoQA models by considering rare and unseen answers. In addition, in order to improve the model's generalization power, we introduce a novel GNN-based soft verbalizer that enhances the prediction on rare and unseen answers by aggregating the information from their similar words. For evaluation, we introduce new baselines by modifying the existing (closed-vocabulary) open-ended VideoQA models and improve their performances by further taking into account rare and unseen answers. Our ablation studies and qualitative analyses demonstrate that our GNN-based soft verbalizer further improves the model performance, especially on rare and unseen answers. We hope that our benchmark OVQA can serve as a guide for evaluating the generalizability of VideoQA models and inspire future research. Code is available at https://github.com/mlvlab/OVQA. △ Less

Submitted 18 August, 2023; originally announced August 2023.

Comments: Accepted paper at ICCV 2023

arXiv:2308.05334 [pdf, other]

Visibility-Constrained Control of Multirotor via Reference Governor

Authors: Dabin Kim, Matthias Pezzutto, Luca Schenato, H. ** Kim

Abstract: For safe vision-based control applications, perception-related constraints have to be satisfied in addition to other state constraints. In this paper, we deal with the problem where a multirotor equipped with a camera needs to maintain the visibility of a point of interest while tracking a reference given by a high-level planner. We devise a method based on reference governor that, differently fro… ▽ More For safe vision-based control applications, perception-related constraints have to be satisfied in addition to other state constraints. In this paper, we deal with the problem where a multirotor equipped with a camera needs to maintain the visibility of a point of interest while tracking a reference given by a high-level planner. We devise a method based on reference governor that, differently from existing solutions, is able to enforce control-level visibility constraints with theoretically assured feasibility. To this end, we design a new type of reference governor for linear systems with polynomial constraints which is capable of handling time-varying references. The proposed solution is implemented online for the real-time multirotor control with visibility constraints and validated with simulations and an actual hardware experiment. △ Less

Submitted 10 August, 2023; originally announced August 2023.

Comments: 8 pages, 6 figures, Accepted to 62nd IEEE Conference on Decision and Control (CDC 2023)

arXiv:2308.04087 [pdf, other]

Safety-Critical Control under Multiple State and Input Constraints and Application to Fixed-Wing UAV

Authors: Donggeon David Oh, Dongjae Lee, H. ** Kim

Abstract: This study presents a framework to guarantee safety for a class of second-order nonlinear systems under multiple state and input constraints. To facilitate real-world applications, a safety-critical controller must consider multiple constraints simultaneously, while being able to impose general forms of constraints designed for various tasks (e.g., obstacle avoidance). With this in mind, we first… ▽ More This study presents a framework to guarantee safety for a class of second-order nonlinear systems under multiple state and input constraints. To facilitate real-world applications, a safety-critical controller must consider multiple constraints simultaneously, while being able to impose general forms of constraints designed for various tasks (e.g., obstacle avoidance). With this in mind, we first devise a zeroing control barrier function (ZCBF) using a newly proposed nominal evading maneuver. By designing the nominal evading maneuver to 1) be continuously differentiable, 2) satisfy input constraints, and 3) be capable of handling other state constraints, we deduce an ultimate invariant set, a subset of the safe set that can be rendered forward invariant with admissible control inputs. Thanks to the development of the ultimate invariant set, we then propose a safety-critical controller, which is a computationally tractable one-step model predictive controller (MPC) with guaranteed recursive feasibility. We validate the proposed framework in simulation, where a fixed-wing UAV tracks a circular trajectory while satisfying multiple safety constraints including collision avoidance, bounds on flight speed and flight path angle, and input constraints. △ Less

Submitted 8 August, 2023; originally announced August 2023.

Comments: Accepted for the 2023 62nd IEEE Conference on Decision and Control (CDC)

arXiv:2307.13871 [pdf, other]

Emulating Expert Insight: A Robust Strategy for Optimal Experimental Design

Authors: Matthew R. Carbone, Hyeong ** Kim, Chandima Fernando, Shinjae Yoo, Daniel Olds, Howie Joress, Brian DeCost, Bruce Ravel, Yugang Zhang, Phillip M. Maffettone

Abstract: The challenge of optimal design of experiments (DOE) pervades materials science, physics, chemistry, and biology. Bayesian optimization has been used to address this challenge in vast sample spaces, although it requires framing experimental campaigns through the lens of maximizing some observable. This framing is insufficient for epistemic research goals that seek to comprehensively analyze a samp… ▽ More The challenge of optimal design of experiments (DOE) pervades materials science, physics, chemistry, and biology. Bayesian optimization has been used to address this challenge in vast sample spaces, although it requires framing experimental campaigns through the lens of maximizing some observable. This framing is insufficient for epistemic research goals that seek to comprehensively analyze a sample space, without an explicit scalar objective (e.g., the characterization of a wafer or sample library). In this work, we propose a flexible formulation of scientific value that recasts a dataset of input conditions and higher-dimensional observable data into a continuous, scalar metric. Intuitively, the scientific value function measures where observables change significantly, emulating the perspective of experts driving an experiment, and can be used in collaborative analysis tools or as an objective for optimization techniques. We demonstrate this technique by exploring simulated phase boundaries from different observables, autonomously driving a variable temperature measurement of a ferroelectric material, and providing feedback from a nanoparticle synthesis campaign. The method is seamlessly compatible with existing optimization tools, can be extended to multi-modal and multi-fidelity experiments, and can integrate existing models of an experimental system. Because of its flexibility, it can be deployed in a range of experimental settings for autonomous or accelerated experiments. △ Less

Submitted 25 July, 2023; originally announced July 2023.

arXiv:2307.09814 [pdf, other]

doi 10.1103/PhysRevD.108.092006

Search for inelastic WIMP-iodine scattering with COSINE-100

Authors: G. Adhikari, N. Carlin, J. J. Choi, S. Choi, A. C. Ezeribe, L. E. Franca, C. Ha, I. S. Hahn, S. J. Hollick, E. J. Jeon, J. H. Jo, H. W. Joo, W. G. Kang, M. Kauer, B. H. Kim, H. J. Kim, J. Kim, K. W. Kim, S. H. Kim, S. K. Kim, W. K. Kim, Y. D. Kim, Y. H. Kim, Y. J. Ko, D. H. Lee , et al. (34 additional authors not shown)

Abstract: We report the results of a search for inelastic scattering of weakly interacting massive particles (WIMPs) off $^{127}$I nuclei using NaI(Tl) crystals with a data exposure of 97.7 kg$\cdot$years from the COSINE-100 experiment. The signature of inelastic WIMP-$^{127}$I scattering is a nuclear recoil accompanied by a 57.6 keV $γ$-ray from the prompt deexcitation, producing a more energetic signal co… ▽ More We report the results of a search for inelastic scattering of weakly interacting massive particles (WIMPs) off $^{127}$I nuclei using NaI(Tl) crystals with a data exposure of 97.7 kg$\cdot$years from the COSINE-100 experiment. The signature of inelastic WIMP-$^{127}$I scattering is a nuclear recoil accompanied by a 57.6 keV $γ$-ray from the prompt deexcitation, producing a more energetic signal compared to the typical WIMP nuclear recoil signal. We found no evidence for this inelastic scattering signature and set a 90 $\%$ confidence level upper limit on the WIMP-proton spin-dependent, inelastic scattering cross section of $1.2 \times 10^{-37} {\rm cm^{2}}$ at the WIMP mass 500 ${\rm GeV/c^{2}}$. △ Less

Submitted 30 October, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

Comments: 8 pages, 5 figures. arXiv admin note: text overlap with arXiv:2104.03537

Journal ref: Phys. Rev. D 108, 092006 (2023)

arXiv:2307.06216 [pdf]

An Open-Source Multi-functional Testing Platform for Optical Phase Change Materials

Authors: Cosmin-Constantin Popescu, Khoi Phuong Dao, Luigi Ranno, Brian Mills, Louis Martin, Yifei Zhang, David Bono. Brian Neltner, Tian Gu, Juejun Hu, Kiumars Aryana, William M. Humphreys, Hyun Jung Kim, Steven Vitale, Paul Miller, Christopher Roberts, Sarah Geiger, Dennis Callahan, Michael Moebius, Myungkoo Kang, Kathleen Richardson, Carlos A. Ríos Ocampo

Abstract: Owing to their unique tunable optical properties, chalcogenide phase change materials are increasingly being investigated for optics and photonics applications. However, in situ characterization of their phase transition characteristics is a capability that remains inaccessible to many researchers. In this article, we introduce a multi-functional silicon microheater platform capable of in situ mea… ▽ More Owing to their unique tunable optical properties, chalcogenide phase change materials are increasingly being investigated for optics and photonics applications. However, in situ characterization of their phase transition characteristics is a capability that remains inaccessible to many researchers. In this article, we introduce a multi-functional silicon microheater platform capable of in situ measurement of structural, kinetic, optical, and thermal properties of these materials. The platform can be fabricated leveraging industry-standard silicon foundry manufacturing processes. We fully open-sourced this platform, including complete hardware design and associated software codes. △ Less

Submitted 12 July, 2023; originally announced July 2023.

Comments: 13 pages main text, 5 figures +1 (supplementary)

arXiv:2306.14425 [pdf, other]

Minimally actuated tiltrotor for perching and normal force exertion

Authors: Dongjae Lee, Sunwoo Hwang, Changhyeon Kim, Seung Jae Lee, H. ** Kim

Abstract: This study presents a new hardware design and control of a minimally actuated 5 control degrees of freedom (CDoF) quadrotor-based tiltrotor. The proposed tiltrotor possesses several characteristics distinct from those found in existing works, including: 1) minimal number of actuators for 5 CDoF, 2) large margin to generate interaction force during aerial physical interaction (APhI), and 3) no mech… ▽ More This study presents a new hardware design and control of a minimally actuated 5 control degrees of freedom (CDoF) quadrotor-based tiltrotor. The proposed tiltrotor possesses several characteristics distinct from those found in existing works, including: 1) minimal number of actuators for 5 CDoF, 2) large margin to generate interaction force during aerial physical interaction (APhI), and 3) no mechanical obstruction in thrust direction rotation. Thanks to these properties, the proposed tiltrotor is suitable for perching-enabled APhI since it can hover parallel to an arbitrarily oriented surface and can freely adjust its thrust direction. To fully control the 5-CDoF of the designed tiltrotor, we construct an asymptotically stabilizing controller with stability analysis. The proposed tiltrotor design and controller are validated in experiments where the first two experiments of $x,y$ position tracking and pitch tracking show controllability of the added CDoF compared to a conventional quadrotor. Finally, the last experiment of perching and cart pushing demonstrates the proposed tiltrotor's applicability to perching-enabled APhI. △ Less

Submitted 26 June, 2023; originally announced June 2023.

Comments: 7 pages, 10 figures, 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) accepted

arXiv:2306.00322 [pdf, other]

doi 10.1103/PhysRevLett.131.201802

Search for Boosted Dark Matter in COSINE-100

Authors: G. Adhikari, N. Carlin, J. J. Choi, S. Choi, A. C. Ezeribe, L. E. Franca, C. Ha, I. S. Hahn, S. J. Hollick, E. J. Jeon, J. H. Jo, H. W. Joo, W. G. Kang, M. Kauer, B. H. Kim, H. J. Kim, J. Kim, K. W. Kim, S. H. Kim, S. K. Kim, W. K. Kim, Y. D. Kim, Y. H. Kim, Y. J. Ko, D. H. Lee , et al. (34 additional authors not shown)

Abstract: We search for energetic electron recoil signals induced by boosted dark matter (BDM) from the galactic center using the COSINE-100 array of NaI(Tl) crystal detectors at the Yangyang Underground Laboratory. The signal would be an excess of events with energies above 4 MeV over the well-understood background. Because no excess of events are observed in a 97.7 kg$\cdot$years exposure, we set limits o… ▽ More We search for energetic electron recoil signals induced by boosted dark matter (BDM) from the galactic center using the COSINE-100 array of NaI(Tl) crystal detectors at the Yangyang Underground Laboratory. The signal would be an excess of events with energies above 4 MeV over the well-understood background. Because no excess of events are observed in a 97.7 kg$\cdot$years exposure, we set limits on BDM interactions under a variety of hypotheses. Notably, we explored the dark photon parameter space, leading to competitive limits compared to direct dark photon search experiments, particularly for dark photon masses below 4\,MeV and considering the invisible decay mode. Furthermore, by comparing our results with a previous BDM search conducted by the Super-Kamionkande experiment, we found that the COSINE-100 detector has advantages in searching for low-mass dark matter. This analysis demonstrates the potential of the COSINE-100 detector to search for MeV electron recoil signals produced by the dark sector particle interactions. △ Less

Submitted 30 October, 2023; v1 submitted 31 May, 2023; originally announced June 2023.

Comments: 7 pages, 4 figures

Journal ref: Phys. Rev. Lett. 131, 201802 (2023)

arXiv:2305.16512 [pdf, other]

doi 10.1016/j.jcis.2023.07.095

Two-dimensional assembly of gold nanoparticles grafted with charged-end-group polymers

Authors: Hyeong ** Kim, Binay P. Nayak, Honghu Zhang, Benjamin M. Ocko, Alex Travesset, David Vaknin, Surya K. Mallapragada, Wenjie Wang

Abstract: Hypothesis: Introducing charged terminal groups to polymers that graft nanoparticles enables Coulombic control over their assembly by tuning the pH and salinity of aqueous suspensions. Experiments: Gold nanoparticles (AuNPs) are grafted with poly(ethylene glycol) (PEG) terminated with CH3 (charge-neutral), COOH (negatively charged), or NH2 (positively charged) groups. The nanoparticles are charact… ▽ More Hypothesis: Introducing charged terminal groups to polymers that graft nanoparticles enables Coulombic control over their assembly by tuning the pH and salinity of aqueous suspensions. Experiments: Gold nanoparticles (AuNPs) are grafted with poly(ethylene glycol) (PEG) terminated with CH3 (charge-neutral), COOH (negatively charged), or NH2 (positively charged) groups. The nanoparticles are characterized using dynamic light scattering, zeta-potential, and thermal gravimetric analysis. Liquid surface X-ray reflectivity (XR) and grazing incidence small-angle X-ray scattering (GISAXS) techniques are employed to determine the density profile and in-plane structure of the AuNP assembly across and on the aqueous surface. Findings: The assembly of PEG-AuNPs at the liquid/vapor interface can be tuned by adjusting pH or salinity, particularly for COOH terminals. However, the effect is less pronounced for NH2 terminals. These distinct assembly behaviors are attributed to the overall charge of PEG-AuNPs and the conformation of PEG. The COOH-PEG corona is the most compact, resulting in smaller superlattice constants. The net charge per particle depends not only on the PEG terminal groups but also on the cation sequestration of PEG and the intrinsic negative charge of the AuNP surface. NH2-PEG, due to its closeness to overall charge neutrality and the presence of hydrogen bonding, enables the assembly of NH2-PEG-AuNPs more readily. △ Less

Submitted 7 December, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

Comments: Published in the Journal of Colloid and Interface Science, DOI - 10.1016/j.jcis.2023.07.095

Journal ref: J. Colloid Interface Sci. Volume 650, Part B (2023); Pages 1941-1948

arXiv:2305.14145 [pdf, other]

doi 10.1002/smll.202304145

Toward accurate thermal modeling of phase change material based photonic devices

Authors: Kiumars Aryana, Hyun Jung Kim, Cosmin-Constantin Popescu, Steven Vitale, Hyung Bin Bae, Taewoo Lee, Tian Gu, Juejun Hu

Abstract: Reconfigurable or programmable photonic devices are rapidly growing and have become an integral part of many optical systems. The ability to selectively modulate electromagnetic waves through electrical stimuli is crucial in the advancement of a variety of applications from data communication and computing devices to environmental science and space explorations. Chalcogenide-based phase change mat… ▽ More Reconfigurable or programmable photonic devices are rapidly growing and have become an integral part of many optical systems. The ability to selectively modulate electromagnetic waves through electrical stimuli is crucial in the advancement of a variety of applications from data communication and computing devices to environmental science and space explorations. Chalcogenide-based phase change materials (PCMs) are one of the most promising material candidates for reconfigurable photonics due to their large optical contrast between their different solid-state structural phases. Although significant efforts have been devoted to accurate simulation of PCM-based devices, in this paper, we highlight three important aspects which have often evaded prior models yet having significant impacts on the thermal and phase transition behavior of these devices: the enthalpy of fusion, the heat capacity change upon glass transition, as well as the thermal conductivity of liquid-phase PCMs. We further investigated the important topic of switching energy scaling in PCM devices, which also helps explain why the three above-mentioned effects have long been overlooked in electronic PCM memories but only become important in photonics. Our findings offer insight to facilitate accurate modeling of PCM-based photonic devices and can inform the development of more efficient reconfigurable optics. △ Less

Submitted 23 May, 2023; originally announced May 2023.

arXiv:2305.09943 [pdf, other]

Demonstration-free Autonomous Reinforcement Learning via Implicit and Bidirectional Curriculum

Authors: Jigang Kim, Daesol Cho, H. ** Kim

Abstract: While reinforcement learning (RL) has achieved great success in acquiring complex skills solely from environmental interactions, it assumes that resets to the initial state are readily available at the end of each episode. Such an assumption hinders the autonomous learning of embodied agents due to the time-consuming and cumbersome workarounds for resetting in the physical world. Hence, there has… ▽ More While reinforcement learning (RL) has achieved great success in acquiring complex skills solely from environmental interactions, it assumes that resets to the initial state are readily available at the end of each episode. Such an assumption hinders the autonomous learning of embodied agents due to the time-consuming and cumbersome workarounds for resetting in the physical world. Hence, there has been a growing interest in autonomous RL (ARL) methods that are capable of learning from non-episodic interactions. However, existing works on ARL are limited by their reliance on prior data and are unable to learn in environments where task-relevant interactions are sparse. In contrast, we propose a demonstration-free ARL algorithm via Implicit and Bi-directional Curriculum (IBC). With an auxiliary agent that is conditionally activated upon learning progress and a bidirectional goal curriculum based on optimal transport, our method outperforms previous methods, even the ones that leverage demonstrations. △ Less

Submitted 8 June, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

Comments: ICML 2023, first two authors contributed equally

arXiv:2305.07857 [pdf, other]

AURA : Automatic Mask Generator using Randomized Input Sampling for Object Removal

Authors: Changsuk Oh, Dongseok Shim, H. ** Kim

Abstract: The objective of the image inpainting task is to fill missing regions of an image in a visually plausible way. Recently, deep-learning-based image inpainting networks have generated outstanding results, and some utilize their models as object removers by masking unwanted objects in an image. However, while trying to better remove objects using their networks, the previous works pay less attention… ▽ More The objective of the image inpainting task is to fill missing regions of an image in a visually plausible way. Recently, deep-learning-based image inpainting networks have generated outstanding results, and some utilize their models as object removers by masking unwanted objects in an image. However, while trying to better remove objects using their networks, the previous works pay less attention to the importance of the input mask. In this paper, we focus on generating the input mask to better remove objects using the off-the-shelf image inpainting network. We propose an automatic mask generator inspired by the explainable AI (XAI) method, whose output can better remove objects than a semantic segmentation mask. The proposed method generates an importance map using randomly sampled input masks and quantitatively estimated scores of the completed images obtained from the random masks. The output mask is selected by a judge module among the candidate masks which are generated from the importance map. We design the judge module to quantitatively estimate the quality of the object removal results. In addition, we empirically find that the evaluation methods used in the previous works reporting object removal results are not appropriate for estimating the performance of an object remover. Therefore, we propose new evaluation metrics (FID$^*$ and U-IDS$^*$) to properly evaluate the quality of object removers. Experiments confirm that our method shows better performance in removing target class objects than the masks generated from the semantic segmentation maps, and the two proposed metrics make judgments consistent with humans. △ Less

Submitted 13 May, 2023; originally announced May 2023.

arXiv:2305.04759 [pdf, other]

Search for lepton-flavor-violating $τ^- \to \ell^-φ$ decays in 2019-2021 Belle II data

Authors: Belle II Collaboration, F. Abudinén, I. Adachi, K. Adamczyk, L. Aggarwal, P. Ahlburg, H. Ahmed, J. K. Ahn, H. Aihara, N. Akopov, A. Aloisio, L. Andricek, N. Anh Ky, D. M. Asner, H. Atmacan, V. Aulchenko, T. Aushev, V. Aushev, M. Aversano, V. Babu, S. Bacher, H. Bae, S. Bahinipati, A. M. Bakich, P. Bambade , et al. (555 additional authors not shown)

Abstract: We report a search for lepton-flavor-violating decays $τ^- \to \ell^- φ$ ($\ell^- =e^-,μ^-$) at the Belle II experiment, using a sample of electron-positron data produced at the SuperKEKB collider in 2019-2021 and corresponding to an integrated luminosity of 190 fb$^{-1}$. We use a new untagged selection for $e^+e^- \to τ^+τ^-$ events, where the signal $τ$ is searched for as a neutrinoless final s… ▽ More We report a search for lepton-flavor-violating decays $τ^- \to \ell^- φ$ ($\ell^- =e^-,μ^-$) at the Belle II experiment, using a sample of electron-positron data produced at the SuperKEKB collider in 2019-2021 and corresponding to an integrated luminosity of 190 fb$^{-1}$. We use a new untagged selection for $e^+e^- \to τ^+τ^-$ events, where the signal $τ$ is searched for as a neutrinoless final state of a single charged lepton and a $φ$ meson and the other $τ$ is not reconstructed in any specific decay mode, in contrast to previous measurements by the BaBar and Belle experiments. We find no evidence for $τ^- \to \ell^- φ$ decays and obtain upper limits on the branching fractions at 90% confidence level of 23 $\times 10^{-8}$ and 9.7$\times 10^{-8}$ for $τ^- \rightarrow e^-φ$ and $τ^- \rightarrow μ^-φ$, respectively △ Less

Submitted 16 May, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

Report number: BELLE2-CONF-2023-004

arXiv:2305.01321 [pdf, other]

Observation of ${B\to D^{(*)} K^- K^{0}_S}$ decays using the 2019-2022 Belle II data sample

Authors: Belle II Collaboration, F. Abudinén, I. Adachi, K. Adamczyk, L. Aggarwal, P. Ahlburg, H. Ahmed, J. K. Ahn, H. Aihara, N. Akopov, A. Aloisio, L. Andricek, N. Anh Ky, D. M. Asner, H. Atmacan, V. Aulchenko, T. Aushev, V. Aushev, M. Aversano, V. Babu, S. Bacher, H. Bae, S. Bahinipati, A. M. Bakich, P. Bambade , et al. (555 additional authors not shown)

Abstract: We present a measurement of the branching fractions of four $B^{0,-}\to D^{(*)+,0} K^- K^{0}_S$ decay modes. The measurement is based on data from SuperKEKB electron-positron collisions at the $Υ(4S)$ resonance collected with the Belle II detector and corresponding to an integrated luminosity of ${362~\text{fb}^{-1}}$. The event yields are extracted from fits to the distributions of the difference… ▽ More We present a measurement of the branching fractions of four $B^{0,-}\to D^{(*)+,0} K^- K^{0}_S$ decay modes. The measurement is based on data from SuperKEKB electron-positron collisions at the $Υ(4S)$ resonance collected with the Belle II detector and corresponding to an integrated luminosity of ${362~\text{fb}^{-1}}$. The event yields are extracted from fits to the distributions of the difference between expected and observed $B$ meson energy to separate signal and background, and are efficiency-corrected as a function of the invariant mass of the $K^-K_S^0$ system. We find the branching fractions to be: \[ \text{B}(B^-\to D^0K^-K_S^0)=(1.89\pm 0.16\pm 0.10)\times 10^{-4}, \] \[ \text{B}(\overline B{}^0\to D^+K^-K_S^0)=(0.85\pm 0.11\pm 0.05)\times 10^{-4},\] \[ \text{B}(B^-\to D^{*0}K^-K_S^0)=(1.57\pm 0.27\pm 0.12)\times 10^{-4}, \] \[ \text{B}(\overline B{}^0\to D^{*+}K^-K_S^0)=(0.96\pm 0.18\pm 0.06)\times 10^{-4},\] where the first uncertainty is statistical and the second systematic. These results include the first observation of $\overline B{}^0\to D^+K^-K_S^0$, $B^-\to D^{*0}K^-K_S^0$, and $\overline B{}^0\to D^{*+}K^-K_S^0$ decays and a significant improvement in the precision of $\text{B}(B^-\to D^0K^-K_S^0)$ compared to previous measurements. △ Less

Submitted 2 May, 2023; originally announced May 2023.

Report number: BELLE2-CONF-2023-003

arXiv:2304.12621 [pdf, other]

Evolution of Resonant Self-interacting Dark Matter Halos

Authors: Ayuki Kamada, Hee Jung Kim

Abstract: Recent analysis on the stellar kinematics of ultra-faint dwarf (UFD) galaxies has put a stringent upper limit on the self-scattering cross section of dark matter, i.e., $σ/m<{\cal O}(0.1)\,{\rm cm^2/g}$ at the scattering velocity of ${\cal O}(10)\,{\rm km/s}$. Resonant self-interacting dark matter (rSIDM) is one possibility that can be consistent with the UFDs and explain the low central densities… ▽ More Recent analysis on the stellar kinematics of ultra-faint dwarf (UFD) galaxies has put a stringent upper limit on the self-scattering cross section of dark matter, i.e., $σ/m<{\cal O}(0.1)\,{\rm cm^2/g}$ at the scattering velocity of ${\cal O}(10)\,{\rm km/s}$. Resonant self-interacting dark matter (rSIDM) is one possibility that can be consistent with the UFDs and explain the low central densities of rotation-supported galaxies; the cross section is resonantly enhanced to be $σ/m = {\cal O}(1)\,{\rm cm^2/g}$ around the scattering velocity of ${\cal O}(100)\,{\rm km/s}$ while being suppressed at lower velocities. To further assess this possibility, since the inferred dark matter distribution of halos from astrophysical observations is usually compared to that in constant-cross section SIDM (cSIDM), whether the structures of rSIDM halos can be approximated by the cSIDM halo profiles needs to be clarified. In this work, we employ the grovothermal fluid method to investigate the structural evolution of rSIDM halos in a wide mass range. We find that except for halos in a specific mass range, the present structures of rSIDM halos are virtually indistinguishable from those of the cSIDM halos. For halos in the specific mass range, the resonant self-scattering renders a break in their density profile. We demonstrate how such a density-profile break appears in astrophysical observations, e.g., rotation curves and line-of-sight velocity dispersion profiles. We show that for halos above the specific mass range, the density-profile break thermalizes to disappear before the present. We demonstrate that such distinctive thermalization dynamics can leave imprints on the orbital classes of stars with similar ages and metallicities. △ Less

Submitted 25 April, 2023; originally announced April 2023.

Comments: 18 pages, 9 figures

Report number: CTPU-PTC-22-19

arXiv:2304.01460 [pdf, other]

doi 10.1103/PhysRevD.108.L041301

Search for bosonic super-weakly interacting massive particles at COSINE-100

Authors: G. Adhikari, N. Carlin, J. J. Choi, S. Choi, A. C. Ezeribe, L. E. Franca, C. Ha, I. S. Hahn, S. J. Hollick, E. J. Jeon, J. H. Jo, H. W. Joo, W. G. Kang, M. Kauer, B. H. Kim, H. J. Kim, J. Kim, K. W. Kim, S. H. Kim, S. K. Kim, W. K. Kim, Y. D. Kim, Y. H. Kim, Y. J. Ko, D. H. Lee , et al. (34 additional authors not shown)

Abstract: We present results of a search for bosonic super-weakly interacting massive particles (BSW) as keV scale dark matter candidates that is based on an exposure of 97.7 kg$\cdot$year from the COSINE experiment. In this search, we employ, for the first time, Compton-like as well as absorption processes for pseudoscalar and vector BSWs. No evidence for BSWs is found in the mass range from 10… ▽ More We present results of a search for bosonic super-weakly interacting massive particles (BSW) as keV scale dark matter candidates that is based on an exposure of 97.7 kg$\cdot$year from the COSINE experiment. In this search, we employ, for the first time, Compton-like as well as absorption processes for pseudoscalar and vector BSWs. No evidence for BSWs is found in the mass range from 10 $\mathrm{keV/c}^2$ to 1 $\mathrm{MeV/c}^2$, and we present the exclusion limits on the dimensionless coupling constants to electrons $g_{ae}$ for pseudoscalar and $κ$ for vector BSWs at 90% confidence level. Our results show that these limits are improved by including the Compton-like process in masses of BSW, above $\mathcal{O}(100\,\mathrm{keV/c}^2)$. △ Less

Submitted 27 August, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

Journal ref: Phys. Rev. D 108 (2023) L041301

arXiv:2303.16450 [pdf, other]

Self-positioning Point-based Transformer for Point Cloud Understanding

Authors: **young Park, Sanghyeok Lee, Sihyeon Kim, Yunyang Xiong, Hyunwoo J. Kim

Abstract: Transformers have shown superior performance on various computer vision tasks with their capabilities to capture long-range dependencies. Despite the success, it is challenging to directly apply Transformers on point clouds due to their quadratic cost in the number of points. In this paper, we present a Self-Positioning point-based Transformer (SPoTr), which is designed to capture both local and g… ▽ More Transformers have shown superior performance on various computer vision tasks with their capabilities to capture long-range dependencies. Despite the success, it is challenging to directly apply Transformers on point clouds due to their quadratic cost in the number of points. In this paper, we present a Self-Positioning point-based Transformer (SPoTr), which is designed to capture both local and global shape contexts with reduced complexity. Specifically, this architecture consists of local self-attention and self-positioning point-based global cross-attention. The self-positioning points, adaptively located based on the input shape, consider both spatial and semantic information with disentangled attention to improve expressive power. With the self-positioning points, we propose a novel global cross-attention mechanism for point clouds, which improves the scalability of global self-attention by allowing the attention module to compute attention weights with only a small set of self-positioning points. Experiments show the effectiveness of SPoTr on three point cloud tasks such as shape classification, part segmentation, and scene segmentation. In particular, our proposed model achieves an accuracy gain of 2.6% over the previous best models on shape classification with ScanObjectNN. We also provide qualitative analyses to demonstrate the interpretability of self-positioning points. The code of SPoTr is available at https://github.com/mlvlab/SPoTr. △ Less

Submitted 29 March, 2023; originally announced March 2023.

Comments: Accepted paper at CVPR 2023

arXiv:2303.13009 [pdf, other]

MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models

Authors: Dohwan Ko, Joonmyung Choi, Hyeong Kyu Choi, Kyoung-Woon On, Byungseok Roh, Hyunwoo J. Kim

Abstract: Foundation models have shown outstanding performance and generalization capabilities across domains. Since most studies on foundation models mainly focus on the pretraining phase, a naive strategy to minimize a single task-specific loss is adopted for fine-tuning. However, such fine-tuning methods do not fully leverage other losses that are potentially beneficial for the target task. Therefore, we… ▽ More Foundation models have shown outstanding performance and generalization capabilities across domains. Since most studies on foundation models mainly focus on the pretraining phase, a naive strategy to minimize a single task-specific loss is adopted for fine-tuning. However, such fine-tuning methods do not fully leverage other losses that are potentially beneficial for the target task. Therefore, we propose MEta Loss TRansformer (MELTR), a plug-in module that automatically and non-linearly combines various loss functions to aid learning the target task via auxiliary learning. We formulate the auxiliary learning as a bi-level optimization problem and present an efficient optimization algorithm based on Approximate Implicit Differentiation (AID). For evaluation, we apply our framework to various video foundation models (UniVL, Violet and All-in-one), and show significant performance gain on all four downstream tasks: text-to-video retrieval, video question answering, video captioning, and multi-modal sentiment analysis. Our qualitative analyses demonstrate that MELTR adequately `transforms' individual loss functions and `melts' them into an effective unified loss. Code is available at https://github.com/mlvlab/MELTR. △ Less

Submitted 22 March, 2023; originally announced March 2023.

Comments: Accepted paper at CVPR 2023

arXiv:2303.10824 [pdf, other]

doi 10.1007/978-3-031-19803-8_39

k-SALSA: k-anonymous synthetic averaging of retinal images via local style alignment

Authors: Minkyu Jeon, Hyeon** Park, Hyunwoo J. Kim, Michael Morley, Hyunghoon Cho

Abstract: The application of modern machine learning to retinal image analyses offers valuable insights into a broad range of human health conditions beyond ophthalmic diseases. Additionally, data sharing is key to fully realizing the potential of machine learning models by providing a rich and diverse collection of training data. However, the personally-identifying nature of retinal images, encompassing th… ▽ More The application of modern machine learning to retinal image analyses offers valuable insights into a broad range of human health conditions beyond ophthalmic diseases. Additionally, data sharing is key to fully realizing the potential of machine learning models by providing a rich and diverse collection of training data. However, the personally-identifying nature of retinal images, encompassing the unique vascular structure of each individual, often prevents this data from being shared openly. While prior works have explored image de-identification strategies based on synthetic averaging of images in other domains (e.g. facial images), existing techniques face difficulty in preserving both privacy and clinical utility in retinal images, as we demonstrate in our work. We therefore introduce k-SALSA, a generative adversarial network (GAN)-based framework for synthesizing retinal fundus images that summarize a given private dataset while satisfying the privacy notion of k-anonymity. k-SALSA brings together state-of-the-art techniques for training and inverting GANs to achieve practical performance on retinal images. Furthermore, k-SALSA leverages a new technique, called local style alignment, to generate a synthetic average that maximizes the retention of fine-grain visual patterns in the source images, thus improving the clinical utility of the generated images. On two benchmark datasets of diabetic retinopathy (EyePACS and APTOS), we demonstrate our improvement upon existing methods with respect to image fidelity, classification performance, and mitigation of membership inference attacks. Our work represents a step toward broader sharing of retinal images for scientific collaboration. Code is available at https://github.com/hcholab/k-salsa. △ Less

Submitted 19 March, 2023; originally announced March 2023.

Comments: European Conference on Computer Vision (ECCV), 2022

arXiv:2303.07872 [pdf, other]

Object-based SLAM utilizing unambiguous pose parameters considering general symmetry types

Authors: Taekbeom Lee, Youngseok Jang, H. ** Kim

Abstract: Existence of symmetric objects, whose observation at different viewpoints can be identical, can deteriorate the performance of simultaneous localization and map**(SLAM). This work proposes a system for robustly optimizing the pose of cameras and objects even in the presence of symmetric objects. We classify objects into three categories depending on their symmetry characteristics, which is effic… ▽ More Existence of symmetric objects, whose observation at different viewpoints can be identical, can deteriorate the performance of simultaneous localization and map**(SLAM). This work proposes a system for robustly optimizing the pose of cameras and objects even in the presence of symmetric objects. We classify objects into three categories depending on their symmetry characteristics, which is efficient and effective in that it allows to deal with general objects and the objects in the same category can be associated with the same type of ambiguity. Then we extract only the unambiguous parameters corresponding to each category and use them in data association and joint optimization of the camera and object pose. The proposed approach provides significant robustness to the SLAM performance by removing the ambiguous parameters and utilizing as much useful geometric information as possible. Comparison with baseline algorithms confirms the superior performance of the proposed system in terms of object tracking and pose estimation, even in challenging scenarios where the baseline fails. △ Less

Submitted 12 March, 2023; originally announced March 2023.

Comments: This paper has been accepted to ICRA 2023

arXiv:2303.03966 [pdf, other]

Semantic-aware Occlusion Filtering Neural Radiance Fields in the Wild

Authors: Jaewon Lee, Injae Kim, Hwan Heo, Hyunwoo J. Kim

Abstract: We present a learning framework for reconstructing neural scene representations from a small number of unconstrained tourist photos. Since each image contains transient occluders, decomposing the static and transient components is necessary to construct radiance fields with such in-the-wild photographs where existing methods require a lot of training data. We introduce SF-NeRF, aiming to disentang… ▽ More We present a learning framework for reconstructing neural scene representations from a small number of unconstrained tourist photos. Since each image contains transient occluders, decomposing the static and transient components is necessary to construct radiance fields with such in-the-wild photographs where existing methods require a lot of training data. We introduce SF-NeRF, aiming to disentangle those two components with only a few images given, which exploits semantic information without any supervision. The proposed method contains an occlusion filtering module that predicts the transient color and its opacity for each pixel, which enables the NeRF model to solely learn the static scene representation. This filtering module learns the transient phenomena guided by pixel-wise semantic features obtained by a trainable image encoder that can be trained across multiple scenes to learn the prior of transient objects. Furthermore, we present two techniques to prevent ambiguous decomposition and noisy results of the filtering module. We demonstrate that our method outperforms state-of-the-art novel view synthesis methods on Phototourism dataset in a few-shot setting. △ Less

Submitted 5 March, 2023; originally announced March 2023.

Comments: 11 pages, 5 figures

arXiv:2302.14273 [pdf, other]

QP Chaser: Polynomial Trajectory Generation for Autonomous Aerial Tracking

Authors: Yunwoo Lee, Jungwon Park, Seungwoo Jung, Boseong Jeon, Dahyun Oh, H. ** Kim

Abstract: Maintaining the visibility of the targets is one of the major objectives of aerial tracking applications. This paper proposes QP Chaser, a trajectory planning pipeline that can enhance the visibility of single- and dual-target in both static and dynamic environments. As the name suggests, the proposed planner generates a target-visible trajectory via quadratic programming problems. First, the pred… ▽ More Maintaining the visibility of the targets is one of the major objectives of aerial tracking applications. This paper proposes QP Chaser, a trajectory planning pipeline that can enhance the visibility of single- and dual-target in both static and dynamic environments. As the name suggests, the proposed planner generates a target-visible trajectory via quadratic programming problems. First, the predictor forecasts the reachable sets of moving objects with a sample-and-check strategy considering obstacles. Subsequently, the trajectory planner reinforces the visibility of targets with consideration of 1) path topology and 2) reachable sets of targets and obstacles. We define a target-visible region (TVR) with topology analysis of not only static obstacles but also dynamic obstacles, and it reflects reachable sets of moving targets and obstacles to maintain the whole body of the target within the camera image robustly and ceaselessly. The online performance of the proposed planner is validated in multiple scenarios, including high-fidelity simulations and real-world experiments. △ Less

Submitted 27 February, 2023; originally announced February 2023.

Comments: 15 pages, 13 figures

arXiv:2302.10267 [pdf, other]

doi 10.1103/PhysRevD.107.122004

Search for solar bosonic dark matter annual modulation with COSINE-100

Authors: G. Adhikari, N. Carlin, J. J. Choi, S. Choi, A. C. Ezeribe, L. E. França, C. Ha, I. S. Hahn, S. J. Hollick, E. J. Jeon, J. H. Jo, H. W. Joo, W. G. Kang, M. Kauer, B. H. Kim, H. J. Kim, J. Kim, K. W. Kim, S. H. Kim, S. K. Kim, W. K. Kim, Y. D. Kim, Y. H. Kim, Y. J. Ko, D. H. Lee , et al. (34 additional authors not shown)

Abstract: We present results from a search for solar bosonic dark matter using the annual modulation method with the COSINE-100 experiment. The results were interpreted considering three dark sector bosons models: solar dark photon; DFSZ and KSVZ solar axion; and Kaluza-Klein solar axion. No modulation signal that is compatible with the expected from the models was found from a data-set of 2.82 yr, using 61… ▽ More We present results from a search for solar bosonic dark matter using the annual modulation method with the COSINE-100 experiment. The results were interpreted considering three dark sector bosons models: solar dark photon; DFSZ and KSVZ solar axion; and Kaluza-Klein solar axion. No modulation signal that is compatible with the expected from the models was found from a data-set of 2.82 yr, using 61.3 kg of NaI(Tl) crystals. Therefore, we set a 90$\%$ confidence level upper limits for each of the three models studied. For the solar dark photon model, the most stringent mixing parameter upper limit is $1.61 \times 10^{-14}$ for dark photons with a mass of 215 eV. For the DFSZ and KSVZ solar axion, and the Kaluza-Klein axion models, the upper limits exclude axion-electron couplings, $g_{ae}$, above $1.61 \times 10^{-11}$ for axion mass below 0.2 keV; and axion-photon couplings, $g_{aγγ}$, above $1.83 \times 10^{-11}$ GeV$^{-1}$ for an axion number density of $4.07 \times 10^{13}$ cm$^{-3}$. This is the first experimental search for solar dark photons and DFSZ and KSVZ solar axions using the annual modulation method. The lower background, higher light yield and reduced threshold of NaI(Tl) crystals of the future COSINE-200 experiment are expected to enhance the sensitivity of the analysis shown in this paper. We show the sensitivities for the three models studied, considering the same search method with COSINE-200. △ Less

Submitted 20 February, 2023; originally announced February 2023.

Comments: 13 pages, 16 figures

arXiv:2302.01571 [pdf, other]

Robust Camera Pose Refinement for Multi-Resolution Hash Encoding

Authors: Hwan Heo, Taekyung Kim, Jiyoung Lee, Jaewon Lee, Soohyun Kim, Hyunwoo J. Kim, **-Hwa Kim

Abstract: Multi-resolution hash encoding has recently been proposed to reduce the computational cost of neural renderings, such as NeRF. This method requires accurate camera poses for the neural renderings of given scenes. However, contrary to previous methods jointly optimizing camera poses and 3D scenes, the naive gradient-based camera pose refinement method using multi-resolution hash encoding severely d… ▽ More Multi-resolution hash encoding has recently been proposed to reduce the computational cost of neural renderings, such as NeRF. This method requires accurate camera poses for the neural renderings of given scenes. However, contrary to previous methods jointly optimizing camera poses and 3D scenes, the naive gradient-based camera pose refinement method using multi-resolution hash encoding severely deteriorates performance. We propose a joint optimization algorithm to calibrate the camera pose and learn a geometric representation using efficient multi-resolution hash encoding. Showing that the oscillating gradient flows of hash encoding interfere with the registration of camera poses, our method addresses the issue by utilizing smooth interpolation weighting to stabilize the gradient oscillation for the ray samplings across hash grids. Moreover, the curriculum training procedure helps to learn the level-wise hash encoding, further increasing the pose refinement. Experiments on the novel-view synthesis datasets validate that our learning frameworks achieve state-of-the-art performance and rapid convergence of neural rendering, even when initial camera poses are unknown. △ Less

Submitted 3 February, 2023; originally announced February 2023.

arXiv:2302.00980 [pdf, other]

Domain Generalization Emerges from Dreaming

Authors: Hwan Heo, Young** Oh, Jaewon Lee, Hyunwoo J. Kim

Abstract: Recent studies have proven that DNNs, unlike human vision, tend to exploit texture information rather than shape. Such texture bias is one of the factors for the poor generalization performance of DNNs. We observe that the texture bias negatively affects not only in-domain generalization but also out-of-distribution generalization, i.e., Domain Generalization. Motivated by the observation, we prop… ▽ More Recent studies have proven that DNNs, unlike human vision, tend to exploit texture information rather than shape. Such texture bias is one of the factors for the poor generalization performance of DNNs. We observe that the texture bias negatively affects not only in-domain generalization but also out-of-distribution generalization, i.e., Domain Generalization. Motivated by the observation, we propose a new framework to reduce the texture bias of a model by a novel optimization-based data augmentation, dubbed Stylized Dream. Our framework utilizes adaptive instance normalization (AdaIN) to augment the style of an original image yet preserve the content. We then adopt a regularization loss to predict consistent outputs between Stylized Dream and original images, which encourages the model to learn shape-based representations. Extensive experiments show that the proposed method achieves state-of-the-art performance in out-of-distribution settings on public benchmark datasets: PACS, VLCS, OfficeHome, TerraIncognita, and DomainNet. △ Less

Submitted 2 February, 2023; originally announced February 2023.

Comments: 23 pages, 4 figures

arXiv:2301.11741 [pdf, other]

Outcome-directed Reinforcement Learning by Uncertainty & Temporal Distance-Aware Curriculum Goal Generation

Authors: Daesol Cho, Seungjae Lee, H. ** Kim

Abstract: Current reinforcement learning (RL) often suffers when solving a challenging exploration problem where the desired outcomes or high rewards are rarely observed. Even though curriculum RL, a framework that solves complex tasks by proposing a sequence of surrogate tasks, shows reasonable results, most of the previous works still have difficulty in proposing curriculum due to the absence of a mechani… ▽ More Current reinforcement learning (RL) often suffers when solving a challenging exploration problem where the desired outcomes or high rewards are rarely observed. Even though curriculum RL, a framework that solves complex tasks by proposing a sequence of surrogate tasks, shows reasonable results, most of the previous works still have difficulty in proposing curriculum due to the absence of a mechanism for obtaining calibrated guidance to the desired outcome state without any prior domain knowledge. To alleviate it, we propose an uncertainty & temporal distance-aware curriculum goal generation method for the outcome-directed RL via solving a bipartite matching problem. It could not only provide precisely calibrated guidance of the curriculum to the desired outcome states but also bring much better sample efficiency and geometry-agnostic curriculum goal proposal capability compared to previous curriculum RL methods. We demonstrate that our algorithm significantly outperforms these prior methods in a variety of challenging navigation tasks and robotic manipulation tasks in a quantitative and qualitative way. △ Less

Submitted 20 February, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

Comments: ICLR 2023 Spotlight. First two authors contributed equally

arXiv:2301.11660 [pdf, other]

Probing Out-of-Distribution Robustness of Language Models with Parameter-Efficient Transfer Learning

Authors: Hyunsoo Cho, Choonghyun Park, Junyeop Kim, Hyuhng Joon Kim, Kang Min Yoo, Sang-goo Lee

Abstract: As the size of the pre-trained language model (PLM) continues to increase, numerous parameter-efficient transfer learning methods have been proposed recently to compensate for the tremendous cost of fine-tuning. Despite the impressive results achieved by large pre-trained language models (PLMs) and various parameter-efficient transfer learning (PETL) methods on sundry benchmarks, it remains unclea… ▽ More As the size of the pre-trained language model (PLM) continues to increase, numerous parameter-efficient transfer learning methods have been proposed recently to compensate for the tremendous cost of fine-tuning. Despite the impressive results achieved by large pre-trained language models (PLMs) and various parameter-efficient transfer learning (PETL) methods on sundry benchmarks, it remains unclear if they can handle inputs that have been distributionally shifted effectively. In this study, we systematically explore how the ability to detect out-of-distribution (OOD) changes as the size of the PLM grows or the transfer methods are altered. Specifically, we evaluated various PETL techniques, including fine-tuning, Adapter, LoRA, and prefix-tuning, on three different intention classification tasks, each utilizing various language models with different scales. △ Less

Submitted 13 June, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

Comments: *SEM 2023

arXiv:2301.11520 [pdf, other]

SNeRL: Semantic-aware Neural Radiance Fields for Reinforcement Learning

Authors: Dongseok Shim, Seungjae Lee, H. ** Kim

Abstract: As previous representations for reinforcement learning cannot effectively incorporate a human-intuitive understanding of the 3D environment, they usually suffer from sub-optimal performances. In this paper, we present Semantic-aware Neural Radiance Fields for Reinforcement Learning (SNeRL), which jointly optimizes semantic-aware neural radiance fields (NeRF) with a convolutional encoder to learn 3… ▽ More As previous representations for reinforcement learning cannot effectively incorporate a human-intuitive understanding of the 3D environment, they usually suffer from sub-optimal performances. In this paper, we present Semantic-aware Neural Radiance Fields for Reinforcement Learning (SNeRL), which jointly optimizes semantic-aware neural radiance fields (NeRF) with a convolutional encoder to learn 3D-aware neural implicit representation from multi-view images. We introduce 3D semantic and distilled feature fields in parallel to the RGB radiance fields in NeRF to learn semantic and object-centric representation for reinforcement learning. SNeRL outperforms not only previous pixel-based representations but also recent 3D-aware representations both in model-free and model-based reinforcement learning. △ Less

Submitted 31 May, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

Comments: ICML 2023. First two authors contributed equally. Order was determined by coin flip

arXiv:2301.08078 [pdf, other]

Stable Contact Guaranteeing Motion/Force Control for an Aerial Manipulator on an Arbitrarily Tilted Surface

Authors: Jeonghyun Byun, Byeongjun Kim, Changhyeon Kim, Donggeon David Oh, H. ** Kim

Abstract: This study aims to design a motion/force controller for an aerial manipulator which guarantees the tracking of time-varying motion/force trajectories as well as the stability during the transition between free and contact motions. To this end, we model the force exerted on the end-effector as the Kelvin-Voigt linear model and estimate its parameters by recursive least-squares estimator. Then, the… ▽ More This study aims to design a motion/force controller for an aerial manipulator which guarantees the tracking of time-varying motion/force trajectories as well as the stability during the transition between free and contact motions. To this end, we model the force exerted on the end-effector as the Kelvin-Voigt linear model and estimate its parameters by recursive least-squares estimator. Then, the gains of the disturbance-observer (DOB)-based motion/force controller are calculated based on the stability conditions considering both the model uncertainties in the dynamic equation and switching between the free and contact motions. To validate the proposed controller, we conducted the time-varying motion/force tracking experiments with different approach speeds and orientations of the surface. The results show that our controller enables the aerial manipulator to track the time-varying motion/force trajectories. △ Less

Submitted 19 January, 2023; originally announced January 2023.

Comments: to be presented in 2023 IEEE International Conference on Robotics and Automations (ICRA), London, United Kingdom, 2023

arXiv:2301.06715 [pdf, other]

SwinDepth: Unsupervised Depth Estimation using Monocular Sequences via Swin Transformer and Densely Cascaded Network

Authors: Dongseok Shim, H. ** Kim

Abstract: Monocular depth estimation plays a critical role in various computer vision and robotics applications such as localization, map**, and 3D object detection. Recently, learning-based algorithms achieve huge success in depth estimation by training models with a large amount of data in a supervised manner. However, it is challenging to acquire dense ground truth depth labels for supervised training,… ▽ More Monocular depth estimation plays a critical role in various computer vision and robotics applications such as localization, map**, and 3D object detection. Recently, learning-based algorithms achieve huge success in depth estimation by training models with a large amount of data in a supervised manner. However, it is challenging to acquire dense ground truth depth labels for supervised training, and the unsupervised depth estimation using monocular sequences emerges as a promising alternative. Unfortunately, most studies on unsupervised depth estimation explore loss functions or occlusion masks, and there is little change in model architecture in that ConvNet-based encoder-decoder structure becomes a de-facto standard for depth estimation. In this paper, we employ a convolution-free Swin Transformer as an image feature extractor so that the network can capture both local geometric features and global semantic features for depth estimation. Also, we propose a Densely Cascaded Multi-scale Network (DCMNet) that connects every feature map directly with another from different scales via a top-down cascade pathway. This densely cascaded connectivity reinforces the interconnection between decoding layers and produces high-quality multi-scale depth outputs. The experiments on two different datasets, KITTI and Make3D, demonstrate that our proposed method outperforms existing state-of-the-art unsupervised algorithms. △ Less

Submitted 17 January, 2023; originally announced January 2023.

Comments: ICRA 2023

arXiv:2301.04716 [pdf, other]

Measurement of the $B^{0} \rightarrow D^{*-} \ell^{+} ν_{\ell}$ branching ratio and $|V_{cb}|$ with a fully reconstructed accompanying $B$ meson in 2019-2021 Belle II data

Authors: F. Abudinén, I. Adachi, K. Adamczyk, L. Aggarwal, P. Ahlburg, H. Ahmed, J. K. Ahn, H. Aihara, N. Akopov, A. Aloisio, F. Ameli, L. Andricek, N. Anh Ky, D. M. Asner, H. Atmacan, V. Aulchenko, T. Aushev, V. Aushev, T. Aziz, V. Babu, H. Bae, S. Baehr, S. Bahinipati, A. M. Bakich, P. Bambade , et al. (561 additional authors not shown)

Abstract: We present a measurement of the $B^{0} \rightarrow D^{*-} \ell^{+} ν_{\ell}$ ($\ell=e,μ$) branching ratio and of the CKM parameter $|V_{cb}|$ using signal decays accompanied by a fully reconstructed $B$ meson. The Belle II data set of electron-positron collisions at the $Υ(4S)$ resonance, corresponding to 189.3$\,$fb$^{-1}$ of integrated luminosity, is analyzed. With the Caprini-Lellouch-Neubert f… ▽ More We present a measurement of the $B^{0} \rightarrow D^{*-} \ell^{+} ν_{\ell}$ ($\ell=e,μ$) branching ratio and of the CKM parameter $|V_{cb}|$ using signal decays accompanied by a fully reconstructed $B$ meson. The Belle II data set of electron-positron collisions at the $Υ(4S)$ resonance, corresponding to 189.3$\,$fb$^{-1}$ of integrated luminosity, is analyzed. With the Caprini-Lellouch-Neubert form factor parameterization, the parameters $η_{\rm EW} F(1) |V_{cb}|$ and $ρ^{2}$ are extracted, where $η_{\rm EW}$ is an electroweak correction, $F(1)$ is a normalization factor and $ρ^{2}$ is a form factor shape parameter. We reconstruct 516 signal decays and thereby obtain $\mathcal{B} (B^{0} \rightarrow D^{*-} \ell^{+} ν_{\ell} ) = \left(5.27 \pm 0.22~\rm{\left(stat\right)} \pm 0.38~\rm{\left(syst\right)}\right) \%$, $η_{EW} F(1) |V_{cb}| \times 10^{3} = 34.6 \pm 1.8~\rm{\left(stat\right)} \pm 1.7~\rm{\left(syst\right)}$, and $ρ^{2} = 0.94 \pm 0.18~\rm{\left(stat\right)} \pm 0.11~\rm{\left(syst\right)}$. △ Less

Submitted 11 January, 2023; originally announced January 2023.

arXiv:2301.03916 [pdf, other]

doi 10.1103/PhysRevMaterials.7.L041001

External screening and lifetime of exciton population in single-layer ReSe$_2$ probed by time- and angle-resolved photoemission spectroscopy

Authors: Klara Volckaert, Byoung Ki Choi, Hyuk ** Kim, Deepnarayan Biswas, Denny Puntel, Simone Peli, Fulvio Parmigiani, Federico Cilento, Young Jun Chang, Søren Ulstrup

Abstract: The semiconductor ReSe$_2$ is characterized by a strongly anisotropic optical absorption and is therefore promising as an optically active component in two-dimensional heterostructures. However, the underlying femtosecond dynamics of photoinduced excitations in such materials has not been sufficiently explored. Here, we apply an infrared optical excitation to single-layer ReSe$_2$ grown on a bilay… ▽ More The semiconductor ReSe$_2$ is characterized by a strongly anisotropic optical absorption and is therefore promising as an optically active component in two-dimensional heterostructures. However, the underlying femtosecond dynamics of photoinduced excitations in such materials has not been sufficiently explored. Here, we apply an infrared optical excitation to single-layer ReSe$_2$ grown on a bilayer graphene substrate and monitor the temporal evolution of the excited state signal using time- and angle-resolved photoemission spectroscopy. We measure an optical gap of $(1.53 \pm 0.02)$ eV, consistent with resonant excitation of the lowest exciton state. The exciton distribution is tunable via the linear polarization of the pump pulse and exhibits a biexponential decay with time constants given by $τ_1 = (110 \pm 10)$ fs and $τ_2 = (650 \pm 70)$ fs, facilitated by recombination via an in-gap state that is pinned at the Fermi level. By extracting the momentum-resolved exciton distribution we estimate its real-space radial extent to be greater than 17.1 Å, implying significant exciton delocalization due to screening from the bilayer graphene substrate. △ Less

Submitted 10 January, 2023; originally announced January 2023.

Comments: 6 pages, 4 figures

Journal ref: Phys. Rev. Materials 7, L041001 (2023)

arXiv:2301.02222 [pdf, other]

Computing nonsurjective primes associated to Galois representations of genus $2$ curves

Authors: Barinder S. Banwait, Armand Brumer, Hyun Jong Kim, Zev Klagsbrun, Jacob Mayle, Padmavathi Srinivasan, Isabel Vogt

Abstract: For a genus $2$ curve $C$ over $\mathbb{Q}$ whose Jacobian $A$ admits only trivial geometric endomorphisms, Serre's open image theorem for abelian surfaces asserts that there are only finitely many primes $\ell$ for which the Galois action on $\ell$-torsion points of $A$ is not maximal. Building on work of Dieulefait, we give a practical algorithm to compute this finite set. The key inputs are Mit… ▽ More For a genus $2$ curve $C$ over $\mathbb{Q}$ whose Jacobian $A$ admits only trivial geometric endomorphisms, Serre's open image theorem for abelian surfaces asserts that there are only finitely many primes $\ell$ for which the Galois action on $\ell$-torsion points of $A$ is not maximal. Building on work of Dieulefait, we give a practical algorithm to compute this finite set. The key inputs are Mitchell's classification of maximal subgroups of $\mathrm{PSp_4}(\mathbb{F}_\ell)$, sampling of the characteristic polynomials of Frobenius, and the Khare--Wintenberger modularity theorem. The algorithm has been submitted for integration into Sage, executed on all of the genus~$2$ curves with trivial endomorphism ring in the LMFDB, and the results incorporated into the homepage of each such curve. △ Less

Submitted 10 July, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

MSC Class: 11F80(primary); 11G10; 11Y16

arXiv:2212.10873 [pdf, other]

Prompt-Augmented Linear Probing: Scaling beyond the Limit of Few-shot In-Context Learners

Authors: Hyunsoo Cho, Hyuhng Joon Kim, Junyeob Kim, Sang-Woo Lee, Sang-goo Lee, Kang Min Yoo, Taeuk Kim

Abstract: Through in-context learning (ICL), large-scale language models are effective few-shot learners without additional model fine-tuning. However, the ICL performance does not scale well with the number of available training samples as it is limited by the inherent input length constraint of the underlying language model. Meanwhile, many studies have revealed that language models are also powerful feat… ▽ More Through in-context learning (ICL), large-scale language models are effective few-shot learners without additional model fine-tuning. However, the ICL performance does not scale well with the number of available training samples as it is limited by the inherent input length constraint of the underlying language model. Meanwhile, many studies have revealed that language models are also powerful feature extractors, allowing them to be utilized in a black-box manner and enabling the linear probing paradigm, where lightweight discriminators are trained on top of the pre-extracted input representations. This paper proposes prompt-augmented linear probing (PALP), a hybrid of linear probing and ICL, which leverages the best of both worlds. PALP inherits the scalability of linear probing and the capability of enforcing language models to derive more meaningful representations via tailoring input into a more conceivable form. Throughout in-depth investigations on various datasets, we verified that PALP significantly enhances the input representations closing the gap between ICL in the data-hungry scenario and fine-tuning in the data-abundant scenario with little training overhead, potentially making PALP a strong alternative in a black-box scenario. △ Less

Submitted 13 June, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

Comments: AAAI 2023

arXiv:2212.02796 [pdf, other]

DiffuPose: Monocular 3D Human Pose Estimation via Denoising Diffusion Probabilistic Model

Authors: Jeongjun Choi, Dongseok Shim, H. ** Kim

Abstract: Thanks to the development of 2D keypoint detectors, monocular 3D human pose estimation (HPE) via 2D-to-3D uplifting approaches have achieved remarkable improvements. Still, monocular 3D HPE is a challenging problem due to the inherent depth ambiguities and occlusions. To handle this problem, many previous works exploit temporal information to mitigate such difficulties. However, there are many rea… ▽ More Thanks to the development of 2D keypoint detectors, monocular 3D human pose estimation (HPE) via 2D-to-3D uplifting approaches have achieved remarkable improvements. Still, monocular 3D HPE is a challenging problem due to the inherent depth ambiguities and occlusions. To handle this problem, many previous works exploit temporal information to mitigate such difficulties. However, there are many real-world applications where frame sequences are not accessible. This paper focuses on reconstructing a 3D pose from a single 2D keypoint detection. Rather than exploiting temporal information, we alleviate the depth ambiguity by generating multiple 3D pose candidates which can be mapped to an identical 2D keypoint. We build a novel diffusion-based framework to effectively sample diverse 3D poses from an off-the-shelf 2D detector. By considering the correlation between human joints by replacing the conventional denoising U-Net with graph convolutional network, our approach accomplishes further performance improvements. We evaluate our method on the widely adopted Human3.6M and HumanEva-I datasets. Comprehensive experiments are conducted to prove the efficacy of the proposed method, and they confirm that our model outperforms state-of-the-art multi-hypothesis 3D HPE methods. △ Less

Submitted 3 August, 2023; v1 submitted 6 December, 2022; originally announced December 2022.

Comments: Accepted to IROS 2023. First two authors contributed equally

arXiv:2212.00975 [pdf, other]

Relation-Aware Language-Graph Transformer for Question Answering

Authors: **young Park, Hyeong Kyu Choi, Juyeon Ko, Hyeon** Park, Ji-Hoon Kim, Jisu Jeong, Kyungmin Kim, Hyunwoo J. Kim

Abstract: Question Answering (QA) is a task that entails reasoning over natural language contexts, and many relevant works augment language models (LMs) with graph neural networks (GNNs) to encode the Knowledge Graph (KG) information. However, most existing GNN-based modules for QA do not take advantage of rich relational information of KGs and depend on limited information interaction between the LM and th… ▽ More Question Answering (QA) is a task that entails reasoning over natural language contexts, and many relevant works augment language models (LMs) with graph neural networks (GNNs) to encode the Knowledge Graph (KG) information. However, most existing GNN-based modules for QA do not take advantage of rich relational information of KGs and depend on limited information interaction between the LM and the KG. To address these issues, we propose Question Answering Transformer (QAT), which is designed to jointly reason over language and graphs with respect to entity relations in a unified manner. Specifically, QAT constructs Meta-Path tokens, which learn relation-centric embeddings based on diverse structural and semantic relations. Then, our Relation-Aware Self-Attention module comprehensively integrates different modalities via the Cross-Modal Relative Position Bias, which guides information exchange between relevant entites of different modalities. We validate the effectiveness of QAT on commonsense question answering datasets like CommonsenseQA and OpenBookQA, and on a medical question answering dataset, MedQA-USMLE. On all the datasets, our method achieves state-of-the-art performance. Our code is available at http://github.com/mlvlab/QAT. △ Less

Submitted 25 April, 2023; v1 submitted 2 December, 2022; originally announced December 2022.

Comments: AAAI2023 (accepted)

arXiv:2211.15270 [pdf, other]

Reconstruction of $B \to ρ\ell ν_\ell$ decays identified using hadronic decays of the recoil $B$ meson in 2019 -- 2021 Belle II data

Authors: Belle II Collaboration, F. Abudinén, I. Adachi, K. Adamczyk, L. Aggarwal, P. Ahlburg, H. Ahmed, J. K. Ahn, H. Aihara, N. Akopov, A. Aloisio, F. Ameli, L. Andricek, N. Anh Ky, D. M. Asner, H. Atmacan, V. Aulchenko, T. Aushev, V. Aushev, V. Babu, S. Bacher, H. Bae, S. Baehr, S. Bahinipati, A. M. Bakich , et al. (560 additional authors not shown)

Abstract: We present results on the semileptonic decays $B^0 \to ρ^- \ell^+ ν_\ell$ and $B^+ \to ρ^0 \ell^+ ν_\ell$ in a sample corresponding to 189.9/fb of Belle II data at the SuperKEKB $e^- e^+$ collider. Signal decays are identified using full reconstruction of the recoil $B$ meson in hadronic final states. We determine the total branching fractions via fits to the distributions of the square of the "mi… ▽ More We present results on the semileptonic decays $B^0 \to ρ^- \ell^+ ν_\ell$ and $B^+ \to ρ^0 \ell^+ ν_\ell$ in a sample corresponding to 189.9/fb of Belle II data at the SuperKEKB $e^- e^+$ collider. Signal decays are identified using full reconstruction of the recoil $B$ meson in hadronic final states. We determine the total branching fractions via fits to the distributions of the square of the "missing" mass in the event and the dipion mass in the signal candidate and find ${\mathcal{B}(B^0\toρ^-\ell^+ ν_\ell) = (4.12 \pm 0.64(\mathrm{stat}) \pm 1.16(\mathrm{syst})) \times 10^{-4}}$ and ${\mathcal{B}({B^+\toρ^0\ell^+ν_\ell}) = (1.77 \pm 0.23 (\mathrm{stat}) \pm 0.36 (\mathrm{syst})) \times 10^{-4}}$ where the dominant systematic uncertainty comes from modeling the nonresonant $B\to (ππ)\ell^+ν_\ell$ contribution. △ Less

Submitted 28 November, 2022; originally announced November 2022.

Report number: BELLE2-CONF-PH-2022-029

arXiv:2211.08814 [pdf, other]

doi 10.1088/1367-2630/aced64

Microscopic origin of local electric polarization in NiPS$_3$

Authors: Hyeon Jung Kim, Ki-Seok Kim

Abstract: Recently, Zhang-Rice triplet to singlet excitations have been measured experimentally and verified numerically in a van der Waals antiferromagnet NiPS\textsubscript{3}, which reveals a collective local change of an electronic structure. In particular, such numerical simulations predicted that these electronic excitations occur simultaneously with local electric polarizations. In this study, we unc… ▽ More Recently, Zhang-Rice triplet to singlet excitations have been measured experimentally and verified numerically in a van der Waals antiferromagnet NiPS\textsubscript{3}, which reveals a collective local change of an electronic structure. In particular, such numerical simulations predicted that these electronic excitations occur simultaneously with local electric polarizations. In this study, we uncover the microscopic origin of this local electric polarization in the Zhang-Rice triplet to singlet excitation. Our lattice-model calculation predicts that the electric polarization can be controlled by applied magnetic fields, where the atomic spin-orbit coupling plays an important role. We speculate emergence of real space Berry curvature to describe the electric polarization in this strongly correlated system. △ Less

Submitted 3 July, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

Journal ref: New J. Phys. 25, 083029 (2023)

arXiv:2210.13143 [pdf, other]

Determination of $|V_{cb}|$ from $B\to D\ellν$ decays using 2019-2021 Belle II data

Authors: Belle II collaboration, F. Abudinén, I. Adachi, K. Adamczyk, L. Aggarwal, P. Ahlburg, H. Ahmed, J. K. Ahn, H. Aihara, N. Akopov, A. Aloisio, F. Ameli, L. Andricek, N. Anh Ky, D. M. Asner, H. Atmacan, V. Aulchenko, T. Aushev, V. Aushev, T. Aziz, V. Babu, S. Bacher, H. Bae, S. Baehr, S. Bahinipati , et al. (570 additional authors not shown)

Abstract: We present a determination of the magnitude of the Cabibbo-Kobayashi-Maskawa (CKM) matrix element $V_{cb}$ using $B\to D\ellν$ decays. The result is based on $e^+e^-\toΥ(4S)$ data recorded by the Belle II detector corresponding to 189.2/fb of integrated luminosity. The semileptonic decays $B^0\to D^-(\to K^+π^-π^-)\ell^+ν_\ell$ and $B^+\to\bar D^0(\to K^+π^-)\ell^+ν_\ell$ are reconstructed, where… ▽ More We present a determination of the magnitude of the Cabibbo-Kobayashi-Maskawa (CKM) matrix element $V_{cb}$ using $B\to D\ellν$ decays. The result is based on $e^+e^-\toΥ(4S)$ data recorded by the Belle II detector corresponding to 189.2/fb of integrated luminosity. The semileptonic decays $B^0\to D^-(\to K^+π^-π^-)\ell^+ν_\ell$ and $B^+\to\bar D^0(\to K^+π^-)\ell^+ν_\ell$ are reconstructed, where $\ell$ is either electron or a muon. The second $B$ meson in the $Υ(4S)$ event is not explicitly reconstructed. Using the diamond-frame method, we determine the $B$ meson four-momentum and thus the hadronic recoil. We extract the partial decay rates as functions of $w$ and perform a fit to the decay form-factor and the CKM parameter $|V_{cb}|$ using the BGL parameterization of the form factor and lattice QCD input from the FNAL/MILC and HPQCD collaborations. We obtain $η_{EW}|V_{cb}|=(38.53\pm 1.15)\times 10^{-3}$, where $η_{EW}$ is an electroweak correction, and the error accounts for theoretical and experimental sources of uncertainty. △ Less

Submitted 24 October, 2022; originally announced October 2022.

Comments: 20 pages, 4 figures

Report number: BELLE2-CONF-PH-2022-010

arXiv:2210.10220 [pdf, other]

Measurement of the photon-energy spectrum in inclusive $B\rightarrow X_{s}γ$ decays identified using hadronic decays of the recoil $B$ meson in 2019-2021 Belle II data

Authors: Belle II Collaboration, F. Abudinén, I. Adachi, K. Adamczyk, L. Aggarwal, P. Ahlburg, H. Ahmed, J. K. Ahn, H. Aihara, N. Akopov, A. Aloisio, F. Ameli, L. Andricek, N. Anh Ky, D. M. Asner, H. Atmacan, V. Aulchenko, T. Aushev, V. Aushev, T. Aziz, V. Babu, S. Bacher, H. Bae, S. Baehr, S. Bahinipati , et al. (573 additional authors not shown)

Abstract: We measure the photon-energy spectrum in radiative bottom-meson ($B$) decays into inclusive final states involving a strange hadron and a photon. We use SuperKEKB electron-positron collisions corresponding to $189~\mathrm{fb}^{-1}$ of integrated luminosity collected at the $Υ(4S)$ resonance by the Belle II experiment. The partner $B$ candidates are fully reconstructed using a large number of hadro… ▽ More We measure the photon-energy spectrum in radiative bottom-meson ($B$) decays into inclusive final states involving a strange hadron and a photon. We use SuperKEKB electron-positron collisions corresponding to $189~\mathrm{fb}^{-1}$ of integrated luminosity collected at the $Υ(4S)$ resonance by the Belle II experiment. The partner $B$ candidates are fully reconstructed using a large number of hadronic channels. The $B \rightarrow X_s γ$ partial branching fractions are measured as a function of photon energy in the signal $B$ meson rest frame in eight bins above $1.8~\mathrm{GeV}$. The background-subtracted signal yield for this photon energy region is $343 \pm 122$ events. Integrated branching fractions for three photon energy thresholds of $1.8~\mathrm{GeV}$, $2.0~\mathrm{GeV}$, and $2.1~\mathrm{GeV}$ are also reported, and found to be in agreement with world averages. △ Less

Submitted 18 October, 2022; originally announced October 2022.

Comments: 14 pages, 3 figures

Report number: BELLE2-CONF-PH-2022-018

arXiv:2210.08176 [pdf, other]

Invertible Monotone Operators for Normalizing Flows

Authors: Byeongkeun Ahn, Chiyoon Kim, Youngjoon Hong, Hyunwoo J. Kim

Abstract: Normalizing flows model probability distributions by learning invertible transformations that transfer a simple distribution into complex distributions. Since the architecture of ResNet-based normalizing flows is more flexible than that of coupling-based models, ResNet-based normalizing flows have been widely studied in recent years. Despite their architectural flexibility, it is well-known that t… ▽ More Normalizing flows model probability distributions by learning invertible transformations that transfer a simple distribution into complex distributions. Since the architecture of ResNet-based normalizing flows is more flexible than that of coupling-based models, ResNet-based normalizing flows have been widely studied in recent years. Despite their architectural flexibility, it is well-known that the current ResNet-based models suffer from constrained Lipschitz constants. In this paper, we propose the monotone formulation to overcome the issue of the Lipschitz constants using monotone operators and provide an in-depth theoretical analysis. Furthermore, we construct an activation function called Concatenated Pila (CPila) to improve gradient flow. The resulting model, Monotone Flows, exhibits an excellent performance on multiple density estimation benchmarks (MNIST, CIFAR-10, ImageNet32, ImageNet64). Code is available at https://github.com/mlvlab/MonotoneFlows. △ Less

Submitted 14 October, 2022; originally announced October 2022.

Comments: NeurIPS 2022

arXiv:2210.07562 [pdf, other]

TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers

Authors: Hyeong Kyu Choi, Joonmyung Choi, Hyunwoo J. Kim

Abstract: Mixup is a commonly adopted data augmentation technique for image classification. Recent advances in mixup methods primarily focus on mixing based on saliency. However, many saliency detectors require intense computation and are especially burdensome for parameter-heavy transformer models. To this end, we propose TokenMixup, an efficient attention-guided token-level data augmentation method that a… ▽ More Mixup is a commonly adopted data augmentation technique for image classification. Recent advances in mixup methods primarily focus on mixing based on saliency. However, many saliency detectors require intense computation and are especially burdensome for parameter-heavy transformer models. To this end, we propose TokenMixup, an efficient attention-guided token-level data augmentation method that aims to maximize the saliency of a mixed set of tokens. TokenMixup provides x15 faster saliency-aware data augmentation compared to gradient-based methods. Moreover, we introduce a variant of TokenMixup which mixes tokens within a single instance, thereby enabling multi-scale feature augmentation. Experiments show that our methods significantly improve the baseline models' performance on CIFAR and ImageNet-1K, while being more efficient than previous methods. We also reach state-of-the-art performance on CIFAR-100 among from-scratch transformer models. Code is available at https://github.com/mlvlab/TokenMixup. △ Less

Submitted 14 October, 2022; originally announced October 2022.

Comments: Accepted paper at NeurIPS 2022

arXiv:2210.07319 [pdf, other]

doi 10.1063/5.0130680

Cryogenic Magneto-Terahertz Scanning Near-field Optical Microscope (cm-SNOM)

Authors: Richard H. J. Kim, Joong-Mok Park, Samuel J. Haeuser, Liang Luo, Jigang Wang

Abstract: We have developed a versatile near-field microscopy platform that can operate at high magnetic fields and below liquid-helium temperatures. We use this platform to demonstrate an extreme terahertz (THz) nanoscope operation and to obtain the first cryogenic magneto-THz time-domain nano-spectroscopy/imaging at temperatures as low as 1.8 K and magnetic fields of up to 5 T simultaneously. Our cryogeni… ▽ More We have developed a versatile near-field microscopy platform that can operate at high magnetic fields and below liquid-helium temperatures. We use this platform to demonstrate an extreme terahertz (THz) nanoscope operation and to obtain the first cryogenic magneto-THz time-domain nano-spectroscopy/imaging at temperatures as low as 1.8 K and magnetic fields of up to 5 T simultaneously. Our cryogenic magneto-THz scanning near-field optical microscopy, or cm-SNOM, instrument comprises three main equipment: i) a 5 T split pair magnetic cryostat with a custom made insert for mounting SNOM inside; ii) an atomic force microscope (AFM) unit that accepts ultrafast THz excitation and iii) a MHz repetition rate, femtosecond laser amplifier for high-field THz pulse generation and sensitive detection. We apply the cm-SNOM to obtain proof of principle measurements of superconducting and topological materials. The new capabilities demonstrated break grounds for studying quantum materials that requires extreme environment of cryogenic operation and applied magnetic fields simultaneously in nanometer space, femtosecond time, and terahertz energy scales. △ Less

Submitted 13 October, 2022; originally announced October 2022.

Journal ref: Review of Scientific Instruments, 94, 043702 (2023)

Showing 51–100 of 535 results for author: Kim, H J