Skip to main content

Showing 51–100 of 6,930 results for author: Lee, H

.
  1. arXiv:2406.08612  [pdf, other

    astro-ph.HE

    Observation of Declination Dependence in the Cosmic Ray Energy Spectrum

    Authors: The Telescope Array Collaboration, R. U. Abbasi, T. Abu-Zayyad, M. Allen, J. W. Belz, D. R. Bergman, I. Buckland, W. Campbell, B. G. Cheon, K. Endo, A. Fedynitch, T. Fujii, K. Fujisue, K. Fujita, M. Fukushima, G. Furlich, Z. Gerber, N. Globus, W. Hanlon, N. Hayashida, H. He, K. Hibino, R. Higuchi, D. Ikeda, T. Ishii , et al. (101 additional authors not shown)

    Abstract: We report on an observation of the difference between northern and southern skies of the ultrahigh energy cosmic ray energy spectrum with a significance of ${\sim}8σ$. We use measurements from the two largest experiments$\unicode{x2014}$the Telescope Array observing the northern hemisphere and the Pierre Auger Observatory viewing the southern hemisphere. Since the comparison of two measurements fr… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 8 pages, 6 figures

  2. arXiv:2406.08528  [pdf, other

    cs.CV cs.LG

    Adaptive Teaching with Shared Classifier for Knowledge Distillation

    Authors: Jaeyeon Jang, Young-Ik Kim, Jisu Lim, Hyeonseong Lee

    Abstract: Knowledge distillation (KD) is a technique used to transfer knowledge from an overparameterized teacher network to a less-parameterized student network, thereby minimizing the incurred performance loss. KD methods can be categorized into offline and online approaches. Offline KD leverages a powerful pretrained teacher network, while online KD allows the teacher network to be adjusted dynamically t… ▽ More

    Submitted 14 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  3. arXiv:2406.08402  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models

    Authors: Chun-Yi Kuan, Wei-** Huang, Hung-yi Lee

    Abstract: Large audio-language models (LALMs) enhance traditional large language models by integrating audio perception capabilities, allowing them to tackle audio-related tasks. Previous research has primarily focused on assessing the performance of LALMs across various tasks, yet overlooking their reliability, particularly concerning issues like object hallucination. In our study, we introduce methods to… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024

  4. arXiv:2406.08301  [pdf, other

    nucl-ex

    Jet modification via $π^0$-hadron correlations in Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV

    Authors: PHENIX Collaboration, N. J. Abdulameer, U. Acharya, A. Adare, S. Afanasiev, C. Aidala, N. N. Ajitanand, Y. Akiba, H. Al-Bataineh, J. Alexander, M. Alfred, K. Aoki, N. Apadula, L. Aphecetche, J. Asai, H. Asano, E. T. Atomssa, R. Averbeck, T. C. Awes, B. Azmoun, V. Babintsev, M. Bai, G. Baksay, L. Baksay, A. Baldisseri , et al. (510 additional authors not shown)

    Abstract: High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is obs… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 534 authors from 83 institutions, 12 pages, 7 figures. v1 is version submitted to Physical Review C. HEPdata tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.html

  5. arXiv:2406.08140  [pdf

    q-bio.NC

    Functional voxel hierarchy and afferent capacity revealed mental state transition on dynamic correlation resting-state fMRI

    Authors: Dong Soo Lee, Hyun Joo Kim, Youngmin Huh, Yeon Koo Kang, Wonseok Whi, Hyekyoung Lee, Hye** Kang

    Abstract: Voxel hierarchy on dynamic brain graphs is produced by k core percolation on functional dynamic amplitude correlation of resting-state fMRI. Directed graphs and their afferent/efferent capacities are produced by Markov modeling of the universal cover of undirected graphs simultaneously with the calculation of volume entropy. Positive and unsigned negative brain graphs were analyzed separately on s… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  6. arXiv:2406.07783  [pdf, other

    astro-ph.GA

    One-sided H alpha Excess before the First Pericentre Passage in Galaxy Pairs

    Authors: Jiwon Chung, Joon Hyeop Lee, Hyun** Jeong

    Abstract: We present novel insights into the interplay between tidal forces and star formation in interacting galaxies before their first pericentre passage. We investigate seven close pair galaxies devoid of visible tidal disturbances, such as tails, bridges, and shells. Using integral field spectroscopy (IFS) data of extended Calar Alto Legacy Integral Field Area (eCALIFA), we unveil a previously unreport… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 7 pages, 4 figgures, Accepted for publication in MNRAS Letters

  7. arXiv:2406.07472  [pdf, other

    cs.CV

    4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models

    Authors: Heng Yu, Chaoyang Wang, Peiye Zhuang, Willi Menapace, Aliaksandr Siarohin, Junli Cao, Laszlo A Jeni, Sergey Tulyakov, Hsin-Ying Lee

    Abstract: Existing dynamic scene generation methods mostly rely on distilling knowledge from pre-trained 3D generative models, which are typically fine-tuned on synthetic object datasets. As a result, the generated scenes are often object-centric and lack photorealism. To address these limitations, we introduce a novel pipeline designed for photorealistic text-to-4D scene generation, discarding the dependen… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  8. arXiv:2406.07237  [pdf, other

    eess.AS cs.SD

    CodecFake: Enhancing Anti-Spoofing Models Against Deepfake Audios from Codec-Based Speech Synthesis Systems

    Authors: Haibin Wu, Yuan Tseng, Hung-yi Lee

    Abstract: Current state-of-the-art (SOTA) codec-based audio synthesis systems can mimic anyone's voice with just a 3-second sample from that specific unseen speaker. Unfortunately, malicious attackers may exploit these technologies, causing misuse and security issues. Anti-spoofing models have been developed to detect fake speech. However, the open question of whether current SOTA anti-spoofing models can e… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024, project page: https://codecfake.github.io/

  9. arXiv:2406.06913  [pdf

    cond-mat.str-el

    Frustrated phonon with charge density wave in vanadium Kagome metal

    Authors: Seung-Phil Heo, Choongjae Won, Heemin Lee, Hanbyul Kim, Eunyoung Park, Sung Yun Lee, Junha Hwang, Hyeongi Choi, Sang-Youn Park, Byungjune Lee, Woo-Suk Noh, Hoyoung Jang, Jae-Hoon Park, Dongbin Shin, Changyong Song

    Abstract: Crystals with unique ionic arrangements and strong electronic correlations serve as a fertile ground for the emergence of exotic phases, as evidenced by the coexistence of charge density wave (CDW) and superconductivity in vanadium Kagome metals, specifically AV3Sb5 (where A represents K, Rb, or Cs). The formation of a star of David CDW superstructure, resulting from the coordinated displacements… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Manuscript: 20 pages, 4 figures, SI: 14 pages, 8 figures

  10. arXiv:2406.06648  [pdf, other

    cs.CL cs.AI cs.LG

    SignBLEU: Automatic Evaluation of Multi-channel Sign Language Translation

    Authors: Jung-Ho Kim, Mathew Huerta-Enochian, Changyong Ko, Du Hui Lee

    Abstract: Sign languages are multi-channel languages that communicate information through not just the hands (manual signals) but also facial expressions and upper body movements (non-manual signals). However, since automatic sign language translation is usually performed by generating a single sequence of glosses, researchers eschew non-manual and co-occurring manual signals in favor of a simplified list o… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Published in LREC-Coling 2024

  11. arXiv:2406.06421  [pdf, ps, other

    math.CO

    Random matchings in linear hypergraphs

    Authors: Hyunwoo Lee

    Abstract: For a given hypergraph $H$ and a vertex $v\in V(H)$, consider a random matching $M$ chosen uniformly from the set of all matchings in $H.$ In $1995,$ Kahn conjectured that if $H$ is a $d$-regular linear $k$-uniform hypergraph, the probability that $M$ does not cover $v$ is $(1 + o_d(1))d^{-1/k}$ for all vertices $v\in V(H).$ This conjecture was proved for $k = 2$ by Kahn and Kim in $1998.$ In th… ▽ More

    Submitted 11 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: 15 pages

  12. arXiv:2406.06357  [pdf, other

    cs.CL cs.AI

    MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows

    Authors: Xingjian Zhang, Yutong Xie, ** Huang, **ge Ma, Zhaoying Pan, Qijia Liu, Ziyang Xiong, Tolga Ergen, Dongsub Shim, Honglak Lee, Qiaozhu Mei

    Abstract: Scientific innovation relies on detailed workflows, which include critical steps such as analyzing literature, generating ideas, validating these ideas, interpreting results, and inspiring follow-up research. However, scientific publications that document these workflows are extensive and unstructured. This makes it difficult for both human researchers and AI systems to effectively navigate and ex… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:1706.03762 by other authors

  13. arXiv:2406.06117  [pdf, other

    hep-ex

    Exclusion of the Cosmological Triangle in Reactor-Based Search for Axion-Like Particles

    Authors: Byung Ju Park, Jae ** Choi, Eunju Jeon, **yu Kim, Kyungwon Kim, Sung Hyun Kim, Sun Kee Kim, Yeongduk Kim, Young Ju Ko, Byoung-Cheol Koh, Chang Hyon Ha, Seo Hyun Lee, In Soo Lee, Hyunseok Lee, Hyun Su Lee, Jaison Lee, Yoomin Oh, Doo** Kim

    Abstract: We report new constraints on axion-like particle (ALP) using data corresponding to a sodium iodine target exposure of 3063 kg$\cdot$days from the neutrino elastic scattering observation with NaI (NEON) experiment. A 16.7 kg of thallium-doped sodium iodide target was located 23.7 meters from a 2.8 GW thermal power nuclear reactor. We searched for ALPs produced by high-flux photons by comparing the… ▽ More

    Submitted 11 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  14. arXiv:2406.06072  [pdf, other

    cs.CV cs.LG cs.RO

    Adapting Pretrained ViTs with Convolution Injector for Visuo-Motor Control

    Authors: Dongyoon Hwang, Byungkun Lee, Hojoon Lee, Hyunseung Kim, Jaegul Choo

    Abstract: Vision Transformers (ViT), when paired with large-scale pretraining, have shown remarkable performance across various computer vision tasks, primarily due to their weak inductive bias. However, while such weak inductive bias aids in pretraining scalability, this may hinder the effective adaptation of ViTs for visuo-motor control tasks as a result of the absence of control-centric inductive biases.… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: accepted to ICML 2024

  15. arXiv:2406.06050  [pdf, other

    cs.CV

    Generalizable Human Gaussians from Single-View Image

    Authors: **nan Chen, Chen Li, Jianfeng Zhang, Hanlin Chen, Buzhen Huang, Gim Hee Lee

    Abstract: In this work, we tackle the task of learning generalizable 3D human Gaussians from a single image. The main challenge for this task is to recover detailed geometry and appearance, especially for the unobserved regions. To this end, we propose single-view generalizable Human Gaussian model (HGM), a diffusion-guided framework for 3D human modeling from a single image. We design a diffusion-based coa… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  16. arXiv:2406.06037  [pdf, other

    cs.LG cs.AI cs.CV

    Investigating Pre-Training Objectives for Generalization in Vision-Based Reinforcement Learning

    Authors: Donghu Kim, Hojoon Lee, Kyungmin Lee, Dongyoon Hwang, Jaegul Choo

    Abstract: Recently, various pre-training methods have been introduced in vision-based Reinforcement Learning (RL). However, their generalization ability remains unclear due to evaluations being limited to in-distribution environments and non-unified experimental setups. To address this, we introduce the Atari Pre-training Benchmark (Atari-PB), which pre-trains a ResNet-50 model on 10 million transitions fro… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: accepted to ICML 2024

  17. arXiv:2406.05965  [pdf, other

    eess.AS cs.AI

    MakeSinger: A Semi-Supervised Training Method for Data-Efficient Singing Voice Synthesis via Classifier-free Diffusion Guidance

    Authors: Semin Kim, Myeonghun Jeong, Hyeonseung Lee, Minchan Kim, Byoung ** Choi, Nam Soo Kim

    Abstract: In this paper, we propose MakeSinger, a semi-supervised training method for singing voice synthesis (SVS) via classifier-free diffusion guidance. The challenge in SVS lies in the costly process of gathering aligned sets of text, pitch, and audio data. MakeSinger enables the training of the diffusion-based SVS model from any speech and singing voice data regardless of its labeling, thereby enhancin… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024

  18. arXiv:2406.05964  [pdf, other

    stat.ML cs.LG

    Distributionally Robust Safe Sample Screening

    Authors: Hiroyuki Hanada, Aoyama Tatsuya, Akahane Satoshi, Tomonari Tanaka, Yoshito Okura, Yu Inatsu, Noriaki Hashimoto, Shion Takeno, Taro Murayama, Hanju Lee, Shinya Kojima, Ichiro Takeuchi

    Abstract: In this study, we propose a machine learning method called Distributionally Robust Safe Sample Screening (DRSSS). DRSSS aims to identify unnecessary training samples, even when the distribution of the training samples changes in the future. To achieve this, we effectively combine the distributionally robust (DR) paradigm, which aims to enhance model robustness against variations in data distributi… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  19. arXiv:2406.05806  [pdf, other

    cs.CL cs.SD eess.AS

    Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper

    Authors: Chih-Kai Yang, Kuan-Po Huang, Hung-yi Lee

    Abstract: This research explores the interaction between Whisper, a high-performing speech recognition model, and information in prompts. Our results unexpectedly show that Whisper may not fully grasp textual prompts as anticipated. Additionally, we find that performance improvement is not guaranteed even with stronger adherence to the topic information in textual prompts. It is also noted that English prom… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: In progress

  20. arXiv:2406.05774  [pdf, other

    cs.CV

    VCR-GauS: View Consistent Depth-Normal Regularizer for Gaussian Surface Reconstruction

    Authors: Hanlin Chen, Fangyin Wei, Chen Li, Tianxin Huang, Yunsong Wang, Gim Hee Lee

    Abstract: Although 3D Gaussian Splatting has been widely studied because of its realistic and efficient novel-view synthesis, it is still challenging to extract a high-quality surface from the point-based representation. Previous works improve the surface by incorporating geometric priors from the off-the-shelf normal estimator. However, there are two main limitations: 1) Supervising normal rendered from 3D… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  21. arXiv:2406.05649  [pdf, other

    cs.CV cs.AI

    GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement

    Authors: Peiye Zhuang, Songfang Han, Chaoyang Wang, Aliaksandr Siarohin, Jiaxu Zou, Michael Vasilkovsky, Vladislav Shakhrai, Sergey Korolev, Sergey Tulyakov, Hsin-Ying Lee

    Abstract: We propose a novel approach for 3D mesh reconstruction from multi-view images. Our method takes inspiration from large reconstruction models like LRM that use a transformer-based triplane generator and a Neural Radiance Field (NeRF) model trained on multi-view images. However, in our method, we introduce several important modifications that allow us to significantly enhance 3D reconstruction quali… ▽ More

    Submitted 13 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

    Comments: 19 pages, 17 figures. Project page: https://snap-research.github.io/GTR/

  22. arXiv:2406.05464  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models

    Authors: Tzu-Quan Lin, Hung-yi Lee, Hao Tang

    Abstract: Self-supervised speech models have shown to be useful for various tasks, but their large size limits the use in devices with low computing power and memory. In this work, we explore early exit, an approach for reducing latency by exiting the forward process of a network early. Most approaches of early exit need a separate early exit model for each task, with some even requiring fine-tuning of the… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  23. A Deep Learning-Augmented Stand-off Radar Scheme for Rapidly Detecting Tree Defects

    Authors: Jiwei Qian, Yee Hui Lee, Kaixuan Cheng, Qiqi Dai, Mohamed Lokman Mohd Yusof, Daryl Lee, Abdulkadir C. Yucel

    Abstract: Tree defect detection is crucial for the structural health screening of trees. Existing nondestructive testing (NDT) techniques for tree defect detection require time-consuming and labor-intensive measurement campaigns. This discourages their application for the routine structural health screening of whole populations of managed urban trees. To address this issue, this study proposes a deep-learni… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: Accepted and to be published in IEEE Transactions on Geoscience and Remote Sensing

  24. arXiv:2406.05065  [pdf, ps, other

    eess.AS

    Emo-bias: A Large Scale Evaluation of Social Bias on Speech Emotion Recognition

    Authors: Yi-Cheng Lin, Haibin Wu, Huang-Cheng Chou, Chi-Chun Lee, Hung-yi Lee

    Abstract: The rapid growth of Speech Emotion Recognition (SER) has diverse global applications, from improving human-computer interactions to aiding mental health diagnostics. However, SER models might contain social bias toward gender, leading to unfair outcomes. This study analyzes gender bias in SER models trained with Self-Supervised Learning (SSL) at scale, exploring factors influencing it. SSL-based S… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024

  25. arXiv:2406.04997  [pdf, ps, other

    eess.AS cs.LG

    On the social bias of speech self-supervised models

    Authors: Yi-Cheng Lin, Tzu-Quan Lin, Hsi-Che Lin, Andy T. Liu, Hung-yi Lee

    Abstract: Self-supervised learning (SSL) speech models have achieved remarkable performance in various tasks, yet the biased outcomes, especially affecting marginalized groups, raise significant concerns. Social bias refers to the phenomenon where algorithms potentially amplify disparate properties between social groups present in the data used for training. Bias in SSL models can perpetuate injustice by au… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024

  26. arXiv:2406.04763  [pdf, other

    cond-mat.mes-hall physics.optics

    Observation of higher-order time-dislocation topological modes

    Authors: Jia-Hui Zhang, Feng Mei, Yi Li, Ching Hua Lee, Jie Ma, Liantuan Xiao, Suotang Jia

    Abstract: Topological dislocation modes resulting from the interplay between spatial dislocations and momentum-space topology have recently attracted significant interest. Here, we theoretically and experimentally demonstrate time-dislocation topological modes which are induced by the interplay between temporal dislocations and Floquet-band topology. By utilizing an extra physical dimension to represent the… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 9 pages, 4 figures

  27. arXiv:2406.04630  [pdf, other

    cs.CL

    Low-Resource Cross-Lingual Summarization through Few-Shot Learning with Large Language Models

    Authors: Gyutae Park, Seo** Hwang, Hwanhee Lee

    Abstract: Cross-lingual summarization (XLS) aims to generate a summary in a target language different from the source language document. While large language models (LLMs) have shown promising zero-shot XLS performance, their few-shot capabilities on this task remain unexplored, especially for low-resource languages with limited parallel data. In this paper, we investigate the few-shot XLS performance of va… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 7 pages,3 figures

  28. arXiv:2406.04582  [pdf, other

    eess.AS cs.SD

    Neural Codec-based Adversarial Sample Detection for Speaker Verification

    Authors: Xuanjun Chen, Jiawei Du, Haibin Wu, Jyh-Shing Roger Jang, Hung-yi Lee

    Abstract: Automatic Speaker Verification (ASV), increasingly used in security-critical applications, faces vulnerabilities from rising adversarial attacks, with few effective defenses available. In this paper, we propose a neural codec-based adversarial sample detection method for ASV. The approach leverages the codec's ability to discard redundant perturbations and retain essential information. Specificall… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  29. arXiv:2406.04064  [pdf, other

    cs.CL cs.AI cs.CY

    Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models

    Authors: Jisu Shin, Hoyun Song, Huije Lee, Soyeong Jeong, Jong C. Park

    Abstract: Social bias is shaped by the accumulation of social perceptions towards targets across various demographic identities. To fully understand such social bias in large language models (LLMs), it is essential to consider the composite of social perceptions from diverse perspectives among identities. Previous studies have either evaluated biases in LLMs by indirectly assessing the presence of sentiment… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Findings of ACL 2024

  30. arXiv:2406.03366  [pdf, other

    quant-ph cond-mat.mtrl-sci cond-mat.str-el physics.comp-ph

    Computational Supremacy of Quantum Eigensolver by Extension of Optimized Binary Configurations

    Authors: Hayun Park, Hunpyo Lee

    Abstract: We developed a quantum eigensolver (QE) which is based on an extension of optimized binary configurations measured by quantum annealing (QA) on a D-Wave Quantum Annealer (D-Wave QA). This approach performs iterative QA measurements to optimize the eigenstates $\vert ψ\rangle$ without the derivation of a classical computer. The computational cost is $ηM L$ for full eigenvalues $E$ and… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 5 pages and 4 figures

  31. arXiv:2406.03364  [pdf, other

    quant-ph cond-mat.mtrl-sci cond-mat.str-el physics.comp-ph

    Determination of Optimal Chain Coupling made by Embedding in D-Wave Quantum Annealer

    Authors: Hayun Park, Hunpyo Lee

    Abstract: The qubits in a D-wave quantum annealer (D-wave QA) are designed on a Pegasus graph that is different from structure of a combinatorial optimization problem. This situation requires embedding with the chains connected by ferromagnetic (FM) coupling $J_c$ between the qubits. Weak and strong $J_c$ values induce chain breaking and enforcement of chain energy, which reduce the accuracy of quantum anne… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 5 pages, 6 figures

  32. arXiv:2406.03111  [pdf, other

    eess.AS eess.SP

    Singing Voice Graph Modeling for SingFake Detection

    Authors: Xuanjun Chen, Haibin Wu, Jyh-Shing Roger Jang, Hung-yi Lee

    Abstract: Detecting singing voice deepfakes, or SingFake, involves determining the authenticity and copyright of a singing voice. Existing models for speech deepfake detection have struggled to adapt to unseen attacks in this unique singing voice domain of human vocalization. To bridge the gap, we present a groundbreaking SingGraph model. The model synergizes the capabilities of the MERT acoustic music unde… ▽ More

    Submitted 9 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024; Our code is available at https://github.com/xjchenGit/SingGraph.git

  33. arXiv:2406.02989  [pdf, other

    cs.RO cs.AI

    Learning Semantic Traversability with Egocentric Video and Automated Annotation Strategy

    Authors: Yunho Kim, Jeong Hyun Lee, Choongin Lee, Juhyeok Mun, Donghoon Youm, Jeongsoo Park, Jemin Hwangbo

    Abstract: For reliable autonomous robot navigation in urban settings, the robot must have the ability to identify semantically traversable terrains in the image based on the semantic understanding of the scene. This reasoning ability is based on semantic traversability, which is frequently achieved using semantic segmentation models fine-tuned on the testing domain. This fine-tuning process often involves m… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Submitted to IEEE Robotics and Automation Letters (RA-L), First two authors contributed equally

  34. arXiv:2406.02963  [pdf, other

    cs.SD eess.AS

    Dataset-Distillation Generative Model for Speech Emotion Recognition

    Authors: Fabian Ritter-Gutierrez, Kuan-Po Huang, Jeremy H. M Wong, Dianwen Ng, Hung-yi Lee, Nancy F. Chen, Eng Siong Chng

    Abstract: Deep learning models for speech rely on large datasets, presenting computational challenges. Yet, performance hinges on training data size. Dataset Distillation (DD) aims to learn a smaller dataset without much performance degradation when training with it. DD has been investigated in computer vision but not yet in speech. This paper presents the first approach for DD to speech targeting Speech Em… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted at Interspeech 2024

  35. arXiv:2406.02925  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    Task Arithmetic can Mitigate Synthetic-to-Real Gap in Automatic Speech Recognition

    Authors: Hsuan Su, Hua Farn, Fan-Yun Sun, Shang-Tse Chen, Hung-yi Lee

    Abstract: Synthetic data is widely used in speech recognition due to the availability of text-to-speech models, which facilitate adapting models to previously unseen text domains. However, existing methods suffer in performance when they fine-tune an automatic speech recognition (ASR) model on synthetic data as they suffer from the distributional shift commonly referred to as the synthetic-to-real gap. In t… ▽ More

    Submitted 15 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

  36. arXiv:2406.02847  [pdf, other

    cs.LG stat.ML

    Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers

    Authors: Brian K Chen, Tianyang Hu, Hui **, Hwee Kuan Lee, Kenji Kawaguchi

    Abstract: In-Context Learning (ICL) has been a powerful emergent property of large language models that has attracted increasing attention in recent years. In contrast to regular gradient-based learning, ICL is highly interpretable and does not require parameter updates. In this paper, we show that, for linearized transformer networks, ICL can be made explicit and permanent through the inclusion of bias ter… ▽ More

    Submitted 6 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted to ICML 2024

  37. arXiv:2406.02596  [pdf, other

    cs.LG cs.AI

    Slow and Steady Wins the Race: Maintaining Plasticity with Hare and Tortoise Networks

    Authors: Hojoon Lee, Hyeonseo Cho, Hyunseung Kim, Donghu Kim, Dugki Min, Jaegul Choo, Clare Lyle

    Abstract: This study investigates the loss of generalization ability in neural networks, revisiting warm-starting experiments from Ash & Adams. Our empirical analysis reveals that common methods designed to enhance plasticity by maintaining trainability provide limited benefits to generalization. While reinitializing the network can be effective, it also risks losing valuable prior knowledge. To this end, w… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: accepted to ICML 2024

  38. arXiv:2406.01512  [pdf, other

    cs.CL

    MAD: Multi-Alignment MEG-to-Text Decoding

    Authors: Yiqian Yang, Hyejeong Jo, Yiqun Duan, Qiang Zhang, **ni Zhou, Won Hee Lee, Ren**g Xu, Hui Xiong

    Abstract: Deciphering language from brain activity is a crucial task in brain-computer interface (BCI) research. Non-invasive cerebral signaling techniques including electroencephalography (EEG) and magnetoencephalography (MEG) are becoming increasingly popular due to their safety and practicality, avoiding invasive electrode implantation. However, current works under-investigated three points: 1) a predomi… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  39. arXiv:2406.01020  [pdf, other

    cs.CV

    CLIP-Guided Attribute Aware Pretraining for Generalizable Image Quality Assessment

    Authors: Daekyu Kwon, Dongyoung Kim, Sehwan Ki, Younghyun Jo, Hyong-Euk Lee, Seon Joo Kim

    Abstract: In no-reference image quality assessment (NR-IQA), the challenge of limited dataset sizes hampers the development of robust and generalizable models. Conventional methods address this issue by utilizing large datasets to extract rich representations for IQA. Also, some approaches propose vision language models (VLM) based IQA, but the domain gap between generic VLM and IQA constrains their scalabi… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  40. arXiv:2406.01007  [pdf, other

    hep-ex

    Measurement of Electron Antineutrino Oscillation Amplitude and Frequency via Neutron Capture on Hydrogen at Daya Bay

    Authors: Daya Bay collaboration, F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, J. Cheng, Y. -C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng , et al. (177 additional authors not shown)

    Abstract: This Letter reports the first measurement of the oscillation amplitude and frequency of reactor antineutrinos at Daya Bay via neutron capture on hydrogen using 1958 days of data. With over 3.6 million signal candidates, an optimized candidate selection, improved treatment of backgrounds and efficiencies, refined energy calibration, and an energy response model for the capture-on-hydrogen sensitive… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  41. arXiv:2406.00945  [pdf, other

    astro-ph.HE astro-ph.IM gr-qc

    General relativistic self-gravitating equilibrium disks around rotating neutron stars

    Authors: Yoonsoo Kim, **ho Kim, Hee Il Kim, Hyung Mok Lee

    Abstract: In modeling a relativistic disk around a compact object, the self-gravity of the disk is often neglected while it needs to be incorporated for more accurate descriptions in several circumstances. Extending the Komatsu-Eriguchi-Hachisu self-consistent field method, we present numerical models of a rapidly rotating neutron star with a self-gravitating disk in stationary equilibrium. In particular, o… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: 15 pages, 12 figures

  42. arXiv:2406.00823  [pdf, other

    stat.ML cs.LG

    Lasso Bandit with Compatibility Condition on Optimal Arm

    Authors: Harin Lee, Taehyun Hwang, Min-hwan Oh

    Abstract: We consider a stochastic sparse linear bandit problem where only a sparse subset of context features affects the expected reward function, i.e., the unknown reward parameter has sparse structure. In the existing Lasso bandit literature, the compatibility conditions together with additional diversity conditions on the context features are imposed to achieve regret bounds that only depend logarithmi… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  43. arXiv:2406.00810  [pdf, other

    cs.CR

    Expanding the Attack Scenarios of SAE J1939: A Comprehensive Analysis of Established and Novel Vulnerabilities in Transport Protocol

    Authors: Hwejae Lee, Hyosun Lee, Saehee Jun, Huy Kang Kim

    Abstract: Following the enactment of the UN Regulation, substantial efforts have been directed toward implementing intrusion detection and prevention systems (IDPSs) and vulnerability analysis in Controller Area Network (CAN). However, Society of Automotive Engineers (SAE) J1939 protocol, despite its extensive application in cam** cars and commercial vehicles, has seen limited vulnerability identification… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: 18 pages, 7 figures, 5 tables; This is the accepted version of ESCAR USA 2024

    MSC Class: 68M25 ACM Class: K.6.5

  44. arXiv:2406.00756  [pdf, other

    hep-ph

    An AI-Inspired Numerical Method in the Quark Model: Application to Finding the Wave Functions for Heavy Tetraquark States

    Authors: Daeho Park, Su Houng Lee

    Abstract: The current ongoing advancements in AI have shed light on the landscape of numerical analysis in science. Inspired by the path of achievement of AI, we have developed a method to construct accurate ground state wave functions of multiquark configurations within a quark model. We successfully tested our method through comparisons with meson-type two-body systems with analytic and numerical solution… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: 7 pages, 5 figures

  45. arXiv:2406.00669  [pdf

    eess.SY econ.GN

    Multi-technology co-optimization approach for sustainable hydrogen and electricity supply chains considering variability and demand scale

    Authors: Sunwoo Kim, Joungho Park, Jay H. Lee

    Abstract: In the pursuit of a carbon-neutral future, hydrogen emerges as a pivotal element, serving as a carbon-free energy carrier and feedstock. As efforts to decarbonize sectors such as heating and transportation intensify, understanding and navigating through the dynamics of hydrogen demand expansion becomes critical. Transitioning to hydrogen economy is complicated by varying regional scales and types… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  46. arXiv:2406.00665   

    econ.GN eess.SY

    Integrating solid direct air capture systems with green hydrogen production: Economic synergy of sector coupling

    Authors: Sunwoo Kim, Joungho Park, Jay H. Lee

    Abstract: In the global pursuit of sustainable energy solutions, mitigating carbon dioxide (CO2) emissions stands as a pivotal challenge. With escalating atmospheric CO2 levels, the imperative of direct air capture (DAC) systems becomes evident. Simultaneously, green hydrogen (GH) emerges as a pivotal medium for renewable energy. Nevertheless, the substantial expenses associated with these technologies impe… ▽ More

    Submitted 28 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

    Comments: Some of the results of our previous preprint paper are flawed, and we are withdrawing them to prevent the spread of incorrect knowledge

  47. arXiv:2406.00324  [pdf, other

    cs.LG cs.AI

    Do's and Don'ts: Learning Desirable Skills with Instruction Videos

    Authors: Hyunseung Kim, Byungkun Lee, Hojoon Lee, Dongyoon Hwang, Donghu Kim, Jaegul Choo

    Abstract: Unsupervised skill discovery is a learning paradigm that aims to acquire diverse behaviors without explicit rewards. However, it faces challenges in learning complex behaviors and often leads to learning unsafe or undesirable behaviors. For instance, in various continuous control tasks, current unsupervised skill discovery methods succeed in learning basic locomotions like standing but struggle wi… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  48. arXiv:2406.00014  [pdf, other

    cs.DB cs.AI cs.CL cs.IR

    KU-DMIS at EHRSQL 2024:Generating SQL query via question templatization in EHR

    Authors: Hajung Kim, Chanhwi Kim, Hoonick Lee, Kyochul Jang, Jiwoo Lee, Kyungjae Lee, Gangwoo Kim, Jaewoo Kang

    Abstract: Transforming natural language questions into SQL queries is crucial for precise data retrieval from electronic health record (EHR) databases. A significant challenge in this process is detecting and rejecting unanswerable questions that request information beyond the database's scope or exceed the system's capabilities. In this paper, we introduce a novel text-to-SQL framework that robustly handle… ▽ More

    Submitted 19 June, 2024; v1 submitted 21 May, 2024; originally announced June 2024.

    Comments: Published at ClinicalNLP workshop @ NAACL 2024

  49. arXiv:2405.20729  [pdf, other

    cs.CV

    Extreme Point Supervised Instance Segmentation

    Authors: Hyeonjun Lee, Sehyun Hwang, Suha Kwak

    Abstract: This paper introduces a novel approach to learning instance segmentation using extreme points, i.e., the topmost, leftmost, bottommost, and rightmost points, of each object. These points are readily available in the modern bounding box annotation process while offering strong clues for precise segmentation, and thus allows to improve performance at the same annotation cost with box-supervised meth… ▽ More

    Submitted 3 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

    Comments: Accepted to CVPR 2024

  50. arXiv:2405.20672  [pdf, other

    cs.CV

    Investigating and unmasking feature-level vulnerabilities of CNNs to adversarial perturbations

    Authors: Davide Coppola, Hwee Kuan Lee

    Abstract: This study explores the impact of adversarial perturbations on Convolutional Neural Networks (CNNs) with the aim of enhancing the understanding of their underlying mechanisms. Despite numerous defense methods proposed in the literature, there is still an incomplete understanding of this phenomenon. Instead of treating the entire model as vulnerable, we propose that specific feature maps learned du… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: 22 pages, 15 figures (including appendix)