Skip to main content

Showing 1–47 of 47 results for author: Hung, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01146  [pdf, other

    eess.IV cs.CV

    Cross-Slice Attention and Evidential Critical Loss for Uncertainty-Aware Prostate Cancer Detection

    Authors: Alex Ling Yu Hung, Haoxin Zheng, Kai Zhao, Kaifeng Pang, Demetri Terzopoulos, Kyunghyun Sung

    Abstract: Current deep learning-based models typically analyze medical images in either 2D or 3D albeit disregarding volumetric information or suffering sub-optimal performance due to the anisotropic resolution of MR data. Furthermore, providing an accurate uncertainty estimation is beneficial to clinicians, as it indicates how confident a model is about its prediction. We propose a novel 2.5D cross-slice a… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2405.08483  [pdf, other

    cs.CV cs.AI

    RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images

    Authors: Zong-Wei Hong, Yen-Yang Hung, Chu-Song Chen

    Abstract: In this work, we introduce a novel method for calculating the 6DoF pose of an object using a single RGB-D image. Unlike existing methods that either directly predict objects' poses or rely on sparse keypoints for pose recovery, our approach addresses this challenging task using dense correspondence, i.e., we regress the object coordinates for each visible pixel. Our method leverages existing objec… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: Accepted by CVPR Workshop DLGC, 2024

  3. arXiv:2402.04885  [pdf, other

    stat.ML cs.AI cs.LG

    A Unified Gaussian Process for Branching and Nested Hyperparameter Optimization

    Authors: Jiazhao Zhang, Ying Hung, Chung-Ching Lin, Zicheng Liu

    Abstract: Choosing appropriate hyperparameters plays a crucial role in the success of neural networks as hyper-parameters directly control the behavior and performance of the training algorithms. To obtain efficient tuning, Bayesian optimization methods based on Gaussian process (GP) models are widely used. Despite numerous applications of Bayesian optimization in deep learning, the existing methodologies a… ▽ More

    Submitted 19 January, 2024; originally announced February 2024.

  4. arXiv:2311.04942  [pdf, other

    eess.IV cs.CV

    CSAM: A 2.5D Cross-Slice Attention Module for Anisotropic Volumetric Medical Image Segmentation

    Authors: Alex Ling Yu Hung, Haoxin Zheng, Kai Zhao, Xiaoxi Du, Kaifeng Pang, Qi Miao, Steven S. Raman, Demetri Terzopoulos, Kyunghyun Sung

    Abstract: A large portion of volumetric medical data, especially magnetic resonance imaging (MRI) data, is anisotropic, as the through-plane resolution is typically much lower than the in-plane resolution. Both 3D and purely 2D deep learning-based segmentation methods are deficient in dealing with such volumetric data since the performance of 3D methods suffers when confronting anisotropic data, and 2D meth… ▽ More

    Submitted 26 November, 2023; v1 submitted 7 November, 2023; originally announced November 2023.

  5. arXiv:2311.03318  [pdf, other

    cs.SD cs.IR eess.AS

    A Foundation Model for Music Informatics

    Authors: Minz Won, Yun-Ning Hung, Duc Le

    Abstract: This paper investigates foundation models tailored for music informatics, a domain currently challenged by the scarcity of labeled data and generalization issues. To this end, we conduct an in-depth comparative study among various foundation model variants, examining key determinants such as model architectures, tokenization methods, temporal resolution, data, and model scalability. This research… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: 5 pages

  6. arXiv:2310.11515  [pdf, ps, other

    cs.LG

    Value-Biased Maximum Likelihood Estimation for Model-based Reinforcement Learning in Discounted Linear MDPs

    Authors: Yu-Heng Hung, **-Chun Hsieh, Akshay Mete, P. R. Kumar

    Abstract: We consider the infinite-horizon linear Markov Decision Processes (MDPs), where the transition probabilities of the dynamic model can be linearly parameterized with the help of a predefined low-dimensional feature map**. While the existing regression-based approaches have been theoretically shown to achieve nearly-optimal regret, they are computationally rather inefficient due to the need for a… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  7. arXiv:2310.01353  [pdf, other

    eess.AS cs.SD

    Scaling Up Music Information Retrieval Training with Semi-Supervised Learning

    Authors: Yun-Ning Hung, Ju-Chiang Wang, Minz Won, Duc Le

    Abstract: In the era of data-driven Music Information Retrieval (MIR), the scarcity of labeled data has been one of the major concerns to the success of an MIR task. In this work, we leverage the semi-supervised teacher-student training approach to improve MIR tasks. For training, we scale up the unlabeled music data to 240k hours, which is much larger than any public MIR datasets. We iteratively create and… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  8. arXiv:2309.02612  [pdf, other

    cs.SD eess.AS

    Music Source Separation with Band-Split RoPE Transformer

    Authors: Wei-Tsung Lu, Ju-Chiang Wang, Qiuqiang Kong, Yun-Ning Hung

    Abstract: Music source separation (MSS) aims to separate a music recording into multiple musically distinct stems, such as vocals, bass, drums, and more. Recently, deep learning approaches such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs) have been used, but the improvement is still limited. In this paper, we propose a novel frequency-domain approach based on a Band-Split RoP… ▽ More

    Submitted 9 September, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: This paper explains the SAMI-ByteDance MSS system submitted to Sound Demixing Challenge (SDX23) Music Separation Track. Version 2 of paper fixed some typos

  9. arXiv:2308.10848  [pdf, other

    cs.CL

    AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors

    Authors: Weize Chen, Yusheng Su, **gwei Zuo, Cheng Yang, Chenfei Yuan, Chi-Min Chan, Heyang Yu, Yaxi Lu, Yi-Hsin Hung, Chen Qian, Yujia Qin, Xin Cong, Ruobing Xie, Zhiyuan Liu, Maosong Sun, Jie Zhou

    Abstract: Autonomous agents empowered by Large Language Models (LLMs) have undergone significant improvements, enabling them to generalize across a broad spectrum of tasks. However, in real-world scenarios, cooperation among individuals is often required to enhance the efficiency and effectiveness of task accomplishment. Hence, inspired by human group dynamics, we propose a multi-agent framework \framework… ▽ More

    Submitted 23 October, 2023; v1 submitted 21 August, 2023; originally announced August 2023.

    Comments: Under review. Code at https://github.com/OpenBMB/AgentVerse/

  10. arXiv:2307.11926  [pdf, other

    eess.IV cs.CV

    PartDiff: Image Super-resolution with Partial Diffusion Models

    Authors: Kai Zhao, Alex Ling Yu Hung, Kaifeng Pang, Haoxin Zheng, Kyunghyun Sung

    Abstract: Denoising diffusion probabilistic models (DDPMs) have achieved impressive performance on various image generation tasks, including image super-resolution. By learning to reverse the process of gradually diffusing the data distribution into Gaussian noise, DDPMs generate new data by iteratively denoising from random noise. Despite their impressive performance, diffusion-based generative models suff… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

  11. arXiv:2306.10785  [pdf, other

    cs.SD cs.LG eess.AS

    Multitrack Music Transcription with a Time-Frequency Perceiver

    Authors: Wei-Tsung Lu, Ju-Chiang Wang, Yun-Ning Hung

    Abstract: Multitrack music transcription aims to transcribe a music audio input into the musical notes of multiple instruments simultaneously. It is a very challenging task that typically requires a more complex model to achieve satisfactory result. In addition, prior works mostly focus on transcriptions of regular instruments, however, neglecting vocals, which are usually the most important signal source i… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

    Comments: ICASSP 2023

  12. arXiv:2306.10756  [pdf, other

    cs.CV cs.AI

    A HRNet-based Rehabilitation Monitoring System

    Authors: Yi-Ching Hung, Yu-Qing Jiang, Fong-Syuan Liou, Yu-Hsuan Tsao, Zi-Cing Chiang, MIn-Te Sun

    Abstract: The rehabilitation treatment helps to heal minor sports and occupational injuries. In a traditional rehabilitation process, a therapist will assign certain actions to a patient to perform in between hospital visits, and it will rely on the patient to remember actions correctly and the schedule to perform them. Unfortunately, many patients forget to perform actions or fail to recall actions in deta… ▽ More

    Submitted 14 July, 2023; v1 submitted 19 June, 2023; originally announced June 2023.

  13. arXiv:2305.05139   

    cs.SD cs.MM eess.AS

    Temporal Convolution Network Based Onset Detection and Query by Humming System Design

    Authors: Yu Cheng Hung, Jian-Jiun Ding

    Abstract: Onsets are a key factor to split audio into several notes. In this paper, we ensemble multiple temporal convolution network (TCN) based model and utilize a restricted frequency range spectrogram to achieve more robust onset detection. Different from the present onset detection of QBH system which is only available in a clean scenario, our proposal of onset detection and speech enhancement can prev… ▽ More

    Submitted 7 June, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

    Comments: This paper has been withdrawn by the author due to a crucial definition of probability threshold and several grammer and vocabulary mistakes

  14. arXiv:2305.03982  [pdf

    cs.SD cs.MM eess.AS

    Pitch Estimation by Denoising Preprocessor and Hybrid Estimation Model

    Authors: Yu Cheng Hung, ** Hung Chen, Jian Jiun Ding

    Abstract: Pitch estimation is to estimate the fundamental frequency and the midi number and plays a critical role in music signal analysis and vocal signal processing. In this work, we proposed a new architecture based on a learning-based enhancement preprocessor and a combination of several traditional and deep learning pitch estimation methods to achieve better pitch estimation performance in both noisy a… ▽ More

    Submitted 6 May, 2023; originally announced May 2023.

    Comments: From ICCE-Taiwan

  15. arXiv:2304.14894  [pdf, other

    eess.IV cs.CV

    Making the Invisible Visible: Toward High-Quality Terahertz Tomographic Imaging via Physics-Guided Restoration

    Authors: Weng-Tai Su, Yi-Chun Hung, Po-Jen Yu, Shang-Hua Yang, Chia-Wen Lin

    Abstract: Terahertz (THz) tomographic imaging has recently attracted significant attention thanks to its non-invasive, non-destructive, non-ionizing, material-classification, and ultra-fast nature for object exploration and inspection. However, its strong water absorption nature and low noise tolerance lead to undesired blurs and distortions of reconstructed THz images. The diffraction-limited THz signals h… ▽ More

    Submitted 28 April, 2023; originally announced April 2023.

    Comments: 34 pages, 13 figures

  16. arXiv:2302.00286  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    Jointist: Simultaneous Improvement of Multi-instrument Transcription and Music Source Separation via Joint Training

    Authors: Kin Wai Cheuk, Keunwoo Choi, Qiuqiang Kong, Bochen Li, Minz Won, Ju-Chiang Wang, Yun-Ning Hung, Dorien Herremans

    Abstract: In this paper, we introduce Jointist, an instrument-aware multi-instrument framework that is capable of transcribing, recognizing, and separating multiple musical instruments from an audio clip. Jointist consists of an instrument recognition module that conditions the other two modules: a transcription module that outputs instrument-specific piano rolls, and a source separation module that utilize… ▽ More

    Submitted 1 February, 2023; v1 submitted 1 February, 2023; originally announced February 2023.

    Comments: arXiv admin note: text overlap with arXiv:2206.10805

  17. arXiv:2211.15787  [pdf, other

    cs.SD eess.AS

    MuSFA: Improving Music Structural Function Analysis with Partially Labeled Data

    Authors: Ju-Chiang Wang, Jordan B. L. Smith, Yun-Ning Hung

    Abstract: Music structure analysis (MSA) systems aim to segment a song recording into non-overlap** sections with useful labels. Previous MSA systems typically predict abstract labels in a post-processing step and require the full context of the song. By contrast, we recently proposed a supervised framework, called "Music Structural Function Analysis" (MuSFA), that models and predicts meaningful labels li… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

    Comments: ISMIR2022, LBD paper

  18. arXiv:2211.01317  [pdf, other

    cs.SD cs.AI cs.LG cs.NE eess.AS

    Low-Resource Music Genre Classification with Cross-Modal Neural Model Reprogramming

    Authors: Yun-Ning Hung, Chao-Han Huck Yang, Pin-Yu Chen, Alexander Lerch

    Abstract: Transfer learning (TL) approaches have shown promising results when handling tasks with limited training data. However, considerable memory and computational resources are often required for fine-tuning pre-trained neural networks with target domain data. In this work, we introduce a novel method for leveraging pre-trained models for low-resource (music) classification based on the concept of Neur… ▽ More

    Submitted 3 May, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: Accepted to IEEE ICASSP 2023. The implementation is available at https://github.com/biboamy/music-repro

  19. A Flexible-Frame-Rate Vision-Aided Inertial Object Tracking System for Mobile Devices

    Authors: Yo-Chung Lau, Kuan-Wei Tseng, I-Ju Hsieh, Hsiao-Ching Tseng, Yi-** Hung

    Abstract: Real-time object pose estimation and tracking is challenging but essential for emerging augmented reality (AR) applications. In general, state-of-the-art methods address this problem using deep neural networks which indeed yield satisfactory results. Nevertheless, the high computational cost of these methods makes them unsuitable for mobile devices where real-world applications usually take place.… ▽ More

    Submitted 22 October, 2022; originally announced October 2022.

  20. arXiv:2210.01292  [pdf, other

    cs.RO

    Data-Efficient Characterization of the Global Dynamics of Robot Controllers with Confidence Guarantees

    Authors: Ewerton R. Vieira, Aravind Sivaramakrishnan, Yao Song, Edgar Granados, Marcio Gameiro, Konstantin Mischaikow, Ying Hung, Kostas E. Bekris

    Abstract: This paper proposes an integration of surrogate modeling and topology to significantly reduce the amount of data required to describe the underlying global dynamics of robot controllers, including closed-box ones. A Gaussian Process (GP), trained with randomized short trajectories over the state-space, acts as a surrogate model for the underlying dynamical system. Then, a combinatorial representat… ▽ More

    Submitted 3 October, 2022; originally announced October 2022.

  21. arXiv:2206.04850  [pdf, other

    eess.AS cs.SD

    Feature-informed Embedding Space Regularization For Audio Classification

    Authors: Yun-Ning Hung, Alexander Lerch

    Abstract: Feature representations derived from models pre-trained on large-scale datasets have shown their generalizability on a variety of audio analysis tasks. Despite this generalizability, however, task-specific features can outperform if sufficient training data is available, as specific task-relevant properties can be learned. Furthermore, the complex pre-trained models bring considerable computationa… ▽ More

    Submitted 9 June, 2022; originally announced June 2022.

  22. arXiv:2205.14701  [pdf, other

    cs.SD eess.AS

    Modeling Beats and Downbeats with a Time-Frequency Transformer

    Authors: Yun-Ning Hung, Ju-Chiang Wang, Xuchen Song, Wei-Tsung Lu, Minz Won

    Abstract: Transformer is a successful deep neural network (DNN) architecture that has shown its versatility not only in natural language processing but also in music information retrieval (MIR). In this paper, we present a novel Transformer-based approach to tackle beat and downbeat tracking. This approach employs SpecTNT (Spectral-Temporal Transformer in Transformer), a variant of Transformer that models b… ▽ More

    Submitted 29 May, 2022; originally announced May 2022.

    Comments: This paper is accepted for publication at ICASSP 2022

  23. arXiv:2205.14700  [pdf, other

    eess.AS cs.SD

    To catch a chorus, verse, intro, or anything else: Analyzing a song with structural functions

    Authors: Ju-Chiang Wang, Yun-Ning Hung, Jordan B. L. Smith

    Abstract: Conventional music structure analysis algorithms aim to divide a song into segments and to group them with abstract labels (e.g., 'A', 'B', and 'C'). However, explicitly identifying the function of each segment (e.g., 'verse' or 'chorus') is rarely attempted, but has many applications. We introduce a multi-task deep learning framework to model these structural semantic labels directly from audio b… ▽ More

    Submitted 29 May, 2022; originally announced May 2022.

    Comments: This manuscript is accepted by ICASSP 2022

  24. arXiv:2203.15163  [pdf, other

    eess.IV cs.CV

    CAT-Net: A Cross-Slice Attention Transformer Model for Prostate Zonal Segmentation in MRI

    Authors: Alex Ling Yu Hung, Haoxin Zheng, Qi Miao, Steven S. Raman, Demetri Terzopoulos, Kyunghyun Sung

    Abstract: Prostate cancer is the second leading cause of cancer death among men in the United States. The diagnosis of prostate MRI often relies on the accurate prostate zonal segmentation. However, state-of-the-art automatic segmentation methods often fail to produce well-contained volumetric segmentation of the prostate zones since certain slices of prostate MRI, such as base and apex slices, are harder t… ▽ More

    Submitted 16 June, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

  25. arXiv:2203.13482  [pdf

    physics.optics cs.CV cs.NE

    Polarization Multiplexed Diffractive Computing: All-Optical Implementation of a Group of Linear Transformations Through a Polarization-Encoded Diffractive Network

    Authors: **gxi Li, Yi-Chun Hung, Onur Kulce, Deniz Mengu, Aydogan Ozcan

    Abstract: Research on optical computing has recently attracted significant attention due to the transformative advances in machine learning. Among different approaches, diffractive optical networks composed of spatially-engineered transmissive surfaces have been demonstrated for all-optical statistical inference and performing arbitrary linear transformations using passive, free-space optical layers. Here,… ▽ More

    Submitted 25 March, 2022; originally announced March 2022.

    Comments: 31 pages, 7 figures

    Journal ref: Light: Science & Applications (2022)

  26. arXiv:2203.04192  [pdf, ps, other

    cs.LG stat.ML

    Reward-Biased Maximum Likelihood Estimation for Neural Contextual Bandits

    Authors: Yu-Heng Hung, **-Chun Hsieh

    Abstract: Reward-biased maximum likelihood estimation (RBMLE) is a classic principle in the adaptive control literature for tackling explore-exploit trade-offs. This paper studies the stochastic contextual bandit problem with general bounded reward functions and proposes NeuralRBMLE, which adapts the RBMLE principle by adding a bias term to the log-likelihood to enforce exploration. NeuralRBMLE leverages th… ▽ More

    Submitted 29 May, 2022; v1 submitted 8 March, 2022; originally announced March 2022.

  27. The Role of Pleura and Adipose in Lung Ultrasound AI

    Authors: Gautam Rajendrakumar Gare, Wanwen Chen, Alex Ling Yu Hung, Edward Chen, Hai V. Tran, Tom Fox, Pete Lowery, Kevin Zamora, Bennett P deBoisblanc, Ricardo Luis Rodriguez, John Michael Galeotti

    Abstract: In this paper, we study the significance of the pleura and adipose tissue in lung ultrasound AI analysis. We highlight their more prominent appearance when using high-frequency linear (HFL) instead of curvilinear ultrasound probes, showing HFL reveals better pleura detail. We compare the diagnostic utility of the pleura and adipose tissue using an HFL ultrasound probe. Masking the adipose tissue d… ▽ More

    Submitted 18 January, 2022; originally announced January 2022.

    Comments: Published in MICCAI 2021 workshop on Lessons Learned from the development and application of medical imaging-based AI technologies for combating COVID-19 (LL-COVID19). The first two authors contributed equally to this work

    Journal ref: LL-COVID19 2021. Lecture Notes in Computer Science, vol 12969. Springer, Cham

  28. arXiv:2111.01320  [pdf, other

    eess.AS cs.SD

    AVASpeech-SMAD: A Strongly Labelled Speech and Music Activity Detection Dataset with Label Co-Occurrence

    Authors: Yun-Ning Hung, Karn N. Watcharasupat, Chih-Wei Wu, Iroro Orife, Kelian Li, Pavan Seshadri, Junyoung Lee

    Abstract: We propose a dataset, AVASpeech-SMAD, to assist speech and music activity detection research. With frame-level music labels, the proposed dataset extends the existing AVASpeech dataset, which originally consists of 45 hours of audio and speech activity labels. To the best of our knowledge, the proposed AVASpeech-SMAD is the first open-source dataset that features strong polyphonic labels for both… ▽ More

    Submitted 1 November, 2021; originally announced November 2021.

  29. arXiv:2103.16932  [pdf, other

    cs.MM eess.IV

    Seeing through a Black Box: Toward High-Quality Terahertz TomographicImaging via Multi-Scale Spatio-Spectral Image Fusion

    Authors: Weng-tai Su, Yi-Chun Hung, Ta-Hsuan Chao, Po-Jen Yu, Shang-Hua Yang, Chia-Wen Lin

    Abstract: Terahertz (THz) imaging has recently attracted significant attention thanks to its non-invasive, non-destructive, non-ionizing, material-classification, and ultra-fast nature for object exploration and inspection. However, its strong water absorption nature and low noise tolerance lead to undesired blurs and distortions of reconstructed THz images. The performances of existing restoration methods… ▽ More

    Submitted 29 December, 2021; v1 submitted 31 March, 2021; originally announced March 2021.

    Comments: 10 pages, 9 figures

  30. Good and Bad Boundaries in Ultrasound Compounding: Preserving Anatomic Boundaries While Suppressing Artifacts

    Authors: Alex Ling Yu Hung, John Galeotti

    Abstract: Ultrasound 3D compounding is important for volumetric reconstruction, but as of yet there is no consensus on best practices for compounding. Ultrasound images depend on probe direction and the path sound waves pass through, so when multiple intersecting B-scans of the same spot from different perspectives yield different pixel values, there is not a single, ideal representation for compounding (i.… ▽ More

    Submitted 10 August, 2021; v1 submitted 24 November, 2020; originally announced November 2020.

    Comments: Int J CARS (2021)

  31. Weakly- and Semi-Supervised Probabilistic Segmentation and Quantification of Ultrasound Needle-Reverberation Artifacts to Allow Better AI Understanding of Tissue Beneath Needles

    Authors: Alex Ling Yu Hung, Edward Chen, John Galeotti

    Abstract: Ultrasound image quality has continually been improving. However, when needles or other metallic objects are operating inside the tissue, the resulting reverberation artifacts can severely corrupt the surrounding image quality. Such effects are challenging for existing computer vision algorithms for medical image analysis. Needle reverberation artifacts can be hard to identify at times and affect… ▽ More

    Submitted 3 June, 2021; v1 submitted 24 November, 2020; originally announced November 2020.

  32. Ultrasound Confidence Maps of Intensity and Structure Based on Directed Acyclic Graphs and Artifact Models

    Authors: Alex Ling Yu Hung, Wanwen Chen, John Galeotti

    Abstract: Ultrasound imaging has been improving, but continues to suffer from inherent artifacts that are challenging to model, such as attenuation, shadowing, diffraction, speckle, etc. These artifacts can potentially confuse image analysis algorithms unless an attempt is made to assess the certainty of individual pixel values. Our novel confidence algorithms analyze pixel values using a directed acyclic g… ▽ More

    Submitted 27 April, 2021; v1 submitted 24 November, 2020; originally announced November 2020.

    Comments: 5 pages, conference

  33. arXiv:2010.11904  [pdf, other

    cs.SD cs.LG eess.AS

    Transcription Is All You Need: Learning to Separate Musical Mixtures with Score as Supervision

    Authors: Yun-Ning Hung, Gordon Wichern, Jonathan Le Roux

    Abstract: Most music source separation systems require large collections of isolated sources for training, which can be difficult to obtain. In this work, we use musical scores, which are comparatively easy to obtain, as a weak label for training a source separation system. In contrast with previous score-informed separation approaches, our system does not require isolated sources, and score is used only as… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

  34. arXiv:2010.04091  [pdf, ps, other

    cs.LG stat.ML

    Reward-Biased Maximum Likelihood Estimation for Linear Stochastic Bandits

    Authors: Yu-Heng Hung, **-Chun Hsieh, Xi Liu, P. R. Kumar

    Abstract: Modifying the reward-biased maximum likelihood method originally proposed in the adaptive control literature, we propose novel learning algorithms to handle the explore-exploit trade-off in linear bandits problems as well as generalized linear bandits problems. We develop novel index policies that we prove achieve order-optimality, and show that they achieve empirical performance competitive with… ▽ More

    Submitted 8 October, 2020; originally announced October 2020.

  35. arXiv:2008.00616  [pdf, other

    eess.AS cs.LG

    Multitask learning for instrument activation aware music source separation

    Authors: Yun-Ning Hung, Alexander Lerch

    Abstract: Music source separation is a core task in music information retrieval which has seen a dramatic improvement in the past years. Nevertheless, most of the existing systems focus exclusively on the problem of source separation itself and ignore the utilization of other~---possibly related---~MIR tasks which could lead to additional quality gains. In this work, we propose a novel multitask structure t… ▽ More

    Submitted 2 August, 2020; originally announced August 2020.

  36. arXiv:2008.00203  [pdf, other

    eess.AS cs.IR cs.LG

    Score-informed Networks for Music Performance Assessment

    Authors: Jiawen Huang, Yun-Ning Hung, Ashis Pati, Siddharth Kumar Gururani, Alexander Lerch

    Abstract: The assessment of music performances in most cases takes into account the underlying musical score being performed. While there have been several automatic approaches for objective music performance assessment (MPA) based on extracted features from both the performance audio and the score, deep neural network-based methods incorporating score information into MPA models have not yet been investiga… ▽ More

    Submitted 1 August, 2020; originally announced August 2020.

    Comments: To appear at 21st International Society for Music Information Retrieval Conference, Montréal, Canada, 2020

  37. arXiv:2006.05021  [pdf, other

    stat.ML cs.LG

    CLAIMED: A CLAssification-Incorporated Minimum Energy Design to explore a multivariate response surface with feasibility constraints

    Authors: Mert Y. Sengul, Yao Song, Linglin He, Adri C. T. van Duin, Ying Hung, Tirthankar Dasgupta

    Abstract: Motivated by the problem of optimization of force-field systems in physics using large-scale computer simulations, we consider exploration of a deterministic complex multivariate response surface. The objective is to find input combinations that generate output close to some desired or "target" vector. In spite of reducing the problem to exploration of the input space with respect to a one-dimensi… ▽ More

    Submitted 13 September, 2021; v1 submitted 8 June, 2020; originally announced June 2020.

  38. arXiv:1912.07821  [pdf

    cs.ET physics.app-ph

    Valley-Coupled-Spintronic Non-Volatile Memories with Compute-In-Memory Support

    Authors: Sandeep Thirumala, Yi-Tse Hung, Shubham Jain, Arnab Raha, Niharika Thakuria, Vijay Raghunathan, Anand Raghunathan, Zhihong Chen, Sumeet Gupta

    Abstract: In this work, we propose valley-coupled spin-hall memories (VSH-MRAMs) based on monolayer WSe2. The key features of the proposed memories are (a) the ability to switch magnets with perpendicular magnetic anisotropy (PMA) via VSH effect and (b) an integrated gate that can modulate the charge/spin current (IC/IS) flow. The former attribute results in high energy efficiency (compared to the Giant-Spi… ▽ More

    Submitted 17 December, 2019; originally announced December 2019.

  39. arXiv:1910.06562  [pdf, other

    cs.LG stat.ML

    Compacting, Picking and Growing for Unforgetting Continual Learning

    Authors: Steven C. Y. Hung, Cheng-Hao Tu, Cheng-En Wu, Chien-Hung Chen, Yi-Ming Chan, Chu-Song Chen

    Abstract: Continual lifelong learning is essential to many applications. In this paper, we propose a simple but effective approach to continual deep learning. Our approach leverages the principles of deep model compression, critical weights selection, and progressive networks expansion. By enforcing their integration in an iterative manner, we introduce an incremental learning method that is scalable to the… ▽ More

    Submitted 30 October, 2019; v1 submitted 15 October, 2019; originally announced October 2019.

    Comments: To appear in NeurIPS 2019

  40. arXiv:1905.13567  [pdf, other

    eess.AS cs.SD

    Musical Composition Style Transfer via Disentangled Timbre Representations

    Authors: Yun-Ning Hung, I-Tung Chiang, Yi-An Chen, Yi-Hsuan Yang

    Abstract: Music creation involves not only composing the different parts (e.g., melody, chords) of a musical work but also arranging/selecting the instruments to play the different parts. While the former has received increasing attention, the latter has not been much investigated. This paper presents, to the best of our knowledge, the first deep learning models for rearranging music of arbitrary genres. Sp… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

    Comments: Accepted by the 28th International Joint Conference on Artificial Intelligence. arXiv admin note: text overlap with arXiv:1811.03271

  41. arXiv:1811.09986  [pdf, other

    cs.CV

    Learning Conditional Random Fields with Augmented Observations for Partially Observed Action Recognition

    Authors: Shih-Yao Lin, Yen-Yu Lin, Chu-Song Chen, Yi-** Hung

    Abstract: This paper aims at recognizing partially observed human actions in videos. Action videos acquired in uncontrolled environments often contain corrupt frames, which make actions partially observed. Furthermore, these frames can last for arbitrary lengths of time and appear irregularly. They are inconsistent with training data and degrade the performance of pre-trained action recognition systems. We… ▽ More

    Submitted 5 December, 2018; v1 submitted 25 November, 2018; originally announced November 2018.

  42. arXiv:1811.03271  [pdf, other

    cs.SD eess.AS

    Learning Disentangled Representations for Timber and Pitch in Music Audio

    Authors: Yun-Ning Hung, Yi-An Chen, Yi-Hsuan Yang

    Abstract: Timbre and pitch are the two main perceptual properties of musical sounds. Depending on the target applications, we sometimes prefer to focus on one of them, while reducing the effect of the other. Researchers have managed to hand-craft such timbre-invariant or pitch-invariant features using domain knowledge and signal processing techniques, but it remains difficult to disentangle them in the resu… ▽ More

    Submitted 8 November, 2018; originally announced November 2018.

  43. arXiv:1811.01143  [pdf, other

    cs.SD eess.AS

    Multitask learning for frame-level instrument recognition

    Authors: Yun-Ning Hung, Yi-An Chen, Yi-Hsuan Yang

    Abstract: For many music analysis problems, we need to know the presence of instruments for each time frame in a multi-instrument musical piece. However, such a frame-level instrument recognition task remains difficult, mainly due to the lack of labeled datasets. To address this issue, we present in this paper a large-scale dataset that contains synthetic polyphonic music with frame-level pitch and instrume… ▽ More

    Submitted 18 February, 2019; v1 submitted 2 November, 2018; originally announced November 2018.

    Comments: This is a pre-print version of an ICASSP 2019 paper

  44. arXiv:1806.09587  [pdf, other

    cs.SD eess.AS

    Frame-level Instrument Recognition by Timbre and Pitch

    Authors: Yun-Ning Hung, Yi-Hsuan Yang

    Abstract: Instrument recognition is a fundamental task in music information retrieval, yet little has been done to predict the presence of instruments in multi-instrument music for each time frame. This task is important for not only automatic transcription but also many retrieval problems. In this paper, we use the newly released MusicNet dataset to study this front, by building and evaluating a convolutio… ▽ More

    Submitted 25 June, 2018; originally announced June 2018.

  45. arXiv:1710.10814  [pdf, other

    stat.ML cs.IR

    Hit Song Prediction for Pop Music by Siamese CNN with Ranking Loss

    Authors: Lang-Chi Yu, Yi-Hsuan Yang, Yun-Ning Hung, Yi-An Chen

    Abstract: A model for hit song prediction can be used in the pop music industry to identify emerging trends and potential artists or songs before they are marketed to the public. While most previous work formulates hit song prediction as a regression or classification problem, we present in this paper a convolutional neural network (CNN) model that treats it as a ranking problem. Specifically, we use a comm… ▽ More

    Submitted 30 October, 2017; originally announced October 2017.

  46. arXiv:1506.05870  [pdf, other

    cs.CV

    To Know Where We Are: Vision-Based Positioning in Outdoor Environments

    Authors: Kuan-Wen Chen, Chun-Hsin Wang, Xiao Wei, Qiao Liang, Ming-Hsuan Yang, Chu-Song Chen, Yi-** Hung

    Abstract: Augmented reality (AR) displays become more and more popular recently, because of its high intuitiveness for humans and high-quality head-mounted display have rapidly developed. To achieve such displays with augmented information, highly accurate image registration or ego-positioning are required, but little attention have been paid for out-door environments. This paper presents a method for ego-p… ▽ More

    Submitted 18 June, 2015; originally announced June 2015.

    Comments: 11 pages, 14 figures

  47. arXiv:1011.2009  [pdf, ps, other

    cs.IT

    Comparison of Spearman's rho and Kendall's tau in Normal and Contaminated Normal Models

    Authors: Weichao Xu, Yunhe Hou, Y. S. Hung, Yuexian Zou

    Abstract: This paper analyzes the performances of the Spearman's rho (SR) and Kendall's tau (KT) with respect to samples drawn from bivariate normal and bivariate contaminated normal populations. The exact analytical formulae of the variance of SR and the covariance between SR and KT are obtained based on the Childs's reduction formula for the quadrivariate normal positive orthant probabilities. Close form… ▽ More

    Submitted 9 November, 2010; originally announced November 2010.