Skip to main content

Showing 1–15 of 15 results for author: Cho, M

Searching in archive eess. Search in all archives.
.
  1. arXiv:2308.06472  [pdf, other

    cs.SD cs.LG eess.AS

    Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding

    Authors: Kumari Nishu, Minsik Cho, Paul Dixon, Devang Naik

    Abstract: Spotting user-defined/flexible keywords represented in text frequently uses an expensive text encoder for joint analysis with an audio encoder in an embedding space, which can suffer from heterogeneous modality representation (i.e., large mismatch) and increased complexity. In this work, we propose a novel architecture to efficiently detect arbitrary keywords based on an audio-compliant text encod… ▽ More

    Submitted 12 August, 2023; originally announced August 2023.

  2. arXiv:2306.05245  [pdf, other

    eess.AS cs.LG cs.SD

    Matching Latent Encoding for Audio-Text based Keyword Spotting

    Authors: Kumari Nishu, Minsik Cho, Devang Naik

    Abstract: Using audio and text embeddings jointly for Keyword Spotting (KWS) has shown high-quality results, but the key challenge of how to semantically align two embeddings for multi-word keywords of different sequence lengths remains largely unsolved. In this paper, we propose an audio-text-based end-to-end model architecture for flexible keyword spotting (KWS), which builds upon learned audio and text e… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

  3. arXiv:2303.08253  [pdf, other

    cs.LG cs.CV cs.PF eess.IV

    R2 Loss: Range Restriction Loss for Model Compression and Quantization

    Authors: Arnav Kundu, Chungkuk Yoo, Srijan Mishra, Minsik Cho, Saurabh Adya

    Abstract: Model quantization and compression is widely used techniques to reduce usage of computing resource at inference time. While state-of-the-art works have been achieved reasonable accuracy with higher bit such as 4bit or 8bit, but still it is challenging to quantize/compress a model further, e.g., 1bit or 2bit. To overcome the challenge, we focus on outliers in weights of a pre-trained model which di… ▽ More

    Submitted 11 February, 2024; v1 submitted 14 March, 2023; originally announced March 2023.

  4. arXiv:2301.13729  [pdf, other

    cs.DC eess.SY

    Low-rank LQR Optimal Control Design over Wireless Communication Networks

    Authors: Myung Cho, Abdallah Abdallah, Mohammad Rasouli

    Abstract: This paper considers a LQR optimal control design problem for distributed control systems with multi-agents. To control large-scale distributed systems such as smart-grid and multi-agent robotic systems over wireless communication networks, it is desired to design a feedback controller by considering various constraints on communication such as limited power, limited energy, or limited communicati… ▽ More

    Submitted 31 January, 2023; originally announced January 2023.

    Comments: 10 pages

  5. arXiv:2212.04396  [pdf, other

    eess.SY math.DS

    On Attack Detection and Identification for the Cyber-Physical System using Lifted System Model

    Authors: Dawei Sun, Minhyun Cho, Inseok Hwang

    Abstract: Motivated by the safety and security issues related to cyber-physical systems with potentially multi-rate, delayed, and nonuniformly sampled measurements, we investigate the attack detection and identification using the lifted system model in this paper. Attack detectability and identifiability based on the lifted system model are formally defined and rigorously characterized in a novel approach.… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

    Comments: It is the preprint of a paper submitted to Automatica

  6. arXiv:2212.02929  [pdf, ps, other

    cs.DC eess.SY

    Deep Neural Networks Based on Iterative Thresholding and Projection Algorithms for Sparse LQR Control Design

    Authors: Myung Cho

    Abstract: In this paper, we consider an LQR design problem for distributed control systems. For large-scale distributed systems, finding a solution might be computationally demanding due to communications among agents. To this aim, we deal with LQR minimization problem with a regularization for sparse feedback matrix, which can lead to achieve the reduction of the communication links in the distributed cont… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

    Comments: 14 pages

  7. arXiv:2211.03885  [pdf, other

    cs.CV eess.IV

    Learned Smartphone ISP on Mobile GPUs with Deep Learning, Mobile AI & AIM 2022 Challenge: Report

    Authors: Andrey Ignatov, Radu Timofte, Shuai Liu, Chaoyu Feng, Furui Bai, Xiaotao Wang, Lei Lei, Ziyao Yi, Yan Xiang, Zibin Liu, Shaoqing Li, Keming Shi, Dehui Kong, Ke Xu, Minsu Kwon, Yaqi Wu, Jiesi Zheng, Zhihao Fan, Xun Wu, Feng Zhang, Albert No, Minhyeok Cho, Zewen Chen, Xiaze Zhang, Ran Li , et al. (13 additional authors not shown)

    Abstract: The role of mobile cameras increased dramatically over the past few years, leading to more and more research in automatic image quality enhancement and RAW photo processing. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based image signal processing (ISP) pipeline replacing the standard mobile ISPs that can run on modern smartphone GPUs using TensorFlow Lite. Th… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

  8. arXiv:2210.15425  [pdf, other

    eess.AS cs.LG cs.SD

    HEiMDaL: Highly Efficient Method for Detection and Localization of wake-words

    Authors: Arnav Kundu, Mohammad Samragh Razlighi, Minsik Cho, Priyanka Padmanabhan, Devang Naik

    Abstract: Streaming keyword spotting is a widely used solution for activating voice assistants. Deep Neural Networks with Hidden Markov Model (DNN-HMM) based methods have proven to be efficient and widely adopted in this space, primarily because of the ability to detect and identify the start and end of the wake-up word at low compute cost. However, such hybrid systems suffer from loss metric mismatch when… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

  9. arXiv:2210.13567  [pdf, ps, other

    cs.CV cs.LG cs.SD eess.AS

    I see what you hear: a vision-inspired method to localize words

    Authors: Mohammad Samragh, Arnav Kundu, Ting-Yao Hu, Minsik Cho, Aman Chadha, Ashish Shrivastava, Oncel Tuzel, Devang Naik

    Abstract: This paper explores the possibility of using visual object detection techniques for word localization in speech data. Object detection has been thoroughly studied in the contemporary literature for visual data. Noting that an audio can be interpreted as a 1-dimensional image, object localization techniques can be fundamentally useful for word localization. Building upon this idea, we propose a lig… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

  10. arXiv:2204.09578  [pdf, other

    eess.SP cs.LG stat.AP

    Restructuring TCAD System: Teaching Traditional TCAD New Tricks

    Authors: Sanghoon Myung, Wonik Jang, Seonghoon **, Jae Myung Choe, Changwook Jeong, Dae Sin Kim

    Abstract: Traditional TCAD simulation has succeeded in predicting and optimizing the device performance; however, it still faces a massive challenge - a high computational cost. There have been many attempts to replace TCAD with deep learning, but it has not yet been completely replaced. This paper presents a novel algorithm restructuring the traditional TCAD system. The proposed algorithm predicts three-di… ▽ More

    Submitted 19 April, 2022; originally announced April 2022.

    Comments: In Proceedings of 2021 IEEE International Electron Devices Meeting (IEDM)

    Journal ref: Proc. of IEDM 2021, 18.2.1-18.2.4 (2021)

  11. arXiv:2204.02455  [pdf, other

    cs.SD cs.LG eess.AS

    Improving Voice Trigger Detection with Metric Learning

    Authors: Prateeth Nayak, Takuya Higuchi, Anmol Gupta, Shivesh Ranjan, Stephen Shum, Siddharth Sigtia, Erik Marchi, Varun Lakshminarasimhan, Minsik Cho, Saurabh Adya, Chandra Dhir, Ahmed Tewfik

    Abstract: Voice trigger detection is an important task, which enables activating a voice assistant when a target user speaks a keyword phrase. A detector is typically trained on speech data independent of speaker information and used for the voice trigger detection task. However, such a speaker independent voice trigger detector typically suffers from performance degradation on speech from underrepresented… ▽ More

    Submitted 13 September, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

    Comments: Accepted at InterSpeech 2022

  12. arXiv:2203.04314  [pdf, other

    eess.IV cs.CV cs.LG

    PyNET-QxQ: An Efficient PyNET Variant for QxQ Bayer Pattern Demosaicing in CMOS Image Sensors

    Authors: Minhyeok Cho, Haechang Lee, Hyunwoo Je, Kijeong Kim, Dongil Ryu, Albert No

    Abstract: Deep learning-based image signal processor (ISP) models for mobile cameras can generate high-quality images that rival those of professional DSLR cameras. However, their computational demands often make them unsuitable for mobile settings. Additionally, modern mobile cameras employ non-Bayer color filter arrays (CFA) such as Quad Bayer, Nona Bayer, and QxQ Bayer to enhance image quality, yet most… ▽ More

    Submitted 5 May, 2023; v1 submitted 8 March, 2022; originally announced March 2022.

    Comments: Accepted by IEEE Access

  13. arXiv:2108.02716  [pdf, other

    cs.NI eess.SP

    Link Quality-Guaranteed Minimum-Cost Millimeter-Wave Base Station Deployment

    Authors: Miaomiao Dong, Taejoon Kim, Minsung Cho, Kangeun Lee, Sungrok Yoon

    Abstract: Today's growth in the volume of wireless devices coupled with the promise of supporting data-intensive 5G-&-beyond use cases is driving the industry to deploy more millimeter-wave (mmWave) base stations (BSs). Although mmWave cellular systems can carry a larger volume of traffic, dense deployment, in turn, increases the BS installation and maintenance cost, which has been largely ignored in their… ▽ More

    Submitted 5 August, 2021; originally announced August 2021.

    Comments: 16 pages, submitted to IEEE Transactions on Wireless Communications

  14. arXiv:2008.01944  [pdf, ps, other

    q-bio.QM cs.IT eess.SP stat.AP

    Optimal Pooling Matrix Design for Group Testing with Dilution (Row Degree) Constraints

    Authors: Jirong Yi, Myung Cho, Xiaodong Wu, Raghu Mudumbai, Weiyu Xu

    Abstract: In this paper, we consider the problem of designing optimal pooling matrix for group testing (for example, for COVID-19 virus testing) with the constraint that no more than $r>0$ samples can be pooled together, which we call "dilution constraint". This problem translates to designing a matrix with elements being either 0 or 1 that has no more than $r$ '1's in each row and has a certain performance… ▽ More

    Submitted 5 August, 2020; originally announced August 2020.

    Comments: group testing design, COVID-19

  15. arXiv:1306.2665  [pdf, ps, other

    cs.IT cs.LG eess.SY math.OC stat.ML

    Precisely Verifying the Null Space Conditions in Compressed Sensing: A Sandwiching Algorithm

    Authors: Myung Cho, Weiyu Xu

    Abstract: In this paper, we propose new efficient algorithms to verify the null space condition in compressed sensing (CS). Given an $(n-m) \times n$ ($m>0$) CS matrix $A$ and a positive $k$, we are interested in computing $\displaystyle α_k = \max_{\{z: Az=0,z\neq 0\}}\max_{\{K: |K|\leq k\}}$ ${\|z_K \|_{1}}{\|z\|_{1}}$, where $K$ represents subsets of $\{1,2,...,n\}$, and $|K|$ is the cardinality of $K$.… ▽ More

    Submitted 9 August, 2013; v1 submitted 11 June, 2013; originally announced June 2013.

    Comments: 30 pages