Skip to main content

Showing 1–39 of 39 results for author: Hwang, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.15154  [pdf, other

    cs.CL cs.AI

    Do not think pink elephant!

    Authors: Kyomin Hwang, Suyoung Kim, JunHoo Lee, Nojun Kwak

    Abstract: Large Models (LMs) have heightened expectations for the potential of general AI as they are akin to human intelligence. This paper shows that recent large models such as Stable Diffusion and DALL-E3 also share the vulnerability of human intelligence, namely the "white bear phenomenon". We investigate the causes of the white bear phenomenon by analyzing their representation space. Based on this ana… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: This paper is accepted in CVPRW

  2. arXiv:2403.17329  [pdf, other

    cs.LG cs.AI

    Deep Support Vectors

    Authors: Junhoo Lee, Hyunho Lee, Kyomin Hwang, Nojun Kwak

    Abstract: Deep learning has achieved tremendous success. \nj{However,} unlike SVMs, which provide direct decision criteria and can be trained with a small dataset, it still has significant weaknesses due to its requirement for massive datasets during training and the black-box characteristics on decision criteria. \nj{This paper addresses} these issues by identifying support vectors in deep learning models.… ▽ More

    Submitted 27 June, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

  3. arXiv:2403.02496  [pdf

    cs.CL

    Choose Your Own Adventure: Interactive E-Books to Improve Word Knowledge and Comprehension Skills

    Authors: Stephanie Day, ** K. Hwang, Tracy Arner, Danielle McNamara, Carol Connor

    Abstract: The purpose of this feasibility study was to examine the potential impact of reading digital interactive e-books on essential skills that support reading comprehension with third-fifth grade students. Students read two e-Books that taught word learning and comprehension monitoring strategies in the service of learning difficult vocabulary and targeted science concepts about hurricanes. We investig… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  4. arXiv:2403.01344  [pdf, other

    cs.LG cs.CV

    Mitigating the Bias in the Model for Continual Test-Time Adaptation

    Authors: Inseop Chung, Kyomin Hwang, Jayeon Yoo, Nojun Kwak

    Abstract: Continual Test-Time Adaptation (CTA) is a challenging task that aims to adapt a source pre-trained model to continually changing target domains. In the CTA setting, a model does not know when the target domain changes, thus facing a drastic change in the distribution of streaming inputs during the test-time. The key challenge is to keep adapting the model to the continually changing target domains… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  5. arXiv:2311.00737  [pdf

    cs.LG physics.ins-det physics.med-ph

    Real-Time Magnetic Tracking and Diagnosis of COVID-19 via Machine Learning

    Authors: Dang Nguyen, Phat K. Huynh, Vinh Duc An Bui, Kee Young Hwang, Nityanand Jain, Chau Nguyen, Le Huu Nhat Minh, Le Van Truong, Xuan Thanh Nguyen, Dinh Hoang Nguyen, Le Tien Dung, Trung Q. Le, Manh-Huong Phan

    Abstract: The COVID-19 pandemic underscored the importance of reliable, noninvasive diagnostic tools for robust public health interventions. In this work, we fused magnetic respiratory sensing technology (MRST) with machine learning (ML) to create a diagnostic platform for real-time tracking and diagnosis of COVID-19 and other respiratory diseases. The MRST precisely captures breathing patterns through thre… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  6. arXiv:2309.01961  [pdf, other

    cs.CV

    NICE: CVPR 2023 Challenge on Zero-shot Image Captioning

    Authors: Taehoon Kim, Pyunghwan Ahn, Sangyun Kim, Sihaeng Lee, Mark Marsden, Alessandra Sala, Seung Hwan Kim, Bohyung Han, Kyoung Mu Lee, Honglak Lee, Kyounghoon Bae, Xiangyu Wu, Yi Gao, Hailiang Zhang, Yang Yang, Weili Guo, Jianfeng Lu, Youngtaek Oh, Jae Won Cho, Dong-** Kim, In So Kweon, Junmo Kim, Wooyoung Kang, Won Young Jhoo, Byungseok Roh , et al. (17 additional authors not shown)

    Abstract: In this report, we introduce NICE (New frontiers for zero-shot Image Captioning Evaluation) project and share the results and outcomes of 2023 challenge. This project is designed to challenge the computer vision community to develop robust image captioning models that advance the state-of-the-art both in terms of accuracy and fairness. Through the challenge, the image captioning models were tested… ▽ More

    Submitted 10 September, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: Tech report, project page https://nice.lgresearch.ai/

  7. arXiv:2308.16415  [pdf, other

    cs.CL eess.AS

    Knowledge Distillation from Non-streaming to Streaming ASR Encoder using Auxiliary Non-streaming Layer

    Authors: Kyuhong Shim, **kyu Lee, Simyung Chang, Kyuwoong Hwang

    Abstract: Streaming automatic speech recognition (ASR) models are restricted from accessing future context, which results in worse performance compared to the non-streaming models. To improve the performance of streaming ASR, knowledge distillation (KD) from the non-streaming to streaming model has been studied, mainly focusing on aligning the output token probabilities. In this paper, we propose a layer-to… ▽ More

    Submitted 30 August, 2023; originally announced August 2023.

    Comments: Accepted to Interspeech 2023

  8. arXiv:2307.05517  [pdf, other

    cs.LG

    Adaptive Graph Convolution Networks for Traffic Flow Forecasting

    Authors: Zhengdao Li, Wei Li, Kai Hwang

    Abstract: Traffic flow forecasting is a highly challenging task due to the dynamic spatial-temporal road conditions. Graph neural networks (GNN) has been widely applied in this task. However, most of these GNNs ignore the effects of time-varying road conditions due to the fixed range of the convolution receptive field. In this paper, we propose a novel Adaptive Graph Convolution Networks (AGC-net) to addres… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

  9. arXiv:2301.03169  [pdf, other

    cs.CV cs.AI

    A Study on the Generality of Neural Network Structures for Monocular Depth Estimation

    Authors: **woo Bae, Kyumin Hwang, Sunghoon Im

    Abstract: Monocular depth estimation has been widely studied, and significant improvements in performance have been recently reported. However, most previous works are evaluated on a few benchmark datasets, such as KITTI datasets, and none of the works provide an in-depth analysis of the generalization performance of monocular depth estimation. In this paper, we deeply investigate the various backbone netwo… ▽ More

    Submitted 10 December, 2023; v1 submitted 8 January, 2023; originally announced January 2023.

    Comments: Accepted in TPAMI

  10. arXiv:2211.06400  [pdf, other

    physics.acc-ph cs.LG

    Prior-mean-assisted Bayesian optimization application on FRIB Front-End tunning

    Authors: Kilean Hwang, Tomofumi Maruta, Alexander Plastun, Kei Fukushima, Tong Zhang, Qiang Zhao, Peter Ostroumov, Yue Hao

    Abstract: Bayesian optimization~(BO) is often used for accelerator tuning due to its high sample efficiency. However, the computational scalability of training over large data-set can be problematic and the adoption of historical data in a computationally efficient way is not trivial. Here, we exploit a neural network model trained over historical data as a prior mean of BO for FRIB Front-End tuning.

    Submitted 11 November, 2022; originally announced November 2022.

  11. arXiv:2211.01629  [pdf, other

    cs.CV cs.LG

    Image-based Early Detection System for Wildfires

    Authors: Omkar Ranadive, Jisu Kim, Serin Lee, Youngseo Cha, Heechan Park, Minkook Cho, Young K. Hwang

    Abstract: Wildfires are a disastrous phenomenon which cause damage to land, loss of property, air pollution, and even loss of human life. Due to the warmer and drier conditions created by climate change, more severe and uncontrollable wildfires are expected to occur in the coming years. This could lead to a global wildfire crisis and have dire consequences on our planet. Hence, it has become imperative to u… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

    Comments: Published in Tackling Climate Change with Machine Learning workshop, Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS 2022)

  12. arXiv:2205.09185  [pdf, other

    physics.ins-det cs.LG hep-ex nucl-ex physics.comp-ph

    AI-assisted Optimization of the ECCE Tracking System at the Electron Ion Collider

    Authors: C. Fanelli, Z. Papandreou, K. Suresh, J. K. Adkins, Y. Akiba, A. Albataineh, M. Amaryan, I. C. Arsene, C. Ayerbe Gayoso, J. Bae, X. Bai, M. D. Baker, M. Bashkanov, R. Bellwied, F. Benmokhtar, V. Berdnikov, J. C. Bernauer, F. Bock, W. Boeglin, M. Borysova, E. Brash, P. Brindza, W. J. Briscoe, M. Brooks, S. Bueltmann , et al. (258 additional authors not shown)

    Abstract: The Electron-Ion Collider (EIC) is a cutting-edge accelerator facility that will study the nature of the "glue" that binds the building blocks of the visible matter in the universe. The proposed experiment will be realized at Brookhaven National Laboratory in approximately 10 years from now, with detector design and R&D currently ongoing. Notably, EIC is one of the first large-scale facilities to… ▽ More

    Submitted 19 May, 2022; v1 submitted 18 May, 2022; originally announced May 2022.

    Comments: 16 pages, 18 figures, 2 appendices, 3 tables

  13. LOCAT: Low-Overhead Online Configuration Auto-Tuning of Spark SQL Applications

    Authors: **han Xin, Kai Hwang, Zhibin Yu

    Abstract: Spark SQL has been widely deployed in industry but it is challenging to tune its performance. Recent studies try to employ machine learning (ML) to solve this problem, but suffer from two drawbacks. First, it takes a long time (high overhead) to collect training samples. Second, the optimal configuration for one input data size of the same application might not be optimal for others. To address th… ▽ More

    Submitted 7 November, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

    Comments: 16 pages, 21 figures, SIGMOD '22. This arxiv version is an extended version of the SIGMOD '22 paper with same title, allowed by conference chairs

  14. arXiv:2202.10612  [pdf, other

    cs.MA cs.AI

    A Decentralized Communication Framework based on Dual-Level Recurrence for Multi-Agent Reinforcement Learning

    Authors: **gchen Li, Haobin Shi, Kao-Shing Hwang

    Abstract: We propose a model enabling decentralized multiple agents to share their perception of environment in a fair and adaptive way. In our model, both the current message and historical observation are taken into account, and they are handled in the same recurrent model but in different forms. We present a dual-level recurrent communication framework for multi-agent systems, in which the first recurren… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

  15. arXiv:2202.05093  [pdf, other

    cs.AI cs.DC cs.LG cs.NE eess.SY

    Two-Stage Deep Anomaly Detection with Heterogeneous Time Series Data

    Authors: Kyeong-Joong Jeong, **-Duk Park, Kyusoon Hwang, Seong-Lyun Kim, Won-Yong Shin

    Abstract: We introduce a data-driven anomaly detection framework using a manufacturing dataset collected from a factory assembly line. Given heterogeneous time series data consisting of operation cycle signals and sensor signals, we aim at discovering abnormal events. Motivated by our empirical findings that conventional single-stage benchmark approaches may not exhibit satisfactory performance under our ch… ▽ More

    Submitted 10 February, 2022; originally announced February 2022.

    Comments: 10 pages, 4 figures, 4 tables; published in the IEEE Access (Please cite our journal version.)

  16. arXiv:2201.05724  [pdf, ps, other

    cs.DM physics.bio-ph q-bio.BM

    StemP: A fast and deterministic Stem-graph approach for RNA and protein folding prediction

    Authors: Mengyi Tang, Kumbit Hwang, Sung Ha Kang

    Abstract: We propose a new deterministic methodology to predict RNA sequence and protein folding. Is stem enough for structure prediction? The main idea is to consider all possible stem formation in the given sequence. With the stem loop energy and the strength of stem, we explore how to deterministically utilize stem information for RNA sequence and protein folding structure prediction. We use graph notati… ▽ More

    Submitted 14 January, 2022; originally announced January 2022.

    MSC Class: 92-10 (Primary) 68R99 (Secondary) ACM Class: G.2.3

  17. arXiv:2103.13620  [pdf, other

    cs.SD cs.AI

    SubSpectral Normalization for Neural Audio Data Processing

    Authors: Simyung Chang, Hyoungwoo Park, Janghoon Cho, Hyunsin Park, Sungrack Yun, Kyuwoong Hwang

    Abstract: Convolutional Neural Networks are widely used in various machine learning domains. In image processing, the features can be obtained by applying 2D convolution to all spatial dimensions of the input. However, in the audio case, frequency domain input like Mel-Spectrogram has different and unique characteristics in the frequency dimension. Thus, there is a need for a method that allows the 2D convo… ▽ More

    Submitted 25 March, 2021; originally announced March 2021.

    Comments: 4 pages, ICASSP '21 accepted

  18. arXiv:2011.01156  [pdf, other

    cs.LG stat.ML

    SapAugment: Learning A Sample Adaptive Policy for Data Augmentation

    Authors: Ting-Yao Hu, Ashish Shrivastava, Jen-Hao Rick Chang, Hema Koppula, Stefan Braun, Kyuyeon Hwang, Ozlem Kalinli, Oncel Tuzel

    Abstract: Data augmentation methods usually apply the same augmentation (or a mix of them) to all the training samples. For example, to perturb data with noise, the noise is sampled from a Normal distribution with a fixed standard deviation, for all samples. We hypothesize that a hard sample with high training loss already provides strong training signal to update the model parameters and should be perturbe… ▽ More

    Submitted 15 February, 2021; v1 submitted 2 November, 2020; originally announced November 2020.

    Comments: Accepted at ICASSP 2021

  19. arXiv:2007.10878  [pdf, other

    cs.LG eess.SP

    DeepNetQoE: Self-adaptive QoE Optimization Framework of Deep Networks

    Authors: Rui Wang, Min Chen, Nadra Guizani, Yong Li, Hamid Gharavi, Kai Hwang

    Abstract: Future advances in deep learning and its impact on the development of artificial intelligence (AI) in all fields depends heavily on data size and computational power. Sacrificing massive computing resources in exchange for better precision rates of the network model is recognized by many researchers. This leads to huge computing consumption and satisfactory results are not always expected when com… ▽ More

    Submitted 16 July, 2020; originally announced July 2020.

  20. Integrating Deep Learning into CAD/CAE System: Generative Design and Evaluation of 3D Conceptual Wheel

    Authors: Soyoung Yoo, Sunghee Lee, Seongsin Kim, Kwang Hyeon Hwang, Jong Ho Park, Namwoo Kang

    Abstract: Engineering design research integrating artificial intelligence (AI) into computer-aided design (CAD) and computer-aided engineering (CAE) is actively being conducted. This study proposes a deep learning-based CAD/CAE framework in the conceptual design phase that automatically generates 3D CAD designs and evaluates their engineering performance. The proposed framework comprises seven stages: (1) 2… ▽ More

    Submitted 13 June, 2021; v1 submitted 25 May, 2020; originally announced June 2020.

    Journal ref: Structural and Multidisciplinary Optimization, 64(4), pp. 2725-2747 (2021)

  21. arXiv:2002.03493  [pdf, other

    cs.DC cs.PF

    AI-oriented Medical Workload Allocation for Hierarchical Cloud/Edge/Device Computing

    Authors: Tianshu Hao, Jianfeng Zhan, Kai Hwang, Wanling Gao, Xu Wen

    Abstract: In a hierarchically-structured cloud/edge/device computing environment, workload allocation can greatly affect the overall system performance. This paper deals with AI-oriented medical workload generated in emergency rooms (ER) or intensive care units (ICU) in metropolitan areas. The goal is to optimize AI-workload allocation to cloud clusters, edge servers, and end devices so that minimum respons… ▽ More

    Submitted 9 February, 2020; originally announced February 2020.

  22. arXiv:1910.06790  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Weakly Labeled Sound Event Detection Using Tri-training and Adversarial Learning

    Authors: Hyoungwoo Park, Sungrack Yun, Jungyun Eum, Janghoon Cho, Kyuwoong Hwang

    Abstract: This paper considers a semi-supervised learning framework for weakly labeled polyphonic sound event detection problems for the DCASE 2019 challenge's task4 by combining both the tri-training and adversarial learning. The goal of the task4 is to detect onsets and offsets of multiple sound events in a single audio clip. The entire dataset consists of the synthetic data with a strong label (sound eve… ▽ More

    Submitted 14 October, 2019; originally announced October 2019.

    Comments: 5 pages, DCASE 2019 Workshop

  23. arXiv:1910.06784  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Acoustic Scene Classification Based on a Large-margin Factorized CNN

    Authors: Janghoon Cho, Sungrack Yun, Hyoungwoo Park, Jungyun Eum, Kyuwoong Hwang

    Abstract: In this paper, we present an acoustic scene classification framework based on a large-margin factorized convolutional neural network (CNN). We adopt the factorized CNN to learn the patterns in the time-frequency domain by factorizing the 2D kernel into two separate 1D kernels. The factorized kernel leads to learn the main component of two patterns: the long-term ambient and short-term event sounds… ▽ More

    Submitted 14 October, 2019; originally announced October 2019.

    Comments: 5 pages, DCASE 2019 Workshop

  24. arXiv:1910.05171  [pdf, other

    cs.LG cs.CL eess.AS stat.ML

    Query-by-example on-device keyword spotting

    Authors: Byeonggeun Kim, Mingu Lee, **kyu Lee, Yeonseok Kim, Kyuwoong Hwang

    Abstract: A keyword spotting (KWS) system determines the existence of, usually predefined, keyword in a continuous speech stream. This paper presents a query-by-example on-device KWS system which is user-specific. The proposed system consists of two main steps: query enrollment and testing. In query enrollment step, phonetic posteriors are output by a small-footprint automatic speech recognition model based… ▽ More

    Submitted 13 January, 2020; v1 submitted 11 October, 2019; originally announced October 2019.

    Comments: IEEE ASRU 2019

  25. arXiv:1910.04500  [pdf, other

    cs.LG eess.AS stat.ML

    Orthogonality Constrained Multi-Head Attention For Keyword Spotting

    Authors: Mingu Lee, **kyu Lee, Hye ** Jang, Byeonggeun Kim, Wonil Chang, Kyuwoong Hwang

    Abstract: Multi-head attention mechanism is capable of learning various representations from sequential data while paying attention to different subsequences, e.g., word-pieces or syllables in a spoken word. From the subsequences, it retrieves richer information than a single-head attention which only summarizes the whole sequence into one context vector. However, a naive use of the multi-head attention doe… ▽ More

    Submitted 10 October, 2019; originally announced October 2019.

    Comments: Accepted to ASRU 2019

  26. arXiv:1909.06326  [pdf, other

    q-bio.QM cs.CV cs.LG eess.IV physics.med-ph

    Automatic Hip Fracture Identification and Functional Subclassification with Deep Learning

    Authors: Justin D Krogue, Kaiyang V Cheng, Kevin M Hwang, Paul Toogood, Eric G Meinberg, Erik J Geiger, Musa Zaid, Kevin C McGill, Rina Patel, Jae Ho Sohn, Alexandra Wright, Bryan F Darger, Kevin A Padrez, Eugene Ozhinsky, Sharmila Majumdar, Valentina Pedoia

    Abstract: Purpose: Hip fractures are a common cause of morbidity and mortality. Automatic identification and classification of hip fractures using deep learning may improve outcomes by reducing diagnostic errors and decreasing time to operation. Methods: Hip and pelvic radiographs from 1118 studies were reviewed and 3034 hips were labeled via bounding boxes and classified as normal, displaced femoral neck f… ▽ More

    Submitted 10 September, 2019; originally announced September 2019.

    Comments: Presented at Orthopaedic Research Society, Austin, TX, Feb 2, 2019, currently in submission for publication

  27. arXiv:1908.02612  [pdf, ps, other

    eess.AS cs.LG cs.SD stat.ML

    An End-to-End Text-independent Speaker Verification Framework with a Keyword Adversarial Network

    Authors: Sungrack Yun, Janghoon Cho, Jungyun Eum, Wonil Chang, Kyuwoong Hwang

    Abstract: This paper presents an end-to-end text-independent speaker verification framework by jointly considering the speaker embedding (SE) network and automatic speech recognition (ASR) network. The SE network learns to output an embedding vector which distinguishes the speaker characteristics of the input utterance, while the ASR network learns to recognize the phonetic context of the input. In training… ▽ More

    Submitted 6 August, 2019; originally announced August 2019.

    Comments: Will be appeared in INTERSPEECH 2019

  28. arXiv:1908.01924  [pdf, ps, other

    cs.PF cs.DC

    Edge AIBench: Towards Comprehensive End-to-end Edge Computing Benchmarking

    Authors: Tianshu Hao, Yunyou Huang, Xu Wen, Wanling Gao, Fan Zhang, Chen Zheng, Lei Wang, Hainan Ye, Kai Hwang, Zujie Ren, Jianfeng Zhan

    Abstract: In edge computing scenarios, the distribution of data and collaboration of workloads on different layers are serious concerns for performance, privacy, and security issues. So for edge computing benchmarking, we must take an end-to-end view, considering all three layers: client-side devices, edge computing layer, and cloud servers. Unfortunately, the previous work ignores this most important point… ▽ More

    Submitted 5 August, 2019; originally announced August 2019.

  29. arXiv:1611.06342  [pdf, other

    cs.LG cs.NE

    Quantized neural network design under weight capacity constraint

    Authors: Sungho Shin, Kyuyeon Hwang, Wonyong Sung

    Abstract: The complexity of deep neural network algorithms for hardware implementation can be lowered either by scaling the number of units or reducing the word-length of weights. Both approaches, however, can accompany the performance degradation although many types of research are conducted to relieve this problem. Thus, it is an important question which one, between the network size scaling and the weigh… ▽ More

    Submitted 19 November, 2016; originally announced November 2016.

    Comments: This paper is accepted at NIPS 2016 workshop on Efficient Methods for Deep Neural Networks (EMDNN). arXiv admin note: text overlap with arXiv:1511.06488

  30. arXiv:1610.00552  [pdf, other

    cs.CL cs.LG cs.SD

    FPGA-Based Low-Power Speech Recognition with Recurrent Neural Networks

    Authors: Minjae Lee, Kyuyeon Hwang, **hwan Park, Sungwook Choi, Sungho Shin, Wonyong Sung

    Abstract: In this paper, a neural network based real-time speech recognition (SR) system is developed using an FPGA for very low-power operation. The implemented system employs two recurrent neural networks (RNNs); one is a speech-to-character RNN for acoustic modeling (AM) and the other is for character-level language modeling (LM). The system also employs a statistical word-level LM to improve the recogni… ▽ More

    Submitted 30 September, 2016; originally announced October 2016.

    Comments: Accepted to SiPS 2016

  31. arXiv:1609.03777  [pdf, ps, other

    cs.LG cs.CL cs.NE

    Character-Level Language Modeling with Hierarchical Recurrent Neural Networks

    Authors: Kyuyeon Hwang, Wonyong Sung

    Abstract: Recurrent neural network (RNN) based character-level language models (CLMs) are extremely useful for modeling out-of-vocabulary words by nature. However, their performance is generally much worse than the word-level language models (WLMs), since CLMs need to consider longer history of tokens to properly predict the next one. We address this problem by proposing hierarchical RNN architectures, whic… ▽ More

    Submitted 2 February, 2017; v1 submitted 13 September, 2016; originally announced September 2016.

    Comments: Submitted to NIPS 2016 on May 20, 2016 (v1), accepted to ICASSP 2017 (v2)

  32. arXiv:1608.04077  [pdf, other

    cs.LG

    Generative Knowledge Transfer for Neural Language Models

    Authors: Sungho Shin, Kyuyeon Hwang, Wonyong Sung

    Abstract: In this paper, we propose a generative knowledge transfer technique that trains an RNN based language model (student network) using text and output probabilities generated from a previously trained RNN (teacher network). The text generation can be conducted by either the teacher or the student network. We can also improve the performance by taking the ensemble of soft labels obtained from multiple… ▽ More

    Submitted 28 February, 2017; v1 submitted 14 August, 2016; originally announced August 2016.

  33. Character-Level Incremental Speech Recognition with Recurrent Neural Networks

    Authors: Kyuyeon Hwang, Wonyong Sung

    Abstract: In real-time speech recognition applications, the latency is an important issue. We have developed a character-level incremental speech recognition (ISR) system that responds quickly even during the speech, where the hypotheses are gradually improved while the speaking proceeds. The algorithm employs a speech-to-character unidirectional recurrent neural network (RNN), which is end-to-end trained w… ▽ More

    Submitted 28 January, 2016; v1 submitted 25 January, 2016; originally announced January 2016.

    Comments: To appear in ICASSP 2016

  34. arXiv:1512.08903  [pdf, ps, other

    cs.CL cs.LG cs.NE

    Online Keyword Spotting with a Character-Level Recurrent Neural Network

    Authors: Kyuyeon Hwang, Minjae Lee, Wonyong Sung

    Abstract: In this paper, we propose a context-aware keyword spotting model employing a character-level recurrent neural network (RNN) for spoken term detection in continuous speech. The RNN is end-to-end trained with connectionist temporal classification (CTC) to generate the probabilities of character and word-boundary labels. There is no need for the phonetic transcription, senone modeling, or system dict… ▽ More

    Submitted 30 December, 2015; originally announced December 2015.

  35. arXiv:1512.08571  [pdf

    cs.NE cs.LG stat.ML

    Structured Pruning of Deep Convolutional Neural Networks

    Authors: Sajid Anwar, Kyuyeon Hwang, Wonyong Sung

    Abstract: Real time application of deep learning algorithms is often hindered by high computational complexity and frequent memory accesses. Network pruning is a promising technique to solve this problem. However, pruning usually results in irregular network connections that not only demand extra representation efforts but also do not fit well on parallel computation. We introduce structured sparsity at var… ▽ More

    Submitted 28 December, 2015; originally announced December 2015.

    Comments: 11 pages, 8 figures, 1 table

  36. Fixed-Point Performance Analysis of Recurrent Neural Networks

    Authors: Sungho Shin, Kyuyeon Hwang, Wonyong Sung

    Abstract: Recurrent neural networks have shown excellent performance in many applications, however they require increased complexity in hardware or software based implementations. The hardware complexity can be much lowered by minimizing the word-length of weights and signals. This work analyzes the fixed-point performance of recurrent neural networks using a retrain based quantization method. The quantizat… ▽ More

    Submitted 27 September, 2016; v1 submitted 4 December, 2015; originally announced December 2015.

  37. arXiv:1511.06841  [pdf, ps, other

    cs.LG cs.NE

    Online Sequence Training of Recurrent Neural Networks with Connectionist Temporal Classification

    Authors: Kyuyeon Hwang, Wonyong Sung

    Abstract: Connectionist temporal classification (CTC) based supervised sequence training of recurrent neural networks (RNNs) has shown great success in many machine learning areas including end-to-end speech and handwritten character recognition. For the CTC training, however, it is required to unroll (or unfold) the RNN by the length of an input sequence. This unrolling requires a lot of memory and hinders… ▽ More

    Submitted 2 February, 2017; v1 submitted 21 November, 2015; originally announced November 2015.

    Comments: Final version: Kyuyeon Hwang and Wonyong Sung, "Sequence to Sequence Training of CTC-RNNs with Partial Windowing," Proceedings of The 33rd International Conference on Machine Learning, pp. 2178-2187, 2016. URL: http://www.jmlr.org/proceedings/papers/v48/hwanga16.html

  38. arXiv:1511.06488  [pdf, other

    cs.LG cs.NE

    Resiliency of Deep Neural Networks under Quantization

    Authors: Wonyong Sung, Sungho Shin, Kyuyeon Hwang

    Abstract: The complexity of deep neural network algorithms for hardware implementation can be much lowered by optimizing the word-length of weights and signals. Direct quantization of floating-point weights, however, does not show good performance when the number of bits assigned is small. Retraining of quantized networks has been developed to relieve this problem. In this work, the effects of retraining ar… ▽ More

    Submitted 7 January, 2016; v1 submitted 19 November, 2015; originally announced November 2015.

  39. Single stream parallelization of generalized LSTM-like RNNs on a GPU

    Authors: Kyuyeon Hwang, Wonyong Sung

    Abstract: Recurrent neural networks (RNNs) have shown outstanding performance on processing sequence data. However, they suffer from long training time, which demands parallel implementations of the training procedure. Parallelization of the training algorithms for RNNs are very challenging because internal recurrent paths form dependencies between two different time frames. In this paper, we first propose… ▽ More

    Submitted 10 March, 2015; originally announced March 2015.

    Comments: Accepted by the 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015