Skip to main content

Showing 1–34 of 34 results for author: Kang, M

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.13502  [pdf, other

    cs.CL cs.SD eess.AS

    ManWav: The First Manchu ASR Model

    Authors: Jean Seo, Minha Kang, Sungjoo Byun, Sangah Lee

    Abstract: This study addresses the widening gap in Automatic Speech Recognition (ASR) research between high resource and extremely low resource languages, with a particular focus on Manchu, a critically endangered language. Manchu exemplifies the challenges faced by marginalized linguistic communities in accessing state-of-the-art technologies. In a pioneering effort, we introduce the first-ever Manchu ASR… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: ACL2024/Field Matters

  2. arXiv:2405.19346  [pdf, other

    eess.SP cs.AI cs.LG

    Subject-Adaptive Transfer Learning Using Resting State EEG Signals for Cross-Subject EEG Motor Imagery Classification

    Authors: Sion An, Myeongkyun Kang, Soopil Kim, Philip Chikontwe, Li Shen, Sang Hyun Park

    Abstract: Electroencephalography (EEG) motor imagery (MI) classification is a fundamental, yet challenging task due to the variation of signals between individuals i.e., inter-subject variability. Previous approaches try to mitigate this using task-specific (TS) EEG signals from the target subject in training. However, recording TS EEG signals requires time and limits its applicability in various fields. In… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: Early Accepted at MICCAI 2024

  3. arXiv:2404.14019  [pdf

    cs.CV eess.SP stat.AP

    A Multimodal Feature Distillation with CNN-Transformer Network for Brain Tumor Segmentation with Incomplete Modalities

    Authors: Ming Kang, Fung Fung Ting, Raphaël C. -W. Phan, Zongyuan Ge, Chee-Ming Ting

    Abstract: Existing brain tumor segmentation methods usually utilize multiple Magnetic Resonance Imaging (MRI) modalities in brain tumor images for segmentation, which can achieve better segmentation performance. However, in clinical applications, some modalities are missing due to resource constraints, leading to severe degradation in the performance of methods applying complete modality segmentation. In th… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    MSC Class: 68U10 (Primary) 68T10; 68T07; 62P10 (Secondary) ACM Class: I.4.6; I.5.1; J.3

  4. arXiv:2404.13388  [pdf

    eess.IV cs.CV cs.LG

    Diagnosis of Multiple Fundus Disorders Amidst a Scarcity of Medical Experts Via Self-supervised Machine Learning

    Authors: Yong Liu, Mengtian Kang, Shuo Gao, Chi Zhang, Ying Liu, Shiming Li, Yue Qi, Arokia Nathan, Wenjun Xu, Chenyu Tang, Edoardo Occhipinti, Mayinuer Yusufu, Ningli Wang, Weiling Bai, Luigi Occhipinti

    Abstract: Fundus diseases are major causes of visual impairment and blindness worldwide, especially in underdeveloped regions, where the shortage of ophthalmologists hinders timely diagnosis. AI-assisted fundus image analysis has several advantages, such as high accuracy, reduced workload, and improved accessibility, but it requires a large amount of expert-annotated data to build reliable models. To addres… ▽ More

    Submitted 23 April, 2024; v1 submitted 20 April, 2024; originally announced April 2024.

  5. arXiv:2404.13386  [pdf

    eess.IV cs.CV cs.LG

    SSVT: Self-Supervised Vision Transformer For Eye Disease Diagnosis Based On Fundus Images

    Authors: Jiaqi Wang, Mengtian Kang, Yong Liu, Chi Zhang, Ying Liu, Shiming Li, Yue Qi, Wenjun Xu, Chenyu Tang, Edoardo Occhipinti, Mayinuer Yusufu, Ningli Wang, Weiling Bai, Shuo Gao, Luigi G. Occhipinti

    Abstract: Machine learning-based fundus image diagnosis technologies trigger worldwide interest owing to their benefits such as reducing medical resource power and providing objective evaluation results. However, current methods are commonly based on supervised methods, bringing in a heavy workload to biomedical staff and hence suffering in expanding effective databases. To address this issue, in this artic… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: ISBI 2024

  6. arXiv:2403.01137  [pdf, other

    cs.CV cs.GR eess.IV

    Neural radiance fields-based holography [Invited]

    Authors: Minsung Kang, Fan Wang, Kai Kumano, Tomoyoshi Ito, Tomoyoshi Shimobaba

    Abstract: This study presents a novel approach for generating holograms based on the neural radiance fields (NeRF) technique. Generating three-dimensional (3D) data is difficult in hologram computation. NeRF is a state-of-the-art technique for 3D light-field reconstruction from 2D images based on volume rendering. The NeRF can rapidly predict new-view images that do not include a training dataset. In this s… ▽ More

    Submitted 9 May, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

  7. arXiv:2402.00148  [pdf, other

    physics.optics eess.IV physics.med-ph

    Ptychographic lensless coherent endomicroscopy through a flexible fiber bundle

    Authors: Gil Weinberg, Munkyu Kang, Wonjun Choi, Wonshik Choi, Ori Katz

    Abstract: Conventional fiber-bundle-based endoscopes allow minimally invasive imaging through flexible multi-core fiber (MCF) bundles by placing a miniature lens at the distal tip and using each core as an imaging pixel. In recent years, lensless imaging through MCFs was made possible by correcting the core-to-core phase distortions pre-measured in a calibration procedure. However, temporally varying wavefr… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

  8. arXiv:2401.16886  [pdf

    cs.CV eess.SP stat.AP

    CAFCT: Contextual and Attentional Feature Fusions of Convolutional Neural Networks and Transformer for Liver Tumor Segmentation

    Authors: Ming Kang, Chee-Ming Ting, Fung Fung Ting, Raphaël Phan

    Abstract: Medical image semantic segmentation techniques can help identify tumors automatically from computed tomography (CT) scans. In this paper, we propose a Contextual and Attentional feature Fusions enhanced Convolutional Neural Network (CNN) and Transformer hybrid network (CAFCT) model for liver tumor segmentation. In the proposed model, three other modules are introduced in the network architecture:… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    MSC Class: 68T07; 68T10; 68U10; 62P10 ACM Class: I.4.6; I.5.1; J.3

  9. arXiv:2312.06458  [pdf

    cs.CV eess.SP stat.AP

    ASF-YOLO: A Novel YOLO Model with Attentional Scale Sequence Fusion for Cell Instance Segmentation

    Authors: Ming Kang, Chee-Ming Ting, Fung Fung Ting, Raphaël C. -W. Phan

    Abstract: We propose a novel Attentional Scale Sequence Fusion based You Only Look Once (YOLO) framework (ASF-YOLO) which combines spatial and scale features for accurate and fast cell instance segmentation. Built on the YOLO segmentation framework, we employ the Scale Sequence Feature Fusion (SSFF) module to enhance the multi-scale information extraction capability of the network, and the Triple Feature En… ▽ More

    Submitted 10 May, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    MSC Class: 68U10 (Primary) 68T10; 68T07; 62P10 (Secondary) ACM Class: I.4.6; I.5.1; J.3

    Journal ref: Image Vis. Comput. 147 (2024) 105057

  10. arXiv:2311.05844  [pdf, other

    cs.CV cs.AI cs.CL cs.MM cs.SD eess.AS

    Face-StyleSpeech: Improved Face-to-Voice latent map** for Natural Zero-shot Speech Synthesis from a Face Image

    Authors: Minki Kang, Wooseok Han, Eunho Yang

    Abstract: Generating a voice from a face image is crucial for develo** virtual humans capable of interacting using their unique voices, without relying on pre-recorded human speech. In this paper, we propose Face-StyleSpeech, a zero-shot Text-To-Speech (TTS) synthesis model that generates natural speech conditioned on a face image rather than reference speech. We hypothesize that learning both speaker ide… ▽ More

    Submitted 25 September, 2023; originally announced November 2023.

    Comments: Submitted to ICASSP 2024

  11. arXiv:2309.12585  [pdf

    cs.CV eess.SP stat.AP

    BGF-YOLO: Enhanced YOLOv8 with Multiscale Attentional Feature Fusion for Brain Tumor Detection

    Authors: Ming Kang, Chee-Ming Ting, Fung Fung Ting, Raphaël C. -W. Phan

    Abstract: You Only Look Once (YOLO)-based object detectors have shown remarkable accuracy for automated brain tumor detection. In this paper, we develop a novel BGF-YOLO architecture by incorporating Bi-level Routing Attention (BRA), Generalized feature pyramid networks (GFPN), and Fourth detecting head into YOLOv8. BGF-YOLO contains an attention mechanism to focus more on important features, and feature py… ▽ More

    Submitted 25 September, 2023; v1 submitted 21 September, 2023; originally announced September 2023.

    MSC Class: 68U10 (Primary) 68T10; 68T07; 62P10 (Secondary) ACM Class: I.4.6; I.5.1; J.3

  12. arXiv:2307.16412  [pdf

    cs.CV eess.SP stat.AP stat.ML

    RCS-YOLO: A Fast and High-Accuracy Object Detector for Brain Tumor Detection

    Authors: Ming Kang, Chee-Ming Ting, Fung Fung Ting, Raphaël C. -W. Phan

    Abstract: With an excellent balance between speed and accuracy, cutting-edge YOLO frameworks have become one of the most efficient algorithms for object detection. However, the performance of using YOLO networks is scarcely investigated in brain tumor detection. We propose a novel YOLO architecture with Reparameterized Convolution based on channel Shuffle (RCS-YOLO). We present RCS and a One-Shot Aggregatio… ▽ More

    Submitted 3 October, 2023; v1 submitted 31 July, 2023; originally announced July 2023.

    MSC Class: 68U10 (Primary) 68T10; 68T07; 62P10 (Secondary) ACM Class: I.4.6; I.5.1; J.3

    Journal ref: In MICCAI 2023 LNCS vol. 14223 600-610 (2023)

  13. arXiv:2307.04377  [pdf, other

    cs.SD eess.AS

    HCLAS-X: Hierarchical and Cascaded Lyrics Alignment System Using Multimodal Cross-Correlation

    Authors: Minsung Kang, Soochul Park, Keunwoo Choi

    Abstract: In this work, we address the challenge of lyrics alignment, which involves aligning the lyrics and vocal components of songs. This problem requires the alignment of two distinct modalities, namely text and audio. To overcome this challenge, we propose a model that is trained in a supervised manner, utilizing the cross-correlation matrix of latent representations between vocals and lyrics. Our syst… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

  14. arXiv:2307.04292  [pdf, other

    eess.AS cs.AI

    A Demand-Driven Perspective on Generative Audio AI

    Authors: Sangshin Oh, Minsung Kang, Hyeongi Moon, Keunwoo Choi, Ben Sangbae Chon

    Abstract: To achieve successful deployment of AI research, it is crucial to understand the demands of the industry. In this paper, we present the results of a survey conducted with professional audio engineers, in order to determine research priorities and define various research tasks. We also summarize the current challenges in audio quality and controllability based on the survey. Our analysis emphasizes… ▽ More

    Submitted 9 July, 2023; originally announced July 2023.

    Comments: 10 pages, 7 figures

  15. arXiv:2306.14590  [pdf

    cs.CV eess.SP stat.AP stat.ML

    CST-YOLO: A Novel Method for Blood Cell Detection Based on Improved YOLOv7 and CNN-Swin Transformer

    Authors: Ming Kang, Chee-Ming Ting, Fung Fung Ting, Raphaël Phan

    Abstract: Blood cell detection is a typical small-scale object detection problem in computer vision. In this paper, we propose a CST-YOLO model for blood cell detection based on YOLOv7 architecture and enhance it with the CNN-Swin Transformer (CST), which is a new attempt at CNN-Transformer fusion. We also introduce three other useful modules: Weighted Efficient Layer Aggregation Networks (W-ELAN), Multisca… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

    MSC Class: 68T07; 68T10; 68U10; 62P10 ACM Class: I.4.6; I.5.1; J.3

  16. arXiv:2306.13020  [pdf

    eess.IV cs.AI cs.CV

    Toward Automated Detection of Microbleeds with Anatomical Scale Localization: A Complete Clinical Diagnosis Support Using Deep Learning

    Authors: Jun-Ho Kim, Young Noh, Haejoon Lee, Seul Lee, Woo-Ram Kim, Koung Mi Kang, Eung Yeop Kim, Mohammed A. Al-masni, Dong-Hyun Kim

    Abstract: Cerebral Microbleeds (CMBs) are chronic deposits of small blood products in the brain tissues, which have explicit relation to various cerebrovascular diseases depending on their anatomical location, including cognitive decline, intracerebral hemorrhage, and cerebral infarction. However, manual detection of CMBs is a time-consuming and error-prone process because of their sparse and tiny structura… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: 16 pages, 10 figures,3 tables

  17. arXiv:2306.09807  [pdf, other

    eess.AS cs.LG cs.SD

    FALL-E: A Foley Sound Synthesis Model and Strategies

    Authors: Minsung Kang, Sangshin Oh, Hyeongi Moon, Kyungyun Lee, Ben Sangbae Chon

    Abstract: This paper introduces FALL-E, a foley synthesis system and its training/inference strategies. The FALL-E model employs a cascaded approach comprising low-resolution spectrogram generation, spectrogram super-resolution, and a vocoder. We trained every sound-related model from scratch using our extensive datasets, and utilized a pre-trained language model. We conditioned the model with dataset-speci… ▽ More

    Submitted 10 August, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: 5 pages, 3 figures

  18. arXiv:2305.13831  [pdf, other

    cs.SD cs.CL eess.AS

    ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models

    Authors: Minki Kang, Wooseok Han, Sung Ju Hwang, Eunho Yang

    Abstract: Emotional Text-To-Speech (TTS) is an important task in the development of systems (e.g., human-like dialogue agents) that require natural and emotional speech. Existing approaches, however, only aim to produce emotional TTS for seen speakers during training, without consideration of the generalization to unseen speakers. In this paper, we propose ZET-Speech, a zero-shot adaptive emotion-controllab… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted by INTERSPEECH 2023

  19. arXiv:2305.00892  [pdf, other

    eess.IV physics.med-ph

    A Novel Low-Rank Tensor Method for Undersampling Artifact Removal in Respiratory Motion-Resolved Multi-Echo 3D Cones MRI

    Authors: Seongho Jeong, MungSoo Kang, Gerald Behr, Heechul Jeong, Youngwook Kee

    Abstract: We propose a novel low-rank tensor method for respiratory motion-resolved multi-echo image reconstruction. The key idea is to construct a 3-way image tensor (space $\times$ echo $\times$ motion state) from the conventional gridding reconstruction of highly undersampled multi-echo k-space raw data, and exploit low-rank tensor structure to separate it from undersampling artifacts. Healthy volunteers… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

  20. arXiv:2303.15144  [pdf, other

    eess.IV

    Joint Multi-Echo/Respiratory Motion-Resolved Compressed Sensing Reconstruction of Free-Breathing Non-Cartesian Abdominal MRI

    Authors: Youngwook Kee, MungSoo Kang, Seongho Jeong, Gerald Behr

    Abstract: We propose a novel respiratory motion-resolved MR image reconstruction method that jointly treats multi-echo k-space raw data. Continuously acquired non-Cartesian multi-echo/multi-coil k-space data with free breathing are sorted/binned into the motion states from end-expiratory to end-inspiratory phases based on a respiratory motion signal. Temporal total variation applied to the motion state dime… ▽ More

    Submitted 3 April, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

  21. arXiv:2211.09383  [pdf, other

    eess.AS cs.AI cs.SD

    Grad-StyleSpeech: Any-speaker Adaptive Text-to-Speech Synthesis with Diffusion Models

    Authors: Minki Kang, Dongchan Min, Sung Ju Hwang

    Abstract: There has been a significant progress in Text-To-Speech (TTS) synthesis technology in recent years, thanks to the advancement in neural generative modeling. However, existing methods on any-speaker adaptive TTS have achieved unsatisfactory performance, due to their suboptimal accuracy in mimicking the target speakers' styles. In this work, we present Grad-StyleSpeech, which is an any-speaker adapt… ▽ More

    Submitted 13 March, 2023; v1 submitted 17 November, 2022; originally announced November 2022.

    Comments: ICASSP 2023

  22. arXiv:2210.11946  [pdf, other

    eess.SY cs.CV

    RT-MOT: Confidence-Aware Real-Time Scheduling Framework for Multi-Object Tracking Tasks

    Authors: Donghwa Kang, Seunghoon Lee, Hoon Sung Chwa, Seung-Hwan Bae, Chang Mook Kang, **kyu Lee, Hyeongboo Baek

    Abstract: Different from existing MOT (Multi-Object Tracking) techniques that usually aim at improving tracking accuracy and average FPS, real-time systems such as autonomous vehicles necessitate new requirements of MOT under limited computing resources: (R1) guarantee of timely execution and (R2) high tracking accuracy. In this paper, we propose RT-MOT, a novel system design for multiple MOT tasks, which a… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: Accepted to 2022 Real-Time Systems Symposium (RTSS)

  23. arXiv:2207.10760  [pdf, ps, other

    cs.SD cs.AI cs.MM eess.AS

    A Proposal for Foley Sound Synthesis Challenge

    Authors: Keunwoo Choi, Sangshin Oh, Minsung Kang, Brian McFee

    Abstract: "Foley" refers to sound effects that are added to multimedia during post-production to enhance its perceived acoustic properties, e.g., by simulating the sounds of footsteps, ambient environmental sounds, or visible objects on the screen. While foley is traditionally produced by foley artists, there is increasing interest in automatic or machine-assisted techniques building upon recent advances in… ▽ More

    Submitted 21 July, 2022; originally announced July 2022.

  24. arXiv:2206.11558  [pdf, other

    eess.AS cs.SD

    Adversarial Multi-Task Learning for Disentangling Timbre and Pitch in Singing Voice Synthesis

    Authors: Tae-Woo Kim, Min-Su Kang, Gyeong-Hoon Lee

    Abstract: Recently, deep learning-based generative models have been introduced to generate singing voices. One approach is to predict the parametric vocoder features consisting of explicit speech parameters. This approach has the advantage that the meaning of each feature is explicitly distinguished. Another approach is to predict mel-spectrograms for a neural vocoder. However, parametric vocoders have limi… ▽ More

    Submitted 13 June, 2024; v1 submitted 23 June, 2022; originally announced June 2022.

    Comments: Accepted by Interspeech 2022

  25. arXiv:2206.09479  [pdf, other

    cs.CV cs.LG eess.IV

    StudioGAN: A Taxonomy and Benchmark of GANs for Image Synthesis

    Authors: Minguk Kang, Joonghyuk Shin, Jaesik Park

    Abstract: Generative Adversarial Network (GAN) is one of the state-of-the-art generative models for realistic image synthesis. While training and evaluating GAN becomes increasingly important, the current GAN research ecosystem does not provide reliable benchmarks for which the evaluation is conducted consistently and fairly. Furthermore, because there are few validated GAN implementations, researchers devo… ▽ More

    Submitted 18 August, 2023; v1 submitted 19 June, 2022; originally announced June 2022.

    Comments: 32 pages, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI, 2023)

  26. arXiv:2106.11171  [pdf, other

    eess.AS cs.AI cs.LG cs.SD eess.SP

    UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control

    Authors: Minsu Kang, Sungjae Kim, Injung Kim

    Abstract: We propose a novel high-fidelity expressive speech synthesis model, UniTTS, that learns and controls overlap** style attributes avoiding interference. UniTTS represents multiple style attributes in a single unified embedding space by the residuals between the phoneme embeddings before and after applying the attributes. The proposed method is especially effective in controlling multiple attribute… ▽ More

    Submitted 28 February, 2022; v1 submitted 21 June, 2021; originally announced June 2021.

    Comments: 20 pages, 11 figures

  27. arXiv:2105.11711  [pdf, other

    cs.CV eess.IV

    High-Frequency aware Perceptual Image Enhancement

    Authors: Hyungmin Roh, Myungjoo Kang

    Abstract: In this paper, we introduce a novel deep neural network suitable for multi-scale analysis and propose efficient model-agnostic methods that help the network extract information from high-frequency domains to reconstruct clearer images. Our model can be applied to multi-scale image enhancement problems including denoising, deblurring and single image super-resolution. Experiments on SIDD, Flickr2K,… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

  28. arXiv:2104.00624  [pdf, ps, other

    eess.AS cs.AI cs.LG

    Fast DCTTS: Efficient Deep Convolutional Text-to-Speech

    Authors: Minsu Kang, Jihyun Lee, Simin Kim, Injung Kim

    Abstract: We propose an end-to-end speech synthesizer, Fast DCTTS, that synthesizes speech in real time on a single CPU thread. The proposed model is composed of a carefully-tuned lightweight network designed by applying multiple network reduction and fidelity improvement techniques. In addition, we propose a novel group highway activation that can compromise between computational efficiency and the regular… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

    Comments: 5 pages, 1 figure, to be published in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021

  29. arXiv:2103.14255  [pdf, other

    eess.IV cs.CV

    Mixing-AdaSIN: Constructing a De-biased Dataset using Adaptive Structural Instance Normalization and Texture Mixing

    Authors: Myeongkyun Kang, Philip Chikontwe, Miguel Luna, Kyung Soo Hong, June Hong Ahn, Sang Hyun Park

    Abstract: Following the pandemic outbreak, several works have proposed to diagnose COVID-19 with deep learning in computed tomography (CT); reporting performance on-par with experts. However, models trained/tested on the same in-distribution data may rely on the inherent data biases for successful prediction, failing to generalize on out-of-distribution samples or CT with different scanning protocols. Early… ▽ More

    Submitted 31 July, 2021; v1 submitted 26 March, 2021; originally announced March 2021.

  30. arXiv:2102.06536  [pdf, other

    cs.AR eess.IV eess.SP

    CrossStack: A 3-D Reconfigurable RRAM Crossbar Inference Engine

    Authors: Jason K. Eshraghian, Kyoungrok Cho, Sung Mo Kang

    Abstract: Deep neural network inference accelerators are rapidly growing in importance as we turn to massively parallelized processing beyond GPUs and ASICs. The dominant operation in feedforward inference is the multiply-and-accumlate process, where each column in a crossbar generates the current response of a single neuron. As a result, memristor crossbar arrays parallelize inference and image processing… ▽ More

    Submitted 7 February, 2021; originally announced February 2021.

    Comments: 5 pages, 4 figures

  31. Deep Metric Learning-based Image Retrieval System for Chest Radiograph and its Clinical Applications in COVID-19

    Authors: Aoxiao Zhong, Xiang Li, Dufan Wu, Hui Ren, Kyungsang Kim, Younggon Kim, Varun Buch, Nir Neumark, Bernardo Bizzo, Won Young Tak, Soo Young Park, Yu Rim Lee, Min Kyu Kang, Jung Gil Park, Byung Seok Kim, Woo ** Chung, Ning Guo, Ittai Dayan, Mannudeep K. Kalra, Quanzheng Li

    Abstract: In recent years, deep learning-based image analysis methods have been widely applied in computer-aided detection, diagnosis and prognosis, and has shown its value during the public health crisis of the novel coronavirus disease 2019 (COVID-19) pandemic. Chest radiograph (CXR) has been playing a crucial role in COVID-19 patient triaging, diagnosing and monitoring, particularly in the United States.… ▽ More

    Submitted 25 November, 2020; originally announced December 2020.

    Comments: Aoxiao Zhong and Xiang Li contribute equally to this work

    Journal ref: Medical Image Analysis. 70 (2021) 101993

  32. arXiv:2009.12610  [pdf

    eess.IV cs.CV cs.LG

    Deep Learning-based Four-region Lung Segmentation in Chest Radiography for COVID-19 Diagnosis

    Authors: Young-Gon Kim, Kyungsang Kim, Dufan Wu, Hui Ren, Won Young Tak, Soo Young Park, Yu Rim Lee, Min Kyu Kang, Jung Gil Park, Byung Seok Kim, Woo ** Chung, Mannudeep K. Kalra, Quanzheng Li

    Abstract: Purpose. Imaging plays an important role in assessing severity of COVID 19 pneumonia. However, semantic interpretation of chest radiography (CXR) findings does not include quantitative description of radiographic opacities. Most current AI assisted CXR image analysis framework do not quantify for regional variations of disease. To address these, we proposed a four region lung segmentation method t… ▽ More

    Submitted 26 September, 2020; originally announced September 2020.

  33. arXiv:2009.04991  [pdf, ps, other

    eess.SP cs.CY cs.LG

    Proximity Sensing: Modeling and Understanding Noisy RSSI-BLE Signals and Other Mobile Sensor Data for Digital Contact Tracing

    Authors: Sheshank Shankar, Rishank Kanaparti, Ayush Chopra, Rohan Sukumaran, Parth Patwa, Myungsun Kang, Abhishek Singh, Kevin P. McPherson, Ramesh Raskar

    Abstract: As we await a vaccine, social-distancing via efficient contact tracing has emerged as the primary health strategy to dampen the spread of COVID-19. To enable efficient digital contact tracing, we present a novel system to estimate pair-wise individual proximity, via a joint model of Bluetooth Low Energy (BLE) signals with other on-device sensors (accelerometer, magnetometer, gyroscope). We explore… ▽ More

    Submitted 24 December, 2020; v1 submitted 3 September, 2020; originally announced September 2020.

    Comments: Accepted to IEEE/ICACT' 2021: International Conference on Advanced Communication Technology. Also presented at the Machine Learning for Mobile Health workshop at NeurIPS 2020

  34. arXiv:2005.04117  [pdf, other

    cs.CV eess.IV

    NTIRE 2020 Challenge on Real Image Denoising: Dataset, Methods and Results

    Authors: Abdelrahman Abdelhamed, Mahmoud Afifi, Radu Timofte, Michael S. Brown, Yue Cao, Zhilu Zhang, Wangmeng Zuo, Xiaoling Zhang, Jiye Liu, Wendong Chen, Changyuan Wen, Meng Liu, Shuailin Lv, Yunchao Zhang, Zhihong Pan, Baopu Li, Teng Xi, Yanwen Fan, Xiyu Yu, Gang Zhang, **gtuo Liu, Junyu Han, Errui Ding, Songhyun Yu, Bumjun Park , et al. (65 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2020 challenge on real image denoising with focus on the newly introduced dataset, the proposed methods and their results. The challenge is a new version of the previous NTIRE 2019 challenge on real image denoising that was based on the SIDD benchmark. This challenge is based on a newly collected validation and testing image datasets, and hence, named SIDD+. This chall… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.