-
A Novel Approach to Evaluating Battery Charger Controller Design with Nonlinear PID Controller in an Extendable CHIL Setup
Authors:
Shervin Salehi Rad,
Micheal Muhlbaier,
Oleg Fishman,
Javad Chevinly,
Elias Nadi,
Hua Zhang,
Fei Lu
Abstract:
The design and development of power electronics converters pose a multitude of challenges. The evaluation of power electronics converters, particularly when operating at high power levels, presents a significant task, offering designers a deeper understanding of the functionality. Several methodologies have been devised to conduct hardware-in-the-loop (HIL) tests, which are classified into two mai…
▽ More
The design and development of power electronics converters pose a multitude of challenges. The evaluation of power electronics converters, particularly when operating at high power levels, presents a significant task, offering designers a deeper understanding of the functionality. Several methodologies have been devised to conduct hardware-in-the-loop (HIL) tests, which are classified into two main categories: controller hardware-in-the-loop (CHIL) and power hardware-in-the-loop (PHIL) tests. This paper explores the advantages and drawbacks of these two approaches and introduces a straightforward and cost-effective CHIL method for initial proof of concept and risk-free controller training. This method is based on the interaction between MATLAB/Simulink and Python. To assess the operation of the proposed system, a modeled battery pack (96S1P) is charged using a modeled battery charger converter, employing a non-linear PID controller. In this scenario, the controller benefits from the CC-CV technique for charging the battery pack. The results are presented in the final section.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Gallium Nitride (GaN) based High-Power Multilevel H-Bridge Inverter for Wireless Power Transfer of Electric Vehicles
Authors:
Javad Chevinly,
Shervin Salehi Rad,
Elias Nadi,
Bogdan Proca,
John Wolgemuth,
Anthony Calabro,
Hua Zhang,
Fei Lu
Abstract:
This paper presents a design and implementation of a high-power Gallium Nitride (GaN)-based multilevel Hbridge inverter to excite wireless charging coils for the wireless power transfer of electric vehicles (EVs). Compared to the traditional conductive charging, wireless charging technology offers a safer and more convenient way to charge EVs. Due to the increasing demand of fast charging, high-po…
▽ More
This paper presents a design and implementation of a high-power Gallium Nitride (GaN)-based multilevel Hbridge inverter to excite wireless charging coils for the wireless power transfer of electric vehicles (EVs). Compared to the traditional conductive charging, wireless charging technology offers a safer and more convenient way to charge EVs. Due to the increasing demand of fast charging, high-power inverters play a crucial role in exciting the wireless charging coils within a wireless power transfer system. This paper details the system specifications for the wireless charging of EVs, providing theoretical analysis and a control strategy for the modular design of a 75-kW 3-level and 4-level Hbridge inverter. The goal is to deliver a low-distortion excitation voltage to the wireless charging coils. LTspice simulation results, including output voltage, Fast Fourier Transform (FFT) analysis for both 3-level and 4-level H-bridge inverters, are presented to validate the control strategy and demonstrate the elimination of output harmonic components in the modular design. A GaNbased inverter prototype was employed to deliver a 85-kHz power to the wireless charging pads of the wireless power transfer system. Experimental results at two different voltage and power levels, 100V-215W and 150V-489W, validate the successful performance of the GaN inverter in the wireless charging system.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Robust Dual-Modal Speech Keyword Spotting for XR Headsets
Authors:
Zhuojiang Cai,
Yuhan Ma,
Feng Lu
Abstract:
While speech interaction finds widespread utility within the Extended Reality (XR) domain, conventional vocal speech keyword spotting systems continue to grapple with formidable challenges, including suboptimal performance in noisy environments, impracticality in situations requiring silence, and susceptibility to inadvertent activations when others speak nearby. These challenges, however, can pot…
▽ More
While speech interaction finds widespread utility within the Extended Reality (XR) domain, conventional vocal speech keyword spotting systems continue to grapple with formidable challenges, including suboptimal performance in noisy environments, impracticality in situations requiring silence, and susceptibility to inadvertent activations when others speak nearby. These challenges, however, can potentially be surmounted through the cost-effective fusion of voice and lip movement information. Consequently, we propose a novel vocal-echoic dual-modal keyword spotting system designed for XR headsets. We devise two different modal fusion approches and conduct experiments to test the system's performance across diverse scenarios. The results show that our dual-modal system not only consistently outperforms its single-modal counterparts, demonstrating higher precision in both typical and noisy environments, but also excels in accurately identifying silent utterances. Furthermore, we have successfully applied the system in real-time demonstrations, achieving promising results. The code is available at https://github.com/caizhuojiang/VE-KWS.
△ Less
Submitted 26 January, 2024;
originally announced January 2024.
-
FS-BAND: A Frequency-Sensitive Banding Detector
Authors:
Zijian Chen,
Wei Sun,
Zicheng Zhang,
Ru Huang,
Fangfang Lu,
Xiongkuo Min,
Guangtao Zhai,
Wenjun Zhang
Abstract:
Banding artifact, as known as staircase-like contour, is a common quality annoyance that happens in compression, transmission, etc. scenarios, which largely affects the user's quality of experience (QoE). The banding distortion typically appears as relatively small pixel-wise variations in smooth backgrounds, which is difficult to analyze in the spatial domain but easily reflected in the frequency…
▽ More
Banding artifact, as known as staircase-like contour, is a common quality annoyance that happens in compression, transmission, etc. scenarios, which largely affects the user's quality of experience (QoE). The banding distortion typically appears as relatively small pixel-wise variations in smooth backgrounds, which is difficult to analyze in the spatial domain but easily reflected in the frequency domain. In this paper, we thereby study the banding artifact from the frequency aspect and propose a no-reference banding detection model to capture and evaluate banding artifacts, called the Frequency-Sensitive BANding Detector (FS-BAND). The proposed detector is able to generate a pixel-wise banding map with a perception correlated quality score. Experimental results show that the proposed FS-BAND method outperforms state-of-the-art image quality assessment (IQA) approaches with higher accuracy in banding classification task.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
Energy-Aware Routing Algorithm for Mobile Ground-to-Air Charging
Authors:
Bill Cai,
Fei Lu,
Lifeng Zhou
Abstract:
We investigate the problem of energy-constrained planning for a cooperative system of an Unmanned Ground Vehicles (UGV) and an Unmanned Aerial Vehicle (UAV). In scenarios where the UGV serves as a mobile base to ferry the UAV and as a charging station to recharge the UAV, we formulate a novel energy-constrained routing problem. To tackle this problem, we design an energy-aware routing algorithm, a…
▽ More
We investigate the problem of energy-constrained planning for a cooperative system of an Unmanned Ground Vehicles (UGV) and an Unmanned Aerial Vehicle (UAV). In scenarios where the UGV serves as a mobile base to ferry the UAV and as a charging station to recharge the UAV, we formulate a novel energy-constrained routing problem. To tackle this problem, we design an energy-aware routing algorithm, aiming to minimize the overall mission duration under the energy limitations of both vehicles. The algorithm first solves a Traveling Salesman Problem (TSP) to generate a guided tour. Then, it employs the Monte-Carlo Tree Search (MCTS) algorithm to refine the tour and generate paths for the two vehicles. We evaluate the performance of our algorithm through extensive simulations and a proof-of-concept experiment. The results show that our algorithm consistently achieves near-optimal mission time and maintains fast running time across a wide range of problem instances.
△ Less
Submitted 29 September, 2023;
originally announced October 2023.
-
Review of X-ray pulsar spacecraft autonomous navigation
Authors:
Yidi Wang,
Wei Zheng,
Shuangnan Zhang,
Minyu Ge,
Liansheng Li,
Kun Jiang,
Xiaoqian Chen,
Xiang Zhang,
Shijie Zheng,
Fangjun Lu
Abstract:
This article provides a review on X-ray pulsar-based navigation (XNAV). The review starts with the basic concept of XNAV, and briefly introduces the past, present and future projects concerning XNAV. This paper focuses on the advances of the key techniques supporting XNAV, including the navigation pulsar database, the X-ray detection system, and the pulse time of arrival estimation. Moreover, the…
▽ More
This article provides a review on X-ray pulsar-based navigation (XNAV). The review starts with the basic concept of XNAV, and briefly introduces the past, present and future projects concerning XNAV. This paper focuses on the advances of the key techniques supporting XNAV, including the navigation pulsar database, the X-ray detection system, and the pulse time of arrival estimation. Moreover, the methods to improve the estimation performance of XNAV are reviewed. Finally, some remarks on the future development of XNAV are provided.
△ Less
Submitted 9 April, 2023;
originally announced April 2023.
-
The Loop Game: Quality Assessment and Optimization for Low-Light Image Enhancement
Authors:
Baoliang Chen,
Lingyu Zhu,
Hanwei Zhu,
Wenhan Yang,
Fangbo Lu,
Shiqi Wang
Abstract:
There is an increasing consensus that the design and optimization of low light image enhancement methods need to be fully driven by perceptual quality. With numerous approaches proposed to enhance low-light images, much less work has been dedicated to quality assessment and quality optimization of low-light enhancement. In this paper, to close the gap between enhancement and assessment, we propose…
▽ More
There is an increasing consensus that the design and optimization of low light image enhancement methods need to be fully driven by perceptual quality. With numerous approaches proposed to enhance low-light images, much less work has been dedicated to quality assessment and quality optimization of low-light enhancement. In this paper, to close the gap between enhancement and assessment, we propose a loop enhancement framework that produces a clear picture of how the enhancement of low-light images could be optimized towards better visual quality. In particular, we create a large-scale database for QUality assessment Of The Enhanced LOw-Light Image (QUOTE-LOL), which serves as the foundation in studying and develo** objective quality assessment measures. The objective quality assessment measure plays a critical bridging role between visual quality and enhancement and is further incorporated in the optimization in learning the enhancement model towards perceptual optimally. Finally, we iteratively perform the enhancement and optimization tasks, enhancing the low-light images continuously. The superiority of the proposed scheme is validated based on various low-light scenes. The database as well as the code will be available.
△ Less
Submitted 20 February, 2022;
originally announced February 2022.
-
Explainable Diabetic Retinopathy Detection and Retinal Image Generation
Authors:
Yuhao Niu,
Lin Gu,
Yitian Zhao,
Feng Lu
Abstract:
Though deep learning has shown successful performance in classifying the label and severity stage of certain diseases, most of them give few explanations on how to make predictions. Inspired by Koch's Postulates, the foundation in evidence-based medicine (EBM) to identify the pathogen, we propose to exploit the interpretability of deep learning application in medical diagnosis. By determining and…
▽ More
Though deep learning has shown successful performance in classifying the label and severity stage of certain diseases, most of them give few explanations on how to make predictions. Inspired by Koch's Postulates, the foundation in evidence-based medicine (EBM) to identify the pathogen, we propose to exploit the interpretability of deep learning application in medical diagnosis. By determining and isolating the neuron activation patterns on which diabetic retinopathy (DR) detector relies to make decisions, we demonstrate the direct relation between the isolated neuron activation and lesions for a pathological explanation. To be specific, we first define novel pathological descriptors using activated neurons of the DR detector to encode both spatial and appearance information of lesions. Then, to visualize the symptom encoded in the descriptor, we propose Patho-GAN, a new network to synthesize medically plausible retinal images. By manipulating these descriptors, we could even arbitrarily control the position, quantity, and categories of generated lesions. We also show that our synthesized images carry the symptoms directly related to diabetic retinopathy diagnosis. Our generated images are both qualitatively and quantitatively superior to the ones by previous methods. Besides, compared to existing methods that take hours to generate an image, our second level speed endows the potential to be an effective solution for data augmentation.
△ Less
Submitted 1 July, 2021;
originally announced July 2021.
-
Overcoming the limitations of patch-based learning to detect cancer in whole slide images
Authors:
Ozan Ciga,
Tony Xu,
Sharon Nofech-Mozes,
Shawna Noy,
Fang-I Lu,
Anne L. Martel
Abstract:
Whole slide images (WSIs) pose unique challenges when training deep learning models. They are very large which makes it necessary to break each image down into smaller patches for analysis, image features have to be extracted at multiple scales in order to capture both detail and context, and extreme class imbalances may exist. Significant progress has been made in the analysis of these images, th…
▽ More
Whole slide images (WSIs) pose unique challenges when training deep learning models. They are very large which makes it necessary to break each image down into smaller patches for analysis, image features have to be extracted at multiple scales in order to capture both detail and context, and extreme class imbalances may exist. Significant progress has been made in the analysis of these images, thanks largely due to the availability of public annotated datasets. We postulate, however, that even if a method scores well on a challenge task, this success may not translate to good performance in a more clinically relevant workflow. Many datasets consist of image patches which may suffer from data curation bias; other datasets are only labelled at the whole slide level and the lack of annotations across an image may mask erroneous local predictions so long as the final decision is correct. In this paper, we outline the differences between patch or slide-level classification versus methods that need to localize or segment cancer accurately across the whole slide, and we experimentally verify that best practices differ in both cases. We apply a binary cancer detection network on post neoadjuvant therapy breast cancer WSIs to find the tumor bed outlining the extent of cancer, a task which requires sensitivity and precision across the whole slide. We extensively study multiple design choices and their effects on the outcome, including architectures and augmentations. Furthermore, we propose a negative data sampling strategy, which drastically reduces the false positive rate (7% on slide level) and improves each metric pertinent to our problem, with a 15% reduction in the error of tumor extent.
△ Less
Submitted 1 December, 2020;
originally announced December 2020.
-
Auxiliary Diagnosing Coronary Stenosis Using Machine Learning
Authors:
Weijun Zhu,
Fengyuan Lu,
Xiaoyu Yang,
En Li
Abstract:
How to accurately classify and diagnose whether an individual has Coronary Stenosis (CS) without invasive physical examination? This problem has not been solved satisfactorily. To this end, the four machine learning (ML) algorithms, i.e., Boosted Tree (BT), Decision Tree (DT), Logistic Regression (LR) and Random Forest (RF) are employed in this paper. First, eleven features including basic informa…
▽ More
How to accurately classify and diagnose whether an individual has Coronary Stenosis (CS) without invasive physical examination? This problem has not been solved satisfactorily. To this end, the four machine learning (ML) algorithms, i.e., Boosted Tree (BT), Decision Tree (DT), Logistic Regression (LR) and Random Forest (RF) are employed in this paper. First, eleven features including basic information of an individual, symptoms and results of routine physical examination are selected, as well as one label is specified, indicating whether an individual suffers from different severity of coronary artery stenosis or not. On the basis of it, a sample set is constructed. Second, each of these four ML algorithms learns from the sample set to obtain the corresponding optimal classified results, respectively. The experimental results show that: RF performs better than other three algorithms, and the former algorithm classifies whether an individual has CS with an accuracy of 95.7% (=90/94).
△ Less
Submitted 7 September, 2021; v1 submitted 16 July, 2020;
originally announced July 2020.
-
DVI: Depth Guided Video Inpainting for Autonomous Driving
Authors:
Miao Liao,
Feixiang Lu,
Dingfu Zhou,
Sibo Zhang,
Wei Li,
Ruigang Yang
Abstract:
To get clear street-view and photo-realistic simulation in autonomous driving, we present an automatic video inpainting algorithm that can remove traffic agents from videos and synthesize missing regions with the guidance of depth/point cloud. By building a dense 3D map from stitched point clouds, frames within a video are geometrically correlated via this common 3D map. In order to fill a target…
▽ More
To get clear street-view and photo-realistic simulation in autonomous driving, we present an automatic video inpainting algorithm that can remove traffic agents from videos and synthesize missing regions with the guidance of depth/point cloud. By building a dense 3D map from stitched point clouds, frames within a video are geometrically correlated via this common 3D map. In order to fill a target inpainting area in a frame, it is straightforward to transform pixels from other frames into the current one with correct occlusion. Furthermore, we are able to fuse multiple videos through 3D point cloud registration, making it possible to inpaint a target video with multiple source videos. The motivation is to solve the long-time occlusion problem where an occluded area has never been visible in the entire video. To our knowledge, we are the first to fuse multiple videos for video inpainting. To verify the effectiveness of our approach, we build a large inpainting dataset in the real urban road environment with synchronized images and Lidar data including many challenge scenes, e.g., long time occlusion. The experimental results show that the proposed approach outperforms the state-of-the-art approaches for all the criteria, especially the RMSE (Root Mean Squared Error) has been reduced by about 13%.
△ Less
Submitted 17 July, 2020;
originally announced July 2020.
-
An Integrated Enhancement Solution for 24-hour Colorful Imaging
Authors:
Feifan Lv,
Yinqiang Zheng,
Yicheng Li,
Feng Lu
Abstract:
The current industry practice for 24-hour outdoor imaging is to use a silicon camera supplemented with near-infrared (NIR) illumination. This will result in color images with poor contrast at daytime and absence of chrominance at nighttime. For this dilemma, all existing solutions try to capture RGB and NIR images separately. However, they need additional hardware support and suffer from various d…
▽ More
The current industry practice for 24-hour outdoor imaging is to use a silicon camera supplemented with near-infrared (NIR) illumination. This will result in color images with poor contrast at daytime and absence of chrominance at nighttime. For this dilemma, all existing solutions try to capture RGB and NIR images separately. However, they need additional hardware support and suffer from various drawbacks, including short service life, high price, specific usage scenario, etc. In this paper, we propose a novel and integrated enhancement solution that produces clear color images, whether at abundant sunlight daytime or extremely low-light nighttime. Our key idea is to separate the VIS and NIR information from mixed signals, and enhance the VIS signal adaptively with the NIR signal as assistance. To this end, we build an optical system to collect a new VIS-NIR-MIX dataset and present a physically meaningful image processing algorithm based on CNN. Extensive experiments show outstanding results, which demonstrate the effectiveness of our solution.
△ Less
Submitted 10 May, 2020;
originally announced May 2020.
-
A Sparse Representation Based Joint Demosaicing Method for Single-Chip Polarized Color Sensor
Authors:
Sijia Wen,
Yinqiang Zheng,
Feng Lu
Abstract:
The emergence of the single-chip polarized color sensor now allows for simultaneously capturing chromatic and polarimetric information of the scene on a monochromatic image plane. However, unlike the usual camera with an embedded demosaicing method, the latest polarized color camera is not delivered with an in-built demosaicing tool. For demosaicing, the users have to down-sample the captured imag…
▽ More
The emergence of the single-chip polarized color sensor now allows for simultaneously capturing chromatic and polarimetric information of the scene on a monochromatic image plane. However, unlike the usual camera with an embedded demosaicing method, the latest polarized color camera is not delivered with an in-built demosaicing tool. For demosaicing, the users have to down-sample the captured images or to use traditional interpolation techniques. Neither of them can perform well since the polarization and color are interdependent. Therefore, joint chromatic and polarimetric demosaicing is the key to obtaining high-quality polarized color images. In this paper, we propose a joint chromatic and polarimetric demosaicing model to address this challenging problem. Instead of mechanically demosaicing for the multi-channel polarized color image, we further present a sparse representation-based optimization strategy that utilizes chromatic information and polarimetric information to jointly optimize the model. To avoid the interaction between color and polarization during demosaicing, we separately construct the corresponding dictionaries. We also build an optical data acquisition system to collect a dataset, which contains various sources of polarization, such as illumination, reflectance and birefringence. Results of both qualitative and quantitative experiments have shown that our method is capable of faithfully recovering full RGB information of four polarization angles for each pixel from a single mosaic input image. Moreover, the proposed method can perform well not only on the synthetic data but the real captured data.
△ Less
Submitted 7 April, 2021; v1 submitted 16 December, 2019;
originally announced December 2019.
-
Adaptive Leader-Follower Formation Control and Obstacle Avoidance via Deep Reinforcement Learning
Authors:
Yanlin Zhou,
Fan Lu,
George Pu,
Xiyao Ma,
Runhan Sun,
Hsi-Yuan Chen,
Xiaolin Li,
Dapeng Wu
Abstract:
We propose a deep reinforcement learning (DRL) methodology for the tracking, obstacle avoidance, and formation control of nonholonomic robots. By separating vision-based control into a perception module and a controller module, we can train a DRL agent without sophisticated physics or 3D modeling. In addition, the modular framework averts daunting retrains of an image-to-action end-to-end neural n…
▽ More
We propose a deep reinforcement learning (DRL) methodology for the tracking, obstacle avoidance, and formation control of nonholonomic robots. By separating vision-based control into a perception module and a controller module, we can train a DRL agent without sophisticated physics or 3D modeling. In addition, the modular framework averts daunting retrains of an image-to-action end-to-end neural network, and provides flexibility in transferring the controller to different robots. First, we train a convolutional neural network (CNN) to accurately localize in an indoor setting with dynamic foreground/background. Then, we design a new DRL algorithm named Momentum Policy Gradient (MPG) for continuous control tasks and prove its convergence. We also show that MPG is robust at tracking varying leader movements and can naturally be extended to problems of formation control. Leveraging reward sha**, features such as collision and obstacle avoidance can be easily integrated into a DRL controller.
△ Less
Submitted 15 November, 2019;
originally announced November 2019.
-
Zap Q-Learning With Nonlinear Function Approximation
Authors:
Shuhang Chen,
Adithya M. Devraj,
Fan Lu,
Ana Bušić,
Sean P. Meyn
Abstract:
Zap Q-learning is a recent class of reinforcement learning algorithms, motivated primarily as a means to accelerate convergence. Stability theory has been absent outside of two restrictive classes: the tabular setting, and optimal stop**. This paper introduces a new framework for analysis of a more general class of recursive algorithms known as stochastic approximation. Based on this general the…
▽ More
Zap Q-learning is a recent class of reinforcement learning algorithms, motivated primarily as a means to accelerate convergence. Stability theory has been absent outside of two restrictive classes: the tabular setting, and optimal stop**. This paper introduces a new framework for analysis of a more general class of recursive algorithms known as stochastic approximation. Based on this general theory, it is shown that Zap Q-learning is consistent under a non-degeneracy assumption, even when the function approximation architecture is nonlinear. Zap Q-learning with neural network function approximation emerges as a special case, and is tested on examples from OpenAI Gym. Based on multiple experiments with a range of neural network sizes, it is found that the new algorithms converge quickly and are robust to choice of function approximation architecture.
△ Less
Submitted 15 July, 2020; v1 submitted 11 October, 2019;
originally announced October 2019.
-
Multi-Beam Multi-Stream Communications for 5G and Beyond Mobile User Equipment and UAV Proof of Concept Designs
Authors:
Yiming Huo,
Franklin Lu,
Felix Wu,
Xiaodai Dong
Abstract:
Millimeter-wave (mmWave), massive multiple-input multiple-output (MIMO), are expected to play a crucial role for 5G and beyond cellular and next-generation wireless local area network (WLAN) communications. Moreover, unmanned aerial vehicles (UAVs) are also considered as an important component of next-generation networks. In this paper, we propose and present a mmWave distributed phased-arrays (DP…
▽ More
Millimeter-wave (mmWave), massive multiple-input multiple-output (MIMO), are expected to play a crucial role for 5G and beyond cellular and next-generation wireless local area network (WLAN) communications. Moreover, unmanned aerial vehicles (UAVs) are also considered as an important component of next-generation networks. In this paper, we propose and present a mmWave distributed phased-arrays (DPA) architecture and proof-of-concept (PoC) designs for user equipment (UE) and unmanned aerial vehicles (UAVs) which will be used in 5G/Beyond 5G wireless communication networks. Through enabling a multi-stream multi-beam communication mode, the UE PoC achieves a peak downlink speed of more than 4 Gbps with optimized thermal distribution performance. Furthermore, based on the DPA topology, the UAV aerial base station (ABS) prototype is designed and demonstrates for the first time an aggregated peak downlink data rate of 2.2 Gbps in the real-world field tests supporting multi-user (MU) application scenarios.
△ Less
Submitted 28 September, 2019;
originally announced September 2019.
-
Attention Guided Low-light Image Enhancement with a Large Scale Low-light Simulation Dataset
Authors:
Feifan Lv,
Yu Li,
Feng Lu
Abstract:
Low-light image enhancement is challenging in that it needs to consider not only brightness recovery but also complex issues like color distortion and noise, which usually hide in the dark. Simply adjusting the brightness of a low-light image will inevitably amplify those artifacts. To address this difficult problem, this paper proposes a novel end-to-end attention-guided method based on multi-bra…
▽ More
Low-light image enhancement is challenging in that it needs to consider not only brightness recovery but also complex issues like color distortion and noise, which usually hide in the dark. Simply adjusting the brightness of a low-light image will inevitably amplify those artifacts. To address this difficult problem, this paper proposes a novel end-to-end attention-guided method based on multi-branch convolutional neural network. To this end, we first construct a synthetic dataset with carefully designed low-light simulation strategies. The dataset is much larger and more diverse than existing ones. With the new dataset for training, our method learns two attention maps to guide the brightness enhancement and denoising tasks respectively. The first attention map distinguishes underexposed regions from well lit regions, and the second attention map distinguishes noises from real textures. With their guidance, the proposed multi-branch decomposition-and-fusion enhancement network works in an input adaptive way. Moreover, a reinforcement-net further enhances color and contrast of the output image. Extensive experiments on multiple datasets demonstrate that our method can produce high fidelity enhancement results for low-light images and outperforms the current state-of-the-art methods by a large margin both quantitatively and visually.
△ Less
Submitted 14 March, 2020; v1 submitted 1 August, 2019;
originally announced August 2019.
-
Understanding Adversarial Attacks on Deep Learning Based Medical Image Analysis Systems
Authors:
Xingjun Ma,
Yuhao Niu,
Lin Gu,
Yisen Wang,
Yitian Zhao,
James Bailey,
Feng Lu
Abstract:
Deep neural networks (DNNs) have become popular for medical image analysis tasks like cancer diagnosis and lesion detection. However, a recent study demonstrates that medical deep learning systems can be compromised by carefully-engineered adversarial examples/attacks with small imperceptible perturbations. This raises safety concerns about the deployment of these systems in clinical settings. In…
▽ More
Deep neural networks (DNNs) have become popular for medical image analysis tasks like cancer diagnosis and lesion detection. However, a recent study demonstrates that medical deep learning systems can be compromised by carefully-engineered adversarial examples/attacks with small imperceptible perturbations. This raises safety concerns about the deployment of these systems in clinical settings. In this paper, we provide a deeper understanding of adversarial examples in the context of medical images. We find that medical DNN models can be more vulnerable to adversarial attacks compared to models for natural images, according to two different viewpoints. Surprisingly, we also find that medical adversarial attacks can be easily detected, i.e., simple detectors can achieve over 98% detection AUC against state-of-the-art attacks, due to fundamental feature differences compared to normal examples. We believe these findings may be a useful basis to approach the design of more explainable and secure medical deep learning systems.
△ Less
Submitted 13 March, 2020; v1 submitted 24 July, 2019;
originally announced July 2019.
-
I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences
Authors:
Kong Aik Lee,
Ville Hautamaki,
Tomi Kinnunen,
Hitoshi Yamamoto,
Koji Okabe,
Ville Vestman,
**g Huang,
Guohong Ding,
Hanwu Sun,
Anthony Larcher,
Rohan Kumar Das,
Haizhou Li,
Mickael Rouvier,
Pierre-Michel Bousquet,
Wei Rao,
Qing Wang,
Chunlei Zhang,
Fahimeh Bahmaninezhad,
Hector Delgado,
Jose Patino,
Qiongqiong Wang,
Ling Guo,
Takafumi Koshinaka,
Jiacen Zhang,
Koichi Shinoda
, et al. (21 additional authors not shown)
Abstract:
The I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE). The latest edition of such joint submission was in SRE 2018, in which the I4U submission was among the best-performing systems. SRE'18 also marks the 10-year anniversary of I4U consortium into NIST SRE series of evaluation. The primary objective of the current paper is to summarize the res…
▽ More
The I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE). The latest edition of such joint submission was in SRE 2018, in which the I4U submission was among the best-performing systems. SRE'18 also marks the 10-year anniversary of I4U consortium into NIST SRE series of evaluation. The primary objective of the current paper is to summarize the results and lessons learned based on the twelve sub-systems and their fusion submitted to SRE'18. It is also our intention to present a shared view on the advancements, progresses, and major paradigm shifts that we have witnessed as an SRE participant in the past decade from SRE'08 to SRE'18. In this regard, we have seen, among others, a paradigm shift from supervector representation to deep speaker embedding, and a switch of research challenge from channel compensation to domain adaptation.
△ Less
Submitted 15 April, 2019;
originally announced April 2019.