-
Flow-Mixup: Classifying Multi-labeled Medical Images with Corrupted Labels
Authors:
**tai Chen,
Hongyun Yu,
Ruiwei Feng,
Danny Z. Chen,
Jian Wu
Abstract:
In clinical practice, medical image interpretation often involves multi-labeled classification, since the affected parts of a patient tend to present multiple symptoms or comorbidities. Recently, deep learning based frameworks have attained expert-level performance on medical image interpretation, which can be attributed partially to large amounts of accurate annotations. However, manually annotat…
▽ More
In clinical practice, medical image interpretation often involves multi-labeled classification, since the affected parts of a patient tend to present multiple symptoms or comorbidities. Recently, deep learning based frameworks have attained expert-level performance on medical image interpretation, which can be attributed partially to large amounts of accurate annotations. However, manually annotating massive amounts of medical images is impractical, while automatic annotation is fast but imprecise (possibly introducing corrupted labels). In this work, we propose a new regularization approach, called Flow-Mixup, for multi-labeled medical image classification with corrupted labels. Flow-Mixup guides the models to capture robust features for each abnormality, thus hel** handle corrupted labels effectively and making it possible to apply automatic annotation. Specifically, Flow-Mixup decouples the extracted features by adding constraints to the hidden states of the models. Also, Flow-Mixup is more stable and effective comparing to other known regularization methods, as shown by theoretical and empirical analyses. Experiments on two electrocardiogram datasets and a chest X-ray dataset containing corrupted labels verify that Flow-Mixup is effective and insensitive to corrupted labels.
△ Less
Submitted 9 February, 2021;
originally announced February 2021.
-
Doctor Imitator: Hand-Radiography-based Bone Age Assessment by Imitating Scoring Methods
Authors:
**tai Chen,
Bohan Yu,
Biwen Lei,
Ruiwei Feng,
Danny Z. Chen,
Jian Wu
Abstract:
Bone age assessment is challenging in clinical practice due to the complicated bone age assessment process. Current automatic bone age assessment methods were designed with rare consideration of the diagnostic logistics and thus may yield certain uninterpretable hidden states and outputs. Consequently, doctors can find it hard to cooperate with such models harmoniously because it is difficult to c…
▽ More
Bone age assessment is challenging in clinical practice due to the complicated bone age assessment process. Current automatic bone age assessment methods were designed with rare consideration of the diagnostic logistics and thus may yield certain uninterpretable hidden states and outputs. Consequently, doctors can find it hard to cooperate with such models harmoniously because it is difficult to check the correctness of the model predictions. In this work, we propose a new graph-based deep learning framework for bone age assessment with hand radiographs, called Doctor Imitator (DI). The architecture of DI is designed to learn the diagnostic logistics of doctors using the scoring methods (e.g., the Tanner-Whitehouse method) for bone age assessment. Specifically, the convolutions of DI capture the local features of the anatomical regions of interest (ROIs) on hand radiographs and predict the ROI scores by our proposed Anatomy-based Group Convolution, summing up for bone age prediction. Besides, we develop a novel Dual Graph-based Attention module to compute patient-specific attention for ROI features and context attention for ROI scores. As far as we know, DI is the first automatic bone age assessment framework following the scoring methods without fully supervised hand radiographs. Experiments on hand radiographs with only bone age supervision verify that DI can achieve excellent performance with sparse parameters and provide more interpretability.
△ Less
Submitted 24 April, 2023; v1 submitted 10 February, 2021;
originally announced February 2021.
-
Separability Problems in Creative Telesco**
Authors:
Shaoshi Chen,
Ruyong Feng,
**chuan Ma,
Michael F. Singer
Abstract:
For given multivariate functions specified by algebraic, differential or difference equations, the separability problem is to decide whether they satisfy linear differential or difference equations in one variable. In this paper, we will explain how separability problems arise naturally in creative telesco** and present some criteria for testing the separability for several classes of special fu…
▽ More
For given multivariate functions specified by algebraic, differential or difference equations, the separability problem is to decide whether they satisfy linear differential or difference equations in one variable. In this paper, we will explain how separability problems arise naturally in creative telesco** and present some criteria for testing the separability for several classes of special functions, including rational functions, hyperexponential functions, hypergeometric terms, and algebraic functions.
△ Less
Submitted 6 February, 2021;
originally announced February 2021.
-
MoDL-QSM: Model-based Deep Learning for Quantitative Susceptibility Map**
Authors:
Ruimin Feng,
Jiayi Zhao,
He Wang,
Baofeng Yang,
Jie Feng,
Yuting Shi,
Ming Zhang,
Chunlei Liu,
Yuyao Zhang,
Jie Zhuang,
Hongjiang Wei
Abstract:
Quantitative susceptibility map** (QSM) has demonstrated great potential in quantifying tissue susceptibility in various brain diseases. However, the intrinsic ill-posed inverse problem relating the tissue phase to the underlying susceptibility distribution affects the accuracy for quantifying tissue susceptibility. Recently, deep learning has shown promising results to improve accuracy by reduc…
▽ More
Quantitative susceptibility map** (QSM) has demonstrated great potential in quantifying tissue susceptibility in various brain diseases. However, the intrinsic ill-posed inverse problem relating the tissue phase to the underlying susceptibility distribution affects the accuracy for quantifying tissue susceptibility. Recently, deep learning has shown promising results to improve accuracy by reducing the streaking artifacts. However, there exists a mismatch between the observed phase and the theoretical forward phase estimated by the susceptibility label. In this study, we proposed a model-based deep learning architecture that followed the STI (susceptibility tensor imaging) physical model, referred to as MoDL-QSM. Specifically, MoDL-QSM accounts for the relationship between STI-derived phase contrast induced by the susceptibility tensor terms (ki13,ki23,ki33) and the acquired single-orientation phase. The convolution neural networks are embedded into the physical model to learn a regularization term containing prior information. ki33 and phase induced by ki13 and ki23 terms were used as the labels for network training. Quantitative evaluation metrics (RSME, SSIM, and HFEN) were compared with recently developed deep learning QSM methods. The results showed that MoDL-QSM achieved superior performance, demonstrating its potential for future applications.
△ Less
Submitted 20 May, 2021; v1 submitted 20 January, 2021;
originally announced January 2021.
-
Telescopers for differential forms with one parameter
Authors:
Shaoshi Chen,
Ruyong Feng,
Ziming Li,
Michael F. Singer,
Stephen Watt
Abstract:
Telescopers for a function are linear differential (resp. difference) operators annihilated by the definite integral (resp. definite sum) of this function. They play a key role in Wilf-Zeilberger theory and algorithms for computing them have been extensively studied in the past thirty years. In this paper, we introduce the notion of telescopers for differential forms with $D$-finite function coeff…
▽ More
Telescopers for a function are linear differential (resp. difference) operators annihilated by the definite integral (resp. definite sum) of this function. They play a key role in Wilf-Zeilberger theory and algorithms for computing them have been extensively studied in the past thirty years. In this paper, we introduce the notion of telescopers for differential forms with $D$-finite function coefficients. These telescopers appear in several areas of mathematics, for instance parametrized differential Galois theory and mirror symmetry. We give a sufficient and necessary condition for the existence of telescopers for a differential form and describe a method to compute them if they exist. Algorithms for verifying this condition are also given.
△ Less
Submitted 19 January, 2021; v1 submitted 16 January, 2021;
originally announced January 2021.
-
Smart Black Box 2.0: Efficient High-bandwidth Driving Data Collection based on Video Anomalies
Authors:
Ryan Feng,
Yu Yao,
Ella Atkins
Abstract:
Autonomous vehicles require fleet-wide data collection for continuous algorithm development and validation. The Smart Black Box (SBB) intelligent event data recorder has been proposed as a system for prioritized high-bandwidth data capture. This paper extends the SBB by applying anomaly detection and action detection methods for generalized event-of-interest (EOI) detection. An updated SBB pipelin…
▽ More
Autonomous vehicles require fleet-wide data collection for continuous algorithm development and validation. The Smart Black Box (SBB) intelligent event data recorder has been proposed as a system for prioritized high-bandwidth data capture. This paper extends the SBB by applying anomaly detection and action detection methods for generalized event-of-interest (EOI) detection. An updated SBB pipeline is proposed for the real-time capture of driving video data. A video dataset is constructed to evaluate the SBB on real-world data for the first time. SBB performance is assessed by comparing the compression of normal and anomalous data and by comparing our prioritized data recording with a FIFO strategy. Results show that SBB data compression can increase the anomalous-to-normal memory ratio by ~25%, while the prioritized recording strategy increases the anomalous-to-normal count ratio when compared to a FIFO strategy. We compare the real-world dataset SBB results to a baseline SBB given ground-truth anomaly labels and conclude that improved general EOI detection methods will greatly improve SBB performance.
△ Less
Submitted 8 February, 2021; v1 submitted 3 January, 2021;
originally announced January 2021.
-
Geometric Brownian motion with affine drift and its time-integral
Authors:
Runhuan Feng,
**** Jiang,
Hans Volkmer
Abstract:
The joint distribution of a geometric Brownian motion and its time-integral was derived in a seminal paper by Yor (1992) using Lamperti's transformation, leading to explicit solutions in terms of modified Bessel functions. In this paper, we revisit this classic result using the simple Laplace transform approach in connection to the Heun differential equation. We extend the methodology to the geome…
▽ More
The joint distribution of a geometric Brownian motion and its time-integral was derived in a seminal paper by Yor (1992) using Lamperti's transformation, leading to explicit solutions in terms of modified Bessel functions. In this paper, we revisit this classic result using the simple Laplace transform approach in connection to the Heun differential equation. We extend the methodology to the geometric Brownian motion with affine drift and show that the joint distribution of this process and its time-integral can be determined by a doubly-confluent Heun equation. Furthermore, the joint Laplace transform of the process and its time-integral is derived from the asymptotics of the solutions. In addition, we provide an application by using the results for the asymptotics of the double-confluent Heun equation in pricing Asian options. Numerical results show the accuracy and efficiency of this new method.
△ Less
Submitted 17 December, 2020;
originally announced December 2020.
-
Pandemic risk management: resources contingency planning and allocation
Authors:
Xiaowei Chen,
Wing Fung Chong,
Runhuan Feng,
Linfeng Zhang
Abstract:
Repeated history of pandemics, such as SARS, H1N1, Ebola, Zika, and COVID-19, has shown that pandemic risk is inevitable. Extraordinary shortages of medical resources have been observed in many parts of the world. Some attributing factors include the lack of sufficient stockpiles and the lack of coordinated efforts to deploy existing resources to the location of greatest needs. The paper investiga…
▽ More
Repeated history of pandemics, such as SARS, H1N1, Ebola, Zika, and COVID-19, has shown that pandemic risk is inevitable. Extraordinary shortages of medical resources have been observed in many parts of the world. Some attributing factors include the lack of sufficient stockpiles and the lack of coordinated efforts to deploy existing resources to the location of greatest needs. The paper investigates contingency planning and resources allocation from a risk management perspective, as opposed to the prevailing supply chain perspective. The key idea is that the competition of limited critical resources is not only present in different geographical locations but also at different stages of a pandemic. This paper draws on an analogy between risk aggregation and capital allocation in finance and pandemic resources planning and allocation for healthcare systems. The main contribution is to introduce new strategies for optimal stockpiling and allocation balancing spatio-temporal competitions of medical supply and demand.
△ Less
Submitted 6 December, 2020;
originally announced December 2020.
-
Content-Adaptive Pixel Discretization to Improve Model Robustness
Authors:
Ryan Feng,
Wu-chi Feng,
Atul Prakash
Abstract:
Preprocessing defenses such as pixel discretization are appealing to remove adversarial attacks due to their simplicity. However, they have been shown to be ineffective except on simple datasets like MNIST. We hypothesize that existing discretization approaches failed because using a fixed codebook for the entire dataset limits their ability to balance image representation and codeword separabilit…
▽ More
Preprocessing defenses such as pixel discretization are appealing to remove adversarial attacks due to their simplicity. However, they have been shown to be ineffective except on simple datasets like MNIST. We hypothesize that existing discretization approaches failed because using a fixed codebook for the entire dataset limits their ability to balance image representation and codeword separability. We first formally prove that adaptive codebooks can provide stronger robustness guarantees than fixed codebooks as a preprocessing defense on some datasets. Based on that insight, we propose a content-adaptive pixel discretization defense called Essential Features, which discretizes the image to a per-image adaptive codebook to reduce the color space. We then find that Essential Features can be further optimized by applying adaptive blurring before the discretization to push perturbed pixel values back to their original value before determining the codebook. Against adaptive attacks, we show that content-adaptive pixel discretization extends the range of datasets that benefit in terms of both L_2 and L_infinity robustness where previously fixed codebooks were found to have failed. Our findings suggest that content-adaptive pixel discretization should be part of the repertoire for making models robust.
△ Less
Submitted 11 October, 2022; v1 submitted 2 December, 2020;
originally announced December 2020.
-
Analyzing the Machine Learning Conference Review Process
Authors:
David Tran,
Alex Valtchanov,
Keshav Ganapathy,
Raymond Feng,
Eric Slud,
Micah Goldblum,
Tom Goldstein
Abstract:
Mainstream machine learning conferences have seen a dramatic increase in the number of participants, along with a growing range of perspectives, in recent years. Members of the machine learning community are likely to overhear allegations ranging from randomness of acceptance decisions to institutional bias. In this work, we critically analyze the review process through a comprehensive study of pa…
▽ More
Mainstream machine learning conferences have seen a dramatic increase in the number of participants, along with a growing range of perspectives, in recent years. Members of the machine learning community are likely to overhear allegations ranging from randomness of acceptance decisions to institutional bias. In this work, we critically analyze the review process through a comprehensive study of papers submitted to ICLR between 2017 and 2020. We quantify reproducibility/randomness in review scores and acceptance decisions, and examine whether scores correlate with paper impact. Our findings suggest strong institutional bias in accept/reject decisions, even after controlling for paper quality. Furthermore, we find evidence for a gender gap, with female authors receiving lower scores, lower acceptance rates, and fewer citations per paper than their male counterparts. We conclude our work with recommendations for future conference organizers.
△ Less
Submitted 25 November, 2020; v1 submitted 24 November, 2020;
originally announced November 2020.
-
Causal Contextual Prediction for Learned Image Compression
Authors:
Zongyu Guo,
Zhizheng Zhang,
Runsen Feng,
Zhibo Chen
Abstract:
Over the past several years, we have witnessed impressive progress in the field of learned image compression. Recent learned image codecs are commonly based on autoencoders, that first encode an image into low-dimensional latent representations and then decode them for reconstruction purposes. To capture spatial dependencies in the latent space, prior works exploit hyperprior and spatial context m…
▽ More
Over the past several years, we have witnessed impressive progress in the field of learned image compression. Recent learned image codecs are commonly based on autoencoders, that first encode an image into low-dimensional latent representations and then decode them for reconstruction purposes. To capture spatial dependencies in the latent space, prior works exploit hyperprior and spatial context model to build an entropy model, which estimates the bit-rate for end-to-end rate-distortion optimization. However, such an entropy model is suboptimal from two aspects: (1) It fails to capture spatially global correlations among the latents. (2) Cross-channel relationships of the latents are still underexplored. In this paper, we propose the concept of separate entropy coding to leverage a serial decoding process for causal contextual entropy prediction in the latent space. A causal context model is proposed that separates the latents across channels and makes use of cross-channel relationships to generate highly informative contexts. Furthermore, we propose a causal global prediction model, which is able to find global reference points for accurate predictions of unknown points. Both these two models facilitate entropy estimation without the transmission of overhead. In addition, we further adopt a new separate attention module to build more powerful transform networks. Experimental results demonstrate that our full image compression model outperforms standard VVC/H.266 codec on Kodak dataset in terms of both PSNR and MS-SSIM, yielding the state-of-the-art rate-distortion performance.
△ Less
Submitted 31 October, 2021; v1 submitted 19 November, 2020;
originally announced November 2020.
-
SeqMobile: A Sequence Based Efficient Android Malware Detection System Using RNN on Mobile Devices
Authors:
Ruitao Feng,
**g Qiang Lim,
Sen Chen,
Shang-Wei Lin,
Yang Liu
Abstract:
With the proliferation of Android malware, the demand for an effective and efficient malware detection system is on the rise. The existing device-end learning based solutions tend to extract limited syntax features (e.g., permissions and API calls) to meet a certain time constraint of mobile devices. However, syntax features lack the semantics which can represent the potential malicious behaviors…
▽ More
With the proliferation of Android malware, the demand for an effective and efficient malware detection system is on the rise. The existing device-end learning based solutions tend to extract limited syntax features (e.g., permissions and API calls) to meet a certain time constraint of mobile devices. However, syntax features lack the semantics which can represent the potential malicious behaviors and further result in more robust model with high accuracy for malware detection. In this paper, we propose an efficient Android malware detection system, named SeqMobile, which adopts behavior-based sequence features and leverages customized deep neural networks on mobile devices instead of the server. Different from the traditional sequence-based approaches on server, to meet the performance demand, SeqMobile accepts three effective performance optimization methods to reduce the time cost. To evaluate the effectiveness and efficiency of our system, we conduct experiments from the following aspects 1) the detection accuracy of different recurrent neural networks; 2) the feature extraction performance on different mobile devices, 3) the detection accuracy and prediction time cost of different sequence lengths. The results unveil that SeqMobile can effectively detect malware with high accuracy. Moreover, our performance optimization methods have proven to improve the performance of training and prediction by at least twofold. Additionally, to discover the potential performance optimization from the SOTA TensorFlow model optimization toolkit for our approach, we also provide an evaluation on the toolkit, which can serve as a guidance for other systems leveraging on sequence-based learning approach. Overall, we conclude that our sequence-based approach, together with our performance optimization methods, enable us to detect malware under the performance demands of mobile devices.
△ Less
Submitted 10 November, 2020;
originally announced November 2020.
-
Probing and Fine-tuning Reading Comprehension Models for Few-shot Event Extraction
Authors:
Rui Feng,
Jie Yuan,
Chao Zhang
Abstract:
We study the problem of event extraction from text data, which requires both detecting target event types and their arguments. Typically, both the event detection and argument detection subtasks are formulated as supervised sequence labeling problems. We argue that the event extraction models so trained are inherently label-hungry, and can generalize poorly across domains and text genres.We propos…
▽ More
We study the problem of event extraction from text data, which requires both detecting target event types and their arguments. Typically, both the event detection and argument detection subtasks are formulated as supervised sequence labeling problems. We argue that the event extraction models so trained are inherently label-hungry, and can generalize poorly across domains and text genres.We propose a reading comprehension framework for event extraction.Specifically, we formulate event detection as a textual entailment prediction problem, and argument detection as a question answer-ing problem. By constructing proper query templates, our approach can effectively distill rich knowledge about tasks and label semantics from pretrained reading comprehension models. Moreover, our model can be fine-tuned with a small amount of data to boost its performance. Our experiment results show that our method performs strongly for zero-shot and few-shot event extraction, and it achieves state-of-the-art performance on the ACE 2005 benchmark when trained with full supervision.
△ Less
Submitted 21 October, 2020;
originally announced October 2020.
-
An Open Review of OpenReview: A Critical Analysis of the Machine Learning Conference Review Process
Authors:
David Tran,
Alex Valtchanov,
Keshav Ganapathy,
Raymond Feng,
Eric Slud,
Micah Goldblum,
Tom Goldstein
Abstract:
Mainstream machine learning conferences have seen a dramatic increase in the number of participants, along with a growing range of perspectives, in recent years. Members of the machine learning community are likely to overhear allegations ranging from randomness of acceptance decisions to institutional bias. In this work, we critically analyze the review process through a comprehensive study of pa…
▽ More
Mainstream machine learning conferences have seen a dramatic increase in the number of participants, along with a growing range of perspectives, in recent years. Members of the machine learning community are likely to overhear allegations ranging from randomness of acceptance decisions to institutional bias. In this work, we critically analyze the review process through a comprehensive study of papers submitted to ICLR between 2017 and 2020. We quantify reproducibility/randomness in review scores and acceptance decisions, and examine whether scores correlate with paper impact. Our findings suggest strong institutional bias in accept/reject decisions, even after controlling for paper quality. Furthermore, we find evidence for a gender gap, with female authors receiving lower scores, lower acceptance rates, and fewer citations per paper than their male counterparts. We conclude our work with recommendations for future conference organizers.
△ Less
Submitted 26 October, 2020; v1 submitted 10 October, 2020;
originally announced October 2020.
-
Transformer-Based Neural Text Generation with Syntactic Guidance
Authors:
Yinghao Li,
Rui Feng,
Isaac Rehg,
Chao Zhang
Abstract:
We study the problem of using (partial) constituency parse trees as syntactic guidance for controlled text generation. Existing approaches to this problem use recurrent structures, which not only suffer from the long-term dependency problem but also falls short in modeling the tree structure of the syntactic guidance. We propose to leverage the parallelism of Transformer to better incorporate pars…
▽ More
We study the problem of using (partial) constituency parse trees as syntactic guidance for controlled text generation. Existing approaches to this problem use recurrent structures, which not only suffer from the long-term dependency problem but also falls short in modeling the tree structure of the syntactic guidance. We propose to leverage the parallelism of Transformer to better incorporate parse trees. Our method first expands a partial template constituency parse tree to a full-fledged parse tree tailored for the input source text, and then uses the expanded tree to guide text generation. The effectiveness of our model in this process hinges upon two new attention mechanisms: 1) a path attention mechanism that forces one node to attend to only other nodes located in its path in the syntax tree to better incorporate syntax guidance; 2) a multi-encoder attention mechanism that allows the decoder to dynamically attend to information from multiple encoders. Our experiments in the controlled paraphrasing task show that our method outperforms SOTA models both semantically and syntactically, improving the best baseline's BLEU score from 11.83 to 26.27.
△ Less
Submitted 4 October, 2020;
originally announced October 2020.
-
Edge Dislocations Can Control Yield Strength in Refractory Body-Centered-Cubic High Entropy Alloys
Authors:
Francesco Maresca,
Chanho Lee,
Rui Feng,
Yi Chou,
Tamas Ungar,
Michael Widom,
Ke An,
John Poplawsky,
Yi-Chia Chou,
Peter Liaw,
William Curtin
Abstract:
Energy efficiency is motivating the search for new high-temperature metals. Some new body-centered-cubic random multicomponent "high entropy alloys (HEAs)" based on refractory elements (Cr-Mo-Nb-Ta-V-W-Hf-Ti-Zr) possess exceptional strengths at high temperatures but the physical origins of this outstanding behavior are not known. Here we show, using integrated neutron-diffraction (ND), high-resolu…
▽ More
Energy efficiency is motivating the search for new high-temperature metals. Some new body-centered-cubic random multicomponent "high entropy alloys (HEAs)" based on refractory elements (Cr-Mo-Nb-Ta-V-W-Hf-Ti-Zr) possess exceptional strengths at high temperatures but the physical origins of this outstanding behavior are not known. Here we show, using integrated neutron-diffraction (ND), high-resolution transmission electron microscopy (HRTEM), and theory, that the high strength and strength retention of a NbTaVTi alloy and a new high-strength/low-density CrMoNbV alloy are attributable to edge dislocations. This is surprising because plastic-flow in BCC elemental metals and dilute alloys is universally accepted to be controlled by screw dislocations. We use the insight and theory to perform a computationally-guided search over $10^7$ BCC HEAs and identify over $10^6$ possible ultra-strong high-temperature alloy compositions for future exploration.
△ Less
Submitted 26 August, 2020;
originally announced August 2020.
-
On distance matrices of distance-regular graphs
Authors:
Hui Zhou,
Rongquan Feng
Abstract:
In this paper, we give a characterization of distance matrices of distance-regular graphs to be invertible.
In this paper, we give a characterization of distance matrices of distance-regular graphs to be invertible.
△ Less
Submitted 22 August, 2020;
originally announced August 2020.
-
Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning
Authors:
Ying Cheng,
Ruize Wang,
Zhihao Pan,
Rui Feng,
Yuejie Zhang
Abstract:
When watching videos, the occurrence of a visual event is often accompanied by an audio event, e.g., the voice of lip motion, the music of playing instruments. There is an underlying correlation between audio and visual events, which can be utilized as free supervised information to train a neural network by solving the pretext task of audio-visual synchronization. In this paper, we propose a nove…
▽ More
When watching videos, the occurrence of a visual event is often accompanied by an audio event, e.g., the voice of lip motion, the music of playing instruments. There is an underlying correlation between audio and visual events, which can be utilized as free supervised information to train a neural network by solving the pretext task of audio-visual synchronization. In this paper, we propose a novel self-supervised framework with co-attention mechanism to learn generic cross-modal representations from unlabelled videos in the wild, and further benefit downstream tasks. Specifically, we explore three different co-attention modules to focus on discriminative visual regions correlated to the sounds and introduce the interactions between them. Experiments show that our model achieves state-of-the-art performance on the pretext task while having fewer parameters compared with existing methods. To further evaluate the generalizability and transferability of our approach, we apply the pre-trained model on two downstream tasks, i.e., sound source localization and action recognition. Extensive experiments demonstrate that our model provides competitive results with other self-supervised methods, and also indicate that our approach can tackle the challenging scenes which contain multiple sound sources.
△ Less
Submitted 13 August, 2020;
originally announced August 2020.
-
Exploring Multi-Scale Feature Propagation and Communication for Image Super Resolution
Authors:
Ruicheng Feng,
Weipeng Guan,
Yu Qiao,
Chao Dong
Abstract:
Multi-scale techniques have achieved great success in a wide range of computer vision tasks. However, while this technique is incorporated in existing works, there still lacks a comprehensive investigation on variants of multi-scale convolution in image super resolution. In this work, we present a unified formulation over widely-used multi-scale structures. With this framework, we systematically e…
▽ More
Multi-scale techniques have achieved great success in a wide range of computer vision tasks. However, while this technique is incorporated in existing works, there still lacks a comprehensive investigation on variants of multi-scale convolution in image super resolution. In this work, we present a unified formulation over widely-used multi-scale structures. With this framework, we systematically explore the two factors of multi-scale convolution -- feature propagation and cross-scale communication. Based on the investigation, we propose a generic and efficient multi-scale convolution unit -- Multi-Scale cross-Scale Share-weights convolution (MS$^3$-Conv). Extensive experiments demonstrate that the proposed MS$^3$-Conv can achieve better SR performance than the standard convolution with less parameters and computational cost. Beyond quantitative analysis, we comprehensively study the visual quality, which shows that MS$^3$-Conv behave better to recover high-frequency details.
△ Less
Submitted 14 August, 2020; v1 submitted 1 August, 2020;
originally announced August 2020.
-
Learning Error-Driven Curriculum for Crowd Counting
Authors:
Wenxi Li,
Zhuoqun Cao,
Qian Wang,
Songjian Chen,
Rui Feng
Abstract:
Density regression has been widely employed in crowd counting. However, the frequency imbalance of pixel values in the density map is still an obstacle to improve the performance. In this paper, we propose a novel learning strategy for learning error-driven curriculum, which uses an additional network to supervise the training of the main network. A tutoring network called TutorNet is proposed to…
▽ More
Density regression has been widely employed in crowd counting. However, the frequency imbalance of pixel values in the density map is still an obstacle to improve the performance. In this paper, we propose a novel learning strategy for learning error-driven curriculum, which uses an additional network to supervise the training of the main network. A tutoring network called TutorNet is proposed to repetitively indicate the critical errors of the main network. TutorNet generates pixel-level weights to formulate the curriculum for the main network during training, so that the main network will assign a higher weight to those hard examples than easy examples. Furthermore, we scale the density map by a factor to enlarge the distance among inter-examples, which is well known to improve the performance. Extensive experiments on two challenging benchmark datasets show that our method has achieved state-of-the-art performance.
△ Less
Submitted 19 July, 2020;
originally announced July 2020.
-
On Noise Injection in Generative Adversarial Networks
Authors:
Ruili Feng,
Deli Zhao,
Zhengjun Zha
Abstract:
Noise injection has been proved to be one of the key technique advances in generating high-fidelity images. Despite its successful usage in GANs, the mechanism of its validity is still unclear. In this paper, we propose a geometric framework to theoretically analyze the role of noise injection in GANs. Based on Riemannian geometry, we successfully model the noise injection framework as fuzzy equiv…
▽ More
Noise injection has been proved to be one of the key technique advances in generating high-fidelity images. Despite its successful usage in GANs, the mechanism of its validity is still unclear. In this paper, we propose a geometric framework to theoretically analyze the role of noise injection in GANs. Based on Riemannian geometry, we successfully model the noise injection framework as fuzzy equivalence on the geodesic normal coordinates. Guided by our theories, we find that the existing method is incomplete and a new strategy for noise injection is devised. Experiments on image generation and GAN inversion demonstrate the superiority of our method.
△ Less
Submitted 22 May, 2021; v1 submitted 10 June, 2020;
originally announced June 2020.
-
A Performance-Sensitive Malware Detection System Using Deep Learning on Mobile Devices
Authors:
Ruitao Feng,
Sen Chen,
Xiaofei Xie,
Guozhu Meng,
Shang-Wei Lin,
Yang Liu
Abstract:
Currently, Android malware detection is mostly performed on server side against the increasing number of malware. Powerful computing resource provides more exhaustive protection for app markets than maintaining detection by a single user. However, apart from the applications provided by the official market, apps from unofficial markets and third-party resources are always causing serious security…
▽ More
Currently, Android malware detection is mostly performed on server side against the increasing number of malware. Powerful computing resource provides more exhaustive protection for app markets than maintaining detection by a single user. However, apart from the applications provided by the official market, apps from unofficial markets and third-party resources are always causing serious security threats to end-users. Meanwhile, it is a time-consuming task if the app is downloaded first and then uploaded to the server side for detection, because the network transmission has a lot of overhead. In addition, the uploading process also suffers from the security threats of attackers. Consequently, a last line of defense on mobile devices is necessary and much-needed. In this paper, we propose an effective Android malware detection system, MobiTive, leveraging customized deep neural networks to provide a real-time and responsive detection environment on mobile devices. MobiTive is a preinstalled solution rather than an app scanning and monitoring engine using after installation, which is more practical and secure. Original deep learning models cannot be directly deployed and executed on mobile devices due to various performance limitations, such as computation power, memory size, and energy. Therefore, we evaluate and investigate the following key points:(1) the performance of different feature extraction methods based on source code or binary code;(2) the performance of different feature type selections for deep learning on mobile devices;(3) the detection accuracy of different deep neural networks on mobile devices;(4) the real-time detection performance and accuracy on different mobile devices;(5) the potential based on the evolution trend of mobile devices' specifications; and finally we further propose a practical solution (MobiTive) to detect Android malware on mobile devices.
△ Less
Submitted 3 September, 2020; v1 submitted 11 May, 2020;
originally announced May 2020.
-
Rational Solutions of First Order Algebraic Ordinary Differential Equations
Authors:
Ruyong Feng,
Shuang Feng
Abstract:
Let $f(t, y,y')=\sum_{i=0}^d a_i(t, y)y'^i=0$ be a first order ordinary differential equation with polynomial coefficients. Eremenko in 1999 proved that there exists a constant $C$ such that every rational solution of $f(t, y,y')=0$ is of degree not greater than $C$. Examples show that this degree bound $C$ depends not only on the degrees of $f$ in $t,y,y'$ but also on the coefficients of $f$ view…
▽ More
Let $f(t, y,y')=\sum_{i=0}^d a_i(t, y)y'^i=0$ be a first order ordinary differential equation with polynomial coefficients. Eremenko in 1999 proved that there exists a constant $C$ such that every rational solution of $f(t, y,y')=0$ is of degree not greater than $C$. Examples show that this degree bound $C$ depends not only on the degrees of $f$ in $t,y,y'$ but also on the coefficients of $f$ viewed as polynomial in $t,y,y'$. In this paper, we show that if $$\max_{i=0}^d \{{\rm deg}(a_i,y)-2(d-i)\}>0 $$ then the degree bound $C$ only depends on the degrees of $f$, and furthermore we present an explicit expression for $C$ in terms of the degrees of $f$.
△ Less
Submitted 4 May, 2020;
originally announced May 2020.
-
Learned Video Compression with Feature-level Residuals
Authors:
Runsen Feng,
Yaojun Wu,
Zongyu Guo,
Zhizheng Zhang,
Xin **,
Zhibo Chen
Abstract:
In this paper, we present an end-to-end video compression network for P-frame challenge on CLIC. We focus on deep neural network (DNN) based video compression, and improve the current frameworks from three aspects. First, we notice that pixel space residuals is sensitive to the prediction errors of optical flow based motion compensation. To suppress the relative influence, we propose to compress t…
▽ More
In this paper, we present an end-to-end video compression network for P-frame challenge on CLIC. We focus on deep neural network (DNN) based video compression, and improve the current frameworks from three aspects. First, we notice that pixel space residuals is sensitive to the prediction errors of optical flow based motion compensation. To suppress the relative influence, we propose to compress the residuals of image feature rather than the residuals of image pixels. Furthermore, we combine the advantages of both pixel-level and feature-level residual compression methods by model ensembling. Finally, we propose a step-by-step training strategy to improve the training efficiency of the whole framework. Experiment results indicate that our proposed method achieves 0.9968 MS-SSIM on CLIC validation set and 0.9967 MS-SSIM on test set.
△ Less
Submitted 21 April, 2020; v1 submitted 17 April, 2020;
originally announced April 2020.
-
3-D Context Entropy Model for Improved Practical Image Compression
Authors:
Zongyu Guo,
Yaojun Wu,
Runsen Feng,
Zhizheng Zhang,
Zhibo Chen
Abstract:
In this paper, we present our image compression framework designed for CLIC 2020 competition. Our method is based on Variational AutoEncoder (VAE) architecture which is strengthened with residual structures. In short, we make three noteworthy improvements here. First, we propose a 3-D context entropy model which can take advantage of known latent representation in current spatial locations for bet…
▽ More
In this paper, we present our image compression framework designed for CLIC 2020 competition. Our method is based on Variational AutoEncoder (VAE) architecture which is strengthened with residual structures. In short, we make three noteworthy improvements here. First, we propose a 3-D context entropy model which can take advantage of known latent representation in current spatial locations for better entropy estimation. Second, a light-weighted residual structure is adopted for feature learning during entropy estimation. Finally, an effective training strategy is introduced for practical adaptation with different resolutions. Experiment results indicate our image compression method achieves 0.9775 MS-SSIM on CLIC validation set and 0.9809 MS-SSIM on test set.
△ Less
Submitted 17 April, 2020;
originally announced April 2020.
-
On Sufficient and Necessary Conditions in Bounded CTL: A Forgetting Approach
Authors:
Renyan Feng,
Erman Acar,
Stefan Schlobach,
Yisong Wang,
Wanwei Liu
Abstract:
Computation Tree Logic (CTL) is one of the central formalisms in formal verification. As a specification language, it is used to express a property that the system at hand is expected to satisfy. From both the verification and the system design points of view, some information content of such property might become irrelevant for the system due to various reasons, e.g., it might become obsolete by…
▽ More
Computation Tree Logic (CTL) is one of the central formalisms in formal verification. As a specification language, it is used to express a property that the system at hand is expected to satisfy. From both the verification and the system design points of view, some information content of such property might become irrelevant for the system due to various reasons, e.g., it might become obsolete by time, or perhaps infeasible due to practical difficulties. Then, the problem arises on how to subtract such piece of information without altering the relevant system behaviour or violating the existing specifications over a given signature. Moreover, in such a scenario, two crucial notions are informative: the strongest necessary condition (SNC) and the weakest sufficient condition (WSC) of a given property. To address such a scenario in a principled way, we introduce a forgetting-based approach in CTL and show that it can be used to compute SNC and WSC of a property under a given model and over a given signature. We study its theoretical properties and also show that our notion of forgetting satisfies existing essential postulates of knowledge forgetting. Furthermore, we analyse the computational complexity of some basic reasoning tasks for the fragment CTL_AF in particular.
△ Less
Submitted 3 July, 2020; v1 submitted 13 March, 2020;
originally announced March 2020.
-
GRAPHITE: Generating Automatic Physical Examples for Machine-Learning Attacks on Computer Vision Systems
Authors:
Ryan Feng,
Neal Mangaokar,
Jiefeng Chen,
Earlence Fernandes,
Somesh Jha,
Atul Prakash
Abstract:
This paper investigates an adversary's ease of attack in generating adversarial examples for real-world scenarios. We address three key requirements for practical attacks for the real-world: 1) automatically constraining the size and shape of the attack so it can be applied with stickers, 2) transform-robustness, i.e., robustness of a attack to environmental physical variations such as viewpoint a…
▽ More
This paper investigates an adversary's ease of attack in generating adversarial examples for real-world scenarios. We address three key requirements for practical attacks for the real-world: 1) automatically constraining the size and shape of the attack so it can be applied with stickers, 2) transform-robustness, i.e., robustness of a attack to environmental physical variations such as viewpoint and lighting changes, and 3) supporting attacks in not only white-box, but also black-box hard-label scenarios, so that the adversary can attack proprietary models. In this work, we propose GRAPHITE, an efficient and general framework for generating attacks that satisfy the above three key requirements. GRAPHITE takes advantage of transform-robustness, a metric based on expectation over transforms (EoT), to automatically generate small masks and optimize with gradient-free optimization. GRAPHITE is also flexible as it can easily trade-off transform-robustness, perturbation size, and query count in black-box settings. On a GTSRB model in a hard-label black-box setting, we are able to find attacks on all possible 1,806 victim-target class pairs with averages of 77.8% transform-robustness, perturbation size of 16.63% of the victim images, and 126K queries per pair. For digital-only attacks where achieving transform-robustness is not a requirement, GRAPHITE is able to find successful small-patch attacks with an average of only 566 queries for 92.2% of victim-target pairs. GRAPHITE is also able to find successful attacks using perturbations that modify small areas of the input image against PatchGuard, a recently proposed defense against patch-based attacks.
△ Less
Submitted 28 February, 2022; v1 submitted 17 February, 2020;
originally announced February 2020.
-
QoE-driven Coupled Uplink and Downlink Rate Adaptation for 360-degree Video Live Streaming
Authors:
Jie Li,
Ransheng Feng,
Zhi Liu,
Wei Sun,
Qiyue Li
Abstract:
360-degree video provides an immersive 360-degree viewing experience and has been widely used in many areas. The 360-degree video live streaming systems involve capturing, compression, uplink (camera to video server) and downlink (video server to user) transmissions. However, few studies have jointly investigated such complex systems, especially the rate adaptation for the coupled uplink and downl…
▽ More
360-degree video provides an immersive 360-degree viewing experience and has been widely used in many areas. The 360-degree video live streaming systems involve capturing, compression, uplink (camera to video server) and downlink (video server to user) transmissions. However, few studies have jointly investigated such complex systems, especially the rate adaptation for the coupled uplink and downlink in the 360-degree video streaming under limited bandwidth constraints. In this letter, we propose a quality of experience (QoE)-driven 360-degree video live streaming system, in which a video server performs rate adaptation based on the uplink and downlink bandwidths and information concerning each user's real-time field-of-view (FOV). We formulate it as a nonlinear integer programming problem and propose an algorithm, which combines the Karush-Kuhn-Tucker (KKT) condition and branch and bound method, to solve it. The numerical results show that the proposed optimization model can improve users' QoE significantly in comparison with other baseline schemes.
△ Less
Submitted 10 January, 2020;
originally announced January 2020.
-
$^{27}\text{Al }$ NMR chemical shift of $\text{Al}(\text{OH})_{4}^{-}$ from first principles. Assessment of error cancellation in NMR chemical shift computations in chemically distinct reference and targeted systems
Authors:
Ernesto Martinez-Baez,
Rulin Feng,
Carolyn I. Pearce,
Gregory K. Schenter,
Aurora E. Clark
Abstract:
Predicting accurate NMR chemical shieldings relies upon cancellation of different types of error in the ab initio methodology used to calculate the shielding tensor of the analyte of interest and the reference. Often the intrinsic error in computed shieldings due to basis sets, approximations in the Hamiltonian, description of the wave function, and dynamic effects, is nearly identical between the…
▽ More
Predicting accurate NMR chemical shieldings relies upon cancellation of different types of error in the ab initio methodology used to calculate the shielding tensor of the analyte of interest and the reference. Often the intrinsic error in computed shieldings due to basis sets, approximations in the Hamiltonian, description of the wave function, and dynamic effects, is nearly identical between the analyte and reference, yet if the electronic structure or sensitivity to local environment differs dramatically, this cannot be taken for granted. Detailed prior work has examined the octahedral trivalent cation $\text{Al}(\text{H}_{2}\text{O})_{6}^{3+}$ , accounting for ab initio intrinsic errors. However, the fact that this analyte is used as a reference for the chemically distinct tetrahedral anion $\text{Al}(\text{OH})_{4}^{-}$ inspires the study of how these errors cancel in an attempt to understand the limits of predictive capability for accurately determining $^{27}\text{Al }$ shielding in $\text{Al}(\text{OH})_{4}^{-}$. In this work, we estimate the absolute shielding of $^{27}\text{Al }$ nucleus in $\text{Al}(\text{OH})_{4}^{-}$ at the coupled cluster level (515.1 $\pm$ 5.3 ppm). Shielding sensitivity to the choice of method approximation and atomic basis sets treatment has been evaluated. Solvent and thermal effects are assessed through ensemble averaging techniques using ab-initio molecular dynamics. The contribution of each type of intrinsic error is assessed for $\text{Al}(\text{H}_{2}\text{O})_{6}^{3+}$ and $\text{Al}(\text{OH})_{4}^{-}$ ions, revealing significant differences that fundamentally hamper the ability to accurately calculate the $^{27}\text{Al }$ chemical shift of $\text{Al}(\text{OH})_{4}^{-}$ from first principles.
△ Less
Submitted 31 December, 2019;
originally announced January 2020.
-
Note on the construction of Picard-Vessiot rings for linear differential equations
Authors:
Ruyong Feng
Abstract:
In this note, we describe a method to construct the Picard-Vessiot ring of a given linear differential equation.
In this note, we describe a method to construct the Picard-Vessiot ring of a given linear differential equation.
△ Less
Submitted 30 September, 2019;
originally announced October 2019.
-
Learning Fixed Points in Generative Adversarial Networks: From Image-to-Image Translation to Disease Detection and Localization
Authors:
Md Mahfuzur Rahman Siddiquee,
Zongwei Zhou,
Nima Tajbakhsh,
Ruibin Feng,
Michael B. Gotway,
Yoshua Bengio,
Jianming Liang
Abstract:
Generative adversarial networks (GANs) have ushered in a revolution in image-to-image translation. The development and proliferation of GANs raises an interesting question: can we train a GAN to remove an object, if present, from an image while otherwise preserving the image? Specifically, can a GAN "virtually heal" anyone by turning his medical image, with an unknown health status (diseased or he…
▽ More
Generative adversarial networks (GANs) have ushered in a revolution in image-to-image translation. The development and proliferation of GANs raises an interesting question: can we train a GAN to remove an object, if present, from an image while otherwise preserving the image? Specifically, can a GAN "virtually heal" anyone by turning his medical image, with an unknown health status (diseased or healthy), into a healthy one, so that diseased regions could be revealed by subtracting those two images? Such a task requires a GAN to identify a minimal subset of target pixels for domain translation, an ability that we call fixed-point translation, which no GAN is equipped with yet. Therefore, we propose a new GAN, called Fixed-Point GAN, trained by (1) supervising same-domain translation through a conditional identity loss, and (2) regularizing cross-domain translation through revised adversarial, domain classification, and cycle consistency loss. Based on fixed-point translation, we further derive a novel framework for disease detection and localization using only image-level annotation. Qualitative and quantitative evaluations demonstrate that the proposed method outperforms the state of the art in multi-domain image-to-image translation and that it surpasses predominant weakly-supervised localization methods in both disease detection and localization. Implementation is available at https://github.com/jlianglab/Fixed-Point-GAN.
△ Less
Submitted 29 August, 2019; v1 submitted 16 August, 2019;
originally announced August 2019.
-
Models Genesis: Generic Autodidactic Models for 3D Medical Image Analysis
Authors:
Zongwei Zhou,
Vatsal Sodha,
Md Mahfuzur Rahman Siddiquee,
Ruibin Feng,
Nima Tajbakhsh,
Michael B. Gotway,
Jianming Liang
Abstract:
Transfer learning from natural image to medical image has established as one of the most practical paradigms in deep learning for medical image analysis. However, to fit this paradigm, 3D imaging tasks in the most prominent imaging modalities (e.g., CT and MRI) have to be reformulated and solved in 2D, losing rich 3D anatomical information and inevitably compromising the performance. To overcome t…
▽ More
Transfer learning from natural image to medical image has established as one of the most practical paradigms in deep learning for medical image analysis. However, to fit this paradigm, 3D imaging tasks in the most prominent imaging modalities (e.g., CT and MRI) have to be reformulated and solved in 2D, losing rich 3D anatomical information and inevitably compromising the performance. To overcome this limitation, we have built a set of models, called Generic Autodidactic Models, nicknamed Models Genesis, because they are created ex nihilo (with no manual labeling), self-taught (learned by self-supervision), and generic (served as source models for generating application-specific target models). Our extensive experiments demonstrate that our Models Genesis significantly outperform learning from scratch in all five target 3D applications covering both segmentation and classification. More importantly, learning a model from scratch simply in 3D may not necessarily yield performance better than transfer learning from ImageNet in 2D, but our Models Genesis consistently top any 2D approaches including fine-tuning the models pre-trained from ImageNet as well as fine-tuning the 2D versions of our Models Genesis, confirming the importance of 3D anatomical information and significance of our Models Genesis for 3D medical imaging. This performance is attributed to our unified self-supervised learning framework, built on a simple yet powerful observation: the sophisticated yet recurrent anatomy in medical images can serve as strong supervision signals for deep models to learn common anatomical representation automatically via self-supervision. As open science, all pre-trained Models Genesis are available at https://github.com/MrGiovanni/ModelsGenesis.
△ Less
Submitted 19 August, 2019;
originally announced August 2019.
-
Zeros of repeated derivatives of random polynomials
Authors:
Renjie Feng,
Dong Yao
Abstract:
It has been shown that zeros of Kac polynomials $K_n(z)$ of degree $n$ cluster asymptotically near the unit circle as $n\to\infty$ under some assumptions. This property remains unchanged for the $l$-th derivative of the Kac polynomials $K^{(l)}_n(z)$ for any fixed order $l$. So it's natural to study the situation when the number of the derivatives we take depends on $n$, i.e., $l=N_n$. We will sho…
▽ More
It has been shown that zeros of Kac polynomials $K_n(z)$ of degree $n$ cluster asymptotically near the unit circle as $n\to\infty$ under some assumptions. This property remains unchanged for the $l$-th derivative of the Kac polynomials $K^{(l)}_n(z)$ for any fixed order $l$. So it's natural to study the situation when the number of the derivatives we take depends on $n$, i.e., $l=N_n$. We will show that the limiting global behavior of zeros of $K_n^{(N_n)}(z)$ depends on the limit of the ratio $N_n/n$. In particular, we prove that when the limit of the ratio is strictly positive, the property of the uniform clustering around the unit circle fails; when the ratio is close to 1, the zeros have some rescaling phenomenon. Then we study such problem for random polynomials with more general coefficients. But things, especially the rescaling phenomenon, become very complicated for the general case when $N_n/n\to 1$, where we compute the case of the random elliptic polynomials to illustrate this.
△ Less
Submitted 2 August, 2019;
originally announced August 2019.
-
Suppressing Model Overfitting for Image Super-Resolution Networks
Authors:
Ruicheng Feng,
**** Gu,
Yu Qiao,
Chao Dong
Abstract:
Large deep networks have demonstrated competitive performance in single image super-resolution (SISR), with a huge volume of data involved. However, in real-world scenarios, due to the limited accessible training pairs, large models exhibit undesirable behaviors such as overfitting and memorization. To suppress model overfitting and further enjoy the merits of large model capacity, we thoroughly i…
▽ More
Large deep networks have demonstrated competitive performance in single image super-resolution (SISR), with a huge volume of data involved. However, in real-world scenarios, due to the limited accessible training pairs, large models exhibit undesirable behaviors such as overfitting and memorization. To suppress model overfitting and further enjoy the merits of large model capacity, we thoroughly investigate generic approaches for supplying additional training data pairs. In particular, we introduce a simple learning principle MixUp to train networks on interpolations of sample pairs, which encourages networks to support linear behavior in-between training samples. In addition, we propose a data synthesis method with learned degradation, enabling models to use extra high-quality images with higher content diversity. This strategy proves to be successful in reducing biases of data. By combining these components -- MixUp and synthetic training data, large models can be trained without overfitting under very limited data samples and achieve satisfactory generalization performance. Our method won the second place in NTIRE2019 Real SR Challenge.
△ Less
Submitted 11 June, 2019;
originally announced June 2019.
-
Robot-Assisted Feeding: Generalizing Skewering Strategies across Food Items on a Realistic Plate
Authors:
Ryan Feng,
Youngsun Kim,
Gilwoo Lee,
Ethan K. Gordon,
Matt Schmittle,
Shivaum Kumar,
Tapomayukh Bhattacharjee,
Siddhartha S. Srinivasa
Abstract:
A robot-assisted feeding system must successfully acquire many different food items. A key challenge is the wide variation in the physical properties of food, demanding diverse acquisition strategies that are also capable of adapting to previously unseen items. Our key insight is that items with similar physical properties will exhibit similar success rates across an action space, allowing the rob…
▽ More
A robot-assisted feeding system must successfully acquire many different food items. A key challenge is the wide variation in the physical properties of food, demanding diverse acquisition strategies that are also capable of adapting to previously unseen items. Our key insight is that items with similar physical properties will exhibit similar success rates across an action space, allowing the robot to generalize its actions to previously unseen items. To better understand which skewering strategy works best for each food item, we collected a dataset of 2450 robot bite acquisition trials for 16 food items with varying properties. Analyzing the dataset provided insights into how the food items' surrounding environment, fork pitch, and fork roll angles affect bite acquisition success. We then developed a bite acquisition framework that takes the image of a full plate as an input, segments it into food items, and then applies our Skewering-Position-Action network (SPANet) to choose a target food item and a corresponding action so that the bite acquisition success rate is maximized. SPANet also uses the surrounding environment features of food items to predict action success rates. We used this framework to perform multiple experiments on uncluttered and cluttered plates. Results indicate that our integrated system can successfully generalize skewering strategies to many previously unseen food items.
△ Less
Submitted 6 September, 2019; v1 submitted 5 June, 2019;
originally announced June 2019.
-
The Berry-Esseen Theorem for Circular $β$-ensemble
Authors:
Renjie Feng,
Gang Tian,
Dongyi Wei
Abstract:
We will prove the Berry-Esseen theorem for the number counting function of the circular $β$-ensemble (C$β$E), which will imply the central limit theorem for the number of points in arcs of the unit circle in mesoscopic and macroscopic scales. We will prove the main result by estimating the characteristic functions of the Prüfer phases and the number counting function, which will imply the uniform…
▽ More
We will prove the Berry-Esseen theorem for the number counting function of the circular $β$-ensemble (C$β$E), which will imply the central limit theorem for the number of points in arcs of the unit circle in mesoscopic and macroscopic scales. We will prove the main result by estimating the characteristic functions of the Prüfer phases and the number counting function, which will imply the uniform upper and lower bounds of their variance. We also show that the similar results hold for the Sine$_β$ process. As a direct application of the uniform variance bound, we can prove the normality of the linear statistics when the test function $f(θ)\in W^{1,p}(S^1)$ for some $p\in(1,+\infty)$.
△ Less
Submitted 14 December, 2023; v1 submitted 22 May, 2019;
originally announced May 2019.
-
Learning Fair Representations via an Adversarial Framework
Authors:
Rui Feng,
Yang Yang,
Yuehan Lyu,
Chenhao Tan,
Yizhou Sun,
Chun** Wang
Abstract:
Fairness has become a central issue for our research community as classification algorithms are adopted in societally critical domains such as recidivism prediction and loan approval. In this work, we consider the potential bias based on protected attributes (e.g., race and gender), and tackle this problem by learning latent representations of individuals that are statistically indistinguishable b…
▽ More
Fairness has become a central issue for our research community as classification algorithms are adopted in societally critical domains such as recidivism prediction and loan approval. In this work, we consider the potential bias based on protected attributes (e.g., race and gender), and tackle this problem by learning latent representations of individuals that are statistically indistinguishable between protected groups while sufficiently preserving other information for classification. To do that, we develop a minimax adversarial framework with a generator to capture the data distribution and generate latent representations, and a critic to ensure that the distributions across different protected groups are similar. Our framework provides a theoretical guarantee with respect to statistical parity and individual fairness. Empirical results on four real-world datasets also show that the learned representation can effectively be used for classification tasks such as credit risk prediction while obstructing information related to protected groups, especially when removing protected attributes is not sufficient for fair classification.
△ Less
Submitted 30 April, 2019;
originally announced April 2019.
-
Comment on "Orientational Distribution of Free O-H Groups of Interfacial Water is Exponential"
Authors:
Wei Gan,
Ran-ran Feng,
Hong-Fei Wang
Abstract:
In a recent letter (PRL,121,246101,2018), Sun et al. reported that combined MD simulation and sum frequency generation vibrational spectroscopy (SFG-VS) measurements led to conclusions of a broad and exponentially decaying orientational distribution, and the presence of the free O-H group pointing down to the bulk at the air/water interface. In this comment, we show that their main conclusions are…
▽ More
In a recent letter (PRL,121,246101,2018), Sun et al. reported that combined MD simulation and sum frequency generation vibrational spectroscopy (SFG-VS) measurements led to conclusions of a broad and exponentially decaying orientational distribution, and the presence of the free O-H group pointing down to the bulk at the air/water interface. In this comment, we show that their main conclusions are based on questionable interpretation of the SFG-VS data presented in the letter [1], and are also contrary to the established data analysis and interpretations in the literature [2-5].
△ Less
Submitted 3 July, 2019; v1 submitted 17 April, 2019;
originally announced April 2019.
-
Small gaps of GOE
Authors:
Renjie Feng,
Gang Tian,
Dongyi Wei
Abstract:
In this article, we study the smallest gaps of the Gaussian orthogonal ensemble. The main result is that the smallest gaps, after normalized by $n$, will tend to a Poisson distribution, and the limiting density of the $k$-th normalized smallest gaps is $ 2{}x^{2k-1}e^{-x^{2}}/(k-1)!$.
In this article, we study the smallest gaps of the Gaussian orthogonal ensemble. The main result is that the smallest gaps, after normalized by $n$, will tend to a Poisson distribution, and the limiting density of the $k$-th normalized smallest gaps is $ 2{}x^{2k-1}e^{-x^{2}}/(k-1)!$.
△ Less
Submitted 6 January, 2019;
originally announced January 2019.
-
An Orchestrated Empirical Study on Deep Learning Frameworks and Platforms
Authors:
Qianyu Guo,
Xiaofei Xie,
Lei Ma,
Qiang Hu,
Ruitao Feng,
Li Li,
Yang Liu,
Jianjun Zhao,
Xiaohong Li
Abstract:
Deep learning (DL) has recently achieved tremendous success in a variety of cutting-edge applications, e.g., image recognition, speech and natural language processing, and autonomous driving. Besides the available big data and hardware evolution, DL frameworks and platforms play a key role to catalyze the research, development, and deployment of DL intelligent solutions. However, the difference in…
▽ More
Deep learning (DL) has recently achieved tremendous success in a variety of cutting-edge applications, e.g., image recognition, speech and natural language processing, and autonomous driving. Besides the available big data and hardware evolution, DL frameworks and platforms play a key role to catalyze the research, development, and deployment of DL intelligent solutions. However, the difference in computation paradigm, architecture design and implementation of existing DL frameworks and platforms brings challenges for DL software development, deployment, maintenance, and migration. Up to the present, it still lacks a comprehensive study on how current diverse DL frameworks and platforms influence the DL software development process.
In this paper, we initiate the first step towards the investigation on how existing state-of-the-art DL frameworks (i.e., TensorFlow, Theano, and Torch) and platforms (i.e., server/desktop, web, and mobile) support the DL software development activities. We perform an in-depth and comparative evaluation on metrics such as learning accuracy, DL model size, robustness, and performance, on state-of-the-art DL frameworks across platforms using two popular datasets MNIST and CIFAR-10. Our study reveals that existing DL frameworks still suffer from compatibility issues, which becomes even more severe when it comes to different platforms. We pinpoint the current challenges and opportunities towards develo** high quality and compatible DL systems. To ignite further investigation along this direction to address urgent industrial demands of intelligent solutions, we make all of our assembled feasible toolchain and dataset publicly available.
△ Less
Submitted 13 November, 2018;
originally announced November 2018.
-
Effects of free-ranging livestock on sympatric herbivores at fine spatiotemporal scales
Authors:
Rongna Feng,
Xinyue Lu,
Tianming Wang,
Jiawei Feng,
Yifei Sun,
Wenhong Xiao,
Yu Guan,
Limin Feng,
James L. D. Smith,
Jian** Ge
Abstract:
Understanding wildlife-livestock interactions is crucial for the design and management of protected areas that aim to conserve large mammal communities undergoing conflicts with humans worldwide. An example of the need to quantify the strength and direction of species interactions is the conservation of big cats in newly established protected areas in China. Currently, free-ranging livestock degra…
▽ More
Understanding wildlife-livestock interactions is crucial for the design and management of protected areas that aim to conserve large mammal communities undergoing conflicts with humans worldwide. An example of the need to quantify the strength and direction of species interactions is the conservation of big cats in newly established protected areas in China. Currently, free-ranging livestock degrade the food and habitat of the endangered Amur tiger and Amur leopard in the forest landscapes of Northeast China, but quantitative assessments of how livestock affect the use of habitat by the major ungulate prey of these predators are very limited. Here, we examined livestock-ungulate interactions using large-scale camera-trap data in the newly established Tiger and Leopard National Park in Northeast China, which borders Russia. We used N-mixture models, two-species occupancy models and activity pattern overlap to understand the effects of cattle grazing on three ungulate species (wild boar, roe deer and sika deer) at a fine spatiotemporal scale. Our results showed that incorporating the biotic interactions with cattle had significant negative effects on encounters with three ungulates; sika deer were particularly displaced as more cattle encroached on forest habitat, as they exhibited low levels of co-occurrence with cattle in terms of habitat use. These results, combined with spatiotemporal overlap, suggested fine-scale avoidance behaviours, and they can help to refine strategies for the conservation of tigers, leopards and their prey in human-dominated transboundary landscapes. Progressively controlling cattle and the impact of cattle on biodiversity while simultaneously addressing the economic needs of local communities should be key priority actions for the Chinese government.
△ Less
Submitted 23 January, 2020; v1 submitted 26 October, 2018;
originally announced October 2018.
-
On $s$-distance-transitive graphs
Authors:
Hui Zhou,
Cheryl Praeger,
Michael Giudici,
Rongquan Feng,
Xingui Fang
Abstract:
Distance-regular graphs have many beautiful combinatorial properties. Distance-transitive graphs have very strong symmetries, and they are distance-regular, i.e. distance-transitivity implies distance-regularity. In this paper, we give similar results, i.e. for special $s$ and graphs with other restrictions we show that $s$-distance-transitivity implies distance-regularity.
Distance-regular graphs have many beautiful combinatorial properties. Distance-transitive graphs have very strong symmetries, and they are distance-regular, i.e. distance-transitivity implies distance-regularity. In this paper, we give similar results, i.e. for special $s$ and graphs with other restrictions we show that $s$-distance-transitivity implies distance-regularity.
△ Less
Submitted 21 October, 2018;
originally announced October 2018.
-
PIRM Challenge on Perceptual Image Enhancement on Smartphones: Report
Authors:
Andrey Ignatov,
Radu Timofte,
Thang Van Vu,
Tung Minh Luu,
Trung X Pham,
Cao Van Nguyen,
Yongwoo Kim,
Jae-Seok Choi,
Munchurl Kim,
Jie Huang,
Jiewen Ran,
Chen Xing,
Xingguang Zhou,
Pengfei Zhu,
Mingrui Geng,
Yawei Li,
Eirikur Agustsson,
Shuhang Gu,
Luc Van Gool,
Etienne de Stoutz,
Nikolay Kobyshev,
Kehui Nie,
Yan Zhao,
Gen Li,
Tong Tong
, et al. (23 additional authors not shown)
Abstract:
This paper reviews the first challenge on efficient perceptual image enhancement with the focus on deploying deep learning models on smartphones. The challenge consisted of two tracks. In the first one, participants were solving the classical image super-resolution problem with a bicubic downscaling factor of 4. The second track was aimed at real-world photo enhancement, and the goal was to map lo…
▽ More
This paper reviews the first challenge on efficient perceptual image enhancement with the focus on deploying deep learning models on smartphones. The challenge consisted of two tracks. In the first one, participants were solving the classical image super-resolution problem with a bicubic downscaling factor of 4. The second track was aimed at real-world photo enhancement, and the goal was to map low-quality photos from the iPhone 3GS device to the same photos captured with a DSLR camera. The target metric used in this challenge combined the runtime, PSNR scores and solutions' perceptual results measured in the user study. To ensure the efficiency of the submitted models, we additionally measured their runtime and memory requirements on Android smartphones. The proposed solutions significantly improved baseline results defining the state-of-the-art for image enhancement on smartphones.
△ Less
Submitted 3 October, 2018;
originally announced October 2018.
-
Bulk-sensitive Imaging of Twin Domains in La$_{2-x}$Sr$_x$CuO$_4$ under Uniaxial Pressure
Authors:
Xin Yu Zheng,
Renfei Feng,
D. S. Ellis,
Young-June Kim
Abstract:
We report our study of twin domains in $La_{2-x}Sr_x CuO_4$ under uniaxial pressure. Using bulk-sensitive x-ray microdiffraction in Laue geometry, we image the distribution of twin domains at room temperature. When compressive uniaxial pressure is applied along one of the in-plane crystallographic axes, the domain population changes dramatically. We observe that the twin domain with shorter lattic…
▽ More
We report our study of twin domains in $La_{2-x}Sr_x CuO_4$ under uniaxial pressure. Using bulk-sensitive x-ray microdiffraction in Laue geometry, we image the distribution of twin domains at room temperature. When compressive uniaxial pressure is applied along one of the in-plane crystallographic axes, the domain population changes dramatically. We observe that the twin domain with shorter lattice parameter along the direction of pressure is unstable under compression, and disappears completely with only moderate pressure. On the other hand, application of tensile pressure changes the domain structure only slightly, demonstrating the asymmetric response of the sample to uniaxial pressure. Our observations suggest that a crystal's response to uniaxial pressure is complex and could deviate easily from the linear-response regime.
△ Less
Submitted 3 August, 2018;
originally announced August 2018.
-
Directed Strongly Regular Cayley Graphs on Dihedral groups $D_n$
Authors:
Yiqin He,
Bicheng Zhangb,
Rongquan Feng
Abstract:
In this paper, we characterize some certain directed strongly regular Cayley graphs on Dihedral groups $D_{n}$, where $n\geqslant 3$ is a positive integer.
In this paper, we characterize some certain directed strongly regular Cayley graphs on Dihedral groups $D_{n}$, where $n\geqslant 3$ is a positive integer.
△ Less
Submitted 23 July, 2018;
originally announced July 2018.
-
Large gaps of CUE and GUE
Authors:
Renjie Feng,
Dongyi Wei
Abstract:
In this article, we will study the largest gaps of the classical random matrices of CUE and GUE. The main result is that, in both cases, the rescaling largest gaps will converge to the Poisson point processes, and the limiting densities are given by the Gumbel distributions.
In this article, we will study the largest gaps of the classical random matrices of CUE and GUE. The main result is that, in both cases, the rescaling largest gaps will converge to the Poisson point processes, and the limiting densities are given by the Gumbel distributions.
△ Less
Submitted 13 November, 2022; v1 submitted 5 July, 2018;
originally announced July 2018.
-
Spectrum of SYK model II: Central limit theorem
Authors:
Renjie Feng,
Gang Tian,
Dongyi Wei
Abstract:
In our previous paper \cite{FTD1}, we derived the almost sure convergence of the global density of eigenvalues of random matrices of the SYK model. In this paper, we will prove the central limit theorem for the linear statistic of eigenvalues and compute its variance.
In our previous paper \cite{FTD1}, we derived the almost sure convergence of the global density of eigenvalues of random matrices of the SYK model. In this paper, we will prove the central limit theorem for the linear statistic of eigenvalues and compute its variance.
△ Less
Submitted 12 June, 2018;
originally announced June 2018.
-
Spectrum of SYK model III: Large deviations and concentration of measures
Authors:
Renjie Feng,
Gang Tian,
Dongyi Wei
Abstract:
In \cite{FTD1}, we proved the almost sure convergence of eigenvalues of the SYK model, which can be viewed as a type of \emph{law of large numbers} in probability theory; in \cite{FTD2}, we proved that the linear statistic of eigenvalues satisfies the \emph{central limit theorem}. In this article, we continue to study another important theorem in probability theory\,-- the \emph{concentration of m…
▽ More
In \cite{FTD1}, we proved the almost sure convergence of eigenvalues of the SYK model, which can be viewed as a type of \emph{law of large numbers} in probability theory; in \cite{FTD2}, we proved that the linear statistic of eigenvalues satisfies the \emph{central limit theorem}. In this article, we continue to study another important theorem in probability theory\,-- the \emph{concentration of measure theorem}, especially for the Gaussian SYK model. We will prove a \emph{large deviation principle} (LDP) for the normalized empirical measure of eigenvalues when $q_n=2$, in which case the eigenvalues can be expressed in term of these of Gaussian random antisymmetric matrices. Such LDP result has its own independent interest in random matrix theory. For general $q_n\geq 3$, we can not prove the LDP, we will prove a concentration of measure theorem by estimating the Lipschitz norm of the Gaussian SYK model.
△ Less
Submitted 12 June, 2018;
originally announced June 2018.
-
Small gaps of circular $β$-ensemble
Authors:
Renjie Feng,
Dongyi Wei
Abstract:
In this article, we study the smallest gaps of the log-gas $β$-ensemble on the unit circle (C$β$E), where $β$ is any positive integer. The main result is that the smallest gaps, after being normalized by $n^{\frac {β+2}{β+1}}$, will converge in distribution to a Poisson point process with some explicit intensity. And thus one can derive the limiting density of the $k$-th smallest gap, which is pro…
▽ More
In this article, we study the smallest gaps of the log-gas $β$-ensemble on the unit circle (C$β$E), where $β$ is any positive integer. The main result is that the smallest gaps, after being normalized by $n^{\frac {β+2}{β+1}}$, will converge in distribution to a Poisson point process with some explicit intensity. And thus one can derive the limiting density of the $k$-th smallest gap, which is proportional to $x^{k(β+1)-1}e^{-x^{β+1}}$. In particular, the result applies to the classical COE, CUE and CSE in random matrix theory. The essential part of the proof is to derive several identities and inequalities regarding the Selberg integral, which should have their own interest.
△ Less
Submitted 21 September, 2020; v1 submitted 5 June, 2018;
originally announced June 2018.
-
Robust Real-time Ellipse Fitting Based on Lagrange Programming Neural Network and Locally Competitive Algorithm
Authors:
Hao Wang,
Chi-Sing Leung,
Hing Cheung So,
Junli Liang,
Ruibin Feng,
Zifa Han
Abstract:
Given a set of 2-dimensional (2-D) scattering points, which are usually obtained from the edge detection process, the aim of ellipse fitting is to construct an elliptic equation that best fits the collected observations. However, some of the scattering points may contain outliers due to imperfect edge detection. To address this issue, we devise a robust real-time ellipse fitting approach based on…
▽ More
Given a set of 2-dimensional (2-D) scattering points, which are usually obtained from the edge detection process, the aim of ellipse fitting is to construct an elliptic equation that best fits the collected observations. However, some of the scattering points may contain outliers due to imperfect edge detection. To address this issue, we devise a robust real-time ellipse fitting approach based on two kinds of analog neural network, Lagrange programming neural network (LPNN) and locally competitive algorithm (LCA). First, to alleviate the influence of these outliers, the fitting task is formulated as a nonsmooth constrained optimization problem in which the objective function is either an l1-norm or l0-norm term. It is because compared with the l2-norm in some traditional ellipse fitting models, the lp-norm with p<2 is less sensitive to outliers. Then, to calculate a real-time solution of this optimization problem, LPNN is applied. As the LPNN model cannot handle the non-differentiable term in its objective, the concept of LCA is introduced and combined with the LPNN framework. Simulation and experimental results show that the proposed ellipse fitting approach is superior to several state-of-the-art algorithms.
△ Less
Submitted 30 May, 2018;
originally announced June 2018.