-
Correspondence Research of the Most Probable Transition Paths between a Stochastic Interacting Particle System and its Mean Field Limit System
Authors:
Jianyu Chen,
Jianyu Hu,
Zibo Wang,
Ting Gao **qiao Duan
Abstract:
This paper derived the indirect approximation theorem of the most probable transition pathway of a stochastic interacting particle system in the mean field sense. This paper studied the problem of indirect approximation of the most probable transition pathway of an interacting particle system (i.e., a high-dimensional stochastic dynamic system) and its mean field limit equation (McKean-Vlasov stoc…
▽ More
This paper derived the indirect approximation theorem of the most probable transition pathway of a stochastic interacting particle system in the mean field sense. This paper studied the problem of indirect approximation of the most probable transition pathway of an interacting particle system (i.e., a high-dimensional stochastic dynamic system) and its mean field limit equation (McKean-Vlasov stochastic differential equation). This study is based on the Onsager-Machlup action functional, reformulated the problem as an optimal control problem. With the stochastic Pontryagin's Maximum Principle, this paper completed the derivation. This paper proved the existence and uniqueness theorem of the solution to the mean field optimal control problem of McKean-Vlasov stochastic differential equations, and also established a system of equations satisfying the control parameters $θ^{*}$ and $θ^{N}$ respectively. There are few studies on the most probable transition pathways of stochastic interacting particle systems, it is still a great challenge to solve the most probable transition pathways directly or to approximate it with the mean field limit system. Therefore, this paper first gave the proof of correspondence between the core equation of Pontryagin's Maximum Principle, that is, Hamiltonian extreme condition equation. That is to say, this correspondence indirectly explain the correspondence between the most probable transition pathways of stochastic interacting particle systems and the mean field systems.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Representation Learning of Tangled Key-Value Sequence Data for Early Classification
Authors:
Tao Duan,
Junzhou Zhao,
Shuo Zhang,
**g Tao,
**hui Wang
Abstract:
Key-value sequence data has become ubiquitous and naturally appears in a variety of real-world applications, ranging from the user-product purchasing sequences in e-commerce, to network packet sequences forwarded by routers in networking. Classifying these key-value sequences is important in many scenarios such as user profiling and malicious applications identification. In many time-sensitive sce…
▽ More
Key-value sequence data has become ubiquitous and naturally appears in a variety of real-world applications, ranging from the user-product purchasing sequences in e-commerce, to network packet sequences forwarded by routers in networking. Classifying these key-value sequences is important in many scenarios such as user profiling and malicious applications identification. In many time-sensitive scenarios, besides the requirement of classifying a key-value sequence accurately, it is also desired to classify a key-value sequence early, in order to respond fast. However, these two goals are conflicting in nature, and it is challenging to achieve them simultaneously. In this work, we formulate a novel tangled key-value sequence early classification problem, where a tangled key-value sequence is a mixture of several concurrent key-value sequences with different keys. The goal is to classify each individual key-value sequence sharing a same key both accurately and early. To address this problem, we propose a novel method, i.e., Key-Value sequence Early Co-classification (KVEC), which leverages both inner- and inter-correlations of items in a tangled key-value sequence through key correlation and value correlation to learn a better sequence representation. Meanwhile, a time-aware halting policy decides when to stop the ongoing key-value sequence and classify it based on current sequence representation. Experiments on both real-world and synthetic datasets demonstrate that our method outperforms the state-of-the-art baselines significantly. KVEC improves the prediction accuracy by up to $4.7 - 17.5\%$ under the same prediction earliness condition, and improves the harmonic mean of accuracy and earliness by up to $3.7 - 14.0\%$.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
AGRNav: Efficient and Energy-Saving Autonomous Navigation for Air-Ground Robots in Occlusion-Prone Environments
Authors:
Junming Wang,
Zekai Sun,
Xiuxian Guan,
Tianxiang Shen,
Zongyuan Zhang,
Tianyang Duan,
Dong Huang,
Shixiong Zhao,
Heming Cui
Abstract:
The exceptional mobility and long endurance of air-ground robots are raising interest in their usage to navigate complex environments (e.g., forests and large buildings). However, such environments often contain occluded and unknown regions, and without accurate prediction of unobserved obstacles, the movement of the air-ground robot often suffers a suboptimal trajectory under existing map**-bas…
▽ More
The exceptional mobility and long endurance of air-ground robots are raising interest in their usage to navigate complex environments (e.g., forests and large buildings). However, such environments often contain occluded and unknown regions, and without accurate prediction of unobserved obstacles, the movement of the air-ground robot often suffers a suboptimal trajectory under existing map**-based and learning-based navigation methods. In this work, we present AGRNav, a novel framework designed to search for safe and energy-saving air-ground hybrid paths. AGRNav contains a lightweight semantic scene completion network (SCONet) with self-attention to enable accurate obstacle predictions by capturing contextual information and occlusion area features. The framework subsequently employs a query-based method for low-latency updates of prediction results to the grid map. Finally, based on the updated map, the hierarchical path planner efficiently searches for energy-saving paths for navigation. We validate AGRNav's performance through benchmarks in both simulated and real-world environments, demonstrating its superiority over classical and state-of-the-art methods. The open-source code is available at https://github.com/jmwang0117/AGRNav.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Distributionally Robust Cross Subject EEG Decoding
Authors:
Tiehang Duan,
Zhenyi Wang,
Gianfranco Doretto,
Fang Li,
Cui Tao,
Donald Adjeroh
Abstract:
Recently, deep learning has shown to be effective for Electroencephalography (EEG) decoding tasks. Yet, its performance can be negatively influenced by two key factors: 1) the high variance and different types of corruption that are inherent in the signal, 2) the EEG datasets are usually relatively small given the acquisition cost, annotation cost and amount of effort needed. Data augmentation app…
▽ More
Recently, deep learning has shown to be effective for Electroencephalography (EEG) decoding tasks. Yet, its performance can be negatively influenced by two key factors: 1) the high variance and different types of corruption that are inherent in the signal, 2) the EEG datasets are usually relatively small given the acquisition cost, annotation cost and amount of effort needed. Data augmentation approaches for alleviation of this problem have been empirically studied, with augmentation operations on spatial domain, time domain or frequency domain handcrafted based on expertise of domain knowledge. In this work, we propose a principled approach to perform dynamic evolution on the data for improvement of decoding robustness. The approach is based on distributionally robust optimization and achieves robustness by optimizing on a family of evolved data distributions instead of the single training data distribution. We derived a general data evolution framework based on Wasserstein gradient flow (WGF) and provides two different forms of evolution within the framework. Intuitively, the evolution process helps the EEG decoder to learn more robust and diverse features. It is worth mentioning that the proposed approach can be readily integrated with other data augmentation approaches for further improvements. We performed extensive experiments on the proposed approach and tested its performance on different types of corrupted EEG signals. The model significantly outperforms competitive baselines on challenging decoding scenarios.
△ Less
Submitted 19 August, 2023;
originally announced August 2023.
-
Carbon emissions and sustainability of launching 5G mobile networks in China
Authors:
Tong Li,
Li Yu,
Yibo Ma,
Tong Duan,
Wenzhen Huang,
Yan Zhou,
Depeng **,
Yong Li,
Tao Jiang
Abstract:
Since 2021, China has deployed more than 2.1 million 5G base stations to increase the network capacity and provide ubiquitous digital connectivity for mobile terminals. However, the launch of 5G networks also exacerbates the misalignment between cellular traffic and energy consumption, which reduces carbon efficiency - the amount of network traffic that can be delivered for each unit of carbon emi…
▽ More
Since 2021, China has deployed more than 2.1 million 5G base stations to increase the network capacity and provide ubiquitous digital connectivity for mobile terminals. However, the launch of 5G networks also exacerbates the misalignment between cellular traffic and energy consumption, which reduces carbon efficiency - the amount of network traffic that can be delivered for each unit of carbon emission. In this study, we develop a large-scale data-driven framework to estimate the carbon emissions induced by mobile networks. We show that the decline in carbon efficiency leads to a carbon efficiency trap, estimated to cause additional carbon emissions of 23.82 +- 1.07 megatons in China. To mitigate the misalignment and improve energy efficiency, we propose DeepEnergy, an energy-saving method leveraging collaborative deep reinforcement learning and graph neural networks. DeepEnergy models complex collaboration among cells, making it possible to effectively coordinate the working state of tens of thousands of cells, which could help over 71% of Chinese provinces avoid carbon efficiency traps. In addition, applying DeepEnergy is estimated to reduce 20.90 +- 0.98 megatons of carbon emissions at the national level in 2023. We further assess the effects of adopting renewable energy and discover that the mobile network could accomplish more than 50% of its net-zero goal by integrating DeepEnergy and solar energy systems. Our study provides insight into carbon emission mitigation in 5G network infrastructure launching in China and overworld, paving the way towards achieving sustainable development goals and future net-zero mobile networks.
△ Less
Submitted 14 June, 2023;
originally announced June 2023.
-
Meta-Learning with Less Forgetting on Large-Scale Non-Stationary Task Distributions
Authors:
Zhenyi Wang,
Li Shen,
Le Fang,
Qiuling Suo,
Donglin Zhan,
Tiehang Duan,
Mingchen Gao
Abstract:
The paradigm of machine intelligence moves from purely supervised learning to a more practical scenario when many loosely related unlabeled data are available and labeled data is scarce. Most existing algorithms assume that the underlying task distribution is stationary. Here we consider a more realistic and challenging setting in that task distributions evolve over time. We name this problem as S…
▽ More
The paradigm of machine intelligence moves from purely supervised learning to a more practical scenario when many loosely related unlabeled data are available and labeled data is scarce. Most existing algorithms assume that the underlying task distribution is stationary. Here we consider a more realistic and challenging setting in that task distributions evolve over time. We name this problem as Semi-supervised meta-learning with Evolving Task diStributions, abbreviated as SETS. Two key challenges arise in this more realistic setting: (i) how to use unlabeled data in the presence of a large amount of unlabeled out-of-distribution (OOD) data; and (ii) how to prevent catastrophic forgetting on previously learned task distributions due to the task distribution shift. We propose an OOD Robust and knowleDge presErved semi-supeRvised meta-learning approach (ORDER), to tackle these two major challenges. Specifically, our ORDER introduces a novel mutual information regularization to robustify the model with unlabeled OOD data and adopts an optimal transport regularization to remember previously learned knowledge in feature space. In addition, we test our method on a very challenging dataset: SETS on large-scale non-stationary semi-supervised task distributions consisting of (at least) 72K tasks. With extensive experiments, we demonstrate the proposed ORDER alleviates forgetting on evolving task distributions and is more robust to OOD data than related strong baselines.
△ Less
Submitted 3 September, 2022;
originally announced September 2022.
-
Improving Task-free Continual Learning by Distributionally Robust Memory Evolution
Authors:
Zhenyi Wang,
Li Shen,
Le Fang,
Qiuling Suo,
Tiehang Duan,
Mingchen Gao
Abstract:
Task-free continual learning (CL) aims to learn a non-stationary data stream without explicit task definitions and not forget previous knowledge. The widely adopted memory replay approach could gradually become less effective for long data streams, as the model may memorize the stored examples and overfit the memory buffer. Second, existing methods overlook the high uncertainty in the memory data…
▽ More
Task-free continual learning (CL) aims to learn a non-stationary data stream without explicit task definitions and not forget previous knowledge. The widely adopted memory replay approach could gradually become less effective for long data streams, as the model may memorize the stored examples and overfit the memory buffer. Second, existing methods overlook the high uncertainty in the memory data distribution since there is a big gap between the memory data distribution and the distribution of all the previous data examples. To address these problems, for the first time, we propose a principled memory evolution framework to dynamically evolve the memory data distribution by making the memory buffer gradually harder to be memorized with distributionally robust optimization (DRO). We then derive a family of methods to evolve the memory buffer data in the continuous probability measure space with Wasserstein gradient flow (WGF). The proposed DRO is w.r.t the worst-case evolved memory data distribution, thus guarantees the model performance and learns significantly more robust features than existing memory-replay-based methods. Extensive experiments on existing benchmarks demonstrate the effectiveness of the proposed methods for alleviating forgetting. As a by-product of the proposed framework, our method is more robust to adversarial examples than existing task-free CL methods. Code is available on GitHub \url{https://github.com/joey-wang123/DRO-Task-free}
△ Less
Submitted 20 August, 2022; v1 submitted 14 July, 2022;
originally announced July 2022.
-
Utterance Rewriting with Contrastive Learning in Multi-turn Dialogue
Authors:
Zhihao Wang,
Tangjian Duan,
Zihao Wang,
Minghui Yang,
Zujie Wen,
Yongliang Wang
Abstract:
Context modeling plays a significant role in building multi-turn dialogue systems. In order to make full use of context information, systems can use Incomplete Utterance Rewriting(IUR) methods to simplify the multi-turn dialogue into single-turn by merging current utterance and context information into a self-contained utterance. However, previous approaches ignore the intent consistency between t…
▽ More
Context modeling plays a significant role in building multi-turn dialogue systems. In order to make full use of context information, systems can use Incomplete Utterance Rewriting(IUR) methods to simplify the multi-turn dialogue into single-turn by merging current utterance and context information into a self-contained utterance. However, previous approaches ignore the intent consistency between the original query and rewritten query. The detection of omitted or coreferred locations in the original query can be further improved. In this paper, we introduce contrastive learning and multi-task learning to jointly model the problem. Our method benefits from carefully designed self-supervised objectives, which act as auxiliary tasks to capture semantics at both sentence-level and token-level. The experiments show that our proposed model achieves state-of-the-art performance on several public datasets.
△ Less
Submitted 22 March, 2022;
originally announced March 2022.
-
Uncertainty Detection and Reduction in Neural Decoding of EEG Signals
Authors:
Tiehang Duan,
Zhenyi Wang,
Sheng Liu,
Sargur N. Srihari,
Hui Yang
Abstract:
EEG decoding systems based on deep neural networks have been widely used in decision making of brain computer interfaces (BCI). Their predictions, however, can be unreliable given the significant variance and noise in EEG signals. Previous works on EEG analysis mainly focus on the exploration of noise pattern in the source signal, while the uncertainty during the decoding process is largely unexpl…
▽ More
EEG decoding systems based on deep neural networks have been widely used in decision making of brain computer interfaces (BCI). Their predictions, however, can be unreliable given the significant variance and noise in EEG signals. Previous works on EEG analysis mainly focus on the exploration of noise pattern in the source signal, while the uncertainty during the decoding process is largely unexplored. Automatically detecting and reducing such decoding uncertainty is important for BCI motor imagery applications such as robotic arm control etc. In this work, we proposed an uncertainty estimation and reduction model (UNCER) to quantify and mitigate the uncertainty during the EEG decoding process. It utilized a combination of dropout oriented method and Bayesian neural network for uncertainty estimation to incorporate both the uncertainty in the input signal and the uncertainty in the model parameters. We further proposed a data augmentation based approach for uncertainty reduction. The model can be integrated into current widely used EEG neural decoders without change of architecture. We performed extensive experiments for uncertainty estimation and its reduction in both intra-subject EEG decoding and cross-subject EEG decoding on two public motor imagery datasets, where the proposed model achieves significant improvement both on the quality of estimated uncertainty and the effectiveness of uncertainty reduction.
△ Less
Submitted 1 October, 2022; v1 submitted 28 December, 2021;
originally announced January 2022.
-
Persistent large anisotropic magnetoresistance and insulator to metal transition in spin-orbit coupled antiferromagnets Sr2(Ir1-xGax)O4
Authors:
Haowen Wang,
Wei Wang,
Ni Hu,
Tianci Duan,
Songliu Yuan,
Shuai Dong,
Chengliang Lu,
Jun-Ming Liu
Abstract:
Antiferromagnetic (AFM) spintronics, where magneto-transport is governed by an antiferromagnet instead of a ferromagnet, opens fascinating new perspectives for both fundamental research and device technology, owing to their intrinsic appealing properties like rigidness to magnetic field, absence of stray field, and ultrafast spin dynamics. One of the urgent challenges, hindering the realization of…
▽ More
Antiferromagnetic (AFM) spintronics, where magneto-transport is governed by an antiferromagnet instead of a ferromagnet, opens fascinating new perspectives for both fundamental research and device technology, owing to their intrinsic appealing properties like rigidness to magnetic field, absence of stray field, and ultrafast spin dynamics. One of the urgent challenges, hindering the realization of the full potential of AFM spintronics, has been the performance gap between AFM metals and insulators. Here, we demonstrate the insulator-metal transition and persistently large anisotropic magnetoresistance (AMR) in single crystals Sr2(Ir1-xGax)O4 (0<x<0.09) which host the same basal-plane AFM lattice with strong spin-orbit coupling. The non-doped Sr2IrO4 shows the insulating transport with the AMR as big as ~16.8% at 50 K. The Ga substitution of Ir allows a gradual reduction of electrical resistivity, and a clear insulator-to-metal transition is identified in doped samples with x above 0.05, while the AMR can still have ~1%, sizable in comparison with those in AFM metals reported so far. Our experiments reveal that all the samples have the similar fourfold AMR symmetry, which can be well understood in the scenario of magnetocrystalline anisotropy. It is suggested that the spin-orbit coupled antiferromagnets Sr2(Ir1-xGax)O4 are promising candidate materials for AFM spintronics, providing a rare opportunity to integrate the superior spintronic functionalities of AFM metals and insulators.
△ Less
Submitted 25 October, 2021;
originally announced October 2021.
-
Meta Learning on a Sequence of Imbalanced Domains with Difficulty Awareness
Authors:
Zhenyi Wang,
Tiehang Duan,
Le Fang,
Qiuling Suo,
Mingchen Gao
Abstract:
Recognizing new objects by learning from a few labeled examples in an evolving environment is crucial to obtain excellent generalization ability for real-world machine learning systems. A typical setting across current meta learning algorithms assumes a stationary task distribution during meta training. In this paper, we explore a more practical and challenging setting where task distribution chan…
▽ More
Recognizing new objects by learning from a few labeled examples in an evolving environment is crucial to obtain excellent generalization ability for real-world machine learning systems. A typical setting across current meta learning algorithms assumes a stationary task distribution during meta training. In this paper, we explore a more practical and challenging setting where task distribution changes over time with domain shift. Particularly, we consider realistic scenarios where task distribution is highly imbalanced with domain labels unavailable in nature. We propose a kernel-based method for domain change detection and a difficulty-aware memory management mechanism that jointly considers the imbalanced domain size and domain importance to learn across domains continuously. Furthermore, we introduce an efficient adaptive task sampling method during meta training, which significantly reduces task gradient variance with theoretical guarantees. Finally, we propose a challenging benchmark with imbalanced domain sequences and varied domain difficulty. We have performed extensive evaluations on the proposed benchmark, demonstrating the effectiveness of our method. We made our code publicly available.
△ Less
Submitted 28 September, 2021;
originally announced September 2021.
-
Active Deep Learning on Entity Resolution by Risk Sampling
Authors:
Youcef Nafa,
Qun Chen,
Zhaoqiang Chen,
Xingyu Lu,
Haiyang He,
Tianyi Duan,
Zhanhuai Li
Abstract:
While the state-of-the-art performance on entity resolution (ER) has been achieved by deep learning, its effectiveness depends on large quantities of accurately labeled training data. To alleviate the data labeling burden, Active Learning (AL) presents itself as a feasible solution that focuses on data deemed useful for model training.
Building upon the recent advances in risk analysis for ER, wh…
▽ More
While the state-of-the-art performance on entity resolution (ER) has been achieved by deep learning, its effectiveness depends on large quantities of accurately labeled training data. To alleviate the data labeling burden, Active Learning (AL) presents itself as a feasible solution that focuses on data deemed useful for model training.
Building upon the recent advances in risk analysis for ER, which can provide a more refined estimate on label misprediction risk than the simpler classifier outputs, we propose a novel AL approach of risk sampling for ER. Risk sampling leverages misprediction risk estimation for active instance selection. Based on the core-set characterization for AL, we theoretically derive an optimization model which aims to minimize core-set loss with non-uniform Lipschitz continuity. Since the defined weighted K-medoids problem is NP-hard, we then present an efficient heuristic algorithm. Finally, we empirically verify the efficacy of the proposed approach on real data by a comparative study. Our extensive experiments have shown that it outperforms the existing alternatives by considerable margins. Using ER as a test case, we demonstrate that risk sampling is a promising approach potentially applicable to other challenging classification tasks.
△ Less
Submitted 23 December, 2020;
originally announced December 2020.
-
Adaptive Deep Learning for Entity Resolution by Risk Analysis
Authors:
Zhaoqiang Chen,
Qun Chen,
Youcef Nafa,
Tianyi Duan,
Wei Pan,
Lijun Zhang,
Zhanhuai Li
Abstract:
The state-of-the-art performance on entity resolution (ER) has been achieved by deep learning. However, deep models are usually trained on large quantities of accurately labeled training data, and can not be easily tuned towards a target workload. Unfortunately, in real scenarios, there may not be sufficient labeled training data, and even worse, their distribution is usually more or less differen…
▽ More
The state-of-the-art performance on entity resolution (ER) has been achieved by deep learning. However, deep models are usually trained on large quantities of accurately labeled training data, and can not be easily tuned towards a target workload. Unfortunately, in real scenarios, there may not be sufficient labeled training data, and even worse, their distribution is usually more or less different from the target workload even when they come from the same domain.
To alleviate the said limitations, this paper proposes a novel risk-based approach to tune a deep model towards a target workload by its particular characteristics. Built on the recent advances on risk analysis for ER, the proposed approach first trains a deep model on labeled training data, and then fine-tunes it by minimizing its estimated misprediction risk on unlabeled target data. Our theoretical analysis shows that risk-based adaptive training can correct the label status of a mispredicted instance with a fairly good chance. We have also empirically validated the efficacy of the proposed approach on real benchmark data by a comparative study. Our extensive experiments show that it can considerably improve the performance of deep models. Furthermore, in the scenario of distribution misalignment, it can similarly outperform the state-of-the-art alternative of transfer learning by considerable margins. Using ER as a test case, we demonstrate that risk-based adaptive training is a promising approach potentially applicable to various challenging classification tasks.
△ Less
Submitted 10 April, 2022; v1 submitted 7 December, 2020;
originally announced December 2020.
-
Attention based Writer Independent Handwriting Verification
Authors:
Mohammad Abuzar Shaikh,
Tiehang Duan,
Mihir Chauhan,
Sargur Srihari
Abstract:
The task of writer verification is to provide a likelihood score for whether the queried and known handwritten image samples belong to the same writer or not. Such a task calls for the neural network to make it's outcome interpretable, i.e. provide a view into the network's decision making process. We implement and integrate cross-attention and soft-attention mechanisms to capture the highly corre…
▽ More
The task of writer verification is to provide a likelihood score for whether the queried and known handwritten image samples belong to the same writer or not. Such a task calls for the neural network to make it's outcome interpretable, i.e. provide a view into the network's decision making process. We implement and integrate cross-attention and soft-attention mechanisms to capture the highly correlated and salient points in feature space of 2D inputs. The attention maps serve as an explanation premise for the network's output likelihood score. The attention mechanism also allows the network to focus more on relevant areas of the input, thus improving the classification performance. Our proposed approach achieves a precision of 86\% for detecting intra-writer cases in CEDAR cursive "AND" dataset. Furthermore, we generate meaningful explanations for the provided decision by extracting attention maps from multiple levels of the network.
△ Less
Submitted 30 September, 2020; v1 submitted 7 September, 2020;
originally announced September 2020.
-
Ultra Efficient Transfer Learning with Meta Update for Cross Subject EEG Classification
Authors:
Tiehang Duan,
Mihir Chauhan,
Mohammad Abuzar Shaikh,
Jun Chu,
Sargur Srihari
Abstract:
The pattern of Electroencephalogram (EEG) signal differs significantly across different subjects, and poses challenge for EEG classifiers in terms of 1) effectively adapting a learned classifier onto a new subject, 2) retaining knowledge of known subjects after the adaptation. We propose an efficient transfer learning method, named Meta UPdate Strategy (MUPS-EEG), for continuous EEG classification…
▽ More
The pattern of Electroencephalogram (EEG) signal differs significantly across different subjects, and poses challenge for EEG classifiers in terms of 1) effectively adapting a learned classifier onto a new subject, 2) retaining knowledge of known subjects after the adaptation. We propose an efficient transfer learning method, named Meta UPdate Strategy (MUPS-EEG), for continuous EEG classification across different subjects. The model learns effective representations with meta update which accelerates adaptation on new subject and mitigate forgetting of knowledge on previous subjects at the same time. The proposed mechanism originates from meta learning and works to 1) find feature representation that is broadly suitable for different subjects, 2) maximizes sensitivity of loss function for fast adaptation on new subject. The method can be applied to all deep learning oriented models. Extensive experiments on two public datasets demonstrate the effectiveness of the proposed model, outperforming current state of the arts by a large margin in terms of both adapting on new subject and retain knowledge of learned subjects.
△ Less
Submitted 1 March, 2021; v1 submitted 13 March, 2020;
originally announced March 2020.
-
Randomized Smoothing of All Shapes and Sizes
Authors:
Greg Yang,
Tony Duan,
J. Edward Hu,
Hadi Salman,
Ilya Razenshteyn,
Jerry Li
Abstract:
Randomized smoothing is the current state-of-the-art defense with provable robustness against $\ell_2$ adversarial attacks. Many works have devised new randomized smoothing schemes for other metrics, such as $\ell_1$ or $\ell_\infty$; however, substantial effort was needed to derive such new guarantees. This begs the question: can we find a general theory for randomized smoothing?
We propose a n…
▽ More
Randomized smoothing is the current state-of-the-art defense with provable robustness against $\ell_2$ adversarial attacks. Many works have devised new randomized smoothing schemes for other metrics, such as $\ell_1$ or $\ell_\infty$; however, substantial effort was needed to derive such new guarantees. This begs the question: can we find a general theory for randomized smoothing?
We propose a novel framework for devising and analyzing randomized smoothing schemes, and validate its effectiveness in practice. Our theoretical contributions are: (1) we show that for an appropriate notion of "optimal", the optimal smoothing distributions for any "nice" norms have level sets given by the norm's *Wulff Crystal*; (2) we propose two novel and complementary methods for deriving provably robust radii for any smoothing distribution; and, (3) we show fundamental limits to current randomized smoothing techniques via the theory of *Banach space cotypes*. By combining (1) and (2), we significantly improve the state-of-the-art certified accuracy in $\ell_1$ on standard datasets. Meanwhile, we show using (3) that with only label statistics under random input perturbations, randomized smoothing cannot achieve nontrivial certified accuracy against perturbations of $\ell_p$-norm $Ω(\min(1, d^{\frac{1}{p} - \frac{1}{2}}))$, when the input dimension $d$ is large. We provide code in github.com/tonyduan/rs4a.
△ Less
Submitted 23 July, 2020; v1 submitted 19 February, 2020;
originally announced February 2020.
-
Towards Interpretable and Learnable Risk Analysis for Entity Resolution
Authors:
Zhaoqiang Chen,
Qun Chen,
Boyi Hou,
Tianyi Duan,
Zhanhuai Li,
Guoliang Li
Abstract:
Machine-learning-based entity resolution has been widely studied. However, some entity pairs may be mislabeled by machine learning models and existing studies do not study the risk analysis problem -- predicting and interpreting which entity pairs are mislabeled. In this paper, we propose an interpretable and learnable framework for risk analysis, which aims to rank the labeled pairs based on thei…
▽ More
Machine-learning-based entity resolution has been widely studied. However, some entity pairs may be mislabeled by machine learning models and existing studies do not study the risk analysis problem -- predicting and interpreting which entity pairs are mislabeled. In this paper, we propose an interpretable and learnable framework for risk analysis, which aims to rank the labeled pairs based on their risks of being mislabeled. We first describe how to automatically generate interpretable risk features, and then present a learnable risk model and its training technique. Finally, we empirically evaluate the performance of the proposed approach on real data. Our extensive experiments have shown that the learning risk model can identify the mislabeled pairs with considerably higher accuracy than the existing alternatives.
△ Less
Submitted 5 December, 2019;
originally announced December 2019.
-
Missingness as Stability: Understanding the Structure of Missingness in Longitudinal EHR data and its Impact on Reinforcement Learning in Healthcare
Authors:
Scott L. Fleming,
Kuhan Jeyapragasan,
Tony Duan,
Daisy Ding,
Saurabh Gombar,
Nigam Shah,
Emma Brunskill
Abstract:
There is an emerging trend in the reinforcement learning for healthcare literature. In order to prepare longitudinal, irregularly sampled, clinical datasets for reinforcement learning algorithms, many researchers will resample the time series data to short, regular intervals and use last-observation-carried-forward (LOCF) imputation to fill in these gaps. Typically, they will not maintain any expl…
▽ More
There is an emerging trend in the reinforcement learning for healthcare literature. In order to prepare longitudinal, irregularly sampled, clinical datasets for reinforcement learning algorithms, many researchers will resample the time series data to short, regular intervals and use last-observation-carried-forward (LOCF) imputation to fill in these gaps. Typically, they will not maintain any explicit information about which values were imputed. In this work, we (1) call attention to this practice and discuss its potential implications; (2) propose an alternative representation of the patient state that addresses some of these issues; and (3) demonstrate in a novel but representative clinical dataset that our alternative representation yields consistently better results for achieving optimal control, as measured by off-policy policy evaluation, compared to representations that do not incorporate missingness information.
△ Less
Submitted 16 November, 2019;
originally announced November 2019.
-
Graph Embedding VAE: A Permutation Invariant Model of Graph Structure
Authors:
Tony Duan,
Juho Lee
Abstract:
Generative models of graph structure have applications in biology and social sciences. The state of the art is GraphRNN, which decomposes the graph generation process into a series of sequential steps. While effective for modest sizes, it loses its permutation invariance for larger graphs. Instead, we present a permutation invariant latent-variable generative model relying on graph embeddings to e…
▽ More
Generative models of graph structure have applications in biology and social sciences. The state of the art is GraphRNN, which decomposes the graph generation process into a series of sequential steps. While effective for modest sizes, it loses its permutation invariance for larger graphs. Instead, we present a permutation invariant latent-variable generative model relying on graph embeddings to encode structure. Using tools from the random graph literature, our model is highly scalable to large graphs with likelihood evaluation and generation in $O(|V | + |E|)$.
△ Less
Submitted 17 October, 2019;
originally announced October 2019.
-
NGBoost: Natural Gradient Boosting for Probabilistic Prediction
Authors:
Tony Duan,
Anand Avati,
Daisy Yi Ding,
Khanh K. Thai,
Sanjay Basu,
Andrew Y. Ng,
Alejandro Schuler
Abstract:
We present Natural Gradient Boosting (NGBoost), an algorithm for generic probabilistic prediction via gradient boosting. Typical regression models return a point estimate, conditional on covariates, but probabilistic regression models output a full probability distribution over the outcome space, conditional on the covariates. This allows for predictive uncertainty estimation -- crucial in applica…
▽ More
We present Natural Gradient Boosting (NGBoost), an algorithm for generic probabilistic prediction via gradient boosting. Typical regression models return a point estimate, conditional on covariates, but probabilistic regression models output a full probability distribution over the outcome space, conditional on the covariates. This allows for predictive uncertainty estimation -- crucial in applications like healthcare and weather forecasting. NGBoost generalizes gradient boosting to probabilistic regression by treating the parameters of the conditional distribution as targets for a multiparameter boosting algorithm. Furthermore, we show how the Natural Gradient is required to correct the training dynamics of our multiparameter boosting approach. NGBoost can be used with any base learner, any family of distributions with continuous parameters, and any scoring rule. NGBoost matches or exceeds the performance of existing methods for probabilistic prediction while offering additional benefits in flexibility, scalability, and usability. An open-source implementation is available at github.com/stanfordmlgroup/ngboost.
△ Less
Submitted 9 June, 2020; v1 submitted 8 October, 2019;
originally announced October 2019.
-
Counterfactual Reasoning for Fair Clinical Risk Prediction
Authors:
Stephen Pfohl,
Tony Duan,
Daisy Yi Ding,
Nigam H. Shah
Abstract:
The use of machine learning systems to support decision making in healthcare raises questions as to what extent these systems may introduce or exacerbate disparities in care for historically underrepresented and mistreated groups, due to biases implicitly embedded in observational data in electronic health records. To address this problem in the context of clinical risk prediction models, we devel…
▽ More
The use of machine learning systems to support decision making in healthcare raises questions as to what extent these systems may introduce or exacerbate disparities in care for historically underrepresented and mistreated groups, due to biases implicitly embedded in observational data in electronic health records. To address this problem in the context of clinical risk prediction models, we develop an augmented counterfactual fairness criteria to extend the group fairness criteria of equalized odds to an individual level. We do so by requiring that the same prediction be made for a patient, and a counterfactual patient resulting from changing a sensitive attribute, if the factual and counterfactual outcomes do not differ. We investigate the extent to which the augmented counterfactual fairness criteria may be applied to develop fair models for prolonged inpatient length of stay and mortality with observational electronic health records data. As the fairness criteria is ill-defined without knowledge of the data generating process, we use a variational autoencoder to perform counterfactual inference in the context of an assumed causal graph. While our technique provides a means to trade off maintenance of fairness with reduction in predictive performance in the context of a learned generative model, further work is needed to assess the generality of this approach.
△ Less
Submitted 14 July, 2019;
originally announced July 2019.
-
A cost-reducing partial labeling estimator in text classification problem
Authors:
Jiangning Chen,
Zhibo Dai,
Juntao Duan,
Qianli Hu,
Ruilin Li,
Heinrich Matzinger,
Ionel Popescu,
Haoyan Zhai
Abstract:
We propose a new approach to address the text classification problems when learning with partial labels is beneficial. Instead of offering each training sample a set of candidate labels, we assign negative-oriented labels to the ambiguous training examples if they are unlikely fall into certain classes. We construct our new maximum likelihood estimators with self-correction property, and prove tha…
▽ More
We propose a new approach to address the text classification problems when learning with partial labels is beneficial. Instead of offering each training sample a set of candidate labels, we assign negative-oriented labels to the ambiguous training examples if they are unlikely fall into certain classes. We construct our new maximum likelihood estimators with self-correction property, and prove that under some conditions, our estimators converge faster. Also we discuss the advantages of applying one of our estimator to a fully supervised learning problem. The proposed method has potential applicability in many areas, such as crowdsourcing, natural language processing and medical image analysis.
△ Less
Submitted 9 June, 2019;
originally announced June 2019.
-
Naive Bayes with Correlation Factor for Text Classification Problem
Authors:
Jiangning Chen,
Zhibo Dai,
Juntao Duan,
Heinrich Matzinger,
Ionel Popescu
Abstract:
Naive Bayes estimator is widely used in text classification problems. However, it doesn't perform well with small-size training dataset. We propose a new method based on Naive Bayes estimator to solve this problem. A correlation factor is introduced to incorporate the correlation among different classes. Experimental results show that our estimator achieves a better accuracy compared with traditio…
▽ More
Naive Bayes estimator is widely used in text classification problems. However, it doesn't perform well with small-size training dataset. We propose a new method based on Naive Bayes estimator to solve this problem. A correlation factor is introduced to incorporate the correlation among different classes. Experimental results show that our estimator achieves a better accuracy compared with traditional Naive Bayes in real world data.
△ Less
Submitted 8 May, 2019;
originally announced May 2019.
-
Parallel Clustering of Single Cell Transcriptomic Data with Split-Merge Sampling on Dirichlet Process Mixtures
Authors:
Tiehang Duan,
José P. Pinto,
Xiaohui Xie
Abstract:
Motivation: With the development of droplet based systems, massive single cell transcriptome data has become available, which enables analysis of cellular and molecular processes at single cell resolution and is instrumental to understanding many biological processes. While state-of-the-art clustering methods have been applied to the data, they face challenges in the following aspects: (1) the clu…
▽ More
Motivation: With the development of droplet based systems, massive single cell transcriptome data has become available, which enables analysis of cellular and molecular processes at single cell resolution and is instrumental to understanding many biological processes. While state-of-the-art clustering methods have been applied to the data, they face challenges in the following aspects: (1) the clustering quality still needs to be improved; (2) most models need prior knowledge on number of clusters, which is not always available; (3) there is a demand for faster computational speed. Results: We propose to tackle these challenges with Parallel Split Merge Sampling on Dirichlet Process Mixture Model (the Para-DPMM model). Unlike classic DPMM methods that perform sampling on each single data point, the split merge mechanism samples on the cluster level, which significantly improves convergence and optimality of the result. The model is highly parallelized and can utilize the computing power of high performance computing (HPC) clusters, enabling massive clustering on huge datasets. Experiment results show the model outperforms current widely used models in both clustering quality and computational speed. Availability: Source code is publicly available on https://github.com/tiehangd/Para_DPMM/tree/master/Para_DPMM_package
△ Less
Submitted 25 December, 2018;
originally announced December 2018.
-
Sequential Embedding Induced Text Clustering, a Non-parametric Bayesian Approach
Authors:
Tiehang Duan,
Qi Lou,
Sargur N. Srihari,
Xiaohui Xie
Abstract:
Current state-of-the-art nonparametric Bayesian text clustering methods model documents through multinomial distribution on bags of words. Although these methods can effectively utilize the word burstiness representation of documents and achieve decent performance, they do not explore the sequential information of text and relationships among synonyms. In this paper, the documents are modeled as t…
▽ More
Current state-of-the-art nonparametric Bayesian text clustering methods model documents through multinomial distribution on bags of words. Although these methods can effectively utilize the word burstiness representation of documents and achieve decent performance, they do not explore the sequential information of text and relationships among synonyms. In this paper, the documents are modeled as the joint of bags of words, sequential features and word embeddings. We proposed Sequential Embedding induced Dirichlet Process Mixture Model (SiDPMM) to effectively exploit this joint document representation in text clustering. The sequential features are extracted by the encoder-decoder component. Word embeddings produced by the continuous-bag-of-words (CBOW) model are introduced to handle synonyms. Experimental results demonstrate the benefits of our model in two major aspects: 1) improved performance across multiple diverse text datasets in terms of the normalized mutual information (NMI); 2) more accurate inference of ground truth cluster numbers with regularization effect on tiny outlier clusters.
△ Less
Submitted 29 November, 2018;
originally announced November 2018.
-
Proton-transfer Ferroelectricity / Multiferroicity in Rutile Oxyhydroxides
Authors:
Menghao Wu,
Tianci Duan,
Chengliang Lu,
Huahua Fu,
Shuai Dong,
Junming Liu
Abstract:
Oxyhydroxide minerals like FeOOH have been a research focus in geology for studying the Earth interior, and also in chemistry for studying oxygen electrocatalysis activity. In this paper we provide first-principles evidence of a new class of ferroelectrics or multiferroics among them:GaOOH,InOOH,CrOOH,FeOOH, which are earth-abundant minerals and have been experimentally verified to possess distort…
▽ More
Oxyhydroxide minerals like FeOOH have been a research focus in geology for studying the Earth interior, and also in chemistry for studying oxygen electrocatalysis activity. In this paper we provide first-principles evidence of a new class of ferroelectrics or multiferroics among them:GaOOH,InOOH,CrOOH,FeOOH, which are earth-abundant minerals and have been experimentally verified to possess distorted rutile structures, are ferroelectric with considerable polarizations(up to 24 muC/cm2) and piezoelectric coefficients. Their atomic-thick layer may possess vertical polarization robust against depolarizing field due to the formation of O-H O bonds that can hardly be symmetrized. Moreover,CrOOH (guyanaite) is revealed to be a combination of high-Tc in-plane type-I multiferroics and vertical type-II multiferroics, which is strain-tunable and may render a desirable coupling between magnetism and ferroelectricity. Supported by experimental evidence on reversible conversion between metal oxyhydroxides and dioxides and their nice lattice match that renders convenient epitaxial growth, heterostructure composed of oxyhydroxides and prevalent metal dioxides (e.g. TiO2, SnO2 and CrO2) may be constructed for various applications like ferroelectric field-effect transistors and multiferroic tunneling junctions.
△ Less
Submitted 5 October, 2018;
originally announced November 2018.
-
Countdown Regression: Sharp and Calibrated Survival Predictions
Authors:
Anand Avati,
Tony Duan,
Sharon Zhou,
Kenneth Jung,
Nigam H. Shah,
Andrew Ng
Abstract:
Probabilistic survival predictions from models trained with Maximum Likelihood Estimation (MLE) can have high, and sometimes unacceptably high variance. The field of meteorology, where the paradigm of maximizing sharpness subject to calibration is popular, has addressed this problem by using scoring rules beyond MLE, such as the Continuous Ranked Probability Score (CRPS). In this paper we present…
▽ More
Probabilistic survival predictions from models trained with Maximum Likelihood Estimation (MLE) can have high, and sometimes unacceptably high variance. The field of meteorology, where the paradigm of maximizing sharpness subject to calibration is popular, has addressed this problem by using scoring rules beyond MLE, such as the Continuous Ranked Probability Score (CRPS). In this paper we present the \emph{Survival-CRPS}, a generalization of the CRPS to the survival prediction setting, with right-censored and interval-censored variants. We evaluate our ideas on the mortality prediction task using two different Electronic Health Record (EHR) data sets (STARR and MIMIC-III) covering millions of patients, with suitable deep neural network architectures: a Recurrent Neural Network (RNN) for STARR and a Fully Connected Network (FCN) for MIMIC-III. We compare results between the two scoring rules while kee** the network architecture and data fixed, and show that models trained with Survival-CRPS result in sharper predictive distributions compared to those trained by MLE, while still maintaining calibration.
△ Less
Submitted 18 June, 2019; v1 submitted 21 June, 2018;
originally announced June 2018.
-
Unveiling the superconducting mechanism of Ba$_{0.51}$K$_{0.49}$BiO$_3$
Authors:
C. H. P. Wen,
H. C. Xu,
Q. Yao,
R. Peng,
X. H. Niu,
Q. Y. Chen,
Z. T. Liu,
D. W. Shen,
Q. Song,
X. Lou,
Y. F. Fang,
X. S. Liu,
Y. H. Song,
Y. J. Jiao,
T. F. Duan,
H. H. Wen,
P. Dudin,
G. Kotliar,
Z. P. Yin,
D. L. Feng
Abstract:
Bismuthates were the first family of oxide high-temperature superconductors, exhibiting superconducting transition temperatures (Tc) up to 32K, but the superconducting mechanism remains under debate despite more than 30 years of extensive research. Our angle-resolved photoemission spectroscopy studies on Ba$_{0.51}$K$_{0.49}$BiO$_3$ reveal an unexpectedly 34% larger bandwidth than in conventional…
▽ More
Bismuthates were the first family of oxide high-temperature superconductors, exhibiting superconducting transition temperatures (Tc) up to 32K, but the superconducting mechanism remains under debate despite more than 30 years of extensive research. Our angle-resolved photoemission spectroscopy studies on Ba$_{0.51}$K$_{0.49}$BiO$_3$ reveal an unexpectedly 34% larger bandwidth than in conventional density functional theory calculations. This can be reproduced by calculations that fully account for long-range Coulomb interactions --- the first direct demonstration of bandwidth expansion due to the Fock exchange term, a long-accepted and yet uncorroborated fundamental effect in many body physics. Furthermore, we observe an isotropic superconducting gap with 2Δ$_0$/k$_B$ T$_c$ = 3.51 $\pm$ 0.05, and strong electron-phonon interactions with a coupling constant λ$\sim$ 1.3 $\pm$ 0.2. These findings solve a long-standing mystery --- Ba$_{0.51}$K$_{0.49}$BiO$_3$ is an extraordinary Bardeen-Cooper-Schrieffer (BCS) superconductor, where long-range Coulomb interactions expand the bandwidth, enhance electron-phonon coupling, and generate the high Tc. Such effects will also be critical for finding new superconductors.
△ Less
Submitted 28 February, 2018;
originally announced February 2018.
-
MURA: Large Dataset for Abnormality Detection in Musculoskeletal Radiographs
Authors:
Pranav Rajpurkar,
Jeremy Irvin,
Aarti Bagul,
Daisy Ding,
Tony Duan,
Hershel Mehta,
Brandon Yang,
Kaylie Zhu,
Dillon Laird,
Robyn L. Ball,
Curtis Langlotz,
Katie Shpanskaya,
Matthew P. Lungren,
Andrew Y. Ng
Abstract:
We introduce MURA, a large dataset of musculoskeletal radiographs containing 40,561 images from 14,863 studies, where each study is manually labeled by radiologists as either normal or abnormal. To evaluate models robustly and to get an estimate of radiologist performance, we collect additional labels from six board-certified Stanford radiologists on the test set, consisting of 207 musculoskeletal…
▽ More
We introduce MURA, a large dataset of musculoskeletal radiographs containing 40,561 images from 14,863 studies, where each study is manually labeled by radiologists as either normal or abnormal. To evaluate models robustly and to get an estimate of radiologist performance, we collect additional labels from six board-certified Stanford radiologists on the test set, consisting of 207 musculoskeletal studies. On this test set, the majority vote of a group of three radiologists serves as gold standard. We train a 169-layer DenseNet baseline model to detect and localize abnormalities. Our model achieves an AUROC of 0.929, with an operating point of 0.815 sensitivity and 0.887 specificity. We compare our model and radiologists on the Cohen's kappa statistic, which expresses the agreement of our model and of each radiologist with the gold standard. Model performance is comparable to the best radiologist performance in detecting abnormalities on finger and wrist studies. However, model performance is lower than best radiologist performance in detecting abnormalities on elbow, forearm, hand, humerus, and shoulder studies. We believe that the task is a good challenge for future research. To encourage advances, we have made our dataset freely available at https://stanfordmlgroup.github.io/competitions/mura .
△ Less
Submitted 22 May, 2018; v1 submitted 11 December, 2017;
originally announced December 2017.
-
CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning
Authors:
Pranav Rajpurkar,
Jeremy Irvin,
Kaylie Zhu,
Brandon Yang,
Hershel Mehta,
Tony Duan,
Daisy Ding,
Aarti Bagul,
Curtis Langlotz,
Katie Shpanskaya,
Matthew P. Lungren,
Andrew Y. Ng
Abstract:
We develop an algorithm that can detect pneumonia from chest X-rays at a level exceeding practicing radiologists. Our algorithm, CheXNet, is a 121-layer convolutional neural network trained on ChestX-ray14, currently the largest publicly available chest X-ray dataset, containing over 100,000 frontal-view X-ray images with 14 diseases. Four practicing academic radiologists annotate a test set, on w…
▽ More
We develop an algorithm that can detect pneumonia from chest X-rays at a level exceeding practicing radiologists. Our algorithm, CheXNet, is a 121-layer convolutional neural network trained on ChestX-ray14, currently the largest publicly available chest X-ray dataset, containing over 100,000 frontal-view X-ray images with 14 diseases. Four practicing academic radiologists annotate a test set, on which we compare the performance of CheXNet to that of radiologists. We find that CheXNet exceeds average radiologist performance on the F1 metric. We extend CheXNet to detect all 14 diseases in ChestX-ray14 and achieve state of the art results on all 14 diseases.
△ Less
Submitted 25 December, 2017; v1 submitted 14 November, 2017;
originally announced November 2017.
-
Image matting with normalized weight and semi-supervised learning
Authors:
** Li,
Tingyan Duan,
Yongfeng Cao
Abstract:
Image matting is an important vision problem. The main stream methods for it combine sampling-based methods and propagation-based methods. In this paper, we deal with the combination with a normalized weighting parameter, which could well control the relative relationship between information from sampling and from propagation. A reasonable value range for this parameter is given based on statistic…
▽ More
Image matting is an important vision problem. The main stream methods for it combine sampling-based methods and propagation-based methods. In this paper, we deal with the combination with a normalized weighting parameter, which could well control the relative relationship between information from sampling and from propagation. A reasonable value range for this parameter is given based on statistics from the standard benchmark dataset. The matting is further improved by introducing semi-supervised learning iterations, which automatically refine the trimap without user's interaction. This is especially beneficial when the trimap is coarse. The experimental results on standard benchmark dataset have shown that both the normalized weighting parameter and the semi-supervised learning iteration could significantly improve the matting performance.
△ Less
Submitted 27 October, 2017;
originally announced October 2017.
-
ProtoDESI: First On-Sky Technology Demonstration for the Dark Energy Spectroscopic Instrument
Authors:
Parker Fagrelius,
Behzad Abareshi,
Lori Allen,
Otger Ballester,
Charles Baltay,
Robert Besuner,
Elizabeth Buckley-Geer,
Karen Butler,
Laia Cardiel,
Arjun Dey,
Ann Elliott,
William Emmet,
Irena Gershkovich,
Klaus Honscheid,
Jose M. Illa,
Jorge Jimenez,
Michael Levi,
Christopher Manser,
Robert Marshall,
Paul Martini,
Anthony Paat,
Ronald Probst,
David Rabinowitz,
Kevin Reil,
Amy Robertson
, et al. (11 additional authors not shown)
Abstract:
The Dark Energy Spectroscopic Instrument (DESI) is under construction to measure the expansion history of the universe using the baryon acoustic oscillations technique. The spectra of 35 million galaxies and quasars over 14,000 square degrees will be measured during a 5-year survey. A new prime focus corrector for the Mayall telescope at Kitt Peak National Observatory will deliver light to 5,000 i…
▽ More
The Dark Energy Spectroscopic Instrument (DESI) is under construction to measure the expansion history of the universe using the baryon acoustic oscillations technique. The spectra of 35 million galaxies and quasars over 14,000 square degrees will be measured during a 5-year survey. A new prime focus corrector for the Mayall telescope at Kitt Peak National Observatory will deliver light to 5,000 individually targeted fiber-fed robotic positioners. The fibers in turn feed ten broadband multi-object spectrographs. We describe the ProtoDESI experiment, that was installed and commissioned on the 4-m Mayall telescope from August 14 to September 30, 2016. ProtoDESI was an on-sky technology demonstration with the goal to reduce technical risks associated with aligning optical fibers with targets using robotic fiber positioners and maintaining the stability required to operate DESI. The ProtoDESI prime focus instrument, consisting of three fiber positioners, illuminated fiducials, and a guide camera, was installed behind the existing Mosaic corrector on the Mayall telescope. A Fiber View Camera was mounted in the Cassegrain cage of the telescope and provided feedback metrology for positioning the fibers. ProtoDESI also provided a platform for early integration of hardware with the DESI Instrument Control System that controls the subsystems, provides communication with the Telescope Control System, and collects instrument telemetry data. Lacking a spectrograph, ProtoDESI monitored the output of the fibers using a Fiber Photometry Camera mounted on the prime focus instrument. ProtoDESI was successful in acquiring targets with the robotically positioned fibers and demonstrated that the DESI guiding requirements can be met.
△ Less
Submitted 2 May, 2018; v1 submitted 24 October, 2017;
originally announced October 2017.
-
Non-normal limiting distribution for optimal alignment scores of strings in binary alphabets
Authors:
Jun Tao Duan,
Heinrich Matzinger,
Ionel Popescu
Abstract:
We consider two independent binary i.i.d. random strings $X$ and $Y$ of equal length $n$ and the optimal alignments according to a symmetric scoring functions only. We decompose the space of scoring functions into five components. Two of these components add a part to the optimal score which does not depend on the alignment and which is asymptotically normal.
We show that when we restrict the nu…
▽ More
We consider two independent binary i.i.d. random strings $X$ and $Y$ of equal length $n$ and the optimal alignments according to a symmetric scoring functions only. We decompose the space of scoring functions into five components. Two of these components add a part to the optimal score which does not depend on the alignment and which is asymptotically normal.
We show that when we restrict the number of gaps sufficiently and add them only into one sequence, then the alignment score can be decomposed into a part which is normal and has order $O(\sqrt{n})$ and a part which is on a smaller order and tends to a Tracy-Widom distribution. Adding gaps only into one sequence is equivalent to aligning a string with its descendants in case of mutations and deletes. For testing relatedness of strings, the normal part is irrelevant, since it does not depend on the alignment hence it can be safely removed from the test statistic.
△ Less
Submitted 16 March, 2017;
originally announced March 2017.
-
The DESI Experiment Part II: Instrument Design
Authors:
DESI Collaboration,
Amir Aghamousa,
Jessica Aguilar,
Steve Ahlen,
Shadab Alam,
Lori E. Allen,
Carlos Allende Prieto,
James Annis,
Stephen Bailey,
Christophe Balland,
Otger Ballester,
Charles Baltay,
Lucas Beaufore,
Chris Bebek,
Timothy C. Beers,
Eric F. Bell,
José Luis Bernal,
Robert Besuner,
Florian Beutler,
Chris Blake,
Hannes Bleuler,
Michael Blomqvist,
Robert Blum,
Adam S. Bolton,
Cesar Briceno
, et al. (268 additional authors not shown)
Abstract:
DESI (Dark Energy Spectropic Instrument) is a Stage IV ground-based dark energy experiment that will study baryon acoustic oscillations and the growth of structure through redshift-space distortions with a wide-area galaxy and quasar redshift survey. The DESI instrument is a robotically-actuated, fiber-fed spectrograph capable of taking up to 5,000 simultaneous spectra over a wavelength range from…
▽ More
DESI (Dark Energy Spectropic Instrument) is a Stage IV ground-based dark energy experiment that will study baryon acoustic oscillations and the growth of structure through redshift-space distortions with a wide-area galaxy and quasar redshift survey. The DESI instrument is a robotically-actuated, fiber-fed spectrograph capable of taking up to 5,000 simultaneous spectra over a wavelength range from 360 nm to 980 nm. The fibers feed ten three-arm spectrographs with resolution $R= λ/Δλ$ between 2000 and 5500, depending on wavelength. The DESI instrument will be used to conduct a five-year survey designed to cover 14,000 deg$^2$. This powerful instrument will be installed at prime focus on the 4-m Mayall telescope in Kitt Peak, Arizona, along with a new optical corrector, which will provide a three-degree diameter field of view. The DESI collaboration will also deliver a spectroscopic pipeline and data management system to reduce and archive all data for eventual public use.
△ Less
Submitted 13 December, 2016; v1 submitted 31 October, 2016;
originally announced November 2016.
-
The DESI Experiment Part I: Science,Targeting, and Survey Design
Authors:
DESI Collaboration,
Amir Aghamousa,
Jessica Aguilar,
Steve Ahlen,
Shadab Alam,
Lori E. Allen,
Carlos Allende Prieto,
James Annis,
Stephen Bailey,
Christophe Balland,
Otger Ballester,
Charles Baltay,
Lucas Beaufore,
Chris Bebek,
Timothy C. Beers,
Eric F. Bell,
José Luis Bernal,
Robert Besuner,
Florian Beutler,
Chris Blake,
Hannes Bleuler,
Michael Blomqvist,
Robert Blum,
Adam S. Bolton,
Cesar Briceno
, et al. (268 additional authors not shown)
Abstract:
DESI (Dark Energy Spectroscopic Instrument) is a Stage IV ground-based dark energy experiment that will study baryon acoustic oscillations (BAO) and the growth of structure through redshift-space distortions with a wide-area galaxy and quasar redshift survey. To trace the underlying dark matter distribution, spectroscopic targets will be selected in four classes from imaging data. We will measure…
▽ More
DESI (Dark Energy Spectroscopic Instrument) is a Stage IV ground-based dark energy experiment that will study baryon acoustic oscillations (BAO) and the growth of structure through redshift-space distortions with a wide-area galaxy and quasar redshift survey. To trace the underlying dark matter distribution, spectroscopic targets will be selected in four classes from imaging data. We will measure luminous red galaxies up to $z=1.0$. To probe the Universe out to even higher redshift, DESI will target bright [O II] emission line galaxies up to $z=1.7$. Quasars will be targeted both as direct tracers of the underlying dark matter distribution and, at higher redshifts ($ 2.1 < z < 3.5$), for the Ly-$α$ forest absorption features in their spectra, which will be used to trace the distribution of neutral hydrogen. When moonlight prevents efficient observations of the faint targets of the baseline survey, DESI will conduct a magnitude-limited Bright Galaxy Survey comprising approximately 10 million galaxies with a median $z\approx 0.2$. In total, more than 30 million galaxy and quasar redshifts will be obtained to measure the BAO feature and determine the matter power spectrum, including redshift space distortions.
△ Less
Submitted 13 December, 2016; v1 submitted 31 October, 2016;
originally announced November 2016.