Skip to main content

Showing 1–16 of 16 results for author: Xia, W

Searching in archive stat. Search in all archives.
.
  1. arXiv:2403.03669  [pdf, other

    stat.ML cs.LG

    Spectral Algorithms on Manifolds through Diffusion

    Authors: Weichun Xia, Lei Shi

    Abstract: The existing research on spectral algorithms, applied within a Reproducing Kernel Hilbert Space (RKHS), has primarily focused on general kernel functions, often neglecting the inherent structure of the input feature space. Our paper introduces a new perspective, asserting that input data are situated within a low-dimensional manifold embedded in a higher-dimensional Euclidean space. We study the c… ▽ More

    Submitted 7 March, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

  2. arXiv:2309.08489  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network

    Authors: Yiling Huang, Weiran Wang, Guanlong Zhao, Hank Liao, Wei Xia, Quan Wang

    Abstract: While standard speaker diarization attempts to answer the question "who spoken when", most of relevant applications in reality are more interested in determining "who spoken what". Whether it is the conventional modularized approach or the more recent end-to-end neural diarization (EEND), an additional automatic speech recognition (ASR) model and an orchestration algorithm are required to associat… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

  3. arXiv:2309.03842  [pdf, other

    stat.ML cs.LG

    Early warning indicators via latent stochastic dynamical systems

    Authors: Lingyu Feng, Ting Gao, Wang Xiao, **qiao Duan

    Abstract: Detecting early warning indicators for abrupt dynamical transitions in complex systems or high-dimensional observation data is essential in many real-world applications, such as brain diseases, natural disasters, and engineering reliability. To this end, we develop a novel approach: the directed anisotropic diffusion map that captures the latent evolutionary dynamics in the low-dimensional manifol… ▽ More

    Submitted 5 April, 2024; v1 submitted 7 September, 2023; originally announced September 2023.

  4. arXiv:2210.06650  [pdf, other

    cs.LG cs.AI cs.NE cs.RO stat.ML

    Interpreting Neural Policies with Disentangled Tree Representations

    Authors: Tsun-Hsuan Wang, Wei Xiao, Tim Seyde, Ramin Hasani, Daniela Rus

    Abstract: The advancement of robots, particularly those functioning in complex human-centric environments, relies on control solutions that are driven by machine learning. Understanding how learning-based controllers make decisions is crucial since robots are often safety-critical systems. This urges a formal and quantitative understanding of the explanatory factors in the interpretability of robot learning… ▽ More

    Submitted 12 November, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

  5. arXiv:2106.10056  [pdf

    cs.LG stat.ML

    A Vertical Federated Learning Framework for Horizontally Partitioned Labels

    Authors: Wensheng Xia, Ying Li, Lan Zhang, Zhonghai Wu, Xiaoyong Yuan

    Abstract: Vertical federated learning is a collaborative machine learning framework to train deep leaning models on vertically partitioned data with privacy-preservation. It attracts much attention both from academia and industry. Unfortunately, applying most existing vertical federated learning methods in real-world applications still faces two daunting challenges. First, most existing vertical federated l… ▽ More

    Submitted 18 June, 2021; originally announced June 2021.

    Comments: 10 pages, 6 figures

  6. arXiv:2007.07203  [pdf, other

    cs.IR cs.LG stat.ML

    Deep Retrieval: Learning A Retrievable Structure for Large-Scale Recommendations

    Authors: Weihao Gao, Xiangjun Fan, Chong Wang, Jiankai Sun, Kai Jia, Wenzhi Xiao, Ruofan Ding, Xingyan Bin, Hui Yang, Xiaobing Liu

    Abstract: One of the core problems in large-scale recommendations is to retrieve top relevant candidates accurately and efficiently, preferably in sub-linear time. Previous approaches are mostly based on a two-step procedure: first learn an inner-product model, and then use some approximate nearest neighbor (ANN) search algorithm to find top candidates. In this paper, we present Deep Retrieval (DR), to lear… ▽ More

    Submitted 18 May, 2021; v1 submitted 12 July, 2020; originally announced July 2020.

    Comments: 9 pages, 6 figures

  7. arXiv:2006.09446  [pdf, ps, other

    cs.LG cs.RO stat.ML

    Real-Time Regression with Dividing Local Gaussian Processes

    Authors: Armin Lederer, Alejandro Jose Ordonez Conejo, Korbinian Maier, Wenxin Xiao, Jonas Umlauft, Sandra Hirche

    Abstract: The increased demand for online prediction and the growing availability of large data sets drives the need for computationally efficient models. While exact Gaussian process regression shows various favorable theoretical properties (uncertainty estimate, unlimited expressive power), the poor scaling with respect to the training set size prohibits its application in big data regimes in real-time. T… ▽ More

    Submitted 30 July, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

  8. arXiv:2004.08704  [pdf, other

    cs.LG stat.ML

    Efficient Synthesis of Compact Deep Neural Networks

    Authors: Wenhan Xia, Hongxu Yin, Niraj K. Jha

    Abstract: Deep neural networks (DNNs) have been deployed in myriad machine learning applications. However, advances in their accuracy are often achieved with increasingly complex and deep network architectures. These large, deep models are often unsuitable for real-world applications, due to their massive computational cost, high memory bandwidth, and long latency. For example, autonomous driving requires f… ▽ More

    Submitted 18 April, 2020; originally announced April 2020.

  9. arXiv:1907.01636  [pdf, other

    cs.IR cs.CL cs.LG stat.ML

    Analyses of Multi-collection Corpora via Compound Topic Modeling

    Authors: Clint P. George, Wei Xia, George Michailidis

    Abstract: As electronically stored data grow in daily life, obtaining novel and relevant information becomes challenging in text mining. Thus people have sought statistical methods based on term frequency, matrix algebra, or topic modeling for text mining. Popular topic models have centered on one single text collection, which is deficient for comparative text analyses. We consider a setting where one can p… ▽ More

    Submitted 17 June, 2019; originally announced July 2019.

    MSC Class: 62F15; 60J22 ACM Class: I.2; I.7; G.3

  10. arXiv:1811.03567  [pdf, other

    cs.LG cs.AI cs.CV cs.NE stat.ML

    Biologically-plausible learning algorithms can scale to large datasets

    Authors: Will Xiao, Honglin Chen, Qianli Liao, Tomaso Poggio

    Abstract: The backpropagation (BP) algorithm is often thought to be biologically implausible in the brain. One of the main reasons is that BP requires symmetric weight matrices in the feedforward and feedback pathways. To address this "weight transport problem" (Grossberg, 1987), two more biologically plausible algorithms, proposed by Liao et al. (2016) and Lillicrap et al. (2016), relax BP's weight symmetr… ▽ More

    Submitted 20 December, 2018; v1 submitted 8 November, 2018; originally announced November 2018.

  11. arXiv:1712.01521  [pdf, ps, other

    stat.AP stat.CO stat.ML

    An Online Algorithm for Nonparametric Correlations

    Authors: Wei Xiao

    Abstract: Nonparametric correlations such as Spearman's rank correlation and Kendall's tau correlation are widely applied in scientific and engineering fields. This paper investigates the problem of computing nonparametric correlations on the fly for streaming data. Standard batch algorithms are generally too slow to handle real-world big data applications. They also require too much memory because all the… ▽ More

    Submitted 5 December, 2017; originally announced December 2017.

  12. arXiv:1702.05698  [pdf, ps, other

    cs.LG cs.CV stat.AP stat.CO stat.ML

    Online Robust Principal Component Analysis with Change Point Detection

    Authors: Wei Xiao, Xiaolin Huang, Jorge Silva, Saba Emrani, Arin Chaudhuri

    Abstract: Robust PCA methods are typically batch algorithms which requires loading all observations into memory before processing. This makes them inefficient to process big data. In this paper, we develop an efficient online robust principal component methods, namely online moving window robust principal component analysis (OMWRPCA). Unlike existing algorithms, OMWRPCA can successfully track not only slowl… ▽ More

    Submitted 20 March, 2017; v1 submitted 18 February, 2017; originally announced February 2017.

  13. arXiv:1606.05382  [pdf, other

    cs.LG stat.AP stat.ML

    Sampling Method for Fast Training of Support Vector Data Description

    Authors: Arin Chaudhuri, Deovrat Kakde, Maria Jahja, Wei Xiao, Hansi Jiang, Seunghyun Kong, Sergiy Peredriy

    Abstract: Support Vector Data Description (SVDD) is a popular outlier detection technique which constructs a flexible description of the input data. SVDD computation time is high for large training datasets which limits its use in big-data process-monitoring applications. We propose a new iterative sampling-based method for SVDD training. The method incrementally learns the training data description at each… ▽ More

    Submitted 25 September, 2016; v1 submitted 16 June, 2016; originally announced June 2016.

  14. arXiv:1604.03648  [pdf, ps, other

    stat.ME

    Robust regression for optimal individualized treatment rules

    Authors: Wei Xiao, Hao Helen Zhang, Wenbin Lu

    Abstract: Because different patients may response quite differently to the same drug or treatment, there is increasing interest in discovering individualized treatment rule. In particular, people are eager to find the optimal individualized treatment rules, which if followed by the whole patient population would lead to the "best" outcome. In this paper, we propose new estimators based on robust regression… ▽ More

    Submitted 13 April, 2016; originally announced April 2016.

  15. arXiv:1603.05770  [pdf, ps, other

    stat.ML stat.AP

    A Probabilistic Machine Learning Approach to Detect Industrial Plant Faults

    Authors: Wei Xiao

    Abstract: Fault detection in industrial plants is a hot research area as more and more sensor data are being collected throughout the industrial process. Automatic data-driven approaches are widely needed and seen as a promising area of investment. This paper proposes an effective machine learning algorithm to predict industrial plant faults based on classification methods such as penalized logistic regress… ▽ More

    Submitted 18 March, 2016; originally announced March 2016.

  16. arXiv:1210.0286   

    q-bio.QM q-bio.MN stat.AP

    MiRank: A bioinformatics tool for gene/miRNA ranking and pathway profiling with TCGA-KEGG data sets

    Authors: Siddharth G. Reddy, Weimin Xiao, Preethi H. Gunaratne

    Abstract: The Cancer Genome Atlas (TCGA) provides researchers with clinicopathological data and genomic characterizations of various carcinomas. These data sets include expression microarrays for genes and microRNAs -- short, non-coding strands of RNA that downregulate gene expression through RNA interference -- as well as days_to_death and days_to_last_followup fields for each tumor sample. Our aim is to d… ▽ More

    Submitted 4 July, 2013; v1 submitted 1 October, 2012; originally announced October 2012.

    Comments: Withdrawn