Skip to main content

Showing 1–50 of 103 results for author: Tao, D

Searching in archive stat. Search in all archives.
.
  1. arXiv:2402.02687  [pdf, other

    cs.LG cs.AI stat.ML

    Poisson Process for Bayesian Optimization

    Authors: Xiaoxing Wang, Jiaxing Li, Chao Xue, Wei Liu, Weifeng Liu, Xiaokang Yang, Junchi Yan, Dacheng Tao

    Abstract: BayesianOptimization(BO) is a sample-efficient black-box optimizer, and extensive methods have been proposed to build the absolute function response of the black-box function through a probabilistic surrogate model, including Tree-structured Parzen Estimator (TPE), random forest (SMAC), and Gaussian process (GP). However, few methods have been explored to estimate the relative rankings of candidat… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  2. arXiv:2402.02399  [pdf, other

    cs.LG cs.AI stat.AP stat.ML

    FreDF: Learning to Forecast in Frequency Domain

    Authors: Hao Wang, Licheng Pan, Zhichao Chen, Degui Yang, Sen Zhang, Yifei Yang, Xinggao Liu, Haoxuan Li, Dacheng Tao

    Abstract: Time series modeling is uniquely challenged by the presence of autocorrelation in both historical and label sequences. Current research predominantly focuses on handling autocorrelation within the historical sequence but often neglects its presence in the label sequence. Specifically, emerging forecast models mainly conform to the direct forecast (DF) paradigm, generating multi-step forecasts unde… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  3. arXiv:2306.03679  [pdf, other

    cs.CV cs.AI cs.CR cs.LG stat.ML

    Human-imperceptible, Machine-recognizable Images

    Authors: Fusheng Hao, Fengxiang He, Yikai Wang, Fuxiang Wu, **g Zhang, Jun Cheng, Dacheng Tao

    Abstract: Massive human-related data is collected to train neural networks for computer vision tasks. A major conflict is exposed relating to software engineers between better develo** AI systems and distancing from the sensitive training data. To reconcile this conflict, this paper proposes an efficient privacy-preserving learning paradigm, where images are first encrypted to become ``human-imperceptible… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

  4. arXiv:2306.03266  [pdf, other

    cs.LG stat.ML

    Extending the Design Space of Graph Neural Networks by Rethinking Folklore Weisfeiler-Lehman

    Authors: Jiarui Feng, Lecheng Kong, Hao Liu, Dacheng Tao, Fuhai Li, Muhan Zhang, Yixin Chen

    Abstract: Message passing neural networks (MPNNs) have emerged as the most popular framework of graph neural networks (GNNs) in recent years. However, their expressive power is limited by the 1-dimensional Weisfeiler-Lehman (1-WL) test. Some works are inspired by $k$-WL/FWL (Folklore WL) and design the corresponding neural versions. Despite the high expressive power, there are serious limitations in this li… ▽ More

    Submitted 14 January, 2024; v1 submitted 5 June, 2023; originally announced June 2023.

    Comments: Accepted to NeurIPS 2023

  5. arXiv:2306.02913  [pdf, other

    cs.LG cs.CY cs.DC eess.SY stat.ML

    Decentralized SGD and Average-direction SAM are Asymptotically Equivalent

    Authors: Tongtian Zhu, Fengxiang He, Kaixuan Chen, Mingli Song, Dacheng Tao

    Abstract: Decentralized stochastic gradient descent (D-SGD) allows collaborative learning on massive devices simultaneously without the control of a central server. However, existing theories claim that decentralization invariably undermines generalization. In this paper, we challenge the conventional belief and present a completely new perspective for understanding decentralized learning. We prove that D-S… ▽ More

    Submitted 9 November, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

    Comments: 40th International Conference on Machine Learning (ICML 2023)

  6. arXiv:2302.11085  [pdf, other

    cs.LG stat.ML

    Learning to Generalize Provably in Learning to Optimize

    Authors: Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, Dacheng Tao, Yingbin Liang, Zhangyang Wang

    Abstract: Learning to optimize (L2O) has gained increasing popularity, which automates the design of optimizers by data-driven approaches. However, current L2O methods often suffer from poor generalization performance in at least two folds: (i) applying the L2O-learned optimizer to unseen optimizees, in terms of lowering their loss function values (optimizer generalization, or ``generalizable learning of op… ▽ More

    Submitted 28 March, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: This paper is accepted in AISTATS 2023

  7. arXiv:2301.08015  [pdf, other

    cs.GT cs.LG math.OC stat.ML

    Global Nash Equilibrium in Non-convex Multi-player Game: Theory and Algorithms

    Authors: Guanpu Chen, Gehui Xu, Fengxiang He, Yiguang Hong, Leszek Rutkowski, Dacheng Tao

    Abstract: Wide machine learning tasks can be formulated as non-convex multi-player games, where Nash equilibrium (NE) is an acceptable solution to all players, since no one can benefit from changing its strategy unilaterally. Attributed to the non-convexity, obtaining the existence condition of global NE is challenging, let alone designing theoretically guaranteed realization algorithms. This paper takes co… ▽ More

    Submitted 19 January, 2023; originally announced January 2023.

  8. arXiv:2210.05955  [pdf, other

    stat.ML cs.LG

    Identifiability and Asymptotics in Learning Homogeneous Linear ODE Systems from Discrete Observations

    Authors: Yuanyuan Wang, Wei Huang, Mingming Gong, Xi Geng, Tongliang Liu, Kun Zhang, Dacheng Tao

    Abstract: Ordinary Differential Equations (ODEs) have recently gained a lot of attention in machine learning. However, the theoretical aspects, e.g., identifiability and asymptotic properties of statistical estimation are still obscure. This paper derives a sufficient condition for the identifiability of homogeneous linear ODE systems from a sequence of equally-spaced error-free observations sampled from a… ▽ More

    Submitted 2 June, 2024; v1 submitted 12 October, 2022; originally announced October 2022.

    Journal ref: Journal of Machine Learning Research 25 (2024) 1-50

  9. arXiv:2210.05579  [pdf, other

    cs.GT cs.AI cs.LG stat.ML

    Benefits of Permutation-Equivariance in Auction Mechanisms

    Authors: Tian Qin, Fengxiang He, Dingfeng Shi, Wenbing Huang, Dacheng Tao

    Abstract: Designing an incentive-compatible auction mechanism that maximizes the auctioneer's revenue while minimizes the bidders' ex-post regret is an important yet intricate problem in economics. Remarkable progress has been achieved through learning the optimal auction mechanism by neural networks. In this paper, we consider the popular additive valuation and symmetric valuation setting; i.e., the valuat… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022

  10. arXiv:2208.14092  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Super-model ecosystem: A domain-adaptation perspective

    Authors: Fengxiang He, Dacheng Tao

    Abstract: This paper attempts to establish the theoretical foundation for the emerging super-model paradigm via domain adaptation, where one first trains a very large-scale model, {\it i.e.}, super model (or foundation model in some other papers), on a large amount of data and then adapts it to various specific domains. Super-model paradigms help reduce computational and data cost and carbon emission, which… ▽ More

    Submitted 30 August, 2022; originally announced August 2022.

  11. arXiv:2206.12680  [pdf, other

    cs.LG stat.ML

    Topology-aware Generalization of Decentralized SGD

    Authors: Tongtian Zhu, Fengxiang He, Lan Zhang, Zhengyang Niu, Mingli Song, Dacheng Tao

    Abstract: This paper studies the algorithmic stability and generalizability of decentralized stochastic gradient descent (D-SGD). We prove that the consensus model learned by D-SGD is $\mathcal{O}{(N^{-1}+m^{-1} +λ^2)}$-stable in expectation in the non-convex non-smooth setting, where $N$ is the total sample size, $m$ is the worker number, and $1+λ$ is the spectral gap that measures the connectivity of the… ▽ More

    Submitted 4 February, 2023; v1 submitted 25 June, 2022; originally announced June 2022.

    Comments: Accepted for publication in the 39th International Conference on Machine Learning (ICML 2022)

  12. arXiv:2206.01515  [pdf, other

    cs.LG cs.AI stat.ML

    Understanding Deep Learning via Decision Boundary

    Authors: Shiye Lei, Fengxiang He, Yancheng Yuan, Dacheng Tao

    Abstract: This paper discovers that the neural network with lower decision boundary (DB) variability has better generalizability. Two new notions, algorithm DB variability and $(ε, η)$-data DB variability, are proposed to measure the decision boundary variability from the algorithm and data perspectives. Extensive experiments show significant negative correlations between the decision boundary variability a… ▽ More

    Submitted 24 December, 2023; v1 submitted 3 June, 2022; originally announced June 2022.

    Comments: Accepted by IEEE TNNLS

  13. arXiv:2203.12964  [pdf, other

    cs.LG cs.AI stat.ML

    Knowledge Removal in Sampling-based Bayesian Inference

    Authors: Shaopeng Fu, Fengxiang He, Dacheng Tao

    Abstract: The right to be forgotten has been legislated in many countries, but its enforcement in the AI industry would cause unbearable costs. When single data deletion requests come, companies may need to delete the whole models learned with massive resources. Existing works propose methods to remove knowledge learned from data for explicitly parameterized models, which however are not appliable to the sa… ▽ More

    Submitted 24 March, 2022; originally announced March 2022.

    Comments: In International Conference on Learning Representations, 2022

  14. arXiv:2112.06281  [pdf, other

    cs.LG cs.AI stat.ML

    Spatial-Temporal-Fusion BNN: Variational Bayesian Feature Layer

    Authors: Shiye Lei, Zhuozhuo Tu, Leszek Rutkowski, Feng Zhou, Li Shen, Fengxiang He, Dacheng Tao

    Abstract: Bayesian neural networks (BNNs) have become a principal approach to alleviate overconfident predictions in deep learning, but they often suffer from scaling issues due to a large number of distribution parameters. In this paper, we discover that the first layer of a deep network possesses multiple disparate optima when solely retrained. This indicates a large posterior variance when the first laye… ▽ More

    Submitted 12 December, 2021; originally announced December 2021.

  15. arXiv:2112.03467  [pdf, other

    cs.LG cs.AI stat.ML

    Spectral Complexity-scaled Generalization Bound of Complex-valued Neural Networks

    Authors: Haowen Chen, Fengxiang He, Shiye Lei, Dacheng Tao

    Abstract: Complex-valued neural networks (CVNNs) have been widely applied to various fields, especially signal processing and image recognition. However, few works focus on the generalization of CVNNs, albeit it is vital to ensure the performance of CVNNs on unseen data. This paper is the first work that proves a generalization bound for the complex-valued neural network. The bound scales with the spectral… ▽ More

    Submitted 6 December, 2021; originally announced December 2021.

  16. arXiv:2108.10015  [pdf, other

    cs.CL stat.ML

    Semantic-Preserving Adversarial Text Attacks

    Authors: Xinghao Yang, Weifeng Liu, James Bailey, Dacheng Tao, Wei Liu

    Abstract: Deep neural networks (DNNs) are known to be vulnerable to adversarial images, while their robustness in text classification is rarely studied. Several lines of text attack methods have been proposed in the literature, including character-level, word-level, and sentence-level attacks. However, it is still a challenge to minimize the number of word changes necessary to induce misclassification, whil… ▽ More

    Submitted 2 March, 2023; v1 submitted 23 August, 2021; originally announced August 2021.

    Comments: 12 pages, 3 figures, 10 tables

  17. arXiv:2101.06417  [pdf, other

    cs.LG cs.AI stat.ML

    Bayesian Inference Forgetting

    Authors: Shaopeng Fu, Fengxiang He, Yue Xu, Dacheng Tao

    Abstract: The right to be forgotten has been legislated in many countries but the enforcement in machine learning would cause unbearable costs: companies may need to delete whole models learned from massive resources due to single individual requests. Existing works propose to remove the knowledge learned from the requested data via its influence function which is no longer naturally well-defined in Bayesia… ▽ More

    Submitted 18 February, 2021; v1 submitted 16 January, 2021; originally announced January 2021.

  18. arXiv:2101.05490  [pdf, other

    cs.LG cs.AI stat.ML

    Neural networks behave as hash encoders: An empirical study

    Authors: Fengxiang He, Shiye Lei, Jianmin Ji, Dacheng Tao

    Abstract: The input space of a neural network with ReLU-like activations is partitioned into multiple linear regions, each corresponding to a specific activation pattern of the included ReLU-like activations. We demonstrate that this partition exhibits the following encoding properties across a variety of deep learning models: (1) {\it determinism}: almost every linear region contains at most one training e… ▽ More

    Submitted 14 January, 2021; originally announced January 2021.

  19. arXiv:2012.13573  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Robustness, Privacy, and Generalization of Adversarial Training

    Authors: Fengxiang He, Shaopeng Fu, Bohan Wang, Dacheng Tao

    Abstract: Adversarial training can considerably robustify deep neural networks to resist adversarial attacks. However, some works suggested that adversarial training might comprise the privacy-preserving and generalization abilities. This paper establishes and quantifies the privacy-robustness trade-off and generalization-robustness trade-off in adversarial training from both theoretical and empirical aspec… ▽ More

    Submitted 25 December, 2020; originally announced December 2020.

  20. arXiv:2012.10931  [pdf, other

    cs.LG stat.ML

    Recent advances in deep learning theory

    Authors: Fengxiang He, Dacheng Tao

    Abstract: Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based ap… ▽ More

    Submitted 11 March, 2021; v1 submitted 20 December, 2020; originally announced December 2020.

  21. arXiv:2011.05885  [pdf, other

    cs.LG stat.ML

    Leveraged Matrix Completion with Noise

    Authors: Xinjian Huang, Weiwei Liu, Bo Du, Dacheng Tao

    Abstract: Completing low-rank matrices from subsampled measurements has received much attention in the past decade. Existing works indicate that $\mathcal{O}(nr\log^2(n))$ datums are required to theoretically secure the completion of an $n \times n$ noisy matrix of rank $r$ with high probability, under some quite restrictive assumptions: (1) the underlying matrix must be incoherent; (2) observations follow… ▽ More

    Submitted 14 August, 2023; v1 submitted 11 November, 2020; originally announced November 2020.

    Comments: This manuscript has been accepted for publication as a regular paper in the IEEE Transactions on Cybernetics

  22. arXiv:2007.09371  [pdf, ps, other

    cs.LG cs.CR stat.ML

    Tighter Generalization Bounds for Iterative Differentially Private Learning Algorithms

    Authors: Fengxiang He, Bohan Wang, Dacheng Tao

    Abstract: This paper studies the relationship between generalization and privacy preservation in iterative learning algorithms by two sequential steps. We first establish an alignment between generalization and privacy preservation for any learning algorithm. We prove that $(\varepsilon, δ)$-differential privacy implies an on-average generalization bound for multi-database learning algorithms which further… ▽ More

    Submitted 7 August, 2020; v1 submitted 18 July, 2020; originally announced July 2020.

  23. arXiv:2006.07836  [pdf, other

    cs.LG stat.ML

    Part-dependent Label Noise: Towards Instance-dependent Label Noise

    Authors: Xiaobo Xia, Tongliang Liu, Bo Han, Nannan Wang, Mingming Gong, Haifeng Liu, Gang Niu, Dacheng Tao, Masashi Sugiyama

    Abstract: Learning with the \textit{instance-dependent} label noise is challenging, because it is hard to model such real-world noise. Note that there are psychological and physiological evidences showing that we humans perceive instances by decomposing them into parts. Annotators are therefore more likely to annotate instances based on the parts rather than the whole instances, where a wrong map** from p… ▽ More

    Submitted 2 December, 2020; v1 submitted 14 June, 2020; originally announced June 2020.

  24. Knowledge Distillation: A Survey

    Authors: Jian** Gou, Baosheng Yu, Stephen John Maybank, Dacheng Tao

    Abstract: In recent years, deep neural networks have been successful in both industry and academia, especially for computer vision tasks. The great success of deep learning is mainly due to its scalability to encode large-scale data and to maneuver billions of model parameters. However, it is a challenge to deploy these cumbersome deep models on devices with limited resources, e.g., mobile phones and embedd… ▽ More

    Submitted 20 May, 2021; v1 submitted 9 June, 2020; originally announced June 2020.

    Comments: It has been accepted for publication in International Journal of Computer Vision (2021)

  25. arXiv:2005.14456  [pdf, other

    cs.LG stat.ML

    DC-NAS: Divide-and-Conquer Neural Architecture Search

    Authors: Yunhe Wang, Yixing Xu, Dacheng Tao

    Abstract: Most applications demand high-performance deep neural architectures costing limited resources. Neural architecture searching is a way of automatically exploring optimal deep neural networks in a given huge search space. However, all sub-networks are usually evaluated using the same criterion; that is, early stop** on a small proportion of the training dataset, which is an inaccurate and highly c… ▽ More

    Submitted 29 May, 2020; originally announced May 2020.

  26. arXiv:2004.03112  [pdf, other

    cs.LG stat.ML

    Repulsive Mixture Models of Exponential Family PCA for Clustering

    Authors: Maoying Qiao, Tongliang Liu, Jun Yu, Wei Bian, Dacheng Tao

    Abstract: The mixture extension of exponential family principal component analysis (EPCA) was designed to encode much more structural information about data distribution than the traditional EPCA does. For example, due to the linearity of EPCA's essential form, nonlinear cluster structures cannot be easily handled, but they are explicitly modeled by the mixing extensions. However, the traditional mixture of… ▽ More

    Submitted 7 April, 2020; originally announced April 2020.

  27. arXiv:2004.02842  [pdf, other

    cs.LG stat.ML

    Detecting Communities in Heterogeneous Multi-Relational Networks:A Message Passing based Approach

    Authors: Maoying Qiao, Jun Yu, Wei Bian, Dacheng Tao

    Abstract: Community is a common characteristic of networks including social networks, biological networks, computer and information networks, to name a few. Community detection is a basic step for exploring and analysing these network data. Typically, homogenous network is a type of networks which consists of only one type of objects with one type of links connecting them. There has been a large body of dev… ▽ More

    Submitted 6 April, 2020; originally announced April 2020.

  28. arXiv:2003.12236  [pdf, ps, other

    cs.LG stat.ML

    Piecewise linear activations substantially shape the loss surfaces of neural networks

    Authors: Fengxiang He, Bohan Wang, Dacheng Tao

    Abstract: Understanding the loss surface of a neural network is fundamentally important to the understanding of deep learning. This paper presents how piecewise linear activation functions substantially shape the loss surfaces of neural networks. We first prove that {\it the loss surfaces of many neural networks have infinite spurious local minima} which are defined as the local minima with higher empirical… ▽ More

    Submitted 27 March, 2020; originally announced March 2020.

    Comments: Published as a conference paper at ICLR 2020

  29. arXiv:2002.03673  [pdf, other

    cs.LG stat.ML

    Rethinking Class-Prior Estimation for Positive-Unlabeled Learning

    Authors: Yu Yao, Tongliang Liu, Bo Han, Mingming Gong, Gang Niu, Masashi Sugiyama, Dacheng Tao

    Abstract: Given only positive (P) and unlabeled (U) data, PU learning can train a binary classifier without any negative data. It has two building blocks: PU class-prior estimation (CPE) and PU classification; the latter has been well studied while the former has received less attention. Hitherto, the distributional-assumption-free CPE methods rely on a critical assumption that the support of the positive d… ▽ More

    Submitted 3 June, 2022; v1 submitted 10 February, 2020; originally announced February 2020.

  30. arXiv:2002.01136  [pdf, other

    cs.LG stat.ML

    On Positive-Unlabeled Classification in GAN

    Authors: Tianyu Guo, Chang Xu, Jiajun Huang, Yunhe Wang, Boxin Shi, Chao Xu, Dacheng Tao

    Abstract: This paper defines a positive and unlabeled classification problem for standard GANs, which then leads to a novel technique to stabilize the training of the discriminator in GANs. Traditionally, real data are taken as positive while generated data are negative. This positive-negative classification criterion was kept fixed all through the learning process of the discriminator without considering t… ▽ More

    Submitted 4 February, 2020; originally announced February 2020.

    Journal ref: CVPR 2020

  31. arXiv:2001.06937  [pdf, other

    cs.LG stat.ML

    A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications

    Authors: Jie Gui, Zhenan Sun, Yonggang Wen, Dacheng Tao, Jie** Ye

    Abstract: Generative adversarial networks (GANs) are a hot research topic recently. GANs have been widely studied since 2014, and a large number of algorithms have been proposed. However, there is few comprehensive study explaining the connections among different GANs variants, and how they have evolved. In this paper, we attempt to provide a review on various GANs methods from the perspectives of algorithm… ▽ More

    Submitted 19 January, 2020; originally announced January 2020.

  32. arXiv:1912.01447  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    Transform-Invariant Convolutional Neural Networks for Image Classification and Search

    Authors: Xu Shen, Xinmei Tian, Anfeng He, Shaoyan Sun, Dacheng Tao

    Abstract: Convolutional neural networks (CNNs) have achieved state-of-the-art results on many visual recognition tasks. However, current CNN models still exhibit a poor ability to be invariant to spatial transformations of images. Intuitively, with sufficient layers and parameters, hierarchical combinations of convolution (matrix multiplication and non-linear activation) and pooling operations should be abl… ▽ More

    Submitted 28 November, 2019; originally announced December 2019.

    Comments: Accepted by ACM Multimedia. arXiv admin note: text overlap with arXiv:1911.12682

  33. arXiv:1911.12682  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    Patch Reordering: a Novel Way to Achieve Rotation and Translation Invariance in Convolutional Neural Networks

    Authors: Xu Shen, Xinmei Tian, Shaoyan Sun, Dacheng Tao

    Abstract: Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance on many visual recognition tasks. However, the combination of convolution and pooling operations only shows invariance to small local location changes in meaningful objects in input. Sometimes, such networks are trained using data augmentation to encode this invariance into the parameters, which restricts the capac… ▽ More

    Submitted 28 November, 2019; originally announced November 2019.

    Comments: Accepted AAAI17

  34. arXiv:1911.12675  [pdf, other

    cs.CV cs.LG cs.NE stat.ML

    Continuous Dropout

    Authors: Xu Shen, Xinmei Tian, Tongliang Liu, Fang Xu, Dacheng Tao

    Abstract: Dropout has been proven to be an effective algorithm for training robust deep networks because of its ability to prevent overfitting by avoiding the co-adaptation of feature detectors. Current explanations of dropout include bagging, naive Bayes, regularization, and sex in evolution. According to the activation patterns of neurons in the human brain, when faced with different situations, the firin… ▽ More

    Submitted 28 November, 2019; originally announced November 2019.

    Comments: Accepted by TNNLS

  35. arXiv:1910.03718  [pdf, ps, other

    cs.LG math-ph math.ST quant-ph stat.ML

    On Dimension-free Tail Inequalities for Sums of Random Matrices and Applications

    Authors: Chao Zhang, Min-Hsiu Hsieh, Dacheng Tao

    Abstract: In this paper, we present a new framework to obtain tail inequalities for sums of random matrices. Compared with existing works, our tail inequalities have the following characteristics: 1) high feasibility--they can be used to study the tail behavior of various matrix functions, e.g., arbitrary matrix norms, the absolute value of the sum of the sum of the $j$ largest singular values (resp. eigenv… ▽ More

    Submitted 8 October, 2019; originally announced October 2019.

  36. arXiv:1909.09757  [pdf, other

    cs.LG stat.ML

    Positive-Unlabeled Compression on the Cloud

    Authors: Yixing Xu, Yunhe Wang, Hanting Chen, Kai Han, Chun**g Xu, Dacheng Tao, Chang Xu

    Abstract: Many attempts have been done to extend the great success of convolutional neural networks (CNNs) achieved on high-end GPU servers to portable devices such as smart phones. Providing compression and acceleration service of deep learning models on the cloud is therefore of significance and is attractive for end users. However, existing network compression and acceleration approaches usually fine-tun… ▽ More

    Submitted 7 October, 2019; v1 submitted 20 September, 2019; originally announced September 2019.

  37. arXiv:1909.01525  [pdf, ps, other

    stat.ML cs.LG

    Likelihood-Free Overcomplete ICA and Applications in Causal Discovery

    Authors: Chenwei Ding, Mingming Gong, Kun Zhang, Dacheng Tao

    Abstract: Causal discovery witnessed significant progress over the past decades. In particular, many recent causal discovery methods make use of independent, non-Gaussian noise to achieve identifiability of the causal models. Existence of hidden direct common causes, or confounders, generally makes causal discovery more difficult; whenever they are present, the corresponding causal discovery algorithms can… ▽ More

    Submitted 5 September, 2019; v1 submitted 3 September, 2019; originally announced September 2019.

    Comments: 10 pages, 3 figures. Accepted by NeurIPS 2019 as spotlight

  38. arXiv:1908.07307  [pdf, other

    cs.LG eess.SP stat.ML

    Investigation of wind pressures on tall building under interference effects using machine learning techniques

    Authors: Gang Hu, Lingbo Liu, Dacheng Tao, Jie Song, K. C. S. Kwok

    Abstract: Interference effects of tall buildings have attracted numerous studies due to the boom of clusters of tall buildings in megacities. To fully understand the interference effects of buildings, it often requires a substantial amount of wind tunnel tests. Limited wind tunnel tests that only cover part of interference scenarios are unable to fully reveal the interference effects. This study used machin… ▽ More

    Submitted 20 August, 2019; originally announced August 2019.

    Comments: 15 pages, 14 figures

  39. arXiv:1907.06814  [pdf, other

    cs.LG cs.CG cs.DS quant-ph stat.ML

    A Quantum-inspired Algorithm for General Minimum Conical Hull Problems

    Authors: Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Dacheng Tao

    Abstract: A wide range of fundamental machine learning tasks that are addressed by the maximum a posteriori estimation can be reduced to a general minimum conical hull problem. The best-known solution to tackle general minimum conical hull problems is the divide-and-conquer anchoring learning scheme (DCA), whose runtime complexity is polynomial in size. However, big data is pushing these polynomial algorith… ▽ More

    Submitted 15 July, 2019; originally announced July 2019.

    Journal ref: Phys. Rev. Research 2, 033199 (2020)

  40. arXiv:1906.10546  [pdf, other

    cs.LG cs.CV stat.ML

    Knowledge Amalgamation from Heterogeneous Networks by Common Feature Learning

    Authors: Sihui Luo, Xinchao Wang, Gongfan Fang, Yao Hu, Dapeng Tao, Mingli Song

    Abstract: An increasing number of well-trained deep networks have been released online by researchers and developers, enabling the community to reuse them in a plug-and-play way without accessing the training annotations. However, due to the large number of network variants, such public-available trained models are often of different architectures, each of which being tailored for a specific task or dataset… ▽ More

    Submitted 24 June, 2019; originally announced June 2019.

    Comments: IJCAI 2019, 7 pages, the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019)

  41. arXiv:1906.00495  [pdf, other

    cs.LG cs.CV stat.ML

    Truncated Cauchy Non-negative Matrix Factorization

    Authors: Naiyang Guan, Tongliang Liu, Yangmuzi Zhang, Dacheng Tao, Larry S. Davis

    Abstract: Non-negative matrix factorization (NMF) minimizes the Euclidean distance between the data matrix and its low rank approximation, and it fails when applied to corrupted data because the loss function is sensitive to outliers. In this paper, we propose a Truncated CauchyNMF loss that handle outliers by truncating large errors, and develop a Truncated CauchyNMF to robustly learn the subspace on noisy… ▽ More

    Submitted 2 June, 2019; originally announced June 2019.

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE T-PAMI), vol. 41, no. 1, pp. 246-259, Jan. 2019

  42. arXiv:1905.05929  [pdf, other

    cs.LG cs.CV stat.ML

    Orthogonal Deep Neural Networks

    Authors: Kui Jia, Shuai Li, Yuxin Wen, Tongliang Liu, Dacheng Tao

    Abstract: In this paper, we introduce the algorithms of Orthogonal Deep Neural Networks (OrthDNNs) to connect with recent interest of spectrally regularized deep learning methods. OrthDNNs are theoretically motivated by generalization analysis of modern DNNs, with the aim to find solution properties of network weights that guarantee better generalization. To this end, we first prove that DNNs are of local i… ▽ More

    Submitted 15 October, 2019; v1 submitted 14 May, 2019; originally announced May 2019.

    Comments: To Appear in IEEE Transactions on Pattern Analysis and Machine Intelligence

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019

  43. arXiv:1905.05177  [pdf, other

    cs.LG cs.DC stat.ML

    A Distributed Approach towards Discriminative Distance Metric Learning

    Authors: Jun Li, Xun Lin, Xiaoguang Rui, Yong Rui, Dacheng Tao

    Abstract: Distance metric learning is successful in discovering intrinsic relations in data. However, most algorithms are computationally demanding when the problem size becomes large. In this paper, we propose a discriminative metric learning algorithm, and develop a distributed scheme learning metrics on moderate-sized subsets of data, and aggregating the results into a global solution. The technique leve… ▽ More

    Submitted 11 May, 2019; originally announced May 2019.

  44. arXiv:1904.10100  [pdf, other

    cs.LG cs.CV stat.ML

    Multiview Hessian Regularization for Image Annotation

    Authors: Weifeng Liu, Dacheng Tao

    Abstract: The rapid development of computer hardware and Internet technology makes large scale data dependent models computationally tractable, and opens a bright avenue for annotating images through innovative machine learning algorithms. Semi-supervised learning (SSL) has consequently received intensive attention in recent years and has been successfully deployed in image annotation. One representative wo… ▽ More

    Submitted 22 April, 2019; originally announced April 2019.

    Journal ref: IEEE Transactions on Image Processing, vol. 22, no. 7, pp. 2676 - 2687, 2013

  45. arXiv:1904.06685  [pdf, other

    cs.LG stat.ML

    Exploring Representativeness and Informativeness for Active Learning

    Authors: Bo Du, Zengmao Wang, Lefei Zhang, Liangpei Zhang, Wei Liu, Jialie Shen, Dacheng Tao

    Abstract: How can we find a general way to choose the most suitable samples for training a classifier? Even with very limited prior information? Active learning, which can be regarded as an iterative optimization procedure, plays a key role to construct a refined training set to improve the classification performance in a variety of applications, such as text analysis, image recognition, social network mode… ▽ More

    Submitted 14 April, 2019; originally announced April 2019.

  46. arXiv:1904.06593  [pdf, other

    cs.CV cs.LG stat.ML

    Shakeout: A New Approach to Regularized Deep Neural Network Training

    Authors: Guoliang Kang, Jun Li, Dacheng Tao

    Abstract: Recent years have witnessed the success of deep neural networks in dealing with a plenty of practical problems. Dropout has played an essential role in many successful deep neural networks, by inducing regularization in the model training. In this paper, we present a new regularized training approach: Shakeout. Instead of randomly discarding units as Dropout does at the training stage, Shakeout ra… ▽ More

    Submitted 13 April, 2019; originally announced April 2019.

    Comments: Appears at T-PAMI 2018

  47. arXiv:1904.05335  [pdf, other

    cs.SI cs.LG stat.ML

    Adapting Stochastic Block Models to Power-Law Degree Distributions

    Authors: Maoying Qiao, Jun Yu, Wei Bian, Qiang Li, Dacheng Tao

    Abstract: Stochastic block models (SBMs) have been playing an important role in modeling clusters or community structures of network data. But, it is incapable of handling several complex features ubiquitously exhibited in real-world networks, one of which is the power-law degree characteristic. To this end, we propose a new variant of SBM, termed power-law degree SBM (PLD-SBM), by introducing degree decay… ▽ More

    Submitted 5 April, 2019; originally announced April 2019.

    Comments: 13 pages, 13 figures

    Journal ref: IEEE Transactions on Cybernetics, 49 (2019) 626-637

  48. arXiv:1904.04088  [pdf, other

    stat.ML cs.CV cs.LG

    Large Margin Multi-modal Multi-task Feature Extraction for Image Classification

    Authors: Yong Luo, Yonggang Wen, Dacheng Tao, Jie Gui, Chao Xu

    Abstract: The features used in many image analysis-based applications are frequently of very high dimension. Feature extraction offers several advantages in high-dimensional cases, and many recent studies have used multi-task feature extraction approaches, which often outperform single-task feature extraction approaches. However, most of these methods are limited in that they only consider data represented… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

    Journal ref: IEEE Transactions on Image Processing (Volume: 25, Issue: 1, Jan. 2016)

  49. Heterogeneous Multi-task Metric Learning across Multiple Domains

    Authors: Yong Luo, Yonggang Wen, Dacheng Tao

    Abstract: Distance metric learning (DML) plays a crucial role in diverse machine learning algorithms and applications. When the labeled information in target domain is limited, transfer metric learning (TML) helps to learn the metric by leveraging the sufficient information from other related domains. Multi-task metric learning (MTML), which can be regarded as a special case of TML, performs transfer across… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

    Journal ref: IEEE Transactions on Neural Networks and Learning Systems (Volume: 29, Issue: 9, Sept. 2018)

  50. arXiv:1904.04061  [pdf, other

    stat.ML cs.CV cs.LG

    Transferring Knowledge Fragments for Learning Distance Metric from A Heterogeneous Domain

    Authors: Yong Luo, Yonggang Wen, Tongliang Liu, Dacheng Tao

    Abstract: The goal of transfer learning is to improve the performance of target learning task by leveraging information (or transferring knowledge) from other related tasks. In this paper, we examine the problem of transfer distance metric learning (DML), which usually aims to mitigate the label information deficiency issue in the target DML. Most of the current Transfer DML (TDML) methods are not applicabl… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (Volume: 41, Issue: 4, April 1 2019)