Skip to main content

Showing 1–50 of 64 results for author: Han, B

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.07955  [pdf, other

    cs.LG stat.ML

    How Interpretable Are Interpretable Graph Neural Networks?

    Authors: Yongqiang Chen, Yatao Bian, Bo Han, James Cheng

    Abstract: Interpretable graph neural networks (XGNNs ) are widely adopted in various scientific applications involving graph-structured data. Existing XGNNs predominantly adopt the attention-based mechanism to learn edge or node importance for extracting and making predictions with the interpretable subgraph. However, the representational properties and limitations of these methods remain inadequately explo… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: ICML2024, 44 pages, 21 figures, 12 tables

  2. arXiv:2404.09960  [pdf, other

    stat.ME stat.AP

    Pseudo P-values for Assessing Covariate Balance in a Finite Study Population with Application to the California Sugar Sweetened Beverage Tax Study

    Authors: Bing Han, Margo A. Sidell

    Abstract: Assessing covariate balance (CB) is a common practice in various types of evaluation studies. Two-sample descriptive statistics, such as the standardized mean difference, have been widely applied in the scientific literature to assess the goodness of CB. Studies in health policy, health services research, built and social environment research, and many other fields often involve a finite number of… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 26 pages in total, 2 figures, 6 tables

  3. arXiv:2404.04865  [pdf, other

    cs.LG cs.CV stat.ML

    On the Learnability of Out-of-distribution Detection

    Authors: Zhen Fang, Yixuan Li, Feng Liu, Bo Han, Jie Lu

    Abstract: Supervised learning aims to train a classifier under the assumption that training and test data are from the same distribution. To ease the above assumption, researchers have studied a more realistic setting: out-of-distribution (OOD) detection, where test data may come from classes that are unknown during training (i.e., OOD data). Due to the unavailability and diversity of OOD data, good general… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: Accepted by JMLR in 7th of April, 2024. This is a journal extension of the previous NeurIPS 2022 Outstanding Paper "Is Out-of-distribution Detection Learnable?" [arXiv:2210.14707]

  4. arXiv:2403.11497  [pdf, other

    cs.CV cs.LG stat.ML

    Do CLIPs Always Generalize Better than ImageNet Models?

    Authors: Qizhou Wang, Yong Lin, Yongqiang Chen, Ludwig Schmidt, Bo Han, Tong Zhang

    Abstract: Large vision language models, such as CLIPs, have revolutionized modern machine learning. CLIPs have demonstrated great generalizability under distribution shifts, supported by an increasing body of literature. However, the evaluation datasets for CLIPs are variations primarily designed for ImageNet benchmarks, which may not fully reflect the extent to which CLIPs, e.g., pre-trained on LAION, robu… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Qizhou Wang, Yong Lin, and Yongqiang Chen contributed equally. Project page: https://counteranimal.github.io

  5. arXiv:2402.03941  [pdf, other

    cs.LG cs.AI stat.ME

    Discovery of the Hidden World with Large Language Models

    Authors: Chenxi Liu, Yongqiang Chen, Tongliang Liu, Mingming Gong, James Cheng, Bo Han, Kun Zhang

    Abstract: Science originates with discovering new causal knowledge from a combination of known facts and observations. Traditional causal discovery approaches mainly rely on high-quality measured variables, usually given by human experts, to find causal relations. However, the causal variables are usually unavailable in a wide range of real-world applications. The rise of large language models (LLMs) that a… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: Preliminary version of an ongoing project; Chenxi and Yongqiang contributed equally; 26 pages, 41 figures; Project page: https://causalcoat.github.io/

  6. arXiv:2310.19035  [pdf, other

    cs.LG stat.ML

    Does Invariant Graph Learning via Environment Augmentation Learn Invariance?

    Authors: Yongqiang Chen, Yatao Bian, Kaiwen Zhou, Binghui Xie, Bo Han, James Cheng

    Abstract: Invariant graph representation learning aims to learn the invariance among data from different environments for out-of-distribution generalization on graphs. As the graph environment partitions are usually expensive to obtain, augmenting the environment information has become the de facto approach. However, the usefulness of the augmented environment information has never been verified. In this wo… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023, 34 pages, 35 figures

  7. arXiv:2310.18910  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    InstanT: Semi-supervised Learning with Instance-dependent Thresholds

    Authors: Muyang Li, Runze Wu, Haoyu Liu, Jun Yu, Xun Yang, Bo Han, Tongliang Liu

    Abstract: Semi-supervised learning (SSL) has been a fundamental challenge in machine learning for decades. The primary family of SSL algorithms, known as pseudo-labeling, involves assigning pseudo-labels to confident unlabeled instances and incorporating them into the training set. Therefore, the selection criteria of confident instances are crucial to the success of SSL. Recently, there has been growing in… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: Accepted as poster for NeurIPS 2023

  8. arXiv:2306.12658  [pdf, other

    stat.ML cs.LG q-fin.MF

    Fitted Value Iteration Methods for Bicausal Optimal Transport

    Authors: Erhan Bayraktar, Bingyan Han

    Abstract: We develop a fitted value iteration (FVI) method to compute bicausal optimal transport (OT) where couplings have an adapted structure. Based on the dynamic programming formulation, FVI adopts a function class to approximate the value functions in bicausal OT. Under the concentrability condition and approximate completeness assumption, we prove the sample complexity using (local) Rademacher complex… ▽ More

    Submitted 1 November, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

    MSC Class: 49Q99; 90C39; 68T07; 90C59

  9. arXiv:2304.11327  [pdf, other

    cs.LG stat.ML

    Understanding and Improving Feature Learning for Out-of-Distribution Generalization

    Authors: Yongqiang Chen, Wei Huang, Kaiwen Zhou, Yatao Bian, Bo Han, James Cheng

    Abstract: A common explanation for the failure of out-of-distribution (OOD) generalization is that the model trained with empirical risk minimization (ERM) learns spurious features instead of invariant features. However, several recent studies challenged this explanation and found that deep networks may have already learned sufficiently good features for OOD generalization. Despite the contradictions at fir… ▽ More

    Submitted 29 October, 2023; v1 submitted 22 April, 2023; originally announced April 2023.

    Comments: Yongqiang Chen, Wei Huang, and Kaiwen Zhou contributed equally; NeurIPS 2023, 55 pages, 64 figures

  10. arXiv:2210.14707  [pdf, other

    cs.LG stat.ML

    Is Out-of-Distribution Detection Learnable?

    Authors: Zhen Fang, Yixuan Li, Jie Lu, Jiahua Dong, Bo Han, Feng Liu

    Abstract: Supervised learning aims to train a classifier under the assumption that training and test data are from the same distribution. To ease the above assumption, researchers have studied a more realistic setting: out-of-distribution (OOD) detection, where test data may come from classes that are unknown during training (i.e., OOD data). Due to the unavailability and diversity of OOD data, good general… ▽ More

    Submitted 23 February, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022 Outstanding Paper

  11. arXiv:2206.07766  [pdf, other

    cs.LG stat.ML

    Pareto Invariant Risk Minimization: Towards Mitigating the Optimization Dilemma in Out-of-Distribution Generalization

    Authors: Yongqiang Chen, Kaiwen Zhou, Yatao Bian, Binghui Xie, Bingzhe Wu, Yonggang Zhang, Kaili Ma, Han Yang, Peilin Zhao, Bo Han, James Cheng

    Abstract: Recently, there has been a growing surge of interest in enabling machine learning systems to generalize well to Out-of-Distribution (OOD) data. Most efforts are devoted to advancing optimization objectives that regularize models to capture the underlying invariance; however, there often are compromises in the optimization process of these OOD objectives: i) Many OOD objectives have to be relaxed a… ▽ More

    Submitted 2 March, 2023; v1 submitted 15 June, 2022; originally announced June 2022.

    Comments: ICLR 2023, 50 pages, 58 figures

  12. arXiv:2205.03059  [pdf, other

    cs.LG stat.ML

    Low-rank Tensor Learning with Nonconvex Overlapped Nuclear Norm Regularization

    Authors: Quanming Yao, Yaqing Wang, Bo Han, James Kwok

    Abstract: Nonconvex regularization has been popularly used in low-rank matrix learning. However, extending it for low-rank tensor learning is still computationally expensive. To address this problem, we develop an efficient solver for use with a nonconvex extension of the overlapped nuclear norm regularizer. Based on the proximal average algorithm, the proposed algorithm can avoid expensive tensor folding/u… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

    Comments: Accepted to JMLR in 2022

  13. arXiv:2203.10571  [pdf, other

    q-fin.MF stat.ML

    Distributionally robust risk evaluation with a causality constraint and structural information

    Authors: Bingyan Han

    Abstract: This work studies distributionally robust evaluation of expected function values over temporal data. A set of alternative measures is characterized by the causal optimal transport. We prove the strong duality and recast the causality constraint as minimization over an infinite-dimensional test function space. We approximate test functions by neural networks and prove the sample complexity with Rad… ▽ More

    Submitted 9 April, 2023; v1 submitted 20 March, 2022; originally announced March 2022.

    Comments: Major revision. Correct a mistake in Lemma 4.3

  14. arXiv:2202.08057  [pdf, other

    cs.LG cs.CR stat.ML

    Understanding and Improving Graph Injection Attack by Promoting Unnoticeability

    Authors: Yongqiang Chen, Han Yang, Yonggang Zhang, Kaili Ma, Tongliang Liu, Bo Han, James Cheng

    Abstract: Recently Graph Injection Attack (GIA) emerges as a practical attack scenario on Graph Neural Networks (GNNs), where the adversary can merely inject few malicious nodes instead of modifying existing nodes or edges, i.e., Graph Modification Attack (GMA). Although GIA has achieved promising results, little is known about why it is successful and whether there is any pitfall behind the success. To und… ▽ More

    Submitted 5 April, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

    Comments: ICLR2022, 42 pages, 22 figures

  15. arXiv:2201.12739  [pdf, other

    cs.LG stat.ML

    Do We Need to Penalize Variance of Losses for Learning with Label Noise?

    Authors: Yexiong Lin, Yu Yao, Yuxuan Du, Jun Yu, Bo Han, Mingming Gong, Tongliang Liu

    Abstract: Algorithms which minimize the averaged loss have been widely designed for dealing with noisy labels. Intuitively, when there is a finite training sample, penalizing the variance of losses will improve the stability and generalization of the algorithms. Interestingly, we found that the variance should be increased for the problem of learning with noisy labels. Specifically, increasing the variance… ▽ More

    Submitted 30 January, 2022; originally announced January 2022.

  16. arXiv:2109.14419  [pdf, other

    cs.LG cs.AI stat.ML

    On the Estimation Bias in Double Q-Learning

    Authors: Zhizhou Ren, Guangxiang Zhu, Hao Hu, Beining Han, Jianglun Chen, Chongjie Zhang

    Abstract: Double Q-learning is a classical method for reducing overestimation bias, which is caused by taking maximum estimated values in the Bellman operation. Its variants in the deep Q-learning paradigm have shown great promise in producing reliable value prediction and improving learning performance. However, as shown by prior work, double Q-learning is not fully unbiased and suffers from underestimatio… ▽ More

    Submitted 14 January, 2022; v1 submitted 29 September, 2021; originally announced September 2021.

    Comments: Thirty-Fifth Conference on Neural Information Processing Systems (NeurIPS 2021)

  17. arXiv:2109.02986  [pdf, other

    stat.ML cs.LG

    Instance-dependent Label-noise Learning under a Structural Causal Model

    Authors: Yu Yao, Tongliang Liu, Mingming Gong, Bo Han, Gang Niu, Kun Zhang

    Abstract: Label noise will degenerate the performance of deep learning algorithms because deep neural networks easily overfit label errors. Let X and Y denote the instance and clean label, respectively. When Y is a cause of X, according to which many datasets have been constructed, e.g., SVHN and CIFAR, the distributions of P(X) and P(Y|X) are entangled. This means that the unsupervised instances are helpfu… ▽ More

    Submitted 3 June, 2022; v1 submitted 7 September, 2021; originally announced September 2021.

  18. arXiv:2101.05467  [pdf, other

    cs.LG stat.ML

    Tackling Instance-Dependent Label Noise via a Universal Probabilistic Model

    Authors: Qizhou Wang, Bo Han, Tongliang Liu, Gang Niu, Jian Yang, Chen Gong

    Abstract: The drastic increase of data quantity often brings the severe decrease of data quality, such as incorrect label annotations, which poses a great challenge for robustly training Deep Neural Networks (DNNs). Existing learning \mbox{methods} with label noise either employ ad-hoc heuristics or restrict to specific noise assumptions. However, more general situations, such as instance-dependent label no… ▽ More

    Submitted 17 March, 2022; v1 submitted 14 January, 2021; originally announced January 2021.

  19. arXiv:2010.11415  [pdf, other

    cs.LG stat.ML

    Maximum Mean Discrepancy Test is Aware of Adversarial Attacks

    Authors: Ruize Gao, Feng Liu, **gfeng Zhang, Bo Han, Tongliang Liu, Gang Niu, Masashi Sugiyama

    Abstract: The maximum mean discrepancy (MMD) test could in principle detect any distributional discrepancy between two datasets. However, it has been shown that the MMD test is unaware of adversarial attacks -- the MMD test failed to detect the discrepancy between natural and adversarial data. Given this phenomenon, we raise a question: are natural and adversarial data really from different distributions? T… ▽ More

    Submitted 11 July, 2021; v1 submitted 21 October, 2020; originally announced October 2020.

  20. arXiv:2010.01875  [pdf, other

    cs.LG stat.ML

    Pointwise Binary Classification with Pairwise Confidence Comparisons

    Authors: Lei Feng, Senlin Shu, Nan Lu, Bo Han, Miao Xu, Gang Niu, Bo An, Masashi Sugiyama

    Abstract: To alleviate the data requirement for training effective binary classifiers in binary classification, many weakly supervised learning settings have been proposed. Among them, some consider using pairwise but not pointwise labels, when pointwise labels are not accessible due to privacy, confidentiality, or security reasons. However, as a pairwise label denotes whether or not two data points share a… ▽ More

    Submitted 13 January, 2022; v1 submitted 5 October, 2020; originally announced October 2020.

    Comments: Accepted to ICML 2021

  21. arXiv:2007.12322  [pdf, other

    cs.LG cs.MA stat.ML

    Off-Policy Multi-Agent Decomposed Policy Gradients

    Authors: Yihan Wang, Beining Han, Tonghan Wang, Heng Dong, Chongjie Zhang

    Abstract: Multi-agent policy gradient (MAPG) methods recently witness vigorous progress. However, there is a significant performance discrepancy between MAPG methods and state-of-the-art multi-agent value-based approaches. In this paper, we investigate causes that hinder the performance of MAPG algorithms and present a multi-agent decomposed policy gradient method (DOP). This method introduces the idea of v… ▽ More

    Submitted 4 October, 2020; v1 submitted 23 July, 2020; originally announced July 2020.

  22. arXiv:2007.08929  [pdf, other

    cs.LG stat.ML

    Provably Consistent Partial-Label Learning

    Authors: Lei Feng, Jiaqi Lv, Bo Han, Miao Xu, Gang Niu, Xin Geng, Bo An, Masashi Sugiyama

    Abstract: Partial-label learning (PLL) is a multi-class classification problem, where each training example is associated with a set of candidate labels. Even though many practical PLL methods have been proposed in the last two decades, there lacks a theoretical understanding of the consistency of those methods-none of the PLL methods hitherto possesses a generation process of candidate label sets, and then… ▽ More

    Submitted 23 October, 2020; v1 submitted 17 July, 2020; originally announced July 2020.

    Comments: NeurIPS 2020 camera-ready version

  23. arXiv:2007.03938  [pdf, other

    cs.LG cs.CV stat.ML

    Operation-Aware Soft Channel Pruning using Differentiable Masks

    Authors: Minsoo Kang, Bohyung Han

    Abstract: We propose a simple but effective data-driven channel pruning algorithm, which compresses deep neural networks in a differentiable way by exploiting the characteristics of operations. The proposed approach makes a joint consideration of batch normalization (BN) and rectified linear unit (ReLU) for channel pruning; it estimates how likely the two successive operations deactivate each feature map an… ▽ More

    Submitted 21 July, 2020; v1 submitted 8 July, 2020; originally announced July 2020.

    Comments: ICML 2020

  24. arXiv:2006.07836  [pdf, other

    cs.LG stat.ML

    Part-dependent Label Noise: Towards Instance-dependent Label Noise

    Authors: Xiaobo Xia, Tongliang Liu, Bo Han, Nannan Wang, Mingming Gong, Haifeng Liu, Gang Niu, Dacheng Tao, Masashi Sugiyama

    Abstract: Learning with the \textit{instance-dependent} label noise is challenging, because it is hard to model such real-world noise. Note that there are psychological and physiological evidences showing that we humans perceive instances by decomposing them into parts. Annotators are therefore more likely to annotate instances based on the parts rather than the whole instances, where a wrong map** from p… ▽ More

    Submitted 2 December, 2020; v1 submitted 14 June, 2020; originally announced June 2020.

  25. arXiv:2006.07831  [pdf, other

    cs.LG stat.ML

    Class2Simi: A Noise Reduction Perspective on Learning with Noisy Labels

    Authors: Songhua Wu, Xiaobo Xia, Tongliang Liu, Bo Han, Mingming Gong, Nannan Wang, Haifeng Liu, Gang Niu

    Abstract: Learning with noisy labels has attracted a lot of attention in recent years, where the mainstream approaches are in pointwise manners. Meanwhile, pairwise manners have shown great potential in supervised metric learning and unsupervised contrastive learning. Thus, a natural question is raised: does learning in a pairwise manner mitigate label noise? To give an affirmative answer, in this paper, we… ▽ More

    Submitted 17 June, 2021; v1 submitted 14 June, 2020; originally announced June 2020.

  26. arXiv:2006.07805  [pdf, other

    cs.LG stat.ML

    Dual T: Reducing Estimation Error for Transition Matrix in Label-noise Learning

    Authors: Yu Yao, Tongliang Liu, Bo Han, Mingming Gong, Jiankang Deng, Gang Niu, Masashi Sugiyama

    Abstract: The transition matrix, denoting the transition relationship from clean labels to noisy labels, is essential to build statistically consistent classifiers in label-noise learning. Existing methods for estimating the transition matrix rely heavily on estimating the noisy class posterior. However, the estimation error for noisy class posterior could be large due to the randomness of label noise, whic… ▽ More

    Submitted 23 June, 2021; v1 submitted 14 June, 2020; originally announced June 2020.

  27. arXiv:2006.00587  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Towards Understanding Cooperative Multi-Agent Q-Learning with Value Factorization

    Authors: Jianhao Wang, Zhizhou Ren, Beining Han, Jianing Ye, Chongjie Zhang

    Abstract: Value factorization is a popular and promising approach to scaling up multi-agent reinforcement learning in cooperative settings, which balances the learning scalability and the representational capacity of value functions. However, the theoretical understanding of such methods is limited. In this paper, we formalize a multi-agent fitted Q-iteration framework for analyzing factorized multi-agent Q… ▽ More

    Submitted 31 October, 2021; v1 submitted 31 May, 2020; originally announced June 2020.

    Comments: Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021)

  28. arXiv:2005.04176  [pdf, other

    stat.ML cs.LG stat.AP

    In Pursuit of Interpretable, Fair and Accurate Machine Learning for Criminal Recidivism Prediction

    Authors: Caroline Wang, Bin Han, Bhrij Patel, Cynthia Rudin

    Abstract: Objectives: We study interpretable recidivism prediction using machine learning (ML) models and analyze performance in terms of prediction ability, sparsity, and fairness. Unlike previous works, this study trains interpretable models that output probabilities rather than binary predictions, and uses quantitative fairness definitions to assess the models. This study also examines whether models can… ▽ More

    Submitted 11 March, 2022; v1 submitted 8 May, 2020; originally announced May 2020.

  29. arXiv:2002.11242  [pdf, other

    cs.LG stat.ML

    Attacks Which Do Not Kill Training Make Adversarial Learning Stronger

    Authors: **gfeng Zhang, Xilie Xu, Bo Han, Gang Niu, Lizhen Cui, Masashi Sugiyama, Mohan Kankanhalli

    Abstract: Adversarial training based on the minimax formulation is necessary for obtaining adversarial robustness of trained models. However, it is conservative or even pessimistic so that it sometimes hurts the natural generalization. In this paper, we raise a fundamental question---do we have to trade off natural generalization for adversarial robustness? We argue that adversarial training is to employ co… ▽ More

    Submitted 5 September, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

    Comments: Thirty-seventh International Conference on Machine Learning (ICML 2020)

  30. arXiv:2002.06508  [pdf, other

    cs.LG stat.ML

    Multi-Class Classification from Noisy-Similarity-Labeled Data

    Authors: Songhua Wu, Xiaobo Xia, Tongliang Liu, Bo Han, Mingming Gong, Nannan Wang, Haifeng Liu, Gang Niu

    Abstract: A similarity label indicates whether two instances belong to the same class while a class label shows the class of the instance. Without class labels, a multi-class classifier could be learned from similarity-labeled pairwise data by meta classification learning. However, since the similarity label is less informative than the class label, it is more likely to be noisy. Deep neural networks can ea… ▽ More

    Submitted 16 February, 2020; originally announced February 2020.

  31. arXiv:2002.03673  [pdf, other

    cs.LG stat.ML

    Rethinking Class-Prior Estimation for Positive-Unlabeled Learning

    Authors: Yu Yao, Tongliang Liu, Bo Han, Mingming Gong, Gang Niu, Masashi Sugiyama, Dacheng Tao

    Abstract: Given only positive (P) and unlabeled (U) data, PU learning can train a binary classifier without any negative data. It has two building blocks: PU class-prior estimation (CPE) and PU classification; the latter has been well studied while the former has received less attention. Hitherto, the distributional-assumption-free CPE methods rely on a critical assumption that the support of the positive d… ▽ More

    Submitted 3 June, 2022; v1 submitted 10 February, 2020; originally announced February 2020.

  32. arXiv:2001.03772  [pdf, other

    cs.LG stat.ML

    Confidence Scores Make Instance-dependent Label-noise Learning Possible

    Authors: Antonin Berthon, Bo Han, Gang Niu, Tongliang Liu, Masashi Sugiyama

    Abstract: In learning with noisy labels, for every instance, its label can randomly walk to other classes following a transition distribution which is named a noise model. Well-studied noise models are all instance-independent, namely, the transition depends only on the original label but not the instance itself, and thus they are less practical in the wild. Fortunately, methods based on instance-dependent… ▽ More

    Submitted 22 February, 2021; v1 submitted 11 January, 2020; originally announced January 2020.

  33. arXiv:1912.12927  [pdf, other

    cs.LG stat.ML

    Learning with Multiple Complementary Labels

    Authors: Lei Feng, Takuo Kaneko, Bo Han, Gang Niu, Bo An, Masashi Sugiyama

    Abstract: A complementary label (CL) simply indicates an incorrect class of an example, but learning with CLs results in multi-class classifiers that can predict the correct class. Unfortunately, the problem setting only allows a single CL for each example, which notably limits its potential since our labelers may easily identify multiple CLs (MCLs) to one example. In this paper, we propose a novel problem… ▽ More

    Submitted 6 August, 2022; v1 submitted 30 December, 2019; originally announced December 2019.

    Comments: Corrected typos in Lemma 2, accepted by ICML 2020

  34. arXiv:1911.13019  [pdf, other

    cs.LG stat.ML

    Towards Oracle Knowledge Distillation with Neural Architecture Search

    Authors: Minsoo Kang, Jonghwan Mun, Bohyung Han

    Abstract: We present a novel framework of knowledge distillation that is capable of learning powerful and efficient student models from ensemble teacher networks. Our approach addresses the inherent model capacity issue between teacher and student and aims to maximize benefit from teacher models during distillation by reducing their capacity gap. Specifically, we employ a neural architecture search techniqu… ▽ More

    Submitted 29 November, 2019; originally announced November 2019.

    Comments: accepted by AAAI-20

  35. arXiv:1911.08696  [pdf, other

    cs.LG stat.ML

    Where is the Bottleneck of Adversarial Learning with Unlabeled Data?

    Authors: **gfeng Zhang, Bo Han, Gang Niu, Tongliang Liu, Masashi Sugiyama

    Abstract: Deep neural networks (DNNs) are incredibly brittle due to adversarial examples. To robustify DNNs, adversarial training was proposed, which requires large-scale but well-labeled data. However, it is quite expensive to annotate large-scale data well. To compensate for this shortage, several seminal works are utilizing large-scale unlabeled data. In this paper, we observe that seminal works do not p… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

  36. arXiv:1911.02377  [pdf, other

    cs.LG stat.ML

    Searching to Exploit Memorization Effect in Learning from Corrupted Labels

    Authors: Quanming Yao, Hansi Yang, Bo Han, Gang Niu, James Kwok

    Abstract: Sample selection approaches are popular in robust learning from noisy labels. However, how to properly control the selection process so that deep networks can benefit from the memorization effect is a hard problem. In this paper, motivated by the success of automated machine learning (AutoML), we model this issue as a function approximation problem. Specifically, we design a domain-specific search… ▽ More

    Submitted 18 September, 2020; v1 submitted 6 November, 2019; originally announced November 2019.

  37. Efficient Decoupled Neural Architecture Search by Structure and Operation Sampling

    Authors: Heung-Chang Lee, Do-Guk Kim, Bohyung Han

    Abstract: We propose a novel neural architecture search algorithm via reinforcement learning by decoupling structure and operation search processes. Our approach samples candidate models from the multinomial distribution on the policy vectors defined on the two search spaces independently. The proposed technique improves the efficiency of architecture search process significantly compared to the conventiona… ▽ More

    Submitted 23 October, 2019; originally announced October 2019.

    Report number: 9053197

    Journal ref: IEEE ICASSP 2020

  38. arXiv:1910.05150  [pdf, other

    cs.LG stat.ML

    SUM: Suboptimal Unitary Multi-task Learning Framework for Spatiotemporal Data Prediction

    Authors: Qichen Li, Jiaxin Pei, Jianding Zhang, Bo Han

    Abstract: The typical multi-task learning methods for spatio-temporal data prediction involve low-rank tensor computation. However, such a method have relatively weak performance when the task number is small, and we cannot integrate it into non-linear models. In this paper, we propose a two-step suboptimal unitary method (SUM) to combine a meta-learning strategy into multi-task models. In the first step, i… ▽ More

    Submitted 11 October, 2019; originally announced October 2019.

    Comments: 5 pages

  39. arXiv:1909.06769  [pdf, other

    cs.LG stat.ML

    VILD: Variational Imitation Learning with Diverse-quality Demonstrations

    Authors: Voot Tangkaratt, Bo Han, Mohammad Emtiyaz Khan, Masashi Sugiyama

    Abstract: The goal of imitation learning (IL) is to learn a good policy from high-quality demonstrations. However, the quality of demonstrations in reality can be diverse, since it is easier and cheaper to collect demonstrations from a mix of experts and amateurs. IL in such situations can be challenging, especially when the level of demonstrators' expertise is unknown. We propose a new IL method called \un… ▽ More

    Submitted 15 September, 2019; originally announced September 2019.

  40. arXiv:1909.05813  [pdf, other

    stat.ME

    Synthetic estimation for the complier average causal effect

    Authors: Denis Agniel, Bing Han, Matthew Cefalu

    Abstract: We propose an improved estimator of the complier average causal effect (CACE). Researchers typically choose a presumably-unbiased estimator for the CACE in studies with noncompliance, when many other lower-variance estimators may be available. We propose a synthetic estimator that combines information across all available estimators, leveraging the efficiency in lower-variance estimators while mai… ▽ More

    Submitted 12 September, 2019; originally announced September 2019.

  41. arXiv:1908.02984  [pdf, other

    cs.LG stat.ML

    Continual Learning by Asymmetric Loss Approximation with Single-Side Overestimation

    Authors: Dongmin Park, Seokil Hong, Bohyung Han, Kyoung Mu Lee

    Abstract: Catastrophic forgetting is a critical challenge in training deep neural networks. Although continual learning has been investigated as a countermeasure to the problem, it often suffers from the requirements of additional network components and the limited scalability to a large number of tasks. We propose a novel approach to continual learning by approximating a true loss function using an asymmet… ▽ More

    Submitted 21 October, 2019; v1 submitted 8 August, 2019; originally announced August 2019.

    Comments: ICCV 2019

  42. arXiv:1907.04275  [pdf, other

    cs.LG stat.ML

    Learning to Optimize Domain Specific Normalization for Domain Generalization

    Authors: Seonguk Seo, Yumin Suh, Dongwan Kim, Geeho Kim, Jongwoo Han, Bohyung Han

    Abstract: We propose a simple but effective multi-source domain generalization technique based on deep neural networks by incorporating optimized normalization layers that are specific to individual domains. Our approach employs multiple normalization methods while learning separate affine parameters per domain. For each domain, the activations are normalized by a weighted average of multiple normalization… ▽ More

    Submitted 21 July, 2020; v1 submitted 9 July, 2019; originally announced July 2019.

  43. arXiv:1906.03950  [pdf, other

    cs.LG cs.AI stat.ML

    Domain-Specific Batch Normalization for Unsupervised Domain Adaptation

    Authors: Woong-Gi Chang, Tackgeun You, Seonguk Seo, Suha Kwak, Bohyung Han

    Abstract: We propose a novel unsupervised domain adaptation framework based on domain-specific batch normalization in deep neural networks. We aim to adapt to both domains by specializing batch normalization layers in convolutional neural networks while allowing them to share all other model parameters, which is realized by a two-stage algorithm. In the first stage, we estimate pseudo-labels for the example… ▽ More

    Submitted 27 May, 2019; originally announced June 2019.

  44. arXiv:1906.00189  [pdf, other

    cs.LG stat.ML

    Are Anchor Points Really Indispensable in Label-Noise Learning?

    Authors: Xiaobo Xia, Tongliang Liu, Nannan Wang, Bo Han, Chen Gong, Gang Niu, Masashi Sugiyama

    Abstract: In label-noise learning, \textit{noise transition matrix}, denoting the probabilities that clean labels flip into noisy labels, plays a central role in building \textit{statistically consistent classifiers}. Existing theories have shown that the transition matrix can be learned by exploiting \textit{anchor points} (i.e., data points that belong to a specific class almost surely). However, when the… ▽ More

    Submitted 16 December, 2019; v1 submitted 1 June, 2019; originally announced June 2019.

    Comments: Accepted by NeurIPS 2019

  45. arXiv:1905.07720  [pdf, other

    cs.LG stat.ML

    Butterfly: One-step Approach towards Wildly Unsupervised Domain Adaptation

    Authors: Feng Liu, Jie Lu, Bo Han, Gang Niu, Guangquan Zhang, Masashi Sugiyama

    Abstract: In unsupervised domain adaptation (UDA), classifiers for the target domain (TD) are trained with clean labeled data from the source domain (SD) and unlabeled data from TD. However, in the wild, it is difficult to acquire a large amount of perfectly clean labeled data in SD given limited budget. Hence, we consider a new, more realistic and more challenging problem setting, where classifiers have to… ▽ More

    Submitted 17 February, 2021; v1 submitted 19 May, 2019; originally announced May 2019.

    Comments: Previous version of this paper has been accepted by the NeurIPS2019 Workshop on Learning Transferable Skills (https://www.skillsworkshop.ai/schedule.html)

  46. arXiv:1901.10155  [pdf, other

    cs.LG stat.ML

    Revisiting Sample Selection Approach to Positive-Unlabeled Learning: Turning Unlabeled Data into Positive rather than Negative

    Authors: Miao Xu, Bingcong Li, Gang Niu, Bo Han, Masashi Sugiyama

    Abstract: In the early history of positive-unlabeled (PU) learning, the sample selection approach, which heuristically selects negative (N) data from U data, was explored extensively. However, this approach was later dominated by the importance reweighting approach, which carefully treats all U data as N data. May there be a new sample selection method that can outperform the latest importance reweighting m… ▽ More

    Submitted 29 January, 2019; originally announced January 2019.

  47. arXiv:1901.04215  [pdf, other

    cs.LG stat.ML

    How does Disagreement Help Generalization against Label Corruption?

    Authors: Xingrui Yu, Bo Han, Jiangchao Yao, Gang Niu, Ivor W. Tsang, Masashi Sugiyama

    Abstract: Learning with noisy labels is one of the hottest problems in weakly-supervised learning. Based on memorization effects of deep neural networks, training on small-loss instances becomes very promising for handling noisy labels. This fosters the state-of-the-art approach "Co-teaching" that cross-trains two deep neural networks using the small-loss trick. However, with the increase of epochs, two net… ▽ More

    Submitted 12 May, 2019; v1 submitted 14 January, 2019; originally announced January 2019.

  48. arXiv:1812.06638  [pdf, other

    eess.SP cs.IT cs.LG stat.ML

    AI-Aided Online Adaptive OFDM Receiver: Design and Experimental Results

    Authors: Peiwen Jiang, Tianqi Wang, Bin Han, Xuanxuan Gao, **g Zhang, Chao-Kai Wen, Shi **, Geoffrey Ye Li

    Abstract: Orthogonal frequency division multiplexing (OFDM) has been widely applied in current communication systems. The artificial intelligence (AI)-aided OFDM receivers are currently brought to the forefront to replace and improve the traditional OFDM receivers. In this study, we first compare two AI-aided OFDM receivers, namely, data-driven fully connected deep neural network and model-driven ComNet, th… ▽ More

    Submitted 24 December, 2021; v1 submitted 17 December, 2018; originally announced December 2018.

    Comments: 30 pages, 12 figures. Copyright may be transferred without notice, after which this version may no longer be accessible

    Journal ref: IEEE Transactions on Wireless Communications, vol. 20, no. 11, pp. 7655-7668, Nov. 2021

  49. arXiv:1812.05877  [pdf, other

    cs.LG stat.ML

    DATELINE: Deep Plackett-Luce Model with Uncertainty Measurements

    Authors: Bo Han

    Abstract: The aggregation of k-ary preferences is a historical and important problem, since it has many real-world applications, such as peer grading, presidential elections and restaurant ranking. Meanwhile, variants of Plackett-Luce model has been applied to aggregate k-ary preferences. However, there are two urgent issues still existing in the current variants. First, most of them ignore feature informat… ▽ More

    Submitted 14 December, 2018; originally announced December 2018.

  50. arXiv:1810.02358  [pdf, other

    cs.LG cs.CL cs.CV stat.ML

    Transfer Learning via Unsupervised Task Discovery for Visual Question Answering

    Authors: Hyeonwoo Noh, Taehoon Kim, Jonghwan Mun, Bohyung Han

    Abstract: We study how to leverage off-the-shelf visual and linguistic data to cope with out-of-vocabulary answers in visual question answering task. Existing large-scale visual datasets with annotations such as image class labels, bounding boxes and region descriptions are good sources for learning rich and diverse visual concepts. However, it is not straightforward how the visual concepts can be captured… ▽ More

    Submitted 7 April, 2019; v1 submitted 3 October, 2018; originally announced October 2018.

    Comments: CVPR 2019