Skip to main content

Showing 1–50 of 84 results for author: Song, L

Searching in archive stat. Search in all archives.
.
  1. arXiv:2401.02890  [pdf, other

    stat.ML cs.LG

    Nonlinear functional regression by functional deep neural network with kernel embedding

    Authors: Zhongjie Shi, Jun Fan, Linhao Song, Ding-Xuan Zhou, Johan A. K. Suykens

    Abstract: With the rapid development of deep learning in various fields of science and technology, such as speech recognition, image classification, and natural language processing, recently it is also widely applied in the functional data analysis (FDA) with some empirical success. However, due to the infinite dimensional input, we need a powerful dimension reduction method for functional learning tasks, e… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  2. arXiv:2312.08583  [pdf, other

    cs.CL stat.ML

    ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks

    Authors: Xiaoxia Wu, Haojun Xia, Stephen Youn, Zhen Zheng, Shiyang Chen, Arash Bakhtiari, Michael Wyatt, Reza Yazdani Aminabadi, Yuxiong He, Olatunji Ruwase, Leon Song, Zhewei Yao

    Abstract: This study examines 4-bit quantization methods like GPTQ in large language models (LLMs), highlighting GPTQ's overfitting and limited enhancement in Zero-Shot tasks. While prior works merely focusing on zero-shot measurement, we extend task scope to more generative categories such as code generation and abstractive summarization, in which we found that INT4 quantization can significantly underperf… ▽ More

    Submitted 18 December, 2023; v1 submitted 13 December, 2023; originally announced December 2023.

  3. arXiv:2304.04443  [pdf, other

    stat.ML cs.LG

    Approximation of Nonlinear Functionals Using Deep ReLU Networks

    Authors: Linhao Song, Jun Fan, Di-Rong Chen, Ding-Xuan Zhou

    Abstract: In recent years, functional neural networks have been proposed and studied in order to approximate nonlinear continuous functionals defined on $L^p([-1, 1]^s)$ for integers $s\ge1$ and $1\le p<\infty$. However, their theoretical properties are largely unknown beyond universality of approximation or the existing analysis does not apply to the rectified linear unit (ReLU) activation function. To fil… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

  4. arXiv:2212.02125  [pdf, other

    stat.ML cs.AI cs.LG

    TD3 with Reverse KL Regularizer for Offline Reinforcement Learning from Mixed Datasets

    Authors: Yuanying Cai, Chuheng Zhang, Li Zhao, Wei Shen, Xuyun Zhang, Lei Song, Jiang Bian, Tao Qin, Tieyan Liu

    Abstract: We consider an offline reinforcement learning (RL) setting where the agent need to learn from a dataset collected by rolling out multiple behavior policies. There are two challenges for this setting: 1) The optimal trade-off between optimizing the RL signal and the behavior cloning (BC) signal changes on different states due to the variation of the action coverage induced by different behavior pol… ▽ More

    Submitted 5 December, 2022; originally announced December 2022.

    Comments: Accepted by ICDM-22 (Best Student Paper Runner-Up Awards)

  5. arXiv:2210.02724  [pdf, other

    cs.LG stat.ME stat.ML

    Leveraging Instance Features for Label Aggregation in Programmatic Weak Supervision

    Authors: Jieyu Zhang, Linxin Song, Alexander Ratner

    Abstract: Programmatic Weak Supervision (PWS) has emerged as a widespread paradigm to synthesize training labels efficiently. The core component of PWS is the label model, which infers true labels by aggregating the outputs of multiple noisy supervision sources abstracted as labeling functions (LFs). Existing statistical label models typically rely only on the outputs of LF, ignoring the instance features w… ▽ More

    Submitted 9 October, 2022; v1 submitted 6 October, 2022; originally announced October 2022.

    Comments: 16 pages

  6. arXiv:2111.02545  [pdf, other

    cs.LG stat.ML

    Multi-task Learning of Order-Consistent Causal Graphs

    Authors: Xinshi Chen, Haoran Sun, Caleb Ellington, Eric Xing, Le Song

    Abstract: We consider the problem of discovering $K$ related Gaussian directed acyclic graphs (DAGs), where the involved graph structures share a consistent causal order and sparse unions of supports. Under the multi-task learning setting, we propose a $l_1/l_2$-regularized maximum likelihood estimator (MLE) for learning $K$ linear structural equation models. We theoretically show that the joint estimator,… ▽ More

    Submitted 3 November, 2021; originally announced November 2021.

    Comments: 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

  7. arXiv:2006.15820  [pdf, other

    cs.LG cs.AI stat.ML

    Retro*: Learning Retrosynthetic Planning with Neural Guided A* Search

    Authors: Binghong Chen, Chengtao Li, Hanjun Dai, Le Song

    Abstract: Retrosynthetic planning is a critical task in organic chemistry which identifies a series of reactions that can lead to the synthesis of a target product. The vast number of possible chemical transformations makes the size of the search space very big, and retrosynthetic planning is challenging even for experienced chemists. However, existing methods either require expensive return estimation by r… ▽ More

    Submitted 29 June, 2020; originally announced June 2020.

    Comments: Presented at ICML 2020

  8. arXiv:2006.13401  [pdf, other

    cs.LG stat.ML

    Understanding Deep Architectures with Reasoning Layer

    Authors: Xinshi Chen, Yufei Zhang, Christoph Reisinger, Le Song

    Abstract: Recently, there has been a surge of interest in combining deep learning models with reasoning in order to handle more sophisticated learning tasks. In many cases, a reasoning task can be solved by an iterative algorithm. This algorithm is often unrolled, and used as a specialized layer in the deep architecture, which can be trained end-to-end with other neural components. Although such hybrid deep… ▽ More

    Submitted 29 October, 2020; v1 submitted 23 June, 2020; originally announced June 2020.

    Comments: 34th Conference on Neural Information Processing Systems (NeurIPS 2020)

  9. arXiv:2006.05806  [pdf, other

    cs.LG stat.ML

    Bandit Samplers for Training Graph Neural Networks

    Authors: Ziqi Liu, Zhengwei Wu, Zhiqiang Zhang, Jun Zhou, Shuang Yang, Le Song, Yuan Qi

    Abstract: Several sampling algorithms with variance reduction have been proposed for accelerating the training of Graph Convolution Networks (GCNs). However, due to the intractable computation of optimal sampling distribution, these sampling algorithms are suboptimal for GCNs and are not applicable to more general graph neural networks (GNNs) where the message aggregator contains learned weights rather than… ▽ More

    Submitted 11 June, 2020; v1 submitted 10 June, 2020; originally announced June 2020.

  10. arXiv:2006.05082  [pdf, other

    cs.LG stat.ML

    Learning to Stop While Learning to Predict

    Authors: Xinshi Chen, Hanjun Dai, Yu Li, Xin Gao, Le Song

    Abstract: There is a recent surge of interest in designing deep architectures based on the update steps in traditional algorithms, or learning neural networks to improve and replace traditional algorithms. While traditional algorithms have certain stop** criteria for outputting results at different iterations, many algorithm-inspired deep models are restricted to a ``fixed-depth'' for all inputs. Similar… ▽ More

    Submitted 9 June, 2020; originally announced June 2020.

    Comments: Proceedings of the 37th International Conference on Machine Learning

  11. arXiv:2004.08883  [pdf, other

    cs.LG cs.MA stat.ML

    Variational Policy Propagation for Multi-agent Reinforcement Learning

    Authors: Chao Qu, Hui Li, Chang Liu, Junwu Xiong, James Zhang, Wei Chu, Weiqiang Wang, Yuan Qi, Le Song

    Abstract: We propose a \emph{collaborative} multi-agent reinforcement learning algorithm named variational policy propagation (VPP) to learn a \emph{joint} policy through the interactions over agents. We prove that the joint policy is a Markov Random Field under some mild conditions, which in turn reduces the policy space effectively. We integrate the variational inference as special differentiable layers i… ▽ More

    Submitted 29 January, 2022; v1 submitted 19 April, 2020; originally announced April 2020.

    Comments: The title of previous version was "Intention Propagation for Multi-agent Reinforcement Learning"

  12. arXiv:2004.04690  [pdf, other

    cs.LG cs.CV stat.ML

    Orthogonal Over-Parameterized Training

    Authors: Weiyang Liu, Rongmei Lin, Zhen Liu, James M. Rehg, Liam Paull, Li Xiong, Le Song, Adrian Weller

    Abstract: The inductive bias of a neural network is largely determined by the architecture and the training algorithm. To achieve good generalization, how to effectively train a neural network is of great importance. We propose a novel orthogonal over-parameterized training (OPT) framework that can provably minimize the hyperspherical energy which characterizes the diversity of neurons on a hypersphere. By… ▽ More

    Submitted 4 June, 2021; v1 submitted 9 April, 2020; originally announced April 2020.

    Comments: CVPR 2021 Oral (43 Pages, Substantial Update from v3, Typos Fixed from v5)

  13. arXiv:2003.10595  [pdf, other

    cs.CR cs.LG stat.ML

    Systematic Evaluation of Privacy Risks of Machine Learning Models

    Authors: Liwei Song, Prateek Mittal

    Abstract: Machine learning models are prone to memorizing sensitive data, making them vulnerable to membership inference attacks in which an adversary aims to guess if an input sample was used to train the model. In this paper, we show that prior work on membership inference attacks may severely underestimate the privacy risks by relying solely on training custom neural network classifiers to perform attack… ▽ More

    Submitted 9 December, 2020; v1 submitted 23 March, 2020; originally announced March 2020.

    Comments: Accepted by USENIX Security 2021, code is available at https://github.com/inspire-group/membership-inference-evaluation

  14. arXiv:2003.04247  [pdf, other

    cs.CR cs.LG stat.ML

    Towards Probabilistic Verification of Machine Unlearning

    Authors: David Marco Sommer, Liwei Song, Sameer Wagh, Prateek Mittal

    Abstract: The right to be forgotten, also known as the right to erasure, is the right of individuals to have their data erased from an entity storing it. The status of this long held notion was legally solidified recently by the General Data Protection Regulation (GDPR) in the European Union. Consequently, there is a need for mechanisms whereby users can verify if service providers comply with their deletio… ▽ More

    Submitted 1 December, 2020; v1 submitted 9 March, 2020; originally announced March 2020.

    Comments: code is available at https://github.com/inspire-group/unlearning-verification

  15. arXiv:2002.12307  [pdf, other

    cs.LG cs.CR stat.ML

    Heterogeneous Graph Neural Networks for Malicious Account Detection

    Authors: Ziqi Liu, Chaochao Chen, Xinxing Yang, Jun Zhou, Xiaolong Li, Le Song

    Abstract: We present, GEM, the first heterogeneous graph neural network approach for detecting malicious accounts at Alipay, one of the world's leading mobile cashless payment platform. Our approach, inspired from a connected subgraph approach, adaptively learns discriminative embeddings from heterogeneous account-device graphs based on two fundamental weaknesses of attackers, i.e. device aggregation and ac… ▽ More

    Submitted 27 February, 2020; originally announced February 2020.

  16. arXiv:2002.11045  [pdf, ps, other

    eess.SP cs.LG cs.NI stat.ML

    Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G Networks

    Authors: Changyang She, Rui Dong, Zhouyou Gu, Zhanwei Hou, Yonghui Li, Wibowo Hardjawana, Chenyang Yang, Lingyang Song, Branka Vucetic

    Abstract: In the future 6th generation networks, ultra-reliable and low-latency communications (URLLC) will lay the foundation for emerging mission-critical applications that have stringent requirements on end-to-end delay and reliability. Existing works on URLLC are mainly based on theoretical models and assumptions. The model-based solutions provide useful insights, but cannot be directly implemented in p… ▽ More

    Submitted 22 February, 2020; originally announced February 2020.

    Comments: The manuscript contains 4 figures 2 tables. It has been submitted to IEEE Network (in the second round of revision)

  17. arXiv:2002.05810  [pdf, other

    cs.LG stat.ML

    RNA Secondary Structure Prediction By Learning Unrolled Algorithms

    Authors: Xinshi Chen, Yu Li, Ramzan Umarov, Xin Gao, Le Song

    Abstract: In this paper, we propose an end-to-end deep learning model, called E2Efold, for RNA secondary structure prediction which can effectively take into account the inherent constraints in the problem. The key idea of E2Efold is to directly predict the RNA base-pairing matrix, and use an unrolled algorithm for constrained programming as the template for deep architectures to enforce constraints. With c… ▽ More

    Submitted 13 February, 2020; originally announced February 2020.

    Comments: International Conference on Learning Representations 2020

    Journal ref: International Conference on Learning Representations 2020, https://openreview.net/forum?id=S1eALyrYDH

  18. arXiv:2001.01408  [pdf, other

    cs.LG stat.ML

    Retrosynthesis Prediction with Conditional Graph Logic Network

    Authors: Hanjun Dai, Chengtao Li, Connor W. Coley, Bo Dai, Le Song

    Abstract: Retrosynthesis is one of the fundamental problems in organic chemistry. The task is to identify reactants that can be used to synthesize a specified product molecule. Recently, computer-aided retrosynthesis is finding renewed interest from both chemistry and computer science communities. Most existing approaches rely on template-based models that define subgraph matching rules, but whether or not… ▽ More

    Submitted 6 January, 2020; originally announced January 2020.

    Comments: NeurIPS 2019

  19. arXiv:1912.01203  [pdf

    cs.LG eess.AS stat.ML

    Music Style Classification with Compared Methods in XGB and BPNN

    Authors: Lifeng Tan, Cong **, Zhiyuan Cheng, Xin Lv, Leiyu Song

    Abstract: Scientists have used many different classification methods to solve the problem of music classification. But the efficiency of each classification is different. In this paper, we propose two compared methods on the task of music style classification. More specifically, feature extraction for representing timbral texture, rhythmic content and pitch content are proposed. Comparative evaluations on p… ▽ More

    Submitted 3 December, 2019; originally announced December 2019.

    Comments: 5 pages, 1 figures

  20. arXiv:1910.13003  [pdf, other

    cs.LG cs.CV stat.ML

    Neural Similarity Learning

    Authors: Weiyang Liu, Zhen Liu, James M. Rehg, Le Song

    Abstract: Inner product-based convolution has been the founding stone of convolutional neural networks (CNNs), enabling end-to-end learning of visual representation. By generalizing inner product with a bilinear matrix, we propose the neural similarity which serves as a learnable parametric similarity measure for CNNs. Neural similarity naturally generalizes the convolution and enhances flexibility. Further… ▽ More

    Submitted 6 December, 2019; v1 submitted 28 October, 2019; originally announced October 2019.

    Comments: NeurIPS 2019 (v3)

  21. arXiv:1907.07157  [pdf, other

    cs.LG cs.CR stat.ML

    The Tradeoff Between Privacy and Accuracy in Anomaly Detection Using Federated XGBoost

    Authors: Mengwei Yang, Linqi Song, Jie Xu, Congduan Li, Guozhen Tan

    Abstract: Privacy has raised considerable concerns recently, especially with the advent of information explosion and numerous data mining techniques to explore the information inside large volumes of data. In this context, a new distributed learning paradigm termed federated learning becomes prominent recently to tackle the privacy issues in distributed learning, where only learning models will be transmitt… ▽ More

    Submitted 14 October, 2019; v1 submitted 16 July, 2019; originally announced July 2019.

  22. arXiv:1906.07172  [pdf, ps, other

    cs.LG cs.CV stat.ML

    Equivariant neural networks and equivarification

    Authors: Erkao Bao, Linqi Song

    Abstract: We provide a process to modify a neural network to an equivariant one, which we call equivarification. As an illustration, we build an equivariant neural network for image classification by equivarifying a convolutional neural network.

    Submitted 22 March, 2020; v1 submitted 16 June, 2019; originally announced June 2019.

    Comments: More explanations added

  23. arXiv:1906.02111  [pdf, other

    cs.LG cs.LO stat.ML

    Can Graph Neural Networks Help Logic Reasoning?

    Authors: Yuyu Zhang, Xinshi Chen, Yuan Yang, Arun Ramamurthy, Bo Li, Yuan Qi, Le Song

    Abstract: Effectively combining logic reasoning and probabilistic inference has been a long-standing goal of machine learning: the former has the ability to generalize with small training data, while the latter provides a principled framework for dealing with noisy data. However, existing methods for combining the best of both worlds are typically computationally intensive. In this paper, we focus on Markov… ▽ More

    Submitted 20 September, 2019; v1 submitted 5 June, 2019; originally announced June 2019.

  24. arXiv:1906.00271  [pdf, other

    cs.LG stat.ML

    GLAD: Learning Sparse Graph Recovery

    Authors: Harsh Shrivastava, Xinshi Chen, Binghong Chen, Guanghui Lan, Srinvas Aluru, Han Liu, Le Song

    Abstract: Recovering sparse conditional independence graphs from data is a fundamental problem in machine learning with wide applications. A popular formulation of the problem is an $\ell_1$ regularized maximum likelihood estimation. Many convex optimization algorithms have been designed to solve this formulation to recover the graph structure. Recently, there is a surge of interest to learn algorithms dire… ▽ More

    Submitted 21 December, 2019; v1 submitted 1 June, 2019; originally announced June 2019.

  25. arXiv:1905.10291  [pdf, other

    stat.ML cs.CR cs.LG

    Privacy Risks of Securing Machine Learning Models against Adversarial Examples

    Authors: Liwei Song, Reza Shokri, Prateek Mittal

    Abstract: The arms race between attacks and defenses for machine learning models has come to a forefront in recent years, in both the security community and the privacy community. However, one big limitation of previous research is that the security domain and the privacy domain have typically been considered separately. It is thus unclear whether the defense methods in one domain will have any unexpected i… ▽ More

    Submitted 25 August, 2019; v1 submitted 24 May, 2019; originally announced May 2019.

    Comments: ACM CCS 2019, code is available at https://github.com/inspire-group/privacy-vs-robustness

  26. arXiv:1905.01726  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Better the Devil you Know: An Analysis of Evasion Attacks using Out-of-Distribution Adversarial Examples

    Authors: Vikash Sehwag, Arjun Nitin Bhagoji, Liwei Song, Chawin Sitawarin, Daniel Cullina, Mung Chiang, Prateek Mittal

    Abstract: A large body of recent work has investigated the phenomenon of evasion attacks using adversarial examples for deep learning systems, where the addition of norm-bounded perturbations to the test inputs leads to incorrect output classification. Previous work has investigated this phenomenon in closed-world systems where training and test inputs follow a pre-specified distribution. However, real-worl… ▽ More

    Submitted 5 May, 2019; originally announced May 2019.

    Comments: 18 pages, 5 figures, 9 tables

  27. arXiv:1904.12083  [pdf, other

    cs.LG stat.CO stat.ML

    Exponential Family Estimation via Adversarial Dynamics Embedding

    Authors: Bo Dai, Zhen Liu, Hanjun Dai, Niao He, Arthur Gretton, Le Song, Dale Schuurmans

    Abstract: We present an efficient algorithm for maximum likelihood estimation (MLE) of exponential family models, with a general parametrization of the energy function that includes neural networks. We exploit the primal-dual view of the MLE with a kinetics augmented model to obtain an estimate associated with an adversarial dual sampler. To represent this sampler, we introduce a novel neural architecture,… ▽ More

    Submitted 30 March, 2020; v1 submitted 26 April, 2019; originally announced April 2019.

    Comments: Appearing in NeurIPS 2019 Vancouver, Canada; a preliminary version published in NeurIPS2018 Bayesian Deep Learning Workshop

  28. arXiv:1903.00070  [pdf, other

    cs.LG cs.RO stat.ML

    Learning to Plan in High Dimensions via Neural Exploration-Exploitation Trees

    Authors: Binghong Chen, Bo Dai, Qinjie Lin, Guo Ye, Han Liu, Le Song

    Abstract: We propose a meta path planning algorithm named \emph{Neural Exploration-Exploitation Trees~(NEXT)} for learning from prior experience for solving new path planning problems in high dimensional continuous state and action spaces. Compared to more classical sampling-based methods like RRT, our approach achieves much better sample efficiency in high-dimensions and can benefit from prior experience o… ▽ More

    Submitted 23 February, 2020; v1 submitted 28 February, 2019; originally announced March 2019.

    Comments: 26 pages, 74 figures, ICLR 2020 spotlight

  29. arXiv:1902.02495  [pdf, other

    stat.ML cs.LG

    Cost-Effective Incentive Allocation via Structured Counterfactual Inference

    Authors: Romain Lopez, Chenchen Li, Xiang Yan, Junwu Xiong, Michael I. Jordan, Yuan Qi, Le Song

    Abstract: We address a practical problem ubiquitous in modern marketing campaigns, in which a central agent tries to learn a policy for allocating strategic financial incentives to customers and observes only bandit feedback. In contrast to traditional policy optimization frameworks, we take into account the additional reward structure and budget constraints common in this setting, and develop a new two-ste… ▽ More

    Submitted 11 November, 2019; v1 submitted 7 February, 2019; originally announced February 2019.

    Journal ref: Association for the Advancement of Artificial Intelligence (AAAI) 2020

  30. arXiv:1902.00743  [pdf, other

    cs.LG eess.SP physics.data-an stat.ML

    Deep Learning for Vertex Reconstruction of Neutrino-Nucleus Interaction Events with Combined Energy and Time Data

    Authors: Linghao Song, Fan Chen, Steven R. Young, Catherine D. Schuman, Gabriel Perdue, Thomas E. Potok

    Abstract: We present a deep learning approach for vertex reconstruction of neutrino-nucleus interaction events, a problem in the domain of high energy physics. In this approach, we combine both energy and timing data that are collected in the MINERvA detector to perform classification and regression tasks. We show that the resulting network achieves higher accuracy than previous results while requiring a sm… ▽ More

    Submitted 2 February, 2019; originally announced February 2019.

    Comments: To appear in 2019 International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2019)

  31. arXiv:1902.00640  [pdf, other

    cs.LG stat.ML

    Particle Flow Bayes' Rule

    Authors: Xinshi Chen, Hanjun Dai, Le Song

    Abstract: We present a particle flow realization of Bayes' rule, where an ODE-based neural operator is used to transport particles from a prior to its posterior after a new observation. We prove that such an ODE operator exists. Its neural parameterization can be trained in a meta-learning framework, allowing this operator to reason about the effect of an individual observation on the posterior, and thus ge… ▽ More

    Submitted 31 December, 2019; v1 submitted 1 February, 2019; originally announced February 2019.

    Journal ref: Proceedings of the 36th International Conference on Machine Learning, PMLR 97:1022-1031, 2019

  32. arXiv:1901.09326  [pdf, other

    cs.LG stat.ML

    Value Propagation for Decentralized Networked Deep Multi-agent Reinforcement Learning

    Authors: Chao Qu, Shie Mannor, Huan Xu, Yuan Qi, Le Song, Junwu Xiong

    Abstract: We consider the networked multi-agent reinforcement learning (MARL) problem in a fully decentralized setting, where agents learn to coordinate to achieve the joint success. This problem is widely encountered in many areas including traffic control, distributed control, and smart grids. We assume that the reward function for each agent can be different and observed only locally by the agent itself.… ▽ More

    Submitted 28 September, 2019; v1 submitted 27 January, 2019; originally announced January 2019.

  33. arXiv:1812.10613  [pdf, other

    cs.LG cs.IR stat.ML

    Generative Adversarial User Model for Reinforcement Learning Based Recommendation System

    Authors: Xinshi Chen, Shuang Li, Hui Li, Shaohua Jiang, Yuan Qi, Le Song

    Abstract: There are great interests as well as many challenges in applying reinforcement learning (RL) to recommendation systems. In this setting, an online user is the environment; neither the reward function nor the environment dynamics are clearly defined, making the application of RL challenging. In this paper, we propose a novel model-based reinforcement learning framework for recommendation systems, w… ▽ More

    Submitted 31 December, 2019; v1 submitted 26 December, 2018; originally announced December 2018.

    Journal ref: Proceedings of the 36th International Conference on Machine Learning, PMLR 97:1052-1061, 2019

  34. arXiv:1812.09584  [pdf, other

    cs.LG stat.ML

    Meta Architecture Search

    Authors: Albert Shaw, Wei Wei, Weiyang Liu, Le Song, Bo Dai

    Abstract: Neural Architecture Search (NAS) has been quite successful in constructing state-of-the-art models on a variety of tasks. Unfortunately, the computational cost can make it difficult to scale. In this paper, we make the first attempt to study Meta Architecture Search which aims at learning a task-agnostic representation that can be used to speed up the process of architecture search on a large numb… ▽ More

    Submitted 15 November, 2019; v1 submitted 22 December, 2018; originally announced December 2018.

    Comments: 11 pages, 4 figures, 4 tables, 4 pages of appendix; NeurIPS 2019

  35. arXiv:1811.10158  [pdf, other

    cs.LG stat.ML

    Reinforcement Learning for Uplift Modeling

    Authors: Chenchen Li, Xiang Yan, Xiaotie Deng, Yuan Qi, Wei Chu, Le Song, Junlong Qiao, Jianshan He, Junwu Xiong

    Abstract: Uplift modeling aims to directly model the incremental impact of a treatment on an individual response. In this work, we address the problem from a new angle and reformulate it as a Markov Decision Process (MDP). We conducted extensive experiments on both a synthetic dataset and real-world scenarios, and showed that our method can achieve significant improvement over previous methods.

    Submitted 4 February, 2019; v1 submitted 25 November, 2018; originally announced November 2018.

  36. arXiv:1811.05016  [pdf, other

    cs.LG stat.ML

    Learning Temporal Point Processes via Reinforcement Learning

    Authors: Shuang Li, Shuai Xiao, Shixiang Zhu, Nan Du, Yao Xie, Le Song

    Abstract: Social goods, such as healthcare, smart city, and information networks, often produce ordered event data in continuous time. The generative processes of these event data can be very complex, requiring flexible models to capture their dynamics. Temporal point processes offer an elegant framework for modeling event data without discretizing the time. However, the existing maximum-likelihood-estimati… ▽ More

    Submitted 26 December, 2020; v1 submitted 12 November, 2018; originally announced November 2018.

    Comments: Add code link

  37. arXiv:1811.02228  [pdf, other

    cs.LG stat.ML

    Kernel Exponential Family Estimation via Doubly Dual Embedding

    Authors: Bo Dai, Hanjun Dai, Arthur Gretton, Le Song, Dale Schuurmans, Niao He

    Abstract: We investigate penalized maximum log-likelihood estimation for exponential family distributions whose natural parameter resides in a reproducing kernel Hilbert space. Key to our approach is a novel technique, doubly dual embedding, that avoids computation of the partition function. This technique also allows the development of a flexible sampling strategy that amortizes the cost of Monte-Carlo sam… ▽ More

    Submitted 24 April, 2019; v1 submitted 6 November, 2018; originally announced November 2018.

    Comments: 22 pages, 20 figures; AISTATS 2019

  38. arXiv:1810.06313  [pdf, ps, other

    cs.IR cs.LG stat.ML

    Regret vs. Bandwidth Trade-off for Recommendation Systems

    Authors: Linqi Song, Christina Fragouli, Devavrat Shah

    Abstract: We consider recommendation systems that need to operate under wireless bandwidth constraints, measured as number of broadcast transmissions, and demonstrate a (tight for some instances) tradeoff between regret and bandwidth for two scenarios: the case of multi-armed bandit with context, and the case where there is a latent structure in the message space that we can exploit to reduce the learning p… ▽ More

    Submitted 15 October, 2018; originally announced October 2018.

  39. arXiv:1808.02610  [pdf, other

    cs.LG stat.ML

    L-Shapley and C-Shapley: Efficient Model Interpretation for Structured Data

    Authors: Jianbo Chen, Le Song, Martin J. Wainwright, Michael I. Jordan

    Abstract: We study instancewise feature importance scoring as a method for model interpretation. Any such method yields, for each predicted instance, a vector of importance scores associated with the feature vector. Methods based on the Shapley score have been proposed as a fair way of computing feature attributions of this kind, but incur an exponential complexity in the number of features. This combinator… ▽ More

    Submitted 7 August, 2018; originally announced August 2018.

  40. arXiv:1806.02371  [pdf, other

    cs.LG cs.CR cs.SI stat.ML

    Adversarial Attack on Graph Structured Data

    Authors: Hanjun Dai, Hui Li, Tian Tian, Xin Huang, Lin Wang, Jun Zhu, Le Song

    Abstract: Deep learning on graph structures has shown exciting results in various applications. However, few attentions have been paid to the robustness of such models, in contrast to numerous research work for image or text adversarial attack and defense. In this paper, we focus on the adversarial attacks that fool the model by modifying the combinatorial structure of data. We first propose a reinforcement… ▽ More

    Submitted 6 June, 2018; originally announced June 2018.

    Comments: to appear in ICML 2018

  41. arXiv:1805.12393  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    KG^2: Learning to Reason Science Exam Questions with Contextual Knowledge Graph Embeddings

    Authors: Yuyu Zhang, Hanjun Dai, Kamil Toraman, Le Song

    Abstract: The AI2 Reasoning Challenge (ARC), a new benchmark dataset for question answering (QA) has been recently released. ARC only contains natural science questions authored for human exams, which are hard to answer and require advanced logic reasoning. On the ARC Challenge Set, existing state-of-the-art QA systems fail to significantly outperform random baseline, reflecting the difficult nature of this… ▽ More

    Submitted 31 May, 2018; originally announced May 2018.

  42. arXiv:1805.09298  [pdf, other

    cs.LG cs.CV stat.ML

    Learning towards Minimum Hyperspherical Energy

    Authors: Weiyang Liu, Rongmei Lin, Zhen Liu, Lixin Liu, Zhiding Yu, Bo Dai, Le Song

    Abstract: Neural networks are a powerful class of nonlinear functions that can be trained end-to-end on various applications. While the over-parametrization nature in many neural networks renders the ability to fit complex functions and the strong representation power to handle challenging tasks, it also leads to highly correlated neurons that can hurt the generalization ability and incur unnecessary comput… ▽ More

    Submitted 22 July, 2020; v1 submitted 23 May, 2018; originally announced May 2018.

    Comments: NeurIPS 2018

  43. arXiv:1805.08395  [pdf, other

    cs.LG stat.ML

    Learning to Optimize via Wasserstein Deep Inverse Optimal Control

    Authors: Yichen Wang, Le Song, Hongyuan Zha

    Abstract: We study the inverse optimal control problem in social sciences: we aim at learning a user's true cost function from the observed temporal behavior. In contrast to traditional phenomenological works that aim to learn a generative model to fit the behavioral data, we propose a novel variational principle and treat user as a reinforcement learning algorithm, which acts by optimizing his cost functio… ▽ More

    Submitted 22 May, 2018; originally announced May 2018.

  44. arXiv:1805.02474  [pdf, other

    cs.CL cs.LG stat.ML

    Sentence-State LSTM for Text Representation

    Authors: Yue Zhang, Qi Liu, Linfeng Song

    Abstract: Bi-directional LSTMs are a powerful tool for text representation. On the other hand, they have been shown to suffer various limitations due to their sequential nature. We investigate an alternative LSTM structure for encoding text, which consists of a parallel state for each word. Recurrent steps are used to perform local and global information exchange between words simultaneously, rather than in… ▽ More

    Submitted 7 May, 2018; originally announced May 2018.

    Comments: ACL 18 camera-ready version

  45. arXiv:1804.08071  [pdf, other

    cs.CV cs.LG stat.ML

    Decoupled Networks

    Authors: Weiyang Liu, Zhen Liu, Zhiding Yu, Bo Dai, Rongmei Lin, Yisen Wang, James M. Rehg, Le Song

    Abstract: Inner product-based convolution has been a central component of convolutional neural networks (CNNs) and the key to learning visual representations. Inspired by the observation that CNN-learned features are naturally decoupled with the norm of features corresponding to the intra-class variation and the angle corresponding to the semantic difference, we propose a generic decoupled learning framewor… ▽ More

    Submitted 22 April, 2018; originally announced April 2018.

    Comments: CVPR 2018 (Spotlight)

  46. arXiv:1802.07814  [pdf, other

    cs.LG cs.AI stat.ML

    Learning to Explain: An Information-Theoretic Perspective on Model Interpretation

    Authors: Jianbo Chen, Le Song, Martin J. Wainwright, Michael I. Jordan

    Abstract: We introduce instancewise feature selection as a methodology for model interpretation. Our method is based on learning a function to extract a subset of features that are most informative for each given example. This feature selector is trained to maximize the mutual information between selected features and the response variable, where the conditional distribution of the response variable given t… ▽ More

    Submitted 13 June, 2018; v1 submitted 21 February, 2018; originally announced February 2018.

    Comments: Accepted to ICML 2018 as a long oral

  47. arXiv:1711.03189  [pdf, other

    cs.LG cs.CV stat.ML

    Deep Hyperspherical Learning

    Authors: Weiyang Liu, Yan-Ming Zhang, Xingguo Li, Zhiding Yu, Bo Dai, Tuo Zhao, Le Song

    Abstract: Convolution as inner product has been the founding basis of convolutional neural networks (CNNs) and the key to end-to-end visual representation learning. Benefiting from deeper architectures, recent CNNs have demonstrated increasingly strong representation abilities. Despite such improvement, the increased depth and larger parameter space have also led to challenges in properly training a network… ▽ More

    Submitted 30 January, 2018; v1 submitted 8 November, 2017; originally announced November 2017.

    Comments: NIPS 2017 (Spotlight)

  48. arXiv:1710.10568  [pdf, other

    stat.ML cs.LG

    Stochastic Training of Graph Convolutional Networks with Variance Reduction

    Authors: Jianfei Chen, Jun Zhu, Le Song

    Abstract: Graph convolutional networks (GCNs) are powerful deep neural networks for graph-structured data. However, GCN computes the representation of a node recursively from its neighbors, making the receptive field size grow exponentially with the number of layers. Previous attempts on reducing the receptive field size by subsampling neighbors do not have a convergence guarantee, and their receptive field… ▽ More

    Submitted 1 March, 2018; v1 submitted 29 October, 2017; originally announced October 2017.

  49. arXiv:1710.07742  [pdf, other

    stat.ML cs.LG

    Towards Black-box Iterative Machine Teaching

    Authors: Weiyang Liu, Bo Dai, Xingguo Li, Zhen Liu, James M. Rehg, Le Song

    Abstract: In this paper, we make an important step towards the black-box machine teaching by considering the cross-space machine teaching, where the teacher and the learner use different feature representations and the teacher can not fully observe the learner's model. In such scenario, we study how the teacher is still able to teach the learner to achieve faster convergence rate than the traditional passiv… ▽ More

    Submitted 5 June, 2018; v1 submitted 20 October, 2017; originally announced October 2017.

    Comments: Published in ICML 2018

  50. arXiv:1705.10470  [pdf, other

    stat.ML cs.LG

    Iterative Machine Teaching

    Authors: Weiyang Liu, Bo Dai, Ahmad Humayun, Charlene Tay, Chen Yu, Linda B. Smith, James M. Rehg, Le Song

    Abstract: In this paper, we consider the problem of machine teaching, the inverse problem of machine learning. Different from traditional machine teaching which views the learners as batch algorithms, we study a new paradigm where the learner uses an iterative algorithm and a teacher can feed examples sequentially and intelligently based on the current performance of the learner. We show that the teaching c… ▽ More

    Submitted 17 November, 2017; v1 submitted 30 May, 2017; originally announced May 2017.

    Comments: Published in ICML 2017