Skip to main content

Showing 1–50 of 83 results for author: Yang, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.20550  [pdf

    cs.LG stat.ML

    Uncertainty Quantification for Deep Learning

    Authors: Peter Jan van Leeuwen, J. Christine Chiu, C. Kevin Yang

    Abstract: A complete and statistically consistent uncertainty quantification for deep learning is provided, including the sources of uncertainty arising from (1) the new input data, (2) the training and testing data (3) the weight vectors of the neural network, and (4) the neural network because it is not a perfect predictor. Using Bayes Theorem and conditional probability densities, we demonstrate how each… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 25 pages 4 figures, submitted to Environmental data Science

    MSC Class: 62D99 ACM Class: G.3

  2. arXiv:2405.17216  [pdf, other

    cs.LG cs.AI cs.LO stat.ML

    Autoformalizing Euclidean Geometry

    Authors: Logan Murphy, Kaiyu Yang, Jialiang Sun, Zhaoyu Li, Anima Anandkumar, Xujie Si

    Abstract: Autoformalization involves automatically translating informal math into formal theorems and proofs that are machine-verifiable. Euclidean geometry provides an interesting and controllable domain for studying autoformalization. In this paper, we introduce a neuro-symbolic framework for autoformalizing Euclidean geometry, which combines domain knowledge, SMT solvers, and large language models (LLMs)… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted to ICML 2024. The first two authors contributed equally

  3. arXiv:2405.02372  [pdf, ps, other

    stat.ML cs.AI cs.LG

    Triadic-OCD: Asynchronous Online Change Detection with Provable Robustness, Optimality, and Convergence

    Authors: Yancheng Huang, Kai Yang, Zelin Zhu, Leian Chen

    Abstract: The primary goal of online change detection (OCD) is to promptly identify changes in the data stream. OCD problem find a wide variety of applications in diverse areas, e.g., security detection in smart grids and intrusion detection in communication networks. Prior research usually assumes precise knowledge of the system parameters. Nevertheless, this presumption often proves unattainable in practi… ▽ More

    Submitted 4 June, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: Accepted at ICML2024

  4. arXiv:2404.12534  [pdf, other

    cs.AI cs.LG cs.LO stat.ML

    Towards Large Language Models as Copilots for Theorem Proving in Lean

    Authors: Peiyang Song, Kaiyu Yang, Anima Anandkumar

    Abstract: Theorem proving is an important challenge for large language models (LLMs), as formal proofs can be checked rigorously by proof assistants such as Lean, leaving no room for hallucination. Existing LLM-based provers try to prove theorems in a fully autonomous mode without human intervention. In this mode, they struggle with novel and challenging theorems, for which human insights may be critical. I… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: All code open-sourced at https://github.com/lean-dojo/LeanCopilot

  5. arXiv:2403.16706  [pdf, other

    stat.ME

    An alternative measure for quantifying the heterogeneity in meta-analysis

    Authors: Ke Yang, Enxuan Lin, Wangli Xu, Li** Zhu, Tiejun Tong

    Abstract: Quantifying the heterogeneity is an important issue in meta-analysis, and among the existing measures, the $I^2$ statistic is most commonly used. In this paper, we first illustrate with a simple example that the $I^2$ statistic is heavily dependent on the study sample sizes, mainly because it is used to quantify the heterogeneity between the observed effect sizes. To reduce the influence of sample… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 40 pages, 7 figures and 3 tables

  6. arXiv:2402.13934  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Do Efficient Transformers Really Save Computation?

    Authors: Kai Yang, Jan Ackermann, Zhenyu He, Guhao Feng, Bohang Zhang, Yunzhen Feng, Qiwei Ye, Di He, Liwei Wang

    Abstract: As transformer-based language models are trained on increasingly large datasets and with vast numbers of parameters, finding more efficient alternatives to the standard Transformer has become very valuable. While many efficient Transformers and Transformer alternatives have been proposed, none provide theoretical guarantees that they are a suitable replacement for the standard Transformer. This ma… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  7. arXiv:2402.09723  [pdf, other

    stat.ML cs.AI cs.CL cs.LG

    Efficient Prompt Optimization Through the Lens of Best Arm Identification

    Authors: Chengshuai Shi, Kun Yang, Zihan Chen, Jundong Li, **g Yang, Cong Shen

    Abstract: The remarkable instruction-following capability of large language models (LLMs) has sparked a growing interest in automatically finding good prompts, i.e., prompt optimization. Most existing works follow the scheme of selecting from a pre-generated pool of candidate prompts. However, these designs mainly focus on the generation strategy, while limited attention has been paid to the selection metho… ▽ More

    Submitted 30 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  8. arXiv:2402.07747  [pdf, ps, other

    math.ST stat.ML

    Optimal score estimation via empirical Bayes smoothing

    Authors: Andre Wibisono, Yihong Wu, Kaylee Yingxi Yang

    Abstract: We study the problem of estimating the score function of an unknown probability distribution $ρ^*$ from $n$ independent and identically distributed observations in $d$ dimensions. Assuming that $ρ^*$ is subgaussian and has a Lipschitz-continuous score function $s^*$, we establish the optimal rate of $\tilde Θ(n^{-\frac{2}{d+4}})$ for this estimation problem under the loss function… ▽ More

    Submitted 12 June, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: COLT 2024; added the new results on extending to beta-Holder scores with beta <= 1

  9. arXiv:2401.16421  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation

    Authors: Zhenyu He, Guhao Feng, Shengjie Luo, Kai Yang, Liwei Wang, **g**g Xu, Zhi Zhang, Hongxia Yang, Di He

    Abstract: In this work, we leverage the intrinsic segmentation of language sequences and design a new positional encoding method called Bilevel Positional Encoding (BiPE). For each position, our BiPE blends an intra-segment encoding and an inter-segment encoding. The intra-segment encoding identifies the locations within a segment and helps the model capture the semantic information therein via absolute pos… ▽ More

    Submitted 17 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: 17 pages, 7 figures, 8 tables; ICML 2024 Camera Ready version; Code: https://github.com/zhenyuhe00/BiPE

  10. arXiv:2312.16341  [pdf, other

    stat.ML cs.IT cs.LG cs.MA

    Harnessing the Power of Federated Learning in Federated Contextual Bandits

    Authors: Chengshuai Shi, Ruida Zhou, Kun Yang, Cong Shen

    Abstract: Federated learning (FL) has demonstrated great potential in revolutionizing distributed machine learning, and tremendous efforts have been made to extend it beyond the original focus on supervised learning. Among many directions, federated contextual bandits (FCB), a pivotal integration of FL and sequential decision-making, has garnered significant attention in recent years. Despite substantial pr… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: A preliminary version appeared in the Multi-Agent Security Workshop at NeurIPS 2023

  11. arXiv:2310.01184  [pdf, ps, other

    stat.AP

    Applications of Improvements to the Pythagorean Won-Loss Expectation in Optimizing Rosters

    Authors: Alexander F. Almeida, Kevin Dayaratna, Steven J. Miller, Andrew K. Yang

    Abstract: Bill James' Pythagorean formula has for decades done an excellent job estimating a baseball team's winning percentage from very little data: if the average runs scored and allowed are denoted respectively by ${\rm RS}$ and ${\rm RA}$, there is some $γ$ such that the winning percentage is approximately ${\rm RS}^γ/ ({\rm RS}^γ+ {\rm RA}^γ)$. One important consequence is to determine the value of di… ▽ More

    Submitted 20 February, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

  12. arXiv:2306.15626  [pdf, other

    cs.LG cs.AI cs.LO stat.ML

    LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

    Authors: Kaiyu Yang, Aidan M. Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan Prenger, Anima Anandkumar

    Abstract: Large language models (LLMs) have shown promise in proving formal theorems using proof assistants such as Lean. However, existing methods are difficult to reproduce or build on, due to private code, data, and large compute requirements. This has created substantial barriers to research on machine learning methods for theorem proving. This paper removes these barriers by introducing LeanDojo: an op… ▽ More

    Submitted 27 October, 2023; v1 submitted 27 June, 2023; originally announced June 2023.

    Comments: Accepted to NeurIPS 2023 (Datasets and Benchmarks Track) as an oral presentation. Data, code, and models available at https://leandojo.org/

  13. arXiv:2306.07456  [pdf

    stat.AP stat.ME

    On the Temporal-spatial Analysis of Estimating Urban Traffic Patterns Via GPS Trace Data of Car-hailing Vehicles

    Authors: Jiannan Mao, Lan Liu, Hao Huang, Weike Lu, Kaiyu Yang, Tianli Tang, Haotian Shi

    Abstract: Car-hailing services have become a prominent data source for urban traffic studies. Extracting useful information from car-hailing trace data is essential for effective traffic management, while discrepancies between car-hailing vehicles and urban traffic should be considered. This paper proposes a generic framework for estimating and analyzing urban traffic patterns using car-hailing trace data.… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

  14. arXiv:2306.06895  [pdf, other

    cs.LG stat.ML

    MPPN: Multi-Resolution Periodic Pattern Network For Long-Term Time Series Forecasting

    Authors: Xing Wang, Zhendong Wang, Kexin Yang, Junlan Feng, Zhiyan Song, Chao Deng, Lin zhu

    Abstract: Long-term time series forecasting plays an important role in various real-world scenarios. Recent deep learning methods for long-term series forecasting tend to capture the intricate patterns of time series by decomposition-based or sampling-based methods. However, most of the extracted patterns may include unpredictable noise and lack good interpretability. Moreover, the multivariate series forec… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

    Comments: 21 pages

  15. arXiv:2306.00638  [pdf, other

    stat.ML cs.DC cs.LG

    Byzantine-Robust Clustered Federated Learning

    Authors: Zhixu Tao, Kun Yang, Sanjeev R. Kulkarni

    Abstract: This paper focuses on the problem of adversarial attacks from Byzantine machines in a Federated Learning setting where non-Byzantine machines can be partitioned into disjoint clusters. In this setting, non-Byzantine machines in the same cluster have the same underlying data distribution, and different clusters of non-Byzantine machines have different learning tasks. Byzantine machines can adversar… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

  16. arXiv:2303.08230  [pdf, other

    cs.LG stat.ML

    Bayesian Beta-Bernoulli Process Sparse Coding with Deep Neural Networks

    Authors: Arunesh Mittal, Kai Yang, Paul Sajda, John Paisley

    Abstract: Several approximate inference methods have been proposed for deep discrete latent variable models. However, non-parametric methods which have previously been successfully employed for classical sparse coding models have largely been unexplored in the context of deep models. We propose a non-parametric iterative algorithm for learning discrete latent representations in such deep models. Additionall… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

  17. arXiv:2211.00716  [pdf, ps, other

    cs.LG cs.AI math.OC math.ST stat.ML

    Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian

    Authors: Paria Rashidinejad, Hanlin Zhu, Kunhe Yang, Stuart Russell, Jiantao Jiao

    Abstract: Offline reinforcement learning (RL), which refers to decision-making from a previously-collected dataset of interactions, has received significant attention over the past years. Much effort has focused on improving offline RL practicality by addressing the prevalent issue of partial data coverage through various forms of conservative policy learning. While the majority of algorithms do not have fi… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

    Comments: 49 pages, 1 figure

  18. Greykite: Deploying Flexible Forecasting at Scale at LinkedIn

    Authors: Reza Hosseini, Albert Chen, Kaixu Yang, Sayan Patra, Yi Su, Saad Eddin Al Orjany, Sishi Tang, Parvez Ahammad

    Abstract: Forecasts help businesses allocate resources and achieve objectives. At LinkedIn, product owners use forecasts to set business targets, track outlook, and monitor health. Engineers use forecasts to efficiently provision hardware. Develo** a forecasting solution to meet these needs requires accurate and interpretable forecasts on diverse time series with sub-hourly to quarterly frequencies. We pr… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

    Comments: In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '22), August 14-18, 2022, Washington, DC, USA. ACM, New York, NY, USA, 11 pages

    ACM Class: G.3

  19. arXiv:2205.05505  [pdf, other

    cs.LG stat.ML

    Probability Distribution of Hypervolume Improvement in Bi-objective Bayesian Optimization

    Authors: Hao Wang, Kaifeng Yang, Michael Affenzeller

    Abstract: Hypervolume improvement (HVI) is commonly employed in multi-objective Bayesian optimization algorithms to define acquisition functions due to its Pareto-compliant property. Rather than focusing on specific statistical moments of HVI, this work aims to provide the exact expression of HVI's probability distribution for bi-objective problems. Considering a bi-variate Gaussian random variable resultin… ▽ More

    Submitted 6 May, 2024; v1 submitted 11 May, 2022; originally announced May 2022.

  20. arXiv:2202.12472  [pdf, ps, other

    cs.GT cs.AI cs.IR cs.LG stat.ML

    Bidding Agent Design in the LinkedIn Ad Marketplace

    Authors: Yuan Gao, Kaiyu Yang, Yuanlong Chen, Min Liu, Noureddine El Karoui

    Abstract: We establish a general optimization framework for the design of automated bidding agent in dynamic online marketplaces. It optimizes solely for the buyer's interest and is agnostic to the auction mechanism imposed by the seller. As a result, the framework allows, for instance, the joint optimization of a group of ads across multiple platforms each running its own auction format. Bidding strategy d… ▽ More

    Submitted 24 February, 2022; originally announced February 2022.

  21. arXiv:2202.08549  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Oracle-Efficient Online Learning for Beyond Worst-Case Adversaries

    Authors: Nika Haghtalab, Yanjun Han, Abhishek Shetty, Kunhe Yang

    Abstract: In this paper, we study oracle-efficient algorithms for beyond worst-case analysis of online learning. We focus on two settings. First, the smoothed analysis setting of [RST11,HRS22] where an adversary is constrained to generating samples from distributions whose density is upper bounded by $1/σ$ times the uniform density. Second, the setting of $K$-hint transductive learning, where the learner is… ▽ More

    Submitted 22 November, 2022; v1 submitted 17 February, 2022; originally announced February 2022.

    Comments: An extended abstract of this work was published under the title "Oracle-efficient Online Learning for Smoothed Adversaries'' in the Proceedings of the 36th Conference on Neural Information Processing Systems

  22. arXiv:2109.05755   

    stat.ME

    IQ: Intrinsic measure for quantifying the heterogeneity in meta-analysis

    Authors: Ke Yang, Enxuan Lin, Tiejun Tong

    Abstract: Quantifying the heterogeneity is an important issue in meta-analysis, and among the existing measures, the $I^2$ statistic is the most commonly used measure in the literature. In this paper, we show that the $I^2$ statistic was, in fact, defined as problematic or even completely wrong from the very beginning. To confirm this statement, we first present a motivating example to show that the $I^2$ s… ▽ More

    Submitted 25 March, 2024; v1 submitted 13 September, 2021; originally announced September 2021.

    Comments: With a move comprehensive version with the new title "An alternative measure for quantifying the heterogeneity in meta-analysis", this old version is no longer most suitable to be posted in the arXiv. We hence will submit the new version with a new title as arXiv:2403.16706 and withdraw this outdated version. Thank you very much for your kind consideration

  23. arXiv:2107.03430  [pdf, ps, other

    stat.ME

    ENNS: Variable Selection, Regression, Classification and Deep Neural Network for High-Dimensional Data

    Authors: Kaixu Yang, Tapabrata Maiti

    Abstract: High-dimensional, low sample-size (HDLSS) data problems have been a topic of immense importance for the last couple of decades. There is a vast literature that proposed a wide variety of approaches to deal with this situation, among which variable selection was a compelling idea. On the other hand, a deep neural network has been used to model complicated relationships and interactions among respon… ▽ More

    Submitted 7 July, 2021; originally announced July 2021.

  24. arXiv:2105.01098  [pdf, other

    stat.ME

    A flexible forecasting model for production systems

    Authors: Reza Hosseini, Kaixu Yang, Albert Chen, Sayan Patra

    Abstract: This paper discusses desirable properties of forecasting models in production systems. It then develops a family of models which are designed to satisfy these properties: highly customizable to capture complex patterns; accommodates a large variety of objectives; has interpretable components; produces robust results; has automatic changepoint detection for trend and seasonality; and runs fast -- m… ▽ More

    Submitted 3 May, 2021; originally announced May 2021.

  25. arXiv:2104.04457  [pdf, other

    q-bio.QM cs.LG q-bio.BM stat.ML

    Protein sequence design with deep generative models

    Authors: Zachary Wu, Kadina E. Johnston, Frances H. Arnold, Kevin K. Yang

    Abstract: Protein engineering seeks to identify protein sequences with optimized properties. When guided by machine learning, protein sequence generation methods can draw on prior knowledge and experimental efforts to improve this process. In this review, we highlight recent applications of machine learning to generate protein sequences, focusing on the emerging field of deep generative methods.

    Submitted 9 April, 2021; originally announced April 2021.

    Comments: 11 pages, 2 figures

  26. arXiv:2101.07683  [pdf

    stat.ML cs.LG stat.AP

    Utilizing Import Vector Machines to Identify Dangerous Pro-active Traffic Conditions

    Authors: Kui Yang, Wen**g Zhao, Constantinos Antoniou

    Abstract: Traffic accidents have been a severe issue in metropolises with the development of traffic flow. This paper explores the theory and application of a recently developed machine learning technique, namely Import Vector Machines (IVMs), in real-time crash risk analysis, which is a hot topic to reduce traffic accidents. Historical crash data and corresponding traffic data from Shanghai Urban Expresswa… ▽ More

    Submitted 19 January, 2021; originally announced January 2021.

    Comments: 6 pages, 3 figures, 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC)

    Journal ref: In 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC) (pp. 1-6). IEEE

  27. arXiv:2101.00407  [pdf, other

    cs.LG cs.AI stat.ML

    ORDisCo: Effective and Efficient Usage of Incremental Unlabeled Data for Semi-supervised Continual Learning

    Authors: Liyuan Wang, Kuo Yang, Chongxuan Li, Lanqing Hong, Zhenguo Li, Jun Zhu

    Abstract: Continual learning usually assumes the incoming data are fully labeled, which might not be applicable in real applications. In this work, we consider semi-supervised continual learning (SSCL) that incrementally learns from partially labeled data. Observing that existing continual learning methods lack the ability to continually exploit the unlabeled data, we propose deep Online Replay with Discrim… ▽ More

    Submitted 8 April, 2021; v1 submitted 2 January, 2021; originally announced January 2021.

    Journal ref: CVPR 2021

  28. arXiv:2010.01596  [pdf, other

    cs.LG cs.AI stat.ML

    TimeAutoML: Autonomous Representation Learning for Multivariate Irregularly Sampled Time Series

    Authors: Yang Jiao, Kai Yang, Shaoyu Dou, Pan Luo, Sijia Liu, Dong** Song

    Abstract: Multivariate time series (MTS) data are becoming increasingly ubiquitous in diverse domains, e.g., IoT systems, health informatics, and 5G networks. To obtain an effective representation of MTS data, it is not only essential to consider unpredictable dynamics and highly variable lengths of these data but also important to address the irregularities in the sampling rates of MTS. Existing parametric… ▽ More

    Submitted 4 October, 2020; originally announced October 2020.

  29. arXiv:2009.10629  [pdf, ps, other

    math.OC stat.CO stat.ML

    Accelerated Gradient Methods for Sparse Statistical Learning with Nonconvex Penalties

    Authors: Kai Yang, Masoud Asgharian, Sahir Bhatnagar

    Abstract: Nesterov's accelerated gradient (AG) is a popular technique to optimize objective functions comprising two components: a convex loss and a penalty function. While AG methods perform well for convex penalties, such as the LASSO, convergence issues may arise when it is applied to nonconvex penalties, such as SCAD. A recent proposal generalizes Nesterov's AG method to the nonconvex setting. The propo… ▽ More

    Submitted 28 November, 2022; v1 submitted 22 September, 2020; originally announced September 2020.

    Comments: 42 pages, 13 figures

    Journal ref: Stat Comput 34, 59 (2024)

  30. Ultra high dimensional generalized additive model: Unified Theory and Methods

    Authors: Kaixu Yang, Tapabrata Maiti

    Abstract: Generalized additive model is a powerful statistical learning and predictive modeling tool that has been applied in a wide range of applications. The need of high-dimensional additive modeling is eminent in the context of dealing with high through-put data such as genetic data analysis. In this article, we studied a two step selection and estimation method for ultra high dimensional generalized ad… ▽ More

    Submitted 15 August, 2020; originally announced August 2020.

  31. arXiv:2007.12098  [pdf, other

    cs.LG stat.ML

    Optimal Transport using GANs for Lineage Tracing

    Authors: Neha Prasad, Karren Yang, Caroline Uhler

    Abstract: In this paper, we present Super-OT, a novel approach to computational lineage tracing that combines a supervised learning framework with optimal transport based on Generative Adversarial Networks (GANs). Unlike previous approaches to lineage tracing, Super-OT has the flexibility to integrate paired data. We benchmark Super-OT based on single-cell RNA-seq data against Waddington-OT, a popular appro… ▽ More

    Submitted 5 January, 2022; v1 submitted 23 July, 2020; originally announced July 2020.

    Comments: 4 pages excluding references, 2 figures, 3 tables. Accepted at ICML 2020 Workshop on Computational Biology for Spotlight Presentation. Code can be found here: https://github.com/uhlerlab/superot

  32. arXiv:2007.10333  [pdf, other

    cs.LG cs.HC stat.ML

    Visualizing Deep Graph Generative Models for Drug Discovery

    Authors: Karan Yang, Chengxi Zang, Fei Wang

    Abstract: Drug discovery aims at designing novel molecules with specific desired properties for clinical trials. Over past decades, drug discovery and development have been a costly and time consuming process. Driven by big chemical data and AI, deep generative models show great potential to accelerate the drug discovery process. Existing works investigate different deep generative frameworks for molecular… ▽ More

    Submitted 20 July, 2020; originally announced July 2020.

    Comments: 4 pages, 2020 KDD Workshop on Applied Data Science for Healthcare

    ACM Class: I.2.1

  33. arXiv:2006.10559  [pdf, ps, other

    cs.LG cs.CR stat.ML

    Differentially-private Federated Neural Architecture Search

    Authors: Ishika Singh, Haoyi Zhou, Kunlin Yang, Meng Ding, Bill Lin, Pengtao Xie

    Abstract: Neural architecture search, which aims to automatically search for architectures (e.g., convolution, max pooling) of neural networks that maximize validation performance, has achieved remarkable progress recently. In many application scenarios, several parties would like to collaboratively search for a shared neural architecture by leveraging data from all parties. However, due to privacy concerns… ▽ More

    Submitted 22 June, 2020; v1 submitted 16 June, 2020; originally announced June 2020.

  34. arXiv:2006.09118  [pdf, ps, other

    cs.LG math.OC stat.ML

    $Q$-learning with Logarithmic Regret

    Authors: Kunhe Yang, Lin F. Yang, Simon S. Du

    Abstract: This paper presents the first non-asymptotic result showing that a model-free algorithm can achieve a logarithmic cumulative regret for episodic tabular reinforcement learning if there exists a strictly positive sub-optimality gap in the optimal $Q$-function. We prove that the optimistic $Q$-learning studied in [** et al. 2018] enjoys a… ▽ More

    Submitted 23 February, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

    Comments: Accepted by AISTATS 2021

  35. arXiv:2006.08688  [pdf, other

    cs.LG cs.AI stat.AP stat.ML

    Causal intersectionality for fair ranking

    Authors: Ke Yang, Joshua R. Loftus, Julia Stoyanovich

    Abstract: In this paper we propose a causal modeling approach to intersectional fairness, and a flexible, task-specific method for computing intersectionally fair rankings. Rankings are used in many contexts, ranging from Web search results to college admissions, but causal inference for fair rankings has received limited attention. Additionally, the growing literature on causal fairness has directed little… ▽ More

    Submitted 15 June, 2020; originally announced June 2020.

  36. arXiv:2006.04941  [pdf, other

    cs.SI cs.LG stat.ML

    Persona2vec: A Flexible Multi-role Representations Learning Framework for Graphs

    Authors: Jisung Yoon, Kai-Cheng Yang, Woo-Sung Jung, Yong-Yeol Ahn

    Abstract: Graph embedding techniques, which learn low-dimensional representations of a graph, are achieving state-of-the-art performance in many graph mining tasks. Most existing embedding algorithms assign a single vector to each node, implicitly assuming that a single representation is enough to capture all characteristics of the node. However, across many domains, it is common to observe pervasively over… ▽ More

    Submitted 21 October, 2020; v1 submitted 4 June, 2020; originally announced June 2020.

    Comments: 9 pages, 7 figures

  37. arXiv:2006.03000  [pdf, other

    cs.LG cs.AI stat.ML

    Differentiable Linear Bandit Algorithm

    Authors: Kaige Yang, Laura Toni

    Abstract: Upper Confidence Bound (UCB) is arguably the most commonly used method for linear multi-arm bandit problems. While conceptually and computationally simple, this method highly relies on the confidence bounds, failing to strike the optimal exploration-exploitation if these bounds are not properly set. In the literature, confidence bounds are typically derived from concentration inequalities based on… ▽ More

    Submitted 4 June, 2020; originally announced June 2020.

    Comments: 16 pages

  38. arXiv:2005.10036  [pdf, other

    cs.LG q-bio.QM stat.ML

    Uncertainty Quantification Using Neural Networks for Molecular Property Prediction

    Authors: Lior Hirschfeld, Kyle Swanson, Kevin Yang, Regina Barzilay, Connor W. Coley

    Abstract: Uncertainty quantification (UQ) is an important component of molecular property prediction, particularly for drug discovery applications where model predictions direct experimental design and where unanticipated imprecision wastes valuable time and resources. The need for UQ is especially acute for neural models, which are becoming increasingly standard yet are challenging to interpret. While seve… ▽ More

    Submitted 20 May, 2020; originally announced May 2020.

  39. arXiv:2002.04720  [pdf, other

    cs.LG physics.chem-ph stat.ML

    Improving Molecular Design by Stochastic Iterative Target Augmentation

    Authors: Kevin Yang, Wengong **, Kyle Swanson, Regina Barzilay, Tommi Jaakkola

    Abstract: Generative models in molecular design tend to be richly parameterized, data-hungry neural models, as they must create complex structured objects as outputs. Estimating such models from data may be challenging due to the lack of sufficient training data. In this paper, we propose a surprisingly effective self-training approach for iteratively creating additional molecular targets. We first pre-trai… ▽ More

    Submitted 15 August, 2021; v1 submitted 11 February, 2020; originally announced February 2020.

    Comments: ICML 2020

    Journal ref: PMLR 119:10716-10726, 2020

  40. arXiv:2002.03736  [pdf, other

    cs.CV cs.LG stat.ML

    Universal Semantic Segmentation for Fisheye Urban Driving Images

    Authors: Yaozu Ye, Kailun Yang, Kaite Xiang, Juan Wang, Kaiwei Wang

    Abstract: Semantic segmentation is a critical method in the field of autonomous driving. When performing semantic image segmentation, a wider field of view (FoV) helps to obtain more information about the surrounding environment, making automatic driving safer and more reliable, which could be offered by fisheye cameras. However, large public fisheye datasets are not available, and the fisheye images captur… ▽ More

    Submitted 24 August, 2020; v1 submitted 31 January, 2020; originally announced February 2020.

    Comments: SMC2020 recieved

  41. arXiv:2001.09223  [pdf, other

    cs.LG cs.NI stat.ML

    Stacked Auto Encoder Based Deep Reinforcement Learning for Online Resource Scheduling in Large-Scale MEC Networks

    Authors: Feibo Jiang, Kezhi Wang, Li Dong, Cunhua Pan, Kun Yang

    Abstract: An online resource scheduling framework is proposed for minimizing the sum of weighted task latency for all the Internet of things (IoT) users, by optimizing offloading decision, transmission power and resource allocation in the large-scale mobile edge computing (MEC) system. Towards this end, a deep reinforcement learning (DRL) based solution is proposed, which includes the following components.… ▽ More

    Submitted 14 April, 2020; v1 submitted 24 January, 2020; originally announced January 2020.

    Comments: Accepted by IEEE Internet of Things Journal

  42. arXiv:1912.05977  [pdf, other

    cs.LG stat.ML

    Tracing the Propagation Path: A Flow Perspective of Representation Learning on Graphs

    Authors: Menghan Wang, Kun Zhang, Gulin Li, Ke** Yang, Luo Si

    Abstract: Graph Convolutional Networks (GCNs) have gained significant developments in representation learning on graphs. However, current GCNs suffer from two common challenges: 1) GCNs are only effective with shallow structures; stacking multiple GCN layers will lead to over-smoothing. 2) GCNs do not scale well with large, dense graphs due to the recursive neighborhood expansion. We generalize the propagat… ▽ More

    Submitted 12 December, 2019; originally announced December 2019.

  43. arXiv:1912.00513  [pdf, other

    cs.LG stat.ML

    A Quasi-Newton Method Based Vertical Federated Learning Framework for Logistic Regression

    Authors: Kai Yang, Tao Fan, Tianjian Chen, Yuanming Shi, Qiang Yang

    Abstract: Data privacy and security becomes a major concern in building machine learning models from different data providers. Federated learning shows promise by leaving data at providers locally and exchanging encrypted information. This paper studies the vertical federated learning structure for logistic regression where the data sets at two parties have the same sample IDs but own disjoint subsets of fe… ▽ More

    Submitted 3 December, 2019; v1 submitted 1 December, 2019; originally announced December 2019.

  44. arXiv:1910.07099  [pdf, other

    cs.LG cs.IR stat.ML

    Entire Space Multi-Task Modeling via Post-Click Behavior Decomposition for Conversion Rate Prediction

    Authors: Hong Wen, **g Zhang, Yuan Wang, Fuyu Lv, Wentian Bao, Quan Lin, Ke** Yang

    Abstract: Recommender system, as an essential part of modern e-commerce, consists of two fundamental modules, namely Click-Through Rate (CTR) and Conversion Rate (CVR) prediction. While CVR has a direct impact on the purchasing volume, its prediction is well-known challenging due to the Sample Selection Bias (SSB) and Data Sparsity (DS) issues. Although existing methods, typically built on the user sequenti… ▽ More

    Submitted 9 June, 2020; v1 submitted 15 October, 2019; originally announced October 2019.

    Comments: 10page, 7 figures. Accepted by SIGIR 2020. The source code will be released at https://github.com/chaimi2013/ESM2

  45. arXiv:1907.05632  [pdf, other

    cs.LG stat.ML

    Laplacian-regularized graph bandits: Algorithms and theoretical analysis

    Authors: Kaige Yang, Xiaowen Dong, Laura Toni

    Abstract: We consider a stochastic linear bandit problem with multiple users, where the relationship between users is captured by an underlying graph and user preferences are represented as smooth signals on the graph. We introduce a novel bandit algorithm where the smoothness prior is imposed via the random-walk graph Laplacian, which leads to a single-user cumulative regret scaling as… ▽ More

    Submitted 10 February, 2020; v1 submitted 12 July, 2019; originally announced July 2019.

  46. arXiv:1906.02818  [pdf, other

    cs.DC q-fin.CP stat.CO

    Tensor Processing Units for Financial Monte Carlo

    Authors: Francois Belletti, Davis King, Kun Yang, Roland Nelet, Yusef Shafi, Yi-Fan Chen, John Anderson

    Abstract: Monte Carlo methods are critical to many routines in quantitative finance such as derivatives pricing, hedging and risk metrics. Unfortunately, Monte Carlo methods are very computationally expensive when it comes to running simulations in high-dimensional state spaces where they are still a method of choice in the financial industry. Recently, Tensor Processing Units (TPUs) have provided considera… ▽ More

    Submitted 27 January, 2020; v1 submitted 6 June, 2019; originally announced June 2019.

  47. arXiv:1905.09381  [pdf, other

    cs.LO cs.AI cs.LG stat.ML

    Learning to Prove Theorems via Interacting with Proof Assistants

    Authors: Kaiyu Yang, Jia Deng

    Abstract: Humans prove theorems by relying on substantial high-level reasoning and problem-specific insights. Proof assistants offer a formalism that resembles human mathematical reasoning, representing theorems in higher-order logic and proofs as high-level tactics. However, human experts have to construct proofs manually by entering tactics into the proof assistant. In this paper, we study the problem of… ▽ More

    Submitted 21 May, 2019; originally announced May 2019.

    Comments: Accepted to ICML 2019

  48. arXiv:1904.12672  [pdf, other

    cs.LG stat.ML

    Efficient Computation of Expected Hypervolume Improvement Using Box Decomposition Algorithms

    Authors: Kaifeng Yang, Michael Emmerich, André Deutz, Thomas Bäck

    Abstract: In the field of multi-objective optimization algorithms, multi-objective Bayesian Global Optimization (MOBGO) is an important branch, in addition to evolutionary multi-objective optimization algorithms (EMOAs). MOBGO utilizes Gaussian Process models learned from previous objective function evaluations to decide the next evaluation site by maximizing or minimizing an infill criterion. A common crit… ▽ More

    Submitted 13 June, 2019; v1 submitted 26 April, 2019; originally announced April 2019.

  49. arXiv:1904.08102  [pdf, other

    cs.LG q-bio.QM stat.ML

    Batched Stochastic Bayesian Optimization via Combinatorial Constraints Design

    Authors: Kevin K. Yang, Yuxin Chen, Alycia Lee, Yisong Yue

    Abstract: In many high-throughput experimental design settings, such as those common in biochemical engineering, batched queries are more cost effective than one-by-one sequential queries. Furthermore, it is often not possible to directly choose items to query. Instead, the experimenter specifies a set of constraints that generates a library of possible items, which are then selected stochastically. Motivat… ▽ More

    Submitted 17 April, 2019; originally announced April 2019.

  50. arXiv:1904.01561  [pdf, other

    cs.LG stat.ML

    Analyzing Learned Molecular Representations for Property Prediction

    Authors: Kevin Yang, Kyle Swanson, Wengong **, Connor Coley, Philipp Eiden, Hua Gao, Angel Guzman-Perez, Timothy Hopper, Brian Kelley, Miriam Mathea, Andrew Palmer, Volker Settels, Tommi Jaakkola, Klavs Jensen, Regina Barzilay

    Abstract: Advancements in neural machinery have led to a wide range of algorithmic solutions for molecular property prediction. Two classes of models in particular have yielded promising results: neural networks applied to computed molecular fingerprints or expert-crafted descriptors, and graph convolutional neural networks that construct a learned molecular representation by operating on the graph structur… ▽ More

    Submitted 20 November, 2019; v1 submitted 2 April, 2019; originally announced April 2019.

    Journal ref: Journal of chemical information and modeling 59.8 (2019): 3370-3388