Skip to main content

Showing 1–35 of 35 results for author: Zhong, H

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.12838  [pdf, ps, other

    quant-ph stat.CO

    Quantum Non-Identical Mean Estimation: Efficient Algorithms and Fundamental Limits

    Authors: Jiachen Hu, Tongyang Li, Xinzhao Wang, Yecheng Xue, Chenyi Zhang, Han Zhong

    Abstract: We systematically investigate quantum algorithms and lower bounds for mean estimation given query access to non-identically distributed samples. On the one hand, we give quantum mean estimators with quadratic quantum speed-up given samples from different bounded or sub-Gaussian random variables. On the other hand, we prove that, in general, it is impossible for any quantum algorithm to achieve qua… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 31 pages, 0 figure. To appear in the 19th Theory of Quantum Computation, Communication and Cryptography (TQC 2024)

  2. arXiv:2404.18922  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    DPO Meets PPO: Reinforced Token Optimization for RLHF

    Authors: Han Zhong, Guhao Feng, Wei Xiong, Li Zhao, Di He, Jiang Bian, Liwei Wang

    Abstract: In the classical Reinforcement Learning from Human Feedback (RLHF) framework, Proximal Policy Optimization (PPO) is employed to learn from sparse, sentence-level rewards -- a challenging scenario in traditional deep reinforcement learning. Despite the great successes of PPO in the alignment of state-of-the-art closed-source large language models (LLMs), its open-source implementation is still larg… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  3. arXiv:2404.12648  [pdf, ps, other

    cs.LG stat.ML

    Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation

    Authors: Jianliang He, Han Zhong, Zhuoran Yang

    Abstract: We study infinite-horizon average-reward Markov decision processes (AMDPs) in the context of general function approximation. Specifically, we propose a novel algorithmic framework named Local-fitted Optimization with OPtimism (LOOP), which incorporates both model-based and value-based incarnations. In particular, LOOP features a novel construction of confidence sets and a low-switching policy upda… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: ICLR 2024

  4. arXiv:2404.03578  [pdf, ps, other

    cs.LG stat.ML

    Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithm

    Authors: Miao Lu, Han Zhong, Tong Zhang, Jose Blanchet

    Abstract: The sim-to-real gap, which represents the disparity between training and testing environments, poses a significant challenge in reinforcement learning (RL). A promising approach to addressing this challenge is distributionally robust RL, often framed as a robust Markov decision process (RMDP). In this framework, the objective is to find a robust policy that achieves good performance under the wors… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  5. arXiv:2403.13027  [pdf, other

    cs.LG cs.CR cs.IT stat.ML

    Towards Better Statistical Understanding of Watermarking LLMs

    Authors: Zhongze Cai, Shang Liu, Hanzhao Wang, Huaiyang Zhong, Xiaocheng Li

    Abstract: In this paper, we study the problem of watermarking large language models (LLMs). We consider the trade-off between model distortion and detection ability and formulate it as a constrained optimization problem based on the green-red algorithm of Kirchenbauer et al. (2023a). We show that the optimal solution to the optimization problem enjoys a nice analytical property which provides a better under… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  6. arXiv:2403.09907  [pdf, other

    stat.ME

    Multi-Layer Kernel Machines: Fast and Optimal Nonparametric Regression with Uncertainty Quantification

    Authors: Xiaowu Dai, Huiying Zhong

    Abstract: Kernel ridge regression (KRR) is widely used for nonparametric regression over reproducing kernel Hilbert spaces. It offers powerful modeling capabilities at the cost of significant computational costs, which typically require $O(n^3)$ computational time and $O(n^2)$ storage space, with the sample size n. We introduce a novel framework of multi-layer kernel machines that approximate KRR by employi… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  7. arXiv:2403.05006  [pdf, ps, other

    cs.LG cs.AI stat.ME stat.ML

    Provable Multi-Party Reinforcement Learning with Diverse Human Feedback

    Authors: Huiying Zhong, Zhun Deng, Weijie J. Su, Zhiwei Steven Wu, Linjun Zhang

    Abstract: Reinforcement learning with human feedback (RLHF) is an emerging paradigm to align models with human preferences. Typically, RLHF aggregates preferences from multiple individuals who have diverse viewpoints that may conflict with each other. Our work \textit{initiates} the theoretical study of multi-party RLHF that explicitly models the diverse preferences of multiple individuals. We show how trad… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  8. arXiv:2312.17248  [pdf, other

    cs.LG cs.AI cs.CC cs.DS stat.ML

    Rethinking Model-based, Policy-based, and Value-based Reinforcement Learning via the Lens of Representation Complexity

    Authors: Guhao Feng, Han Zhong

    Abstract: Reinforcement Learning (RL) encompasses diverse paradigms, including model-based RL, policy-based RL, and value-based RL, each tailored to approximate the model, optimal policy, and optimal value function, respectively. This work investigates the potential hierarchy of representation complexity -- the complexity of functions to be represented -- among these RL paradigms. We first demonstrate that,… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  9. arXiv:2312.11456  [pdf, other

    cs.LG cs.AI stat.ML

    Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint

    Authors: Wei Xiong, Hanze Dong, Chenlu Ye, Ziqi Wang, Han Zhong, Heng Ji, Nan Jiang, Tong Zhang

    Abstract: This paper studies the alignment process of generative models with Reinforcement Learning from Human Feedback (RLHF). We first identify the primary challenges of existing popular methods like offline PPO and offline DPO as lacking in strategical exploration of the environment. Then, to understand the mathematical principle of RLHF, we consider a standard mathematical formulation, the reverse-KL re… ▽ More

    Submitted 1 May, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: 53 pages; theoretical study and algorithmic design of iterative RLHF and DPO

  10. arXiv:2312.04464  [pdf, other

    cs.LG stat.ML

    Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation

    Authors: Jiayi Huang, Han Zhong, Liwei Wang, Lin F. Yang

    Abstract: To tackle long planning horizon problems in reinforcement learning with general function approximation, we propose the first algorithm, termed as UCRL-WVTR, that achieves both \emph{horizon-free} and \emph{instance-dependent}, since it eliminates the polynomial dependency on the planning horizon. The derived regret bound is deemed \emph{sharp}, as it matches the minimax lower bound when specialize… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  11. arXiv:2310.19861  [pdf, ps, other

    cs.LG cs.GT stat.ML

    Posterior Sampling for Competitive RL: Function Approximation and Partial Observation

    Authors: Shuang Qiu, Ziyu Dai, Han Zhong, Zhaoran Wang, Zhuoran Yang, Tong Zhang

    Abstract: This paper investigates posterior sampling algorithms for competitive reinforcement learning (RL) in the context of general function approximations. Focusing on zero-sum Markov games (MGs) under two critical settings, namely self-play and adversarial learning, we first propose the self-play and adversarial generalized eluder coefficient (GEC) as complexity measures for function approximation, capt… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023

  12. arXiv:2306.07652  [pdf

    stat.AP q-bio.TO

    Inactivated COVID-19 Vaccination did not affect In vitro fertilization (IVF) / Intra-Cytoplasmic Sperm Injection (ICSI) cycle outcomes

    Authors: Qi Wan, Ying Ling Yao, XingYu Lv, Li Hong Geng, Yue Wang, Enoch Appiah Adu-Gyamfi, Xue Jiao Wang, Yue Qian, Juan Yang, Ming Xing Chend, Zhao Hui Zhong, Yuan Li, Yu Bin Ding

    Abstract: Background: The objective of this study is to evaluate the impact of COVID-19 inactivated vaccine administration on the outcomes of in vitro fertilization (IVF) and intracytoplasmic sperm injection (ICSI) cycles in infertile couples in China. Methods: We collected data from the CYART prospective cohort, which included couples undergoing IVF treatment from January 2021 to September 2022 at Sichuan… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: 26 pages, 4 figures and 5 tables

  13. arXiv:2306.06836  [pdf, other

    cs.LG cs.AI stat.ML

    Tackling Heavy-Tailed Rewards in Reinforcement Learning with Function Approximation: Minimax Optimal and Instance-Dependent Regret Bounds

    Authors: Jiayi Huang, Han Zhong, Liwei Wang, Lin F. Yang

    Abstract: While numerous works have focused on devising efficient algorithms for reinforcement learning (RL) with uniformly bounded rewards, it remains an open question whether sample or time-efficient algorithms for RL with large state-action space exist when the rewards are \emph{heavy-tailed}, i.e., with only finite $(1+ε)$-th moments for some $ε\in(0,1]$. In this work, we address the challenge of such r… ▽ More

    Submitted 7 March, 2024; v1 submitted 11 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023

  14. arXiv:2305.18258  [pdf, other

    cs.LG cs.AI cs.GT math.OC stat.ML

    Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration

    Authors: Zhihan Liu, Miao Lu, Wei Xiong, Han Zhong, Hao Hu, Shenao Zhang, Sirui Zheng, Zhuoran Yang, Zhaoran Wang

    Abstract: In online reinforcement learning (online RL), balancing exploration and exploitation is crucial for finding an optimal policy in a sample-efficient way. To achieve this, existing sample-efficient online RL algorithms typically consist of three components: estimation, planning, and exploration. However, in order to cope with general function approximators, most of them involve impractical algorithm… ▽ More

    Submitted 25 October, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

  15. arXiv:2305.09659  [pdf, ps, other

    cs.LG cs.AI math.OC stat.ML

    Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage

    Authors: Jose Blanchet, Miao Lu, Tong Zhang, Han Zhong

    Abstract: In this paper, we study distributionally robust offline reinforcement learning (robust offline RL), which seeks to find an optimal policy purely from an offline dataset that can perform well in perturbed environments. In specific, we propose a generic algorithm framework called Doubly Pessimistic Model-based Policy Optimization ($P^2MPO$), which features a novel combination of a flexible model est… ▽ More

    Submitted 22 August, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

    Comments: V2 adds results on robust offline Markov games

  16. arXiv:2305.08841  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    A Theoretical Analysis of Optimistic Proximal Policy Optimization in Linear Markov Decision Processes

    Authors: Han Zhong, Tong Zhang

    Abstract: The proximal policy optimization (PPO) algorithm stands as one of the most prosperous methods in the field of reinforcement learning (RL). Despite its success, the theoretical understanding of PPO remains deficient. Specifically, it is unclear whether PPO or its optimistic variants can effectively solve linear Markov decision processes (MDPs), which are arguably the simplest models in RL with func… ▽ More

    Submitted 8 June, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

  17. arXiv:2304.06484  [pdf, other

    cs.CY cs.LG cs.SI econ.GN stat.AP

    Exploring Gender and Race Biases in the NFT Market

    Authors: Howard Zhong, Mark Hamilton

    Abstract: Non-Fungible Tokens (NFTs) are non-interchangeable assets, usually digital art, which are stored on the blockchain. Preliminary studies find that female and darker-skinned NFTs are valued less than their male and lighter-skinned counterparts. However, these studies analyze only the CryptoPunks collection. We test the statistical significance of race and gender biases in the prices of CryptoPunks a… ▽ More

    Submitted 29 March, 2023; originally announced April 2023.

  18. arXiv:2302.10796  [pdf, ps, other

    quant-ph cs.AI cs.LG stat.ML

    Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret

    Authors: Han Zhong, Jiachen Hu, Yecheng Xue, Tongyang Li, Liwei Wang

    Abstract: While quantum reinforcement learning (RL) has attracted a surge of attention recently, its theoretical understanding is limited. In particular, it remains elusive how to design provably efficient quantum RL algorithms that can address the exploration-exploitation trade-off. To this end, we propose a novel UCRL-style algorithm that takes advantage of quantum computing for tabular Markov decision pr… ▽ More

    Submitted 13 June, 2024; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: ICML 2024

  19. arXiv:2211.01962  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP, and Beyond

    Authors: Han Zhong, Wei Xiong, Sirui Zheng, Liwei Wang, Zhaoran Wang, Zhuoran Yang, Tong Zhang

    Abstract: We study sample efficient reinforcement learning (RL) under the general framework of interactive decision making, which includes Markov decision process (MDP), partially observable Markov decision process (POMDP), and predictive state representation (PSR) as special cases. Toward finding the minimum assumption that empowers sample efficient learning, we propose a novel complexity measure, generali… ▽ More

    Submitted 30 June, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

    Comments: We changed the title from the first version. We fixed a technical issue in the first version regarding the $\ell_2$ eluder technique (Lemma D.2)

  20. arXiv:2210.15598  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Provable Sim-to-real Transfer in Continuous Domain with Partial Observations

    Authors: Jiachen Hu, Han Zhong, Chi **, Liwei Wang

    Abstract: Sim-to-real transfer trains RL agents in the simulated environments and then deploys them in the real world. Sim-to-real transfer has been widely used in practice because it is often cheaper, safer and much faster to collect samples in simulation than in the real world. Despite the empirical success of the sim-to-real transfer, its theoretical foundation is much less understood. In this paper, we… ▽ More

    Submitted 2 March, 2023; v1 submitted 27 October, 2022; originally announced October 2022.

    Comments: Accepted at ICLR2023

  21. arXiv:2205.15512  [pdf, ps, other

    cs.LG cs.GT stat.ML

    Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game

    Authors: Wei Xiong, Han Zhong, Chengshuai Shi, Cong Shen, Liwei Wang, Tong Zhang

    Abstract: Offline reinforcement learning (RL) aims at learning an optimal strategy using a pre-collected dataset without further interactions with the environment. While various algorithms have been proposed for offline RL in the previous literature, the minimax optimality has only been (nearly) established for tabular Markov decision processes (MDPs). In this paper, we focus on offline RL with linear funct… ▽ More

    Submitted 1 March, 2023; v1 submitted 30 May, 2022; originally announced May 2022.

  22. arXiv:2205.13863  [pdf, other

    cs.LG cs.AI stat.ML

    Why Robust Generalization in Deep Learning is Difficult: Perspective of Expressive Power

    Authors: Binghui Li, Jikai **, Han Zhong, John E. Hopcroft, Liwei Wang

    Abstract: It is well-known that modern neural networks are vulnerable to adversarial examples. To mitigate this problem, a series of robust learning algorithms have been proposed. However, although the robust training error can be near zero via some methods, all existing algorithms lead to a high robust generalization error. In this paper, we provide a theoretical understanding of this puzzling phenomenon f… ▽ More

    Submitted 14 October, 2022; v1 submitted 27 May, 2022; originally announced May 2022.

    Comments: 25 pages; to appear in NeurIPS 2022

  23. arXiv:2202.07511  [pdf, ps, other

    cs.LG cs.GT stat.ML

    Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets

    Authors: Han Zhong, Wei Xiong, Jiyuan Tan, Liwei Wang, Tong Zhang, Zhaoran Wang, Zhuoran Yang

    Abstract: We study episodic two-player zero-sum Markov games (MGs) in the offline setting, where the goal is to find an approximate Nash equilibrium (NE) policy pair based on a dataset collected a priori. When the dataset does not have uniform coverage over all policy pairs, finding an approximate NE involves challenges in three aspects: (i) distributional shift between the behavior policy and the optimal p… ▽ More

    Submitted 29 December, 2022; v1 submitted 15 February, 2022; originally announced February 2022.

  24. arXiv:2112.13521  [pdf, ps, other

    cs.LG cs.GT stat.ML

    Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopic Followers?

    Authors: Han Zhong, Zhuoran Yang, Zhaoran Wang, Michael I. Jordan

    Abstract: We study multi-player general-sum Markov games with one of the players designated as the leader and the other players regarded as followers. In particular, we focus on the class of games where the followers are myopic, i.e., they aim to maximize their instantaneous rewards. For such a game, our goal is to find a Stackelberg-Nash equilibrium (SNE), which is a policy pair $(π^*, ν^*)$ such that (i)… ▽ More

    Submitted 27 December, 2021; originally announced December 2021.

  25. arXiv:2112.10935  [pdf, ps, other

    cs.LG stat.ML

    Nearly Optimal Policy Optimization with Stable at Any Time Guarantee

    Authors: Tianhao Wu, Yunchang Yang, Han Zhong, Liwei Wang, Simon S. Du, Jiantao Jiao

    Abstract: Policy optimization methods are one of the most widely used classes of Reinforcement Learning (RL) algorithms. However, theoretical understanding of these methods remains insufficient. Even in the episodic (time-inhomogeneous) tabular setting, the state-of-the-art theoretical result of policy-based method in \citet{shani2020optimistic} is only $\tilde{O}(\sqrt{S^2AH^4K})$ where $S$ is the number o… ▽ More

    Submitted 3 December, 2022; v1 submitted 20 December, 2021; originally announced December 2021.

    Comments: arXiv admin note: text overlap with arXiv:2002.08243 by other authors

  26. arXiv:2110.08984  [pdf, ps, other

    cs.LG stat.ML

    Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs

    Authors: Han Zhong, Zhuoran Yang, Zhaoran Wang, Csaba Szepesvári

    Abstract: We study episodic reinforcement learning (RL) in non-stationary linear kernel Markov decision processes (MDPs). In this setting, both the reward function and the transition kernel are linear with respect to the given feature maps and are allowed to vary over time, as long as their respective parameter variations do not exceed certain variation budgets. We propose the $\underline{\text{p}}$eriodica… ▽ More

    Submitted 2 October, 2022; v1 submitted 17 October, 2021; originally announced October 2021.

  27. arXiv:2109.01258  [pdf, other

    cs.LG eess.SY stat.AP

    Estimating Demand Flexibility Using Siamese LSTM Neural Networks

    Authors: Guangchun Ruan, Daniel S. Kirschen, Haiwang Zhong, Qing Xia, Chongqing Kang

    Abstract: There is an opportunity in modern power systems to explore the demand flexibility by incentivizing consumers with dynamic prices. In this paper, we quantify demand flexibility using an efficient tool called time-varying elasticity, whose value may change depending on the prices and decision dynamics. This tool is particularly useful for evaluating the demand response potential and system reliabili… ▽ More

    Submitted 2 September, 2021; originally announced September 2021.

    Comments: Author copy of the manuscript submitted to IEEE Trans on Power Systems

    Journal ref: IEEE Transactions on Power Systems, 2022

  28. arXiv:2106.11692  [pdf, ps, other

    cs.LG stat.ML

    A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning

    Authors: Yunchang Yang, Tianhao Wu, Han Zhong, Evrard Garcelon, Matteo Pirotta, Alessandro Lazaric, Liwei Wang, Simon S. Du

    Abstract: In this paper, we present a reduction-based framework for conservative bandits and RL, in which our core technique is to calculate the necessary and sufficient budget obtained from running the baseline policy. For lower bounds, we improve the existing lower bound for conservative multi-armed bandits and obtain new lower bounds for conservative linear bandits, tabular RL and low-rank MDP, through a… ▽ More

    Submitted 16 March, 2022; v1 submitted 22 June, 2021; originally announced June 2021.

  29. arXiv:2103.12515  [pdf, other

    physics.soc-ph stat.AP

    Quantitative Assessment of U.S. Bulk Power Systems and Market Operations during COVID-19

    Authors: Guangchun Ruan, Jiahan Wu, Haiwang Zhong, Qing Xia, Le Xie

    Abstract: Starting in early 2020, the novel coronavirus disease (COVID-19) severely affected the U.S., causing substantial changes in the operations of bulk power systems and electricity markets. In this paper, we develop a data-driven analysis to substantiate the pandemic's impacts from the perspectives of power system security, electric power generation, electric power demand and electricity prices. Our r… ▽ More

    Submitted 30 August, 2020; originally announced March 2021.

    Comments: Journal paper, 19 pages, also available at EnerarXiv

  30. arXiv:2102.03240  [pdf

    physics.ao-ph econ.GN stat.OT

    De-carbonization of global energy use during the COVID-19 pandemic

    Authors: Zhu Liu, Biqing Zhu, Philippe Ciais, Steven J. Davis, Chenxi Lu, Haiwang Zhong, Piyu Ke, Yanan Cui, Zhu Deng, Duo Cui, Taochun Sun, Xinyu Dou, Jianguang Tan, Rui Guo, Bo Zheng, Katsumasa Tanaka, Wenli Zhao, Pierre Gentine

    Abstract: The COVID-19 pandemic has disrupted human activities, leading to unprecedented decreases in both global energy demand and GHG emissions. Yet a little known that there is also a low carbon shift of the global energy system in 2020. Here, using the near-real-time data on energy-related GHG emissions from 30 countries (about 70% of global power generation), we show that the pandemic caused an unprece… ▽ More

    Submitted 5 February, 2021; originally announced February 2021.

  31. arXiv:2012.14098  [pdf, other

    cs.LG math.OC stat.ML

    Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds Globally Optimal Policy

    Authors: Han Zhong, Xun Deng, Ethan X. Fang, Zhuoran Yang, Zhaoran Wang, Runze Li

    Abstract: While deep reinforcement learning has achieved tremendous successes in various applications, most existing works only focus on maximizing the expected value of total return and thus ignore its inherent stochasticity. Such stochasticity is also known as the aleatoric uncertainty and is closely related to the notion of risk. In this work, we make the first attempt to study risk-sensitive deep reinfo… ▽ More

    Submitted 16 September, 2023; v1 submitted 28 December, 2020; originally announced December 2020.

    Comments: 59 pages

  32. arXiv:2009.03509  [pdf, other

    cs.LG stat.ML

    Masked Label Prediction: Unified Message Passing Model for Semi-Supervised Classification

    Authors: Yunsheng Shi, Zhengjie Huang, Shikun Feng, Hui Zhong, Wen** Wang, Yu Sun

    Abstract: Graph neural network (GNN) and label propagation algorithm (LPA) are both message passing algorithms, which have achieved superior performance in semi-supervised classification. GNN performs feature propagation by a neural network to make predictions, while LPA uses label propagation across graph adjacency matrix to get results. However, there is still no effective way to directly combine these tw… ▽ More

    Submitted 9 May, 2021; v1 submitted 8 September, 2020; originally announced September 2020.

    Comments: 7 pages, 3 figures and 8 tables; Accepted by IJCAI 2021

  33. arXiv:2003.12205  [pdf

    cs.LG cs.AI eess.SP stat.ML

    AirRL: A Reinforcement Learning Approach to Urban Air Quality Inference

    Authors: Huiqiang Zhong, Cunxiang Yin, Xiaohui Wu, **chang Luo, JiaWei He

    Abstract: Urban air pollution has become a major environmental problem that threatens public health. It has become increasingly important to infer fine-grained urban air quality based on existing monitoring stations. One of the challenges is how to effectively select some relevant stations for air quality inference. In this paper, we propose a novel model based on reinforcement learning for urban air qualit… ▽ More

    Submitted 26 March, 2020; originally announced March 2020.

  34. arXiv:1808.10307  [pdf, other

    cs.CR cs.LG stat.ML

    Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation

    Authors: Cong Liao, Haoti Zhong, Anna Squicciarini, Sencun Zhu, David Miller

    Abstract: Deep learning models have consistently outperformed traditional machine learning models in various classification tasks, including image classification. As such, they have become increasingly prevalent in many real world applications including those where security is of great concern. Such popularity, however, may attract attackers to exploit the vulnerabilities of the deployed deep learning model… ▽ More

    Submitted 30 August, 2018; originally announced August 2018.

  35. arXiv:1711.05809  [pdf, other

    cs.LG stat.ML

    Hierarchical Modeling of Seed Variety Yields and Decision Making for Future Planting Plans

    Authors: Huaiyang Zhong, Xiaocheng Li, David Lobell, Stefano Ermon, Margaret L. Brandeau

    Abstract: Eradicating hunger and malnutrition is a key development goal of the 21st century. We address the problem of optimally identifying seed varieties to reliably increase crop yield within a risk-sensitive decision-making framework. Specifically, we introduce a novel hierarchical machine learning mechanism for predicting crop yield (the yield of different seed varieties of the same crop). We integrate… ▽ More

    Submitted 15 November, 2017; originally announced November 2017.