Skip to main content

Showing 1–50 of 89 results for author: Tambe, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11741  [pdf, other

    cs.LG cs.AI

    Transcendence: Generative Models Can Outperform The Experts That Train Them

    Authors: Edwin Zhang, Vincent Zhu, Naomi Saphra, Anat Kleiman, Benjamin L. Edelman, Milind Tambe, Sham M. Kakade, Eran Malach

    Abstract: Generative models are trained with the simple objective of imitating the conditional probability distribution induced by the data they are trained on. Therefore, when trained on data generated by humans, we may not expect the artificial model to outperform the humans on their original objectives. In this work, we study the phenomenon of transcendence: when a generative model achieves capabilities… ▽ More

    Submitted 28 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Code, models, and data at https://transcendence.eddie.win

  2. arXiv:2403.17381  [pdf, other

    cs.LG cs.AI

    Application-Driven Innovation in Machine Learning

    Authors: David Rolnick, Alan Aspuru-Guzik, Sara Beery, Bistra Dilkina, Priya L. Donti, Marzyeh Ghassemi, Hannah Kerner, Claire Monteleoni, Esther Rolf, Milind Tambe, Adam White

    Abstract: As applications of machine learning proliferate, innovative algorithms inspired by specific real-world challenges have become increasingly important. Such work offers the potential for significant impact not merely in domains of application but also in machine learning itself. In this paper, we describe the paradigm of application-driven research in machine learning, contrasting it with the more s… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: 12 pages, 3 figures

  3. arXiv:2403.05683  [pdf, other

    cs.AI cs.LG

    Efficient Public Health Intervention Planning Using Decomposition-Based Decision-Focused Learning

    Authors: Sanket Shah, Arun Suggala, Milind Tambe, Aparna Taneja

    Abstract: The declining participation of beneficiaries over time is a key concern in public health programs. A popular strategy for improving retention is to have health workers `intervene' on beneficiaries at risk of drop** out. However, the availability and time of these health workers are limited resources. As a result, there has been a line of research on optimizing these limited intervention resource… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 12 pages, 3 figures, 2 tables

  4. arXiv:2402.14807  [pdf, other

    cs.MA cs.AI cs.LG

    A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health

    Authors: Nikhil Behari, Edwin Zhang, Yunfan Zhao, Aparna Taneja, Dheeraj Nagaraj, Milind Tambe

    Abstract: Restless multi-armed bandits (RMAB) have demonstrated success in optimizing resource allocation for large beneficiary populations in public health settings. Unfortunately, RMAB models lack flexibility to adapt to evolving public health policy priorities. Concurrently, Large Language Models (LLMs) have emerged as adept automated planners across domains of robotic control and navigation. In this pap… ▽ More

    Submitted 26 May, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

  5. arXiv:2402.14090  [pdf, other

    cs.AI econ.GN stat.ML

    Social Environment Design

    Authors: Edwin Zhang, Sadie Zhao, Tonghan Wang, Safwan Hossain, Henry Gasztowtt, Stephan Zheng, David C. Parkes, Milind Tambe, Yiling Chen

    Abstract: Artificial Intelligence (AI) holds promise as a technology that can be used to improve government and economic policy-making. This paper proposes a new research agenda towards this end by introducing Social Environment Design, a general framework for the use of AI for automated policy-making that connects with the Reinforcement Learning, EconCS, and Computational Social Choice communities. The fra… ▽ More

    Submitted 17 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: ICML 2024 Position Paper. Website at https://sed.eddie.win

  6. arXiv:2402.11771  [pdf, other

    cs.LG cs.AI stat.ME stat.ML

    Evaluating the Effectiveness of Index-Based Treatment Allocation

    Authors: Niclas Boehmer, Yash Nair, Sanket Shah, Lucas Janson, Aparna Taneja, Milind Tambe

    Abstract: When resources are scarce, an allocation policy is needed to decide who receives a resource. This problem occurs, for instance, when allocating scarce medical resources and is often solved using modern ML methods. This paper introduces methods to evaluate index-based allocation policies -- that allocate a fixed number of resources to those who need them the most -- by using data from a randomized… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

  7. arXiv:2402.04933  [pdf, other

    cs.LG stat.AP

    A Bayesian Approach to Online Learning for Contextual Restless Bandits with Applications to Public Health

    Authors: Biyonka Liang, Lily Xu, Aparna Taneja, Milind Tambe, Lucas Janson

    Abstract: Public health programs often provide interventions to encourage beneficiary adherence,and effectively allocating interventions is vital for producing the greatest overall health outcomes. Such resource allocation problems are often modeled as restless multi-armed bandits (RMABs) with unknown underlying transition dynamics, hence requiring online reinforcement learning (RL). We present Bayesian Lea… ▽ More

    Submitted 27 May, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: 26 pages, 18 figures

  8. arXiv:2312.09983  [pdf, other

    cs.LG cs.AI stat.ML

    Toward Computationally Efficient Inverse Reinforcement Learning via Reward Sha**

    Authors: Lauren H. Cooke, Harvey Klyne, Edwin Zhang, Cassidy Laidlaw, Milind Tambe, Finale Doshi-Velez

    Abstract: Inverse reinforcement learning (IRL) is computationally challenging, with common approaches requiring the solution of multiple reinforcement learning (RL) sub-problems. This work motivates the use of potential-based reward sha** to reduce the computational burden of each RL sub-problem. This work serves as a proof-of-concept and we hope will inspire future developments towards computationally ef… ▽ More

    Submitted 18 December, 2023; v1 submitted 15 December, 2023; originally announced December 2023.

  9. arXiv:2311.07139  [pdf, other

    cs.LG cs.AI cs.MA

    Analyzing and Predicting Low-Listenership Trends in a Large-Scale Mobile Health Program: A Preliminary Investigation

    Authors: Arshika Lalan, Shresth Verma, Kumar Madhu Sudan, Amrita Mahale, Aparna Hegde, Milind Tambe, Aparna Taneja

    Abstract: Mobile health programs are becoming an increasingly popular medium for dissemination of health information among beneficiaries in less privileged communities. Kilkari is one of the world's largest mobile health programs which delivers time sensitive audio-messages to pregnant women and new mothers. We have been collaborating with ARMMAN, a non-profit in India which operates the Kilkari program, to… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: Accepted to Data Science for Social Good Workshop, KDD 2023

  10. arXiv:2310.14526  [pdf, other

    cs.LG cs.AI

    Towards a Pretrained Model for Restless Bandits via Multi-arm Generalization

    Authors: Yunfan Zhao, Nikhil Behari, Edward Hughes, Edwin Zhang, Dheeraj Nagaraj, Karl Tuyls, Aparna Taneja, Milind Tambe

    Abstract: Restless multi-arm bandits (RMABs), a class of resource allocation problems with broad application in areas such as healthcare, online advertising, and anti-poaching, have recently been studied from a multi-agent reinforcement learning perspective. Prior RMAB research suffers from several limitations, e.g., it fails to adequately address continuous states, and requires retraining from scratch when… ▽ More

    Submitted 29 January, 2024; v1 submitted 22 October, 2023; originally announced October 2023.

  11. arXiv:2308.09726  [pdf, other

    cs.LG cs.AI cs.CY cs.MA

    Equitable Restless Multi-Armed Bandits: A General Framework Inspired By Digital Health

    Authors: Jackson A. Killian, Manish Jain, Yugang Jia, Jonathan Amar, Erich Huang, Milind Tambe

    Abstract: Restless multi-armed bandits (RMABs) are a popular framework for algorithmic decision making in sequential settings with limited resources. RMABs are increasingly being used for sensitive decisions such as in public health, treatment scheduling, anti-poaching, and -- the motivation for this work -- digital health. For such high stakes settings, decisions must both improve outcomes and prevent disp… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: 16 pages, 8 figures, 2 tables

  12. arXiv:2307.08774  [pdf, other

    cs.AI

    Reflections from the Workshop on AI-Assisted Decision Making for Conservation

    Authors: Lily Xu, Esther Rolf, Sara Beery, Joseph R. Bennett, Tanya Berger-Wolf, Tanya Birch, Elizabeth Bondi-Kelly, Justin Brashares, Melissa Chapman, Anthony Corso, Andrew Davies, Nikhil Garg, Angela Gaylard, Robert Heilmayr, Hannah Kerner, Konstantin Klemmer, Vipin Kumar, Lester Mackey, Claire Monteleoni, Paul Moorcroft, Jonathan Palmer, Andrew Perrault, David Thau, Milind Tambe

    Abstract: In this white paper, we synthesize key points made during presentations and discussions from the AI-Assisted Decision Making for Conservation workshop, hosted by the Center for Research on Computation and Society at Harvard University on October 20-21, 2022. We identify key open research questions in resource allocation, planning, and interventions for biodiversity conservation, highlighting conse… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: Co-authored by participants from the October 2022 workshop: https://crcs.seas.harvard.edu/conservation-workshop

  13. arXiv:2305.16830  [pdf, other

    cs.LG cs.AI

    Leaving the Nest: Going Beyond Local Loss Functions for Predict-Then-Optimize

    Authors: Sanket Shah, Andrew Perrault, Bryan Wilder, Milind Tambe

    Abstract: Predict-then-Optimize is a framework for using machine learning to perform decision-making under uncertainty. The central research question it asks is, "How can the structure of a decision-making task be used to tailor ML models for that specific task?" To this end, recent work has proposed learning task-specific loss functions that capture this underlying structure. However, current approaches ma… ▽ More

    Submitted 18 February, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: 10 pages, 2 figures

  14. arXiv:2305.12640  [pdf, other

    cs.AI cs.LG stat.ML

    Limited Resource Allocation in a Non-Markovian World: The Case of Maternal and Child Healthcare

    Authors: Panayiotis Danassis, Shresth Verma, Jackson A. Killian, Aparna Taneja, Milind Tambe

    Abstract: The success of many healthcare programs depends on participants' adherence. We consider the problem of scheduling interventions in low resource settings (e.g., placing timely support calls from health workers) to increase adherence and/or engagement. Past works have successfully developed several classes of Restless Multi-armed Bandit (RMAB) based solutions for this problem. Nevertheless, all past… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

    Comments: Proceedings of the 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023)

  15. arXiv:2303.00799  [pdf, other

    cs.AI cs.LG cs.MA

    Fairness for Workers Who Pull the Arms: An Index Based Policy for Allocation of Restless Bandit Tasks

    Authors: Arpita Biswas, Jackson A. Killian, Paula Rodriguez Diaz, Susobhan Ghosh, Milind Tambe

    Abstract: Motivated by applications such as machine repair, project monitoring, and anti-poaching patrol scheduling, we study intervention planning of stochastic processes under resource constraints. This planning problem has previously been modeled as restless multi-armed bandits (RMAB), where each arm is an intervention-dependent Markov Decision Process. However, the existing literature assumes all interv… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

    Comments: 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2023), 10 pages

  16. arXiv:2302.02570  [pdf, other

    cs.AI cs.LG stat.ME stat.ML

    Improved Policy Evaluation for Randomized Trials of Algorithmic Resource Allocation

    Authors: Aditya Mate, Bryan Wilder, Aparna Taneja, Milind Tambe

    Abstract: We consider the task of evaluating policies of algorithmic resource allocation through randomized controlled trials (RCTs). Such policies are tasked with optimizing the utilization of limited intervention resources, with the goal of maximizing the benefits derived. Evaluation of such allocation policies through RCTs proves difficult, notwithstanding the scale of the trial, because the individuals'… ▽ More

    Submitted 6 February, 2023; originally announced February 2023.

  17. arXiv:2301.07835  [pdf, other

    cs.AI

    Decision-Focused Evaluation: Analyzing Performance of Deployed Restless Multi-Arm Bandits

    Authors: Paritosh Verma, Shresth Verma, Aditya Mate, Aparna Taneja, Milind Tambe

    Abstract: Restless multi-arm bandits (RMABs) is a popular decision-theoretic framework that has been used to model real-world sequential decision making problems in public health, wildlife conservation, communication systems, and beyond. Deployed RMAB systems typically operate in two stages: the first predicts the unknown parameters defining the RMAB instance, and the second employs an optimization algorith… ▽ More

    Submitted 18 January, 2023; originally announced January 2023.

    Comments: 11 pages, 3 figures, AI for Social Good Workshop (AAAI'23)

  18. arXiv:2211.06318  [pdf

    cs.CY cs.AI cs.LG

    Artificial Intelligence and Life in 2030: The One Hundred Year Study on Artificial Intelligence

    Authors: Peter Stone, Rodney Brooks, Erik Brynjolfsson, Ryan Calo, Oren Etzioni, Greg Hager, Julia Hirschberg, Shivaram Kalyanakrishnan, Ece Kamar, Sarit Kraus, Kevin Leyton-Brown, David Parkes, William Press, AnnaLee Saxenian, Julie Shah, Milind Tambe, Astro Teller

    Abstract: In September 2016, Stanford's "One Hundred Year Study on Artificial Intelligence" project (AI100) issued the first report of its planned long-term periodic assessment of artificial intelligence (AI) and its impact on society. It was written by a panel of 17 study authors, each of whom is deeply rooted in AI research, chaired by Peter Stone of the University of Texas at Austin. The report, entitled… ▽ More

    Submitted 31 October, 2022; originally announced November 2022.

    Comments: 52 pages, https://ai100.stanford.edu/2016-report

  19. arXiv:2211.00112  [pdf, other

    cs.MA cs.AI cs.LG math.OC

    Indexability is Not Enough for Whittle: Improved, Near-Optimal Algorithms for Restless Bandits

    Authors: Abheek Ghosh, Dheeraj Nagaraj, Manish Jain, Milind Tambe

    Abstract: We study the problem of planning restless multi-armed bandits (RMABs) with multiple actions. This is a popular model for multi-agent systems with applications like multi-channel communication, monitoring and machine maintenance tasks, and healthcare. Whittle index policies, which are based on Lagrangian relaxations, are widely used in these settings due to their simplicity and near-optimality unde… ▽ More

    Submitted 28 February, 2023; v1 submitted 31 October, 2022; originally announced November 2022.

    Comments: 21 pages; AAMAS'23 version with appendix

  20. arXiv:2210.00025  [pdf, other

    cs.LG stat.ML

    Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits

    Authors: Siddhartha Banerjee, Sean R. Sinclair, Milind Tambe, Lily Xu, Christina Lee Yu

    Abstract: How best to incorporate historical data to "warm start" bandit algorithms is an open question: naively initializing reward estimates using all historical samples can suffer from spurious data and imbalanced data coverage, leading to computational and storage issues $\unicode{x2014}$ particularly salient in continuous action spaces. We propose Artificial Replay, a meta-algorithm for incorporating h… ▽ More

    Submitted 26 January, 2023; v1 submitted 30 September, 2022; originally announced October 2022.

    Comments: 36 pages (14 pages main paper), 9 figures

  21. arXiv:2205.15372  [pdf, ps, other

    cs.LG

    Optimistic Whittle Index Policy: Online Learning for Restless Bandits

    Authors: Kai Wang, Lily Xu, Aparna Taneja, Milind Tambe

    Abstract: Restless multi-armed bandits (RMABs) extend multi-armed bandits to allow for stateful arms, where the state of each arm evolves restlessly with different transitions depending on whether that arm is pulled. Solving RMABs requires information on transition dynamics, which are often unknown upfront. To plan in RMAB settings with unknown transitions, we propose the first online learning algorithm bas… ▽ More

    Submitted 8 March, 2023; v1 submitted 30 May, 2022; originally announced May 2022.

    Comments: Accepted at AAAI 2023. 7 page paper, 2 page references, 9 page appendix. Code available. Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI 2023)

  22. arXiv:2205.05659  [pdf, other

    cs.AI cs.CY cs.LG

    Ranked Prioritization of Groups in Combinatorial Bandit Allocation

    Authors: Lily Xu, Arpita Biswas, Fei Fang, Milind Tambe

    Abstract: Preventing poaching through ranger patrols protects endangered wildlife, directly contributing to the UN Sustainable Development Goal 15 of life on land. Combinatorial bandits have been used to allocate limited patrol resources, but existing approaches overlook the fact that each location is home to multiple species in varying proportions, so a patrol benefits each species to differing degrees. Wh… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: Accepted at IJCAI 2022, AI for Good track. 7 pages + 2 pages appendix. Code is available at https://github.com/lily-x/rankedCUCB

  23. Evolutionary Approach to Security Games with Signaling

    Authors: Adam Żychowski, Jacek Mańdziuk, Elizabeth Bondi, Aravind Venugopal, Milind Tambe, Balaraman Ravindran

    Abstract: Green Security Games have become a popular way to model scenarios involving the protection of natural resources, such as wildlife. Sensors (e.g. drones equipped with cameras) have also begun to play a role in these scenarios by providing real-time information. Incorporating both human and sensor defender resources strategically is the subject of recent work on Security Games with Signaling (SGS).… ▽ More

    Submitted 29 April, 2022; originally announced April 2022.

    Journal ref: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2022, 620-627

  24. arXiv:2204.13663  [pdf, other

    cs.AI cs.CY

    ADVISER: AI-Driven Vaccination Intervention Optimiser for Increasing Vaccine Uptake in Nigeria

    Authors: Vineet Nair, Kritika Prakash, Michael Wilbur, Aparna Taneja, Corinne Namblard, Oyindamola Adeyemo, Abhishek Dubey, Abiodun Adereni, Milind Tambe, Ayan Mukhopadhyay

    Abstract: More than 5 million children under five years die from largely preventable or treatable medical conditions every year, with an overwhelmingly large proportion of deaths occurring in under-developed countries with low vaccination uptake. One of the United Nations' sustainable development goals (SDG 3) aims to end preventable deaths of newborns and children under five years of age. We focus on Niger… ▽ More

    Submitted 5 July, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

    Comments: Accepted for publication at International Joint Conference on Artificial Intelligence 2022, AI for Good Track (IJCAI-22)

  25. arXiv:2203.16067  [pdf, other

    cs.LG cs.AI

    Decision-Focused Learning without Differentiable Optimization: Learning Locally Optimized Decision Losses

    Authors: Sanket Shah, Kai Wang, Bryan Wilder, Andrew Perrault, Milind Tambe

    Abstract: Decision-Focused Learning (DFL) is a paradigm for tailoring a predictive model to a downstream optimization task that uses its predictions in order to perform better on that specific task. The main technical challenge associated with DFL is that it requires being able to differentiate through the optimization problem, which is difficult due to discontinuous solutions and other challenges. Past wor… ▽ More

    Submitted 8 November, 2022; v1 submitted 30 March, 2022; originally announced March 2022.

    Comments: 16 pages, 5 figures, 3 tables

  26. arXiv:2202.14010   

    cs.CR cs.AI cs.GT cs.LG

    Proceedings of the Artificial Intelligence for Cyber Security (AICS) Workshop at AAAI 2022

    Authors: James Holt, Edward Raff, Ahmad Ridley, Dennis Ross, Arunesh Sinha, Diane Staheli, William Streilen, Milind Tambe, Yevgeniy Vorobeychik, Allan Wollaber

    Abstract: The workshop will focus on the application of AI to problems in cyber security. Cyber systems generate large volumes of data, utilizing this effectively is beyond human capabilities. Additionally, adversaries continue to develop new attacks. Hence, AI methods are required to understand and protect the cyber domain. These challenges are widely studied in enterprise networks, but there are many gaps… ▽ More

    Submitted 1 March, 2022; v1 submitted 28 February, 2022; originally announced February 2022.

  27. arXiv:2202.00916  [pdf, other

    cs.LG cs.AI

    Scalable Decision-Focused Learning in Restless Multi-Armed Bandits with Application to Maternal and Child Health

    Authors: Kai Wang, Shresth Verma, Aditya Mate, Sanket Shah, Aparna Taneja, Neha Madhiwalla, Aparna Hegde, Milind Tambe

    Abstract: This paper studies restless multi-armed bandit (RMAB) problems with unknown arm transition dynamics but with known correlated arm features. The goal is to learn a model to predict transition dynamics given features, where the Whittle index policy solves the RMAB problems using predicted transitions. However, prior works often learn the model by maximizing the predictive accuracy instead of final R… ▽ More

    Submitted 13 August, 2023; v1 submitted 2 February, 2022; originally announced February 2022.

  28. arXiv:2201.12408  [pdf, other

    cs.LG

    Networked Restless Multi-Armed Bandits for Mobile Interventions

    Authors: Han-Ching Ou, Christoph Siebenbrunner, Jackson Killian, Meredith B Brooks, David Kempe, Yevgeniy Vorobeychik, Milind Tambe

    Abstract: Motivated by a broad class of mobile intervention problems, we propose and study restless multi-armed bandits (RMABs) with network effects. In our model, arms are partially recharging and connected through a graph, so that pulling one arm also improves the state of neighboring arms, significantly extending the previously studied setting of fully recharging bandits with no network effects. In mobil… ▽ More

    Submitted 28 January, 2022; originally announced January 2022.

  29. arXiv:2109.10637  [pdf, other

    cs.AI

    Facilitating human-wildlife cohabitation through conflict prediction

    Authors: Susobhan Ghosh, Pradeep Varakantham, Aniket Bhatkhande, Tamanna Ahmad, Anish Andheria, Wenjun Li, Aparna Taneja, Divy Thakkar, Milind Tambe

    Abstract: With increasing world population and expanded use of forests as cohabited regions, interactions and conflicts with wildlife are increasing, leading to large-scale loss of lives (animal and human) and livelihoods (economic). While community knowledge is valuable, forest officials and conservation organisations can greatly benefit from predictive analysis of human-wildlife conflict, leading to targe… ▽ More

    Submitted 22 September, 2021; originally announced September 2021.

    Comments: 7 pages, 4 figures

  30. arXiv:2109.08075  [pdf, other

    cs.LG cs.AI

    Field Study in Deploying Restless Multi-Armed Bandits: Assisting Non-Profits in Improving Maternal and Child Health

    Authors: Aditya Mate, Lovish Madaan, Aparna Taneja, Neha Madhiwalla, Shresth Verma, Gargi Singh, Aparna Hegde, Pradeep Varakantham, Milind Tambe

    Abstract: The widespread availability of cell phones has enabled non-profits to deliver critical health information to their beneficiaries in a timely manner. This paper describes our work to assist non-profits that employ automated messaging programs to deliver timely preventive care information to beneficiaries (new and expecting mothers) during pregnancy and after delivery. Unfortunately, a key challenge… ▽ More

    Submitted 27 October, 2021; v1 submitted 16 September, 2021; originally announced September 2021.

  31. arXiv:2107.03003  [pdf, other

    cs.LG cs.AI stat.ML

    Harnessing Heterogeneity: Learning from Decomposed Feedback in Bayesian Modeling

    Authors: Kai Wang, Bryan Wilder, Sze-chuan Suen, Bistra Dilkina, Milind Tambe

    Abstract: There is significant interest in learning and optimizing a complex system composed of multiple sub-components, where these components may be agents or autonomous sensors. Among the rich literature on this topic, agent-based and domain-specific simulations can capture complex dynamics and subgroup interaction, but optimizing over such simulations can be computationally and algorithmically challengi… ▽ More

    Submitted 6 July, 2021; originally announced July 2021.

  32. arXiv:2107.01689  [pdf, other

    cs.LG

    Restless and Uncertain: Robust Policies for Restless Bandits via Deep Multi-Agent Reinforcement Learning

    Authors: Jackson A. Killian, Lily Xu, Arpita Biswas, Milind Tambe

    Abstract: We introduce robustness in \textit{restless multi-armed bandits} (RMABs), a popular model for constrained resource allocation among independent stochastic processes (arms). Nearly all RMAB techniques assume stochastic dynamics are precisely known. However, in many real-world settings, dynamics are estimated with significant \emph{uncertainty}, e.g., via historical data, which can lead to bad outco… ▽ More

    Submitted 21 June, 2022; v1 submitted 4 July, 2021; originally announced July 2021.

    Comments: 16 pages, 4 figures

  33. Q-Learning Lagrange Policies for Multi-Action Restless Bandits

    Authors: Jackson A. Killian, Arpita Biswas, Sanket Shah, Milind Tambe

    Abstract: Multi-action restless multi-armed bandits (RMABs) are a powerful framework for constrained resource allocation in which $N$ independent processes are managed. However, previous work only study the offline setting where problem dynamics are known. We address this restrictive assumption, designing the first algorithms for learning good policies for Multi-action RMABs online using combinations of Lag… ▽ More

    Submitted 22 June, 2021; originally announced June 2021.

    Comments: 13 pages, 6 figures, to be published in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data

  34. arXiv:2106.08413  [pdf, other

    cs.LG cs.AI cs.MA

    Robust Reinforcement Learning Under Minimax Regret for Green Security

    Authors: Lily Xu, Andrew Perrault, Fei Fang, Haipeng Chen, Milind Tambe

    Abstract: Green security domains feature defenders who plan patrols in the face of uncertainty about the adversarial behavior of poachers, illegal loggers, and illegal fishers. Importantly, the deterrence effect of patrols on adversaries' future behavior makes patrol planning a sequential decision-making problem. Therefore, we focus on robust sequential patrol planning for green security following the minim… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

    Comments: Accepted at the Conference on Uncertainty in Artificial Intelligence (UAI) 2021. 11 pages, 5 figures

  35. arXiv:2106.07039  [pdf, other

    cs.SI cs.AI cs.LG

    Contingency-Aware Influence Maximization: A Reinforcement Learning Approach

    Authors: Haipeng Chen, Wei Qiu, Han-Ching Ou, Bo An, Milind Tambe

    Abstract: The influence maximization (IM) problem aims at finding a subset of seed nodes in a social network that maximize the spread of influence. In this study, we focus on a sub-class of IM problems, where whether the nodes are willing to be the seeds when being invited is uncertain, called contingency-aware IM. Such contingency aware IM is critical for applications for non-profit organizations in low re… ▽ More

    Submitted 13 June, 2021; originally announced June 2021.

    Comments: 11 pages; accepted for publication at UAI 2021

  36. arXiv:2106.06060  [pdf, other

    cs.MA cs.AI cs.GT

    AI-driven Prices for Externalities and Sustainability in Production Markets

    Authors: Panayiotis Danassis, Aris Filos-Ratsikas, Haipeng Chen, Milind Tambe, Boi Faltings

    Abstract: Traditional competitive markets do not account for negative externalities; indirect costs that some participants impose on others, such as the cost of over-appropriating a common-pool resource (which diminishes future stock, and thus harvest, for everyone). Quantifying appropriate interventions to market prices has proven to be quite challenging. We propose a practical approach to computing market… ▽ More

    Submitted 12 January, 2023; v1 submitted 10 June, 2021; originally announced June 2021.

    Comments: Accepted to the 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2023)

  37. arXiv:2106.04663  [pdf, other

    cs.GT

    Solving Structured Hierarchical Games Using Differential Backward Induction

    Authors: Zun Li, Feiran Jia, Aditya Mate, Shahin Jabbari, Mithun Chakraborty, Milind Tambe, Yevgeniy Vorobeychik

    Abstract: From large-scale organizations to decentralized political systems, hierarchical strategic decision making is commonplace. We introduce a novel class of structured hierarchical games (SHGs) that formally capture such hierarchical strategic interactions. In an SHG, each player is a node in a tree, and strategic choices of players are sequenced from root to leaves, with root moving first, followed by… ▽ More

    Submitted 27 June, 2022; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: The short version of this paper appears in the proceedings of UAI-22

  38. arXiv:2106.03279  [pdf, other

    cs.LG

    Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Problems by Reinforcement Learning

    Authors: Kai Wang, Sanket Shah, Haipeng Chen, Andrew Perrault, Finale Doshi-Velez, Milind Tambe

    Abstract: In the predict-then-optimize framework, the objective is to train a predictive model, map** from environment features to parameters of an optimization problem, which maximizes decision quality when the optimization is subsequently solved. Recent work on decision-focused learning shows that embedding the optimization problem in the training pipeline can improve decision quality and help generaliz… ▽ More

    Submitted 16 July, 2022; v1 submitted 6 June, 2021; originally announced June 2021.

  39. arXiv:2106.03278  [pdf, other

    cs.GT

    Coordinating Followers to Reach Better Equilibria: End-to-End Gradient Descent for Stackelberg Games

    Authors: Kai Wang, Lily Xu, Andrew Perrault, Michael K. Reiter, Milind Tambe

    Abstract: A growing body of work in game theory extends the traditional Stackelberg game to settings with one leader and multiple followers who play a Nash equilibrium. Standard approaches for computing equilibria in these games reformulate the followers' best response as constraints in the leader's optimization problem. These reformulation approaches can sometimes be effective, but often get trapped in low… ▽ More

    Submitted 3 December, 2021; v1 submitted 6 June, 2021; originally announced June 2021.

  40. arXiv:2105.07965  [pdf, other

    cs.LG cs.AI

    Learn to Intervene: An Adaptive Learning Policy for Restless Bandits in Application to Preventive Healthcare

    Authors: Arpita Biswas, Gaurav Aggarwal, Pradeep Varakantham, Milind Tambe

    Abstract: In many public health settings, it is important for patients to adhere to health programs, such as taking medications and periodic health checks. Unfortunately, beneficiaries may gradually disengage from such programs, which is detrimental to their health. A concrete example of gradual disengagement has been observed by an organization that carries out a free automated call-based program for sprea… ▽ More

    Submitted 22 July, 2021; v1 submitted 17 May, 2021; originally announced May 2021.

    Comments: To appear in the 30th International Joint Conference on Artificial Intelligence (IJCAI 2021)

  41. arXiv:2103.09052  [pdf, other

    cs.LG

    Selective Intervention Planning using Restless Multi-Armed Bandits to Improve Maternal and Child Health Outcomes

    Authors: Siddharth Nishtala, Lovish Madaan, Aditya Mate, Harshavardhan Kamarthi, Anirudh Grama, Divy Thakkar, Dhyanesh Narayanan, Suresh Chaudhary, Neha Madhiwalla, Ramesh Padmanabhan, Aparna Hegde, Pradeep Varakantham, Balaraman Ravindran, Milind Tambe

    Abstract: India has a maternal mortality ratio of 113 and child mortality ratio of 2830 per 100,000 live births. Lack of access to preventive care information is a major contributing factor for these deaths, especially in low resource households. We partner with ARMMAN, a non-profit based in India employing a call-based information program to disseminate health-related information to pregnant women and wome… ▽ More

    Submitted 18 October, 2021; v1 submitted 7 March, 2021; originally announced March 2021.

    Comments: 7 pages. Camera-ready version for AASG 2021 Workshop

  42. arXiv:2103.04730  [pdf, other

    cs.LG cs.AI

    Efficient Algorithms for Finite Horizon and Streaming Restless Multi-Armed Bandit Problems

    Authors: Aditya Mate, Arpita Biswas, Christoph Siebenbrunner, Susobhan Ghosh, Milind Tambe

    Abstract: We propose Streaming Bandits, a Restless Multi Armed Bandit (RMAB) framework in which heterogeneous arms may arrive and leave the system after staying on for a finite lifetime. Streaming Bandits naturally capture the health intervention planning problem, where health workers must manage the health outcomes of a patient cohort while new patients join and existing patients leave the cohort each day.… ▽ More

    Submitted 15 February, 2022; v1 submitted 8 March, 2021; originally announced March 2021.

  43. arXiv:2102.10646  [pdf, other

    cs.GT cs.MA

    A Game-Theoretic Approach for Hierarchical Epidemic Control

    Authors: Feiran Jia, Aditya Mate, Zun Li, Shahin Jabbari, Mithun Chakraborty, Milind Tambe, Michael Wellman, Yevgeniy Vorobeychik

    Abstract: We design and analyze a multi-level game-theoretic model of hierarchical policy interventions for epidemic control, such as those in response to the COVID-19 pandemic. Our model captures the potentially mismatched priorities among a hierarchy of policy-makers (e.g., federal, state, and local governments) with respect to two cost components that have opposite dependence on the policy strength -- po… ▽ More

    Submitted 3 August, 2022; v1 submitted 21 February, 2021; originally announced February 2021.

  44. arXiv:2101.02766  [pdf, other

    cs.LG cs.MA

    Active Screening for Recurrent Diseases: A Reinforcement Learning Approach

    Authors: Han-Ching Ou, Haipeng Chen, Shahin Jabbari, Milind Tambe

    Abstract: Active screening is a common approach in controlling the spread of recurring infectious diseases such as tuberculosis and influenza. In this approach, health workers periodically select a subset of population for screening. However, given the limited number of health workers, only a small subset of the population can be visited in any given time period. Given the recurrent nature of the disease an… ▽ More

    Submitted 19 April, 2021; v1 submitted 7 January, 2021; originally announced January 2021.

    Comments: The short version of this paper appears in the proceedings of AAMAS-21

  45. arXiv:2012.12839  [pdf, other

    cs.MA

    Cohorting to isolate asymptomatic spreaders: An agent-based simulation study on the Mumbai Suburban Railway

    Authors: Alok Talekar, Sharad Shriram, Nidhin Vaidhiyan, Gaurav Aggarwal, Jiangzhuo Chen, Srini Venkatramanan, Li**g Wang, Aniruddha Adiga, Adam Sadilek, Ashish Tendulkar, Madhav Marathe, Rajesh Sundaresan, Milind Tambe

    Abstract: The Mumbai Suburban Railways, \emph{locals}, are a key transit infrastructure of the city and is crucial for resuming normal economic activity. To reduce disease transmission, policymakers can enforce reduced crowding and mandate wearing of masks. \emph{Cohorting} -- forming groups of travelers that always travel together, is an additional policy to reduce disease transmission on \textit{locals} w… ▽ More

    Submitted 24 December, 2020; v1 submitted 23 December, 2020; originally announced December 2020.

    Comments: Will be presented at AAMAS 2021. Minor edits to styling (per conf guidelines) and acknowledgement

  46. Reinforcement Learning for Unified Allocation and Patrolling in Signaling Games with Uncertainty

    Authors: Aravind Venugopal, Elizabeth Bondi, Harshavardhan Kamarthi, Keval Dholakia, Balaraman Ravindran, Milind Tambe

    Abstract: Green Security Games (GSGs) have been successfully used in the protection of valuable resources such as fisheries, forests and wildlife. While real-world deployment involves both resource allocation and subsequent coordinated patrolling with communication and real-time, uncertain information, previous game models do not fully address both of these stages simultaneously. Furthermore, adopting exist… ▽ More

    Submitted 18 December, 2020; originally announced December 2020.

    Comments: Accepted at AAMAS 2021

    Report number: page 1353-1361

  47. arXiv:2012.03822  [pdf, other

    cs.LG

    Efficient Reservoir Management through Deep Reinforcement Learning

    Authors: Xinrun Wang, Tarun Nair, Haoyang Li, Yuh Sheng Reuben Wong, Nachiket Kelkar, Srinivas Vaidyanathan, Rajat Nayak, Bo An, Jagdish Krishnaswamy, Milind Tambe

    Abstract: Dams impact downstream river dynamics through flow regulation and disruption of upstream-downstream linkages. However, current dam operation is far from satisfactory due to the inability to respond the complicated and uncertain dynamics of the upstream-downstream system and various usages of the reservoir. Even further, the unsatisfactory dam operation can cause floods in downstream areas. Therefo… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.

    Comments: 5 pages, 4 figures, Workshop paper

  48. arXiv:2011.10666  [pdf, other

    cs.LG cs.CY

    Enhancing Poaching Predictions for Under-Resourced Wildlife Conservation Parks Using Remote Sensing Imagery

    Authors: Rachel Guo, Lily Xu, Drew Cronin, Francis Okeke, Andrew Plumptre, Milind Tambe

    Abstract: Illegal wildlife poaching is driving the loss of biodiversity. To combat poaching, rangers patrol expansive protected areas for illegal poaching activity. However, rangers often cannot comprehensively search such large parks. Thus, the Protection Assistant for Wildlife Security (PAWS) was introduced as a machine learning approach to help identify the areas with highest poaching risk. As PAWS is de… ▽ More

    Submitted 20 November, 2020; originally announced November 2020.

    Comments: Presented at NeurIPS 2020 Workshop on Machine Learning for the Develo** World. 4 pages, 1 page references. 4 figures, 1 table

  49. arXiv:2011.02962  [pdf, other

    cs.LG cs.AI cs.CY

    Measuring Data Collection Diligence for Community Healthcare

    Authors: Ramesha Karunasena, Mohammad Sarparajul Ambiya, Arunesh Sinha, Ruchit Nagar, Saachi Dalal, Divy Thakkar, Dhyanesh Narayanan, Milind Tambe

    Abstract: Data analytics has tremendous potential to provide targeted benefit in low-resource communities, however the availability of high-quality public health data is a significant challenge in develo** countries primarily due to non-diligent data collection by community health workers (CHWs). In this work, we define and test a data collection diligence score. This challenging unlabeled data problem is… ▽ More

    Submitted 7 April, 2021; v1 submitted 5 November, 2020; originally announced November 2020.

    ACM Class: I.2.1

  50. arXiv:2009.09559  [pdf, other

    cs.SI

    Clinical trial of an AI-augmented intervention for HIV prevention in youth experiencing homelessness

    Authors: Bryan Wilder, Laura Onasch-Vera, Graham Diguiseppi, Robin Petering, Chyna Hill, Amulya Yadav, Eric Rice, Milind Tambe

    Abstract: Youth experiencing homelessness (YEH) are subject to substantially greater risk of HIV infection, compounded both by their lack of access to stable housing and the disproportionate representation of youth of marginalized racial, ethnic, and gender identity groups among YEH. A key goal for health equity is to improve adoption of protective behaviors in this population. One promising strategy for in… ▽ More

    Submitted 6 November, 2020; v1 submitted 20 September, 2020; originally announced September 2020.

    Report number: Accepted at AAAI 2021