Skip to main content

Showing 1–47 of 47 results for author: Tulabandhula, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.07833  [pdf, other

    cs.CV cs.AI

    Sense Less, Generate More: Pre-training LiDAR Perception with Masked Autoencoders for Ultra-Efficient 3D Sensing

    Authors: Sina Tayebati, Theja Tulabandhula, Amit R. Trivedi

    Abstract: In this work, we propose a disruptively frugal LiDAR perception dataflow that generates rather than senses parts of the environment that are either predictable based on the extensive training of the environment or have limited consequence to the overall prediction accuracy. Therefore, the proposed methodology trades off sensing energy with training data for low-power robotics and autonomous naviga… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  2. arXiv:2403.02745  [pdf, other

    cs.AI cs.CL

    CURATRON: Complete Robust Preference Data for Robust Alignment of Large Language Models

    Authors: Son The Nguyen, Niranjan Uma Naresh, Theja Tulabandhula

    Abstract: This paper addresses the challenges of aligning large language models (LLMs) with human values via preference learning (PL), with a focus on the issues of incomplete and corrupted data in preference datasets. We propose a novel method for robustly and completely recalibrating values within these datasets to enhance LLMs resilience against the issues. In particular, we devise a guaranteed polynomia… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  3. arXiv:2403.00822  [pdf, other

    cs.IR cs.AI

    InteraRec: Screenshot Based Recommendations Using Multimodal Large Language Models

    Authors: Saketh Reddy Karra, Theja Tulabandhula

    Abstract: Weblogs, comprised of records detailing user activities on any website, offer valuable insights into user preferences, behavior, and interests. Numerous recommendation algorithms, employing strategies such as collaborative filtering, content-based filtering, and hybrid methods, leverage the data mined through these weblogs to provide personalized recommendations to users. Despite the abundance of… ▽ More

    Submitted 15 June, 2024; v1 submitted 26 February, 2024; originally announced March 2024.

  4. arXiv:2402.07107  [pdf, other

    cs.LG cs.AI

    Echoes of Socratic Doubt: Embracing Uncertainty in Calibrated Evidential Reinforcement Learning

    Authors: Alex Christopher Stutts, Danilo Erricolo, Theja Tulabandhula, Amit Ranjan Trivedi

    Abstract: We present a novel statistical approach to incorporating uncertainty awareness in model-free distributional reinforcement learning involving quantile regression-based deep Q networks. The proposed algorithm, $\textit{Calibrated Evidential Quantile Regression in Deep Q Networks (CEQR-DQN)}$, aims to address key challenges associated with separately estimating aleatoric and epistemic uncertainty in… ▽ More

    Submitted 3 June, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

  5. arXiv:2312.06826  [pdf, other

    cs.AI cs.HC

    User Friendly and Adaptable Discriminative AI: Using the Lessons from the Success of LLMs and Image Generation Models

    Authors: Son The Nguyen, Theja Tulabandhula, Mary Beth Watson-Manheim

    Abstract: While there is significant interest in using generative AI tools as general-purpose models for specific ML applications, discriminative models are much more widely deployed currently. One of the key shortcomings of these discriminative AI tools that have been already deployed is that they are not adaptable and user-friendly compared to generative AI tools (e.g., GPT4, Stable Diffusion, Bard, etc.)… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  6. arXiv:2311.12241  [pdf, other

    cs.AI

    InteraSSort: Interactive Assortment Planning Using Large Language Models

    Authors: Saketh Reddy Karra, Theja Tulabandhula

    Abstract: Assortment planning, integral to multiple commercial offerings, is a key problem studied in e-commerce and retail settings. Numerous variants of the problem along with their integration into business solutions have been thoroughly investigated in the existing literature. However, the nuanced complexities of in-store planning and a lack of optimization proficiency among store planners with strong d… ▽ More

    Submitted 9 January, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

  7. arXiv:2309.11069  [pdf, other

    cs.CV cs.AI

    Dynamic Tiling: A Model-Agnostic, Adaptive, Scalable, and Inference-Data-Centric Approach for Efficient and Accurate Small Object Detection

    Authors: Son The Nguyen, Theja Tulabandhula, Duy Nguyen

    Abstract: We introduce Dynamic Tiling, a model-agnostic, adaptive, and scalable approach for small object detection, anchored in our inference-data-centric philosophy. Dynamic Tiling starts with non-overlap** tiles for initial detections and utilizes dynamic overlap** rates along with a tile minimizer. This dual approach effectively resolves fragmented objects, improves detection accuracy, and minimizes… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

  8. arXiv:2309.11018  [pdf, other

    cs.LG cs.CV cs.RO

    Conformalized Multimodal Uncertainty Regression and Reasoning

    Authors: Domenico Parente, Nastaran Darabi, Alex C. Stutts, Theja Tulabandhula, Amit Ranjan Trivedi

    Abstract: This paper introduces a lightweight uncertainty estimator capable of predicting multimodal (disjoint) uncertainty bounds by integrating conformal prediction with a deep-learning regressor. We specifically discuss its application for visual odometry (VO), where environmental features such as flying domain symmetries and sensor measurements under ambiguities and occlusion can result in multimodal un… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

  9. arXiv:2309.11006  [pdf, other

    cs.RO cs.CV

    STARNet: Sensor Trustworthiness and Anomaly Recognition via Approximated Likelihood Regret for Robust Edge Autonomy

    Authors: Nastaran Darabi, Sina Tayebati, Sureshkumar S., Sathya Ravi, Theja Tulabandhula, Amit R. Trivedi

    Abstract: Complex sensors such as LiDAR, RADAR, and event cameras have proliferated in autonomous robotics to enhance perception and understanding of the environment. Meanwhile, these sensors are also vulnerable to diverse failure mechanisms that can intricately interact with their operation environment. In parallel, the limited availability of training data on complex sensors also affects the reliability o… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

  10. arXiv:2309.09593  [pdf, other

    cs.CV cs.IT cs.RO

    Mutual Information-calibrated Conformal Feature Fusion for Uncertainty-Aware Multimodal 3D Object Detection at the Edge

    Authors: Alex C. Stutts, Danilo Erricolo, Sathya Ravi, Theja Tulabandhula, Amit Ranjan Trivedi

    Abstract: In the expanding landscape of AI-enabled robotics, robust quantification of predictive uncertainties is of great importance. Three-dimensional (3D) object detection, a critical robotics operation, has seen significant advancements; however, the majority of current works focus only on accuracy and ignore uncertainty quantification. Addressing this gap, our novel study integrates the principles of c… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

  11. arXiv:2308.14182  [pdf, other

    cs.CL

    Generative AI for Business Strategy: Using Foundation Models to Create Business Strategy Tools

    Authors: Son The Nguyen, Theja Tulabandhula

    Abstract: Generative models (foundation models) such as LLMs (large language models) are having a large impact on multiple fields. In this work, we propose the use of such models for business decision making. In particular, we combine unstructured textual data sources (e.g., news data) with multiple foundation models (namely, GPT4, transformer-based Named Entity Recognition (NER) models and Entailment-based… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

  12. arXiv:2303.02207  [pdf, other

    cs.CV cs.AI cs.LG cs.RO eess.IV

    Lightweight, Uncertainty-Aware Conformalized Visual Odometry

    Authors: Alex C. Stutts, Danilo Erricolo, Theja Tulabandhula, Amit Ranjan Trivedi

    Abstract: Data-driven visual odometry (VO) is a critical subroutine for autonomous edge robotics, and recent progress in the field has produced highly accurate point predictions in complex environments. However, emerging autonomous edge robotics devices like insect-scale drones and surgical robots lack a computationally efficient framework to estimate VO's predictive uncertainties. Meanwhile, as edge roboti… ▽ More

    Submitted 3 March, 2023; originally announced March 2023.

  13. arXiv:2210.15559  [pdf, other

    cs.CV cs.AI cs.LG cs.RO eess.IV

    Robust Monocular Localization of Drones by Adapting Domain Maps to Depth Prediction Inaccuracies

    Authors: Priyesh Shukla, Sureshkumar S., Alex C. Stutts, Sathya Ravi, Theja Tulabandhula, Amit R. Trivedi

    Abstract: We present a novel monocular localization framework by jointly training deep learning-based depth prediction and Bayesian filtering-based pose reasoning. The proposed cross-modal framework significantly outperforms deep learning-only predictions with respect to model scalability and tolerance to environmental variations. Specifically, we show little-to-no degradation of pose accuracy even with ext… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

  14. arXiv:2207.02328  [pdf, other

    q-bio.NC cs.LG

    Unified Embeddings of Structural and Functional Connectome via a Function-Constrained Structural Graph Variational Auto-Encoder

    Authors: Carlo Amodeo, Igor Fortel, Olusola Ajilore, Liang Zhan, Alex Leow, Theja Tulabandhula

    Abstract: Graph theoretical analyses have become standard tools in modeling functional and anatomical connectivity in the brain. With the advent of connectomics, the primary graphs or networks of interest are structural connectome (derived from DTI tractography) and functional connectome (derived from resting-state fMRI). However, most published connectome studies have focused on either structural or functi… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

  15. arXiv:2204.12000  [pdf, other

    cs.CL cs.AI

    Estimating the Personality of White-Box Language Models

    Authors: Saketh Reddy Karra, Son The Nguyen, Theja Tulabandhula

    Abstract: Technology for open-ended language generation, a key application of artificial intelligence, has advanced to a great extent in recent years. Large-scale language models, which are trained on large corpora of text, are being used in a wide range of applications everywhere, from virtual assistants to conversational bots. While these language models output fluent text, existing research shows that th… ▽ More

    Submitted 10 May, 2023; v1 submitted 25 April, 2022; originally announced April 2022.

  16. arXiv:2104.05217  [pdf, other

    eess.SY cs.LG

    ENOS: Energy-Aware Network Operator Search for Hybrid Digital and Compute-in-Memory DNN Accelerators

    Authors: Shamma Nasrin, Ahish Shylendra, Yuti Kadakia, Nick Iliev, Wilfred Gomes, Theja Tulabandhula, Amit Ranjan Trivedi

    Abstract: This work proposes a novel Energy-Aware Network Operator Search (ENOS) approach to address the energy-accuracy trade-offs of a deep neural network (DNN) accelerator. In recent years, novel inference operators have been proposed to improve the computational efficiency of a DNN. Augmenting the operators, their corresponding novel computing modes have also been explored. However, simplification of DN… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

  17. arXiv:2104.00801  [pdf, other

    cs.SI cs.LG

    Choice-Aware User Engagement Modeling andOptimization on Social Media

    Authors: Saketh Reddy Karra, Theja Tulabandhula

    Abstract: We address the problem of maximizing user engagement with content (in the form of like, reply, retweet, and retweet with comments)on the Twitter platform. We formulate the engagement forecasting task as a multi-label classification problem that captures choice behavior on an unsupervised clustering of tweet-topics. We propose a neural network architecture that incorporates user engagement history… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

    Comments: 11 pages, 1 figure

  18. arXiv:2102.08247   

    cs.RO cs.AR eess.IV

    Probabilistic Localization of Insect-Scale Drones on Floating-Gate Inverter Arrays

    Authors: Priyesh Shukla, Ankith Muralidhar, Nick Iliev, Theja Tulabandhula, Sawyer B. Fuller, Amit Ranjan Trivedi

    Abstract: We propose a novel compute-in-memory (CIM)-based ultra-low-power framework for probabilistic localization of insect-scale drones. The conventional probabilistic localization approaches rely on the three-dimensional (3D) Gaussian Mixture Model (GMM)-based representation of a 3D map. A GMM model with hundreds of mixture functions is typically needed to adequately learn and represent the intricacies… ▽ More

    Submitted 24 May, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: Will submit the revised article.

    ACM Class: B.7; I.2.9

  19. arXiv:2012.11715  [pdf, other

    cs.AI q-fin.PM

    Off-Policy Optimization of Portfolio Allocation Policies under Constraints

    Authors: Nymisha Bandi, Theja Tulabandhula

    Abstract: The dynamic portfolio optimization problem in finance frequently requires learning policies that adhere to various constraints, driven by investor preferences and risk. We motivate this problem of finding an allocation policy within a sequential decision making framework and study the effects of: (a) using data collected under previously employed policies, which may be sub-optimal and constraint-v… ▽ More

    Submitted 21 December, 2020; originally announced December 2020.

  20. arXiv:2012.03323  [pdf, other

    cs.IR

    KATRec: Knowledge Aware aTtentive Sequential Recommendations

    Authors: Mehrnaz Amjadi, Seyed Danial Mohseni Taheri, Theja Tulabandhula

    Abstract: Sequential recommendation systems model dynamic preferences of users based on their historical interactions with platforms. Despite recent progress, modeling short-term and long-term behavior of users in such systems is nontrivial and challenging. To address this, we present a solution enhanced by a knowledge graph called KATRec (Knowledge Aware aTtentive sequential Recommendations). KATRec learns… ▽ More

    Submitted 5 July, 2021; v1 submitted 6 December, 2020; originally announced December 2020.

  21. arXiv:2011.14430  [pdf

    cs.AI cs.LG

    Deep Reinforcement Learning for Crowdsourced Urban Delivery: System States Characterization, Heuristics-guided Action Choice, and Rule-Interposing Integration

    Authors: Tanvir Ahamed, Bo Zou, Nahid Parvez Farazi, Theja Tulabandhula

    Abstract: This paper investigates the problem of assigning ship** requests to ad hoc couriers in the context of crowdsourced urban delivery. The ship** requests are spatially distributed each with a limited time window between the earliest time for pickup and latest time for delivery. The ad hoc couriers, termed crowdsourcees, also have limited time availability and carrying capacity. We propose a new d… ▽ More

    Submitted 29 November, 2020; originally announced November 2020.

    Comments: 50 pages, 17 figures

  22. arXiv:2011.14033  [pdf, other

    cs.LG stat.ML

    A Tractable Online Learning Algorithm for the Multinomial Logit Contextual Bandit

    Authors: Priyank Agrawal, Theja Tulabandhula, Vashist Avadhanula

    Abstract: In this paper, we consider the contextual variant of the MNL-Bandit problem. More specifically, we consider a dynamic set optimization problem, where a decision-maker offers a subset (assortment) of products to a consumer and observes the response in every round. Consumers purchase products to maximize their utility. We assume that a set of attributes describe the products, and the mean utility of… ▽ More

    Submitted 14 April, 2024; v1 submitted 27 November, 2020; originally announced November 2020.

    Comments: Bug fixed

  23. arXiv:2006.10356  [pdf, other

    cs.LG cs.AI stat.ML

    Learning by Repetition: Stochastic Multi-armed Bandits under Priming Effect

    Authors: Priyank Agrawal, Theja Tulabandhula

    Abstract: We study the effect of persistence of engagement on learning in a stochastic multi-armed bandit setting. In advertising and recommendation systems, repetition effect includes a wear-in period, where the user's propensity to reward the platform via a click or purchase depends on how frequently they see the recommendation in the recent past. It also includes a counteracting wear-out period, where th… ▽ More

    Submitted 18 June, 2020; originally announced June 2020.

    Comments: Appears in the 36th Conference on Uncertainty in Artificial Intelligence (UAI 2020)

  24. arXiv:2006.08055  [pdf, other

    cs.IR cs.AI econ.TH

    Multi-Purchase Behavior: Modeling, Estimation and Optimization

    Authors: Theja Tulabandhula, Deeksha Sinha, Saketh Reddy Karra, Prasoon Patidar

    Abstract: We study the problem of modeling purchase of multiple products and utilizing it to display optimized recommendations for online retailers and e-commerce platforms. We present a parsimonious multi-purchase family of choice models called the Bundle-MVL-K family, and develop a binary search based iterative strategy that efficiently computes optimized recommendations for this model. We establish the… ▽ More

    Submitted 5 August, 2023; v1 submitted 14 June, 2020; originally announced June 2020.

    Comments: 48 pages. Published in Manufacturing & Service Operations Management 2023

  25. arXiv:2003.04736  [pdf, other

    cs.AI cs.CE stat.AP

    Optimizing Revenue while showing Relevant Assortments at Scale

    Authors: Theja Tulabandhula, Deeksha Sinha, Saketh Karra

    Abstract: Scalable real-time assortment optimization has become essential in e-commerce operations due to the need for personalization and the availability of a large variety of items. While this can be done when there are simplistic assortment choices to be made, the optimization process becomes difficult when imposing constraints on the collection of relevant assortments based on insights by store-manager… ▽ More

    Submitted 1 March, 2021; v1 submitted 6 March, 2020; originally announced March 2020.

    Comments: 53 pages, 10 figures

  26. arXiv:2003.02629  [pdf, other

    eess.SP cs.LG eess.IV

    $MC^2RAM$: Markov Chain Monte Carlo Sampling in SRAM for Fast Bayesian Inference

    Authors: Priyesh Shukla, Ahish Shylendra, Theja Tulabandhula, Amit Ranjan Trivedi

    Abstract: This work discusses the implementation of Markov Chain Monte Carlo (MCMC) sampling from an arbitrary Gaussian mixture model (GMM) within SRAM. We show a novel architecture of SRAM by embedding it with random number generators (RNGs), digital-to-analog converters (DACs), and analog-to-digital converters (ADCs) so that SRAM arrays can be used for high performance Metropolis-Hastings (MH) algorithm-b… ▽ More

    Submitted 28 February, 2020; originally announced March 2020.

    Comments: This paper has been accepted at the IEEE International Symposium on Circuits and Systems (ISCAS) to be held in May 2020 at Seville, Spain

  27. arXiv:2001.07853  [pdf, other

    cs.LG cs.IR stat.ML

    Incentivising Exploration and Recommendations for Contextual Bandits with Payments

    Authors: Priyank Agrawal, Theja Tulabandhula

    Abstract: We propose a contextual bandit based model to capture the learning and social welfare goals of a web platform in the presence of myopic users. By using payments to incentivize these agents to explore different items/recommendations, we show how the platform can learn the inherent attributes of items and achieve a sublinear regret while maximizing cumulative social welfare. We also calculate theore… ▽ More

    Submitted 21 January, 2020; originally announced January 2020.

    Comments: 11 pages, 4 figures

  28. arXiv:1911.08518  [pdf, other

    cs.NE eess.SP

    Supported-BinaryNet: Bitcell Array-based Weight Supports for Dynamic Accuracy-Latency Trade-offs in SRAM-based Binarized Neural Network

    Authors: Shamma Nasrin, Srikanth Ramakrishna, Theja Tulabandhula, Amit Ranjan Trivedi

    Abstract: In this work, we introduce bitcell array-based support parameters to improve the prediction accuracy of SRAM-based binarized neural network (SRAM-BNN). Our approach enhances the training weight space of SRAM-BNN while requiring minimal overheads to a typical design. More flexibility of the weight space leads to higher prediction accuracy in our design. We adapt row digital-to-analog (DAC) converte… ▽ More

    Submitted 26 November, 2019; v1 submitted 19 November, 2019; originally announced November 2019.

  29. arXiv:1901.07734  [pdf, ps, other

    cs.LG cs.IR stat.ML

    Thompson Sampling for a Fatigue-aware Online Recommendation System

    Authors: Yunjuan Wang, Theja Tulabandhula

    Abstract: In this paper we consider an online recommendation setting, where a platform recommends a sequence of items to its users at every time period. The users respond by selecting one of the items recommended or abandon the platform due to fatigue from seeing less useful items. Assuming a parametric stochastic model of user behavior, which captures positional effects of these items as well as the abando… ▽ More

    Submitted 14 April, 2019; v1 submitted 23 January, 2019; originally announced January 2019.

  30. arXiv:1811.09026  [pdf, other

    cs.LG cs.AI stat.ML

    Bandits with Temporal Stochastic Constraints

    Authors: Priyank Agrawal, Theja Tulabandhula

    Abstract: We study the effect of impairment on stochastic multi-armed bandits and develop new ways to mitigate it. Impairment effect is the phenomena where an agent only accrues reward for an action if they have played it at least a few times in the recent past. It is practically motivated by repetition and recency effects in domains such as advertising (here consumer behavior may require repeat actions by… ▽ More

    Submitted 20 June, 2020; v1 submitted 22 November, 2018; originally announced November 2018.

    Comments: An extended abstract appeared in the 4th Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2019)

  31. arXiv:1804.08796  [pdf, other

    stat.ML cs.LG stat.AP

    Block-Structure Based Time-Series Models For Graph Sequences

    Authors: Mehrnaz Amjadi, Theja Tulabandhula

    Abstract: Although the computational and statistical trade-off for modeling single graphs, for instance, using block models is relatively well understood, extending such results to sequences of graphs has proven to be difficult. In this work, we take a step in this direction by proposing two models for graph sequences that capture: (a) link persistence between nodes across time, and (b) community persistenc… ▽ More

    Submitted 18 September, 2018; v1 submitted 23 April, 2018; originally announced April 2018.

    Comments: 40 pages, 10 figures

  32. arXiv:1803.01968  [pdf, ps, other

    stat.ML cs.LG econ.EM math.OC

    An Online Algorithm for Learning Buyer Behavior under Realistic Pricing Restrictions

    Authors: Debjyoti Saharoy, Theja Tulabandhula

    Abstract: We propose a new efficient online algorithm to learn the parameters governing the purchasing behavior of a utility maximizing buyer, who responds to prices, in a repeated interaction setting. The key feature of our algorithm is that it can learn even non-linear buyer utility while working with arbitrary price constraints that the seller may impose. This overcomes a major shortcoming of previous ap… ▽ More

    Submitted 5 March, 2018; originally announced March 2018.

  33. arXiv:1710.03275  [pdf, ps, other

    cs.IR cs.CR

    Privacy-preserving Targeted Advertising

    Authors: Theja Tulabandhula, Shailesh Vaya, Aritra Dhar

    Abstract: Recommendation systems form the center piece of a rapidly growing trillion dollar online advertisement industry. Even with numerous optimizations and approximations, collaborative filtering (CF) based approaches require real-time computations involving very large vectors. Curating and storing such related profile information vectors on web portals seriously breaches the user's privacy. Modifying s… ▽ More

    Submitted 17 June, 2018; v1 submitted 9 October, 2017; originally announced October 2017.

    Comments: A preliminary version was presented at the 11th INFORMS Workshop on Data Mining and Decision Analytics (2016)

  34. arXiv:1708.05510  [pdf, other

    math.OC cs.DS

    Optimizing Revenue over Data-driven Assortments

    Authors: Deeksha Sinha, Theja Tulabandhula

    Abstract: We revisit the problem of large-scale assortment optimization under the multinomial logit choice model without any assumptions on the structure of the feasible assortments. Scalable real-time assortment optimization has become essential in e-commerce operations due to the need for personalization and the availability of a large variety of items. While this can be done when there are simplistic ass… ▽ More

    Submitted 1 May, 2018; v1 submitted 18 August, 2017; originally announced August 2017.

    Comments: 28 pages, 4 figures

  35. arXiv:1706.02999  [pdf, other

    stat.ML cs.AI cs.LG

    Symmetry Learning for Function Approximation in Reinforcement Learning

    Authors: Anuj Mahajan, Theja Tulabandhula

    Abstract: In this paper we explore methods to exploit symmetries for ensuring sample efficiency in reinforcement learning (RL), this problem deserves ever increasing attention with the recent advances in the use of deep networks for complex RL tasks which require large amount of training data. We introduce a novel method to detect symmetries using reward trails observed during episodic experience and prove… ▽ More

    Submitted 9 June, 2017; originally announced June 2017.

    Comments: 12 pages, 3 figures. A preliminary version appears in AAMAS 2017. Also presented at the 3rd Multidisciplinary Conference on Reinforcement Learning and Decision Making

  36. arXiv:1706.02682  [pdf, other

    math.OC cs.DS cs.GT

    Impact of Detour-Aware Policies on Maximizing Profit in Ridesharing

    Authors: Arpita Biswas, Ragavendran Gopalakrishnan, Theja Tulabandhula, Asmita Metrewar, Koyel Mukherjee, Raja Subramaniam Thangaraj

    Abstract: This paper provides efficient solutions to maximize profit for commercial ridesharing services, under a pricing model with detour-based discounts for passengers. We propose greedy heuristics for real-time ride matching that offer different trade-offs between optimality and speed. Simulations on New York City (NYC) taxi trip data show that our heuristics are up to 90% optimal and 10^5 times faster… ▽ More

    Submitted 8 June, 2017; originally announced June 2017.

    Comments: 18 pages, 10 figures

  37. arXiv:1706.02237  [pdf, other

    cs.LG stat.ML

    Efficient Reinforcement Learning via Initial Pure Exploration

    Authors: Sudeep Raja Putta, Theja Tulabandhula

    Abstract: In several realistic situations, an interactive learning agent can practice and refine its strategy before going on to be evaluated. For instance, consider a student preparing for a series of tests. She would typically take a few practice tests to know which areas she needs to improve upon. Based of the scores she obtains in these practice tests, she would formulate a strategy for maximizing her s… ▽ More

    Submitted 7 June, 2017; originally announced June 2017.

    Comments: 4 pages, 3 figures, Presented at Reinforcement Learning and Decision Making 2017

  38. arXiv:1704.00367  [pdf, other

    cs.LG cs.IT stat.ML

    Provable Inductive Robust PCA via Iterative Hard Thresholding

    Authors: U. N. Niranjan, Arun Rajkumar, Theja Tulabandhula

    Abstract: The robust PCA problem, wherein, given an input data matrix that is the superposition of a low-rank matrix and a sparse matrix, we aim to separate out the low-rank and sparse components, is a well-studied problem in machine learning. One natural question that arises is that, as in the inductive setting, if features are provided as input as well, can we hope to do better? Answering this in the affi… ▽ More

    Submitted 4 July, 2017; v1 submitted 2 April, 2017; originally announced April 2017.

  39. arXiv:1703.07853  [pdf, other

    cs.LG

    Faster Reinforcement Learning Using Active Simulators

    Authors: Vikas Jain, Theja Tulabandhula

    Abstract: In this work, we propose several online methods to build a \emph{learning curriculum} from a given set of target-task-specific training tasks in order to speed up reinforcement learning (RL). These methods can decrease the total training time needed by an RL agent compared to training on the target task from scratch. Unlike traditional transfer learning, we consider creating a sequence from severa… ▽ More

    Submitted 21 November, 2017; v1 submitted 22 March, 2017; originally announced March 2017.

    Comments: 12 pages and 4 figures More experiments added to the previous version

  40. arXiv:1703.07807  [pdf, other

    cs.LG

    Learning to Partition using Score Based Compatibilities

    Authors: Arun Rajkumar, Koyel Mukherjee, Theja Tulabandhula

    Abstract: We study the problem of learning to partition users into groups, where one must learn the compatibilities between the users to achieve optimal grou**s. We define four natural objectives that optimize for average and worst case compatibilities and propose new algorithms for adaptively learning optimal grou**s. When we do not impose any structure on the compatibilities, we show that the group fo… ▽ More

    Submitted 22 March, 2017; originally announced March 2017.

    Comments: Appears in the Proceedings of the 16th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2017)

  41. arXiv:1609.05536  [pdf, other

    cs.LG stat.ML

    Learning Personalized Optimal Control for Repeatedly Operated Systems

    Authors: Theja Tulabandhula

    Abstract: We consider the problem of online learning of optimal control for repeatedly operated systems in the presence of parametric uncertainty. During each round of operation, environment selects system parameters according to a fixed but unknown probability distribution. These parameters govern the dynamics of a plant. An agent chooses a control input to the plant and is then revealed the cost of the ch… ▽ More

    Submitted 18 September, 2016; originally announced September 2016.

    Comments: This work was presented at the NIPS 2015 Workshop: Machine Learning From and For Adaptive User Technologies: From Active Learning & Experimentation to Optimization & Personalization (ref. https://sites.google.com/site/mlaihci)

  42. arXiv:1608.04929  [pdf, other

    cs.LG

    Reinforcement Learning algorithms for regret minimization in structured Markov Decision Processes

    Authors: K J Prabuchandran, Tejas Bodas, Theja Tulabandhula

    Abstract: A recent goal in the Reinforcement Learning (RL) framework is to choose a sequence of actions or a policy to maximize the reward collected or minimize the regret incurred in a finite time horizon. For several RL problems in operation research and optimal control, the optimal policy of the underlying Markov Decision Process (MDP) is characterized by a known structure. The current state of the art a… ▽ More

    Submitted 17 August, 2016; originally announced August 2016.

    Comments: An extended abstract appears in AAMAS 2016

  43. arXiv:1607.07306  [pdf, other

    cs.GT cs.CC cs.DS

    The Costs and Benefits of Sharing: Sequential Individual Rationality and Sequential Fairness

    Authors: Ragavendran Gopalakrishnan, Koyel Mukherjee, Theja Tulabandhula

    Abstract: In designing dynamic shared service systems that incentivize customers to opt for shared rather than exclusive service, the traditional notion of individual rationality may be insufficient, as a customer's estimated utility could fluctuate arbitrarily during their time in the shared system, as long as their realized utility at service completion is not worse than that for exclusive service. In thi… ▽ More

    Submitted 20 June, 2017; v1 submitted 25 July, 2016; originally announced July 2016.

    Comments: Presented as a poster at EC 2016. Presented as an invited talk (sponsored session) at INFORMS Annual Meeting 2016. Presented at MSOM Service Operations SIG 2017. Currently under review at Management Science

  44. arXiv:1407.1097  [pdf, other

    math.OC cs.LG stat.ML

    Robust Optimization using Machine Learning for Uncertainty Sets

    Authors: Theja Tulabandhula, Cynthia Rudin

    Abstract: Our goal is to build robust optimization problems for making decisions based on complex data from the past. In robust optimization (RO) generally, the goal is to create a policy for decision-making that is robust to our uncertainty about the future. In particular, we want our policy to best handle the the worst possible situation that could arise, out of an uncertainty set of possible situations.… ▽ More

    Submitted 3 July, 2014; originally announced July 2014.

    Comments: 28 pages, 2 figures; a shorter preliminary version appeared in ISAIM 2014

  45. arXiv:1405.7764  [pdf, other

    stat.ML cs.LG

    Generalization Bounds for Learning with Linear, Polygonal, Quadratic and Conic Side Knowledge

    Authors: Theja Tulabandhula, Cynthia Rudin

    Abstract: In this paper, we consider a supervised learning setting where side knowledge is provided about the labels of unlabeled examples. The side knowledge has the effect of reducing the hypothesis space, leading to tighter generalization bounds, and thus possibly better generalization. We consider several types of side knowledge, the first leading to linear and polygonal constraints on the hypothesis sp… ▽ More

    Submitted 7 October, 2014; v1 submitted 29 May, 2014; originally announced May 2014.

    Comments: 37 pages, 3 figures, a shorter version appeared in ISAIM 2014 (new additions include a reference change and a new figure)

  46. arXiv:1112.0698  [pdf, other

    stat.ML cs.AI math.OC

    Machine Learning with Operational Costs

    Authors: Theja Tulabandhula, Cynthia Rudin

    Abstract: This work proposes a way to align statistical modeling with decision making. We provide a method that propagates the uncertainty in predictive modeling to the uncertainty in operational cost, where operational cost is the amount spent by the practitioner in solving the problem. The method allows us to explore the range of operational costs associated with the set of reasonable statistical models,… ▽ More

    Submitted 18 June, 2013; v1 submitted 3 December, 2011; originally announced December 2011.

    Comments: Current version: Final version appearing in JMLR 2013. v2: Many parts have been rewritten including the introduction, Minor correction of Theorem 6. 38 pages. Previously: v1: 36 pages, 8 figures. Short version appears in Proceedings of the International Symposium on Artificial Intelligence and Mathematics, 2012

  47. arXiv:1104.5061  [pdf, other

    math.OC cs.LG stat.ML

    On Combining Machine Learning with Decision Making

    Authors: Theja Tulabandhula, Cynthia Rudin

    Abstract: We present a new application and covering number bound for the framework of "Machine Learning with Operational Costs (MLOC)," which is an exploratory form of decision theory. The MLOC framework incorporates knowledge about how a predictive model will be used for a subsequent task, thus combining machine learning with the decision that is made afterwards. In this work, we use the MLOC framework to… ▽ More

    Submitted 12 March, 2014; v1 submitted 26 April, 2011; originally announced April 2011.

    Comments: 35 pages, 16 figures, longer version of a paper appearing in Algorithmic Decision Theory 2011