Skip to main content

Showing 1–50 of 76 results for author: Gupta, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12276  [pdf, other

    cs.AI cs.CL cs.SE

    CodeNav: Beyond tool-use to using real-world codebases with LLM agents

    Authors: Tanmay Gupta, Luca Weihs, Aniruddha Kembhavi

    Abstract: We present CodeNav, an LLM agent that navigates and leverages previously unseen code repositories to solve user queries. In contrast to tool-use LLM agents that require ``registration'' of all relevant tools via manual descriptions within the LLM context, CodeNav automatically indexes and searches over code blocks in the target codebase, finds relevant code snippets, imports them, and uses them to… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  2. arXiv:2406.11775  [pdf, other

    cs.CV cs.AI

    Task Me Anything

    Authors: Jieyu Zhang, Weikai Huang, Zixian Ma, Oscar Michel, Dong He, Tanmay Gupta, Wei-Chiu Ma, Ali Farhadi, Aniruddha Kembhavi, Ranjay Krishna

    Abstract: Benchmarks for large multimodal language models (MLMs) now serve to simultaneously assess the general capabilities of models instead of evaluating for a specific capability. As a result, when a developer wants to identify which models to use for their application, they are overwhelmed by the number of benchmarks and remain uncertain about which benchmark's results are most reflective of their spec… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: website: https://www.task-me-anything.org

  3. arXiv:2404.05366  [pdf, other

    cs.CV

    CDAD-Net: Bridging Domain Gaps in Generalized Category Discovery

    Authors: Sai Bhargav Rongali, Sarthak Mehrotra, Ankit Jha, Mohamad Hassan N C, Shirsha Bose, Tanisha Gupta, Mainak Singha, Biplab Banerjee

    Abstract: In Generalized Category Discovery (GCD), we cluster unlabeled samples of known and novel classes, leveraging a training dataset of known classes. A salient challenge arises due to domain shifts between these datasets. To address this, we present a novel setting: Across Domain Generalized Category Discovery (AD-GCD) and bring forth CDAD-NET (Class Discoverer Across Domains) as a remedy. CDAD-NET is… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted in L3D-IVU, CVPR Workshop, 2024

  4. arXiv:2404.01619  [pdf, other

    cs.CR cs.SI

    Making Privacy-preserving Federated Graph Analytics with Strong Guarantees Practical (for Certain Queries)

    Authors: Kunlong Liu, Trinabh Gupta

    Abstract: Privacy-preserving federated graph analytics is an emerging area of research. The goal is to run graph analytics queries over a set of devices that are organized as a graph while kee** the raw data on the devices rather than centralizing it. Further, no entity may learn any new information except for the final query result. For instance, a device may not learn a neighbor's data. The state-of-the… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: to be published in SACMAT 2024

  5. arXiv:2404.01475  [pdf, other

    cs.LG cond-mat.mtrl-sci cs.AI physics.chem-ph

    Are large language models superhuman chemists?

    Authors: Adrian Mirza, Nawaf Alampara, Sreekanth Kunchapu, Benedict Emoekabu, Aswanth Krishnan, Mara Wilhelmi, Macjonathan Okereke, Juliane Eberhardt, Amir Mohammad Elahi, Maximilian Greiner, Caroline T. Holick, Tanya Gupta, Mehrdad Asgari, Christina Glaubitz, Lea C. Klepsch, Yannik Köster, Jakob Meyer, Santiago Miret, Tim Hoffmann, Fabian Alexander Kreth, Michael Ringleb, Nicole Roesner, Ulrich S. Schubert, Leanne M. Stafast, Dinga Wonanke , et al. (3 additional authors not shown)

    Abstract: Large language models (LLMs) have gained widespread interest due to their ability to process human language and perform tasks on which they have not been explicitly trained. This is relevant for the chemical sciences, which face the problem of small and diverse datasets that are frequently in the form of text. LLMs have shown promise in addressing these issues and are increasingly being harnessed… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  6. arXiv:2403.11085  [pdf, other

    cs.CV cs.CL

    m&m's: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks

    Authors: Zixian Ma, Weikai Huang, Jieyu Zhang, Tanmay Gupta, Ranjay Krishna

    Abstract: Real-world multi-modal problems are rarely solved by a single machine learning model, and often require multi-step computational plans that involve stitching several models. Tool-augmented LLMs hold tremendous promise for automating the generation of such computational plans. However, the lack of standardized benchmarks for evaluating LLMs as planners for multi-step multi-modal tasks has prevented… ▽ More

    Submitted 21 March, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

  7. arXiv:2402.15610  [pdf, other

    cs.CL

    Selective "Selective Prediction": Reducing Unnecessary Abstention in Vision-Language Reasoning

    Authors: Tejas Srinivasan, Jack Hessel, Tanmay Gupta, Bill Yuchen Lin, Ye** Choi, Jesse Thomason, Khyathi Raghavi Chandu

    Abstract: Selective prediction minimizes incorrect predictions from vision-language models (VLMs) by allowing them to abstain from answering when uncertain. However, when deploying a vision-language system with low tolerance for inaccurate predictions, selective prediction may be over-cautious and abstain too frequently, even on many correct predictions. We introduce ReCoVERR, an inference-time algorithm to… ▽ More

    Submitted 12 June, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: Accepted to ACL Findings 2024

  8. arXiv:2402.06665  [pdf, other

    cs.AI cs.CL cs.LG cs.RO

    The Essential Role of Causality in Foundation World Models for Embodied AI

    Authors: Tarun Gupta, Wenbo Gong, Chao Ma, Nick Pawlowski, Agrin Hilmkil, Meyer Scetbon, Marc Rigter, Ade Famoti, Ashley Juan Llorens, Jianfeng Gao, Stefan Bauer, Danica Kragic, Bernhard Schölkopf, Cheng Zhang

    Abstract: Recent advances in foundation models, especially in large multi-modal models and conversational agents, have ignited interest in the potential of generally capable embodied agents. Such agents will require the ability to perform new tasks in many different real-world environments. However, current foundation models fail to accurately model physical interactions and are therefore insufficient for E… ▽ More

    Submitted 29 April, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

  9. arXiv:2401.10601  [pdf, other

    cs.DS cs.DB

    Influential Slot and Tag Selection in Billboard Advertisement

    Authors: Dildar Ali, Tejash Gupta, Suman Banerjee, Yamuna Prasad

    Abstract: The selection of influential billboard slots remains an important problem in billboard advertisements. Existing studies on this problem have not considered the case of context-specific influence probability. To bridge this gap, in this paper, we introduce the Context Dependent Influential Billboard Slot Selection Problem. First, we show that the problem is NP-hard. We also show that the influence… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: 15 pages

  10. Federated learning with differential privacy and an untrusted aggregator

    Authors: Kunlong Liu, Trinabh Gupta

    Abstract: Federated learning for training models over mobile devices is gaining popularity. Current systems for this task exhibit significant trade-offs between model accuracy, privacy guarantee, and device efficiency. For instance, Oort (OSDI 2021) provides excellent accuracy and efficiency but requires a trusted central server. On the other hand, Orchard (OSDI 2020) provides good accuracy and the rigorous… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

    Comments: 22 pages, 10 figures, to be published in ICISSP 2024

    Journal ref: Proceedings of the 10th International Conference on Information Systems Security and Privacy ICISSP - Volume 1, 379-389, 2024

  11. arXiv:2312.07979  [pdf

    cs.CL cs.LG

    SLJP: Semantic Extraction based Legal Judgment Prediction

    Authors: Prameela Madambakam, Shathanaa Rajmohan, Himangshu Sharma, Tummepalli Anka Chandrahas Purushotham Gupta

    Abstract: Legal Judgment Prediction (LJP) is a judicial assistance system that recommends the legal components such as applicable statues, prison term and penalty term by analyzing the given input case document. Indian legal system is in the need of technical assistance such as artificial intelligence to solve the crores of pending cases in various courts for years and its being increased day to day. Most o… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  12. arXiv:2312.02976  [pdf, other

    cs.RO cs.AI cs.CV

    Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World

    Authors: Kiana Ehsani, Tanmay Gupta, Rose Hendrix, Jordi Salvador, Luca Weihs, Kuo-Hao Zeng, Kunal Pratap Singh, Ye** Kim, Winson Han, Alvaro Herrasti, Ranjay Krishna, Dustin Schwenk, Eli VanderBilt, Aniruddha Kembhavi

    Abstract: Reinforcement learning (RL) with dense rewards and imitation learning (IL) with human-generated trajectories are the most widely used approaches for training modern embodied agents. RL requires extensive reward sha** and auxiliary losses and is often too slow and ineffective for long-horizon tasks. While IL with human supervision is effective, collecting human trajectories at scale is extremely… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: First six authors contributed equally. Project page: https://spoc-robot.github.io/

  13. arXiv:2311.09760  [pdf, other

    cs.DC cs.DS

    Eventually Lattice-Linear Algorithms

    Authors: Arya Tanmay Gupta, Sandeep S Kulkarni

    Abstract: Lattice-linear systems allow nodes to execute asynchronously. We introduce eventually lattice-linear algorithms, where lattices are induced only among the states in a subset of the state space. The algorithm guarantees that the system transitions to a state in one of the lattices. Then, the algorithm behaves lattice linearly while traversing to an optimal state through that lattice. We present a… ▽ More

    Submitted 13 January, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: arXiv admin note: text overlap with arXiv:2109.13216

  14. arXiv:2310.08864  [pdf, other

    cs.RO

    Open X-Embodiment: Robotic Learning Datasets and RT-X Models

    Authors: Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, A**kya Jain, Albert Tung, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Andrey Kolobov, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie , et al. (267 additional authors not shown)

    Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method… ▽ More

    Submitted 1 June, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Project website: https://robotics-transformer-x.github.io

  15. arXiv:2309.14003  [pdf, other

    cs.LG cs.RO

    Hierarchical Imitation Learning for Stochastic Environments

    Authors: Maximilian Igl, Punit Shah, Paul Mougin, Sirish Srinivasan, Tarun Gupta, Brandyn White, Kyriacos Shiarlis, Shimon Whiteson

    Abstract: Many applications of imitation learning require the agent to generate the full distribution of behaviour observed in the training data. For example, to evaluate the safety of autonomous vehicles in simulation, accurate and diverse behaviour models of other road users are paramount. Existing methods that improve this distributional realism typically rely on hierarchical policies. These condition th… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: Published at IROS'23

  16. arXiv:2309.01618  [pdf, other

    cs.CL

    Critical Behavioral Traits Foster Peer Engagement in Online Mental Health Communities

    Authors: Aseem Srivastava, Tanya Gupta, Alison Cerezo, Sarah Peregrine, Lord, Md Shad Akhtar, Tanmoy Chakraborty

    Abstract: Online Mental Health Communities (OMHCs), such as Reddit, have witnessed a surge in popularity as go-to platforms for seeking information and support in managing mental health needs. Platforms like Reddit offer immediate interactions with peers, granting users a vital space for seeking mental health assistance. However, the largely unregulated nature of these platforms introduces intricate challen… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

  17. arXiv:2307.13080  [pdf, ps, other

    cs.DC

    Tolerance to Asynchrony of an Algorithm for Gathering Myopic Robots on an Infinite Triangular Grid

    Authors: Arya Tanmay Gupta, Sandeep S Kulkarni

    Abstract: In this paper, we study the problem of gathering distance-1 myopic robots on an infinite triangular grid. We show that the algorithm developed by Goswami et al. (SSS, 2022) is lattice-linear (cf. Gupta and Kulkarni, SRDS 2023). This implies that a distributed scheduler, assumed therein, is not required for this algorithm: it runs correctly in asynchrony. It also implies that the algorithm works co… ▽ More

    Submitted 10 January, 2024; v1 submitted 24 July, 2023; originally announced July 2023.

  18. arXiv:2307.11073  [pdf, other

    cs.CV cs.AI cs.GR

    OBJECT 3DIT: Language-guided 3D-aware Image Editing

    Authors: Oscar Michel, Anand Bhattad, Eli VanderBilt, Ranjay Krishna, Aniruddha Kembhavi, Tanmay Gupta

    Abstract: Existing image editing tools, while powerful, typically disregard the underlying 3D geometry from which the image is projected. As a result, edits made using these tools may become detached from the geometry and lighting conditions that are at the foundation of the image formation process. In this work, we formulate the newt ask of language-guided 3D-aware editing, where objects in an image should… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

  19. arXiv:2304.02075  [pdf, other

    cs.RO cs.AI

    GUTS: Generalized Uncertainty-Aware Thompson Sampling for Multi-Agent Active Search

    Authors: Nikhil Angad Bakshi, Tejus Gupta, Ramina Ghods, Jeff Schneider

    Abstract: Robotic solutions for quick disaster response are essential to ensure minimal loss of life, especially when the search area is too dangerous or too vast for human rescuers. We model this problem as an asynchronous multi-agent active-search task where each robot aims to efficiently seek objects of interest (OOIs) in an unknown environment. This formulation addresses the requirement that search miss… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

    Comments: 7 pages, 5 figures, 1 table, for associated video see: https://youtu.be/K0jkzdQ_j2E , to appear in International Conference on Robotics and Automation (ICRA) 2023

  20. arXiv:2302.14834  [pdf, ps, other

    cs.DS cs.DC

    DAG-Inducing Problems and Algorithms

    Authors: Arya Tanmay Gupta, Sandeep S Kulkarni

    Abstract: Consider the execution of a sequential algorithm that requires the program to converge to an optimal state, and then terminate/stutter. To design such an algorithm, we need to ensure that the state space that it traverses forms a directed acyclic graph (DAG) and its sink nodes are optimal states. However, if we run the same algorithm on multiple computing nodes running in parallel, and without syn… ▽ More

    Submitted 10 April, 2024; v1 submitted 28 February, 2023; originally announced February 2023.

  21. arXiv:2302.14139  [pdf, other

    cs.LG cs.AI cs.SE

    Scalable End-to-End ML Platforms: from AutoML to Self-serve

    Authors: Igor L. Markov, Pavlos A. Apostolopoulos, Mia R. Garrard, Tanya Qie, Yin Huang, Tanvi Gupta, Anika Li, Cesar Cardoso, George Han, Ryan Maghsoudian, Norm Zhou

    Abstract: ML platforms help enable intelligent data-driven applications and maintain them with limited engineering effort. Upon sufficiently broad adoption, such platforms reach economies of scale that bring greater component reuse while improving efficiency of system development and maintenance. For an end-to-end ML platform with broad adoption, scaling relies on pervasive ML automation and system integrat… ▽ More

    Submitted 3 March, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: 10 pages, 1 figure, 2 tables

  22. arXiv:2302.08229  [pdf, other

    cs.LG cs.CL

    Improving Spoken Language Identification with Map-Mix

    Authors: Shangeth Rajaa, Kriti Anandan, Swaraj Dalmia, Tarun Gupta, Eng Siong Chng

    Abstract: The pre-trained multi-lingual XLSR model generalizes well for language identification after fine-tuning on unseen languages. However, the performance significantly degrades when the languages are not very distinct from each other, for example, in the case of dialects. Low resource dialect classification remains a challenging problem to solve. We present a new data augmentation method that leverage… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Comments: Accepted at ICASSP 2023

  23. arXiv:2302.07207  [pdf, ps, other

    cs.DC

    Lattice Linearity of Multiplication and Modulo

    Authors: Arya Tanmay Gupta, Sandeep S Kulkarni

    Abstract: In this paper, we study the lattice linearity of multiplication and modulo operations. We demonstrate that these operations are lattice linear and the parallel processing algorithms that we study for both these operations are able to exploit the lattice linearity of their respective problems. This implies that these algorithms can be implemented in asynchronous environments, where the nodes are al… ▽ More

    Submitted 24 July, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

  24. arXiv:2211.13793  [pdf, other

    eess.SP cs.LG stat.AP

    Tensor Decomposition of Large-scale Clinical EEGs Reveals Interpretable Patterns of Brain Physiology

    Authors: Teja Gupta, Neeraj Wagh, Samarth Rawal, Brent Berry, Gregory Worrell, Yogatheesan Varatharajah

    Abstract: Identifying abnormal patterns in electroencephalography (EEG) remains the cornerstone of diagnosing several neurological diseases. The current clinical EEG review process relies heavily on expert visual review, which is unscalable and error-prone. In an effort to augment the expert review process, there is a significant interest in mining population-level EEG patterns using unsupervised approaches… ▽ More

    Submitted 4 February, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

    Comments: 4 pages, 3 Figures, 2 Tables; Accepted at IEEE NER 2023

  25. arXiv:2211.13769  [pdf, other

    cs.CV cs.AI cs.LG

    On Designing Light-Weight Object Trackers through Network Pruning: Use CNNs or Transformers?

    Authors: Saksham Aggarwal, Taneesh Gupta, Pawan Kumar Sahu, Arnav Chavan, Rishabh Tiwari, Dilip K. Prasad, Deepak K. Gupta

    Abstract: Object trackers deployed on low-power devices need to be light-weight, however, most of the current state-of-the-art (SOTA) methods rely on using compute-heavy backbones built using CNNs or transformers. Large sizes of such models do not allow their deployment in low-power conditions and designing compressed variants of large tracking models is of great importance. This paper demonstrates how high… ▽ More

    Submitted 26 March, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

    Comments: Accepted at IEEE ICASSP 2023

  26. arXiv:2211.11559  [pdf, other

    cs.CV cs.AI cs.CL

    Visual Programming: Compositional visual reasoning without training

    Authors: Tanmay Gupta, Aniruddha Kembhavi

    Abstract: We present VISPROG, a neuro-symbolic approach to solving complex and compositional visual tasks given natural language instructions. VISPROG avoids the need for any task-specific training. Instead, it uses the in-context learning ability of large language models to generate python-like modular programs, which are then executed to get both the solution and a comprehensive and interpretable rational… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

  27. arXiv:2211.04878  [pdf, other

    cs.LG cs.AI

    Foundation Models for Semantic Novelty in Reinforcement Learning

    Authors: Tarun Gupta, Peter Karkus, Tong Che, Danfei Xu, Marco Pavone

    Abstract: Effectively exploring the environment is a key challenge in reinforcement learning (RL). We address this challenge by defining a novel intrinsic reward based on a foundation model, such as contrastive language image pretraining (CLIP), which can encode a wealth of domain-independent semantic visual-language knowledge about the world. Specifically, our intrinsic reward is defined based on pre-train… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

    Comments: Foundation Models for Decision Making Workshop at Neural Information Processing Systems, 2022

  28. arXiv:2210.03055  [pdf, other

    cs.DC

    Inducing Lattices in Non-Lattice-Linear Problems

    Authors: Arya Tanmay Gupta, Sandeep S Kulkarni

    Abstract: Lattice-linearity was introduced as modelling problems using predicates that induce a lattice among the global states (Garg, SPAA 2020). Such modelling enables permitting asynchronous execution in multiprocessor systems. A key property of \textit{the predicate} representing such problems is that it induces \textit{one} lattice in the state space. Such representation guarantees the execution to be… ▽ More

    Submitted 29 July, 2023; v1 submitted 6 October, 2022; originally announced October 2022.

    Comments: arXiv admin note: text overlap with arXiv:2209.14703

  29. arXiv:2209.14703  [pdf, ps, other

    cs.DC cs.DS

    Fully Lattice Linear Algorithms

    Authors: Arya Tanmay Gupta, Sandeep S Kulkarni

    Abstract: This paper focuses on analyzing and differentiating between lattice linear problems and algorithms. It introduces a new class of algorithms called \textit{(fully) lattice linear algorithms}. A property of these algorithms is that they induce a partial order among all states and form \textit{multiple lattices}. An initial state locks in one of these lattices. We present a lattice linear self-stabil… ▽ More

    Submitted 10 November, 2022; v1 submitted 29 September, 2022; originally announced September 2022.

  30. arXiv:2207.13666  [pdf, other

    cs.CR

    SAC-AP: Soft Actor Critic based Deep Reinforcement Learning for Alert Prioritization

    Authors: Lalitha Chavali, Tanay Gupta, Paresh Saxena

    Abstract: Intrusion detection systems (IDS) generate a large number of false alerts which makes it difficult to inspect true positives. Hence, alert prioritization plays a crucial role in deciding which alerts to investigate from an enormous number of alerts that are generated by IDS. Recently, deep reinforcement learning (DRL) based deep deterministic policy gradient (DDPG) off-policy method has shown to a… ▽ More

    Submitted 3 August, 2022; v1 submitted 27 July, 2022; originally announced July 2022.

    Comments: 8 pages, 8 figures, IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE 2022

  31. arXiv:2207.01079  [pdf, other

    cs.CL cond-mat.mtrl-sci cs.IR

    DiSCoMaT: Distantly Supervised Composition Extraction from Tables in Materials Science Articles

    Authors: Tanishq Gupta, Mohd Zaki, Devanshi Khatsuriya, Kausik Hira, N. M. Anoop Krishnan, Mausam

    Abstract: A crucial component in the curation of KB for a scientific domain (e.g., materials science, foods & nutrition, fuels) is information extraction from tables in the domain's published research articles. To facilitate research in this direction, we define a novel NLP task of extracting compositions of materials (e.g., glasses) from tables in materials science papers. The task involves solving several… ▽ More

    Submitted 28 January, 2024; v1 submitted 3 July, 2022; originally announced July 2022.

    Comments: Accepted long paper at ACL 2023 (https://2023.aclweb.org/program/accepted_main_conference/)

  32. arXiv:2206.14913  [pdf, other

    cs.CL

    GPTs at Factify 2022: Prompt Aided Fact-Verification

    Authors: Pawan Kumar Sahu, Saksham Aggarwal, Taneesh Gupta, Gyanendra Das

    Abstract: One of the most pressing societal issues is the fight against false news. The false claims, as difficult as they are to expose, create a lot of damage. To tackle the problem, fact verification becomes crucial and thus has been a topic of interest among diverse research communities. Using only the textual form of data we propose our solution to the problem and achieve competitive results with other… ▽ More

    Submitted 29 June, 2022; originally announced June 2022.

    Comments: Accepted in AAAI'22: First Workshop on Multimodal Fact-Checking and Hate Speech Detection, Februrary 22 - March 1, 2022,Vancouver, BC, Canada

  33. arXiv:2204.13653  [pdf, other

    cs.CV

    GRIT: General Robust Image Task Benchmark

    Authors: Tanmay Gupta, Ryan Marten, Aniruddha Kembhavi, Derek Hoiem

    Abstract: Computer vision models excel at making predictions when the test distribution closely resembles the training distribution. Such models have yet to match the ability of biological vision to learn from multiple sources and generalize to new data sources and tasks. To facilitate the development and evaluation of more general vision systems, we introduce the General Robust Image Task (GRIT) benchmark.… ▽ More

    Submitted 2 May, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

  34. arXiv:2203.11774  [pdf, other

    cs.SD cs.LG eess.AS

    Estimation of speaker age and height from speech signal using bi-encoder transformer mixture model

    Authors: Tarun Gupta, Duc-Tuan Truong, Tran The Anh, Chng Eng Siong

    Abstract: The estimation of speaker characteristics such as age and height is a challenging task, having numerous applications in voice forensic analysis. In this work, we propose a bi-encoder transformer mixture model for speaker age and height estimation. Considering the wide differences in male and female voice characteristics such as differences in formant and fundamental frequencies, we propose the use… ▽ More

    Submitted 22 March, 2022; originally announced March 2022.

    Comments: Submitted to Interspeech 2022

  35. arXiv:2202.02317  [pdf, other

    cs.CV cs.CL

    Webly Supervised Concept Expansion for General Purpose Vision Models

    Authors: Amita Kamath, Christopher Clark, Tanmay Gupta, Eric Kolve, Derek Hoiem, Aniruddha Kembhavi

    Abstract: General Purpose Vision (GPV) systems are models that are designed to solve a wide array of visual tasks without requiring architectural changes. Today, GPVs primarily learn both skills and concepts from large fully supervised datasets. Scaling GPVs to tens of thousands of concepts by acquiring data to learn each concept for every skill quickly becomes prohibitive. This work presents an effective a… ▽ More

    Submitted 20 July, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

    Comments: ECCV 2022

  36. arXiv:2202.00104  [pdf, other

    cs.LG cs.AI cs.MA

    Generalization in Cooperative Multi-Agent Systems

    Authors: Anuj Mahajan, Mikayel Samvelyan, Tarun Gupta, Benjamin Ellis, Mingfei Sun, Tim Rocktäschel, Shimon Whiteson

    Abstract: Collective intelligence is a fundamental trait shared by several species of living organisms. It has allowed them to thrive in the diverse environmental conditions that exist on our planet. From simple organisations in an ant colony to complex systems in human groups, collective intelligence is vital for solving complex survival tasks. As is commonly observed, such natural systems are flexible to… ▽ More

    Submitted 21 February, 2022; v1 submitted 31 January, 2022; originally announced February 2022.

  37. arXiv:2110.08963  [pdf, other

    cs.AI

    SS-MAIL: Self-Supervised Multi-Agent Imitation Learning

    Authors: Akshay Dharmavaram, Tejus Gupta, Jiachen Li, Katia P. Sycara

    Abstract: The current landscape of multi-agent expert imitation is broadly dominated by two families of algorithms - Behavioral Cloning (BC) and Adversarial Imitation Learning (AIL). BC approaches suffer from compounding errors, as they ignore the sequential decision-making nature of the trajectory generation problem. Furthermore, they cannot effectively model multi-modal behaviors. While AIL methods solve… ▽ More

    Submitted 17 October, 2021; originally announced October 2021.

    Comments: Pre-Print

  38. arXiv:2110.07554  [pdf, other

    cs.LG cs.AI cs.SE

    Looper: An end-to-end ML platform for product decisions

    Authors: Igor L. Markov, Hanson Wang, Nitya Kasturi, Shaun Singh, Sze Wai Yuen, Mia Garrard, Sarah Tran, Yin Huang, Zehui Wang, Igor Glotov, Tanvi Gupta, Boshuang Huang, Peng Chen, Xiaowen Xie, Michael Belkin, Sal Uryasev, Sam Howie, Eytan Bakshy, Norm Zhou

    Abstract: Modern software systems and products increasingly rely on machine learning models to make data-driven decisions based on interactions with users, infrastructure and other systems. For broader adoption, this practice must (i) accommodate product engineers without ML backgrounds, (ii) support finegrain product-metric evaluation and (iii) optimize for product goals. To address shortcomings of prior p… ▽ More

    Submitted 21 June, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: 11 pages + references, 7 figures; to appear in KDD 2022

  39. arXiv:2109.15290  [pdf

    cs.CL cond-mat.mtrl-sci

    MatSciBERT: A Materials Domain Language Model for Text Mining and Information Extraction

    Authors: Tanishq Gupta, Mohd Zaki, N. M. Anoop Krishnan, Mausam

    Abstract: An overwhelmingly large amount of knowledge in the materials domain is generated and stored as text published in peer-reviewed scientific literature. Recent developments in natural language processing, such as bidirectional encoder representations from transformers (BERT) models, provide promising tools to extract information from these texts. However, direct application of these models in the mat… ▽ More

    Submitted 30 September, 2021; originally announced September 2021.

  40. arXiv:2109.14579  [pdf

    cs.RO

    A secure home automation prototype built on raspberry-pi

    Authors: Arya Tanmay Gupta, Himani Gupta, Muskan Sharma, Priyanka Khanna

    Abstract: With the development of sensors, wireless mobile communication, embedded system, the technologies of the Internet of Things have been widely used in SmartMeter, public security, intelligent building and so on. Because of its huge market prospects, the Internet of Things has been paid close attention by several governments all over the world. IoT facilitates the seamless integration of wireless sen… ▽ More

    Submitted 8 October, 2021; v1 submitted 29 September, 2021; originally announced September 2021.

  41. arXiv:2109.13216  [pdf, ps, other

    cs.DC

    Extending Lattice linearity for Self-Stabilizing Algorithms

    Authors: Arya Tanmay Gupta, Sandeep S Kulkarni

    Abstract: In this article, we focus on extending the notion of lattice linearity to self-stabilizing programs. Lattice linearity allows a node to execute its actions with old information about the state of other nodes and still preserve correctness. It increases the concurrency of the program execution by eliminating the need for synchronization among its nodes. The extension -- denoted as eventually lattic… ▽ More

    Submitted 18 October, 2021; v1 submitted 27 September, 2021; originally announced September 2021.

  42. arXiv:2108.09418  [pdf, other

    cs.DC

    Technical Report: Using Static Analysis to Compute Benefit of Tolerating Consistency

    Authors: Duong Nguyen, Arya Tanmay Gupta, Sandeep S. Kulkarni

    Abstract: Synchronization is the Achilles heel of concurrent programs. Synchronization requirement is often used to ensure that the execution of the concurrent program can be serialized. Without synchronization requirement, a program suffers from consistency violations. Recently, it was shown that if programs are designed to tolerate such consistency violation faults (\cvf{s}) then one can obtain substantia… ▽ More

    Submitted 7 October, 2022; v1 submitted 20 August, 2021; originally announced August 2021.

  43. Foveal-pit inspired filtering of DVS spike response

    Authors: Shriya T. P. Gupta, Pablo Linares-Serrano, Basabdatta Sen Bhattacharya, Teresa Serrano-Gotarredona

    Abstract: In this paper, we present results of processing Dynamic Vision Sensor (DVS) recordings of visual patterns with a retinal model based on foveal-pit inspired Difference of Gaussian (DoG) filters. A DVS sensor was stimulated with varying number of vertical white and black bars of different spatial frequencies moving horizontally at a constant velocity. The output spikes generated by the DVS sensor we… ▽ More

    Submitted 29 May, 2021; originally announced May 2021.

    Comments: 6 pages, 4 figures, 2 tables. 2021 55th Annual Conference on Information Sciences and Systems (CISS), 2021

    ACM Class: I.2.10; I.4.5; I.4.10

  44. Implementing a foveal-pit inspired filter in a Spiking Convolutional Neural Network: a preliminary study

    Authors: Shriya T. P. Gupta, Basabdatta Sen Bhattacharya

    Abstract: We have presented a Spiking Convolutional Neural Network (SCNN) that incorporates retinal foveal-pit inspired Difference of Gaussian filters and rank-order encoding. The model is trained using a variant of the backpropagation algorithm adapted to work with spiking neurons, as implemented in the Nengo library. We have evaluated the performance of our model on two publicly available datasets - one f… ▽ More

    Submitted 29 May, 2021; originally announced May 2021.

    Comments: 8 pages, 8 figures, 4 tables. 2020 International Joint Conference on Neural Networks (IJCNN)

    ACM Class: I.2.10; I.4.5; I.4.10

  45. arXiv:2104.13446  [pdf, other

    cs.LG cs.MA

    Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients

    Authors: Bozhidar Vasilev, Tarun Gupta, Bei Peng, Shimon Whiteson

    Abstract: Policy gradient methods are an attractive approach to multi-agent reinforcement learning problems due to their convergence properties and robustness in partially observable scenarios. However, there is a significant performance gap between state-of-the-art policy gradient and value-based methods on the popular StarCraft Multi-Agent Challenge (SMAC) benchmark. In this paper, we introduce semi-on-po… ▽ More

    Submitted 6 May, 2021; v1 submitted 27 April, 2021; originally announced April 2021.

    Comments: AAMAS Adaptive and Learning Agents Workshop. 20th International Conference on Autonomous Agents and Multiagent Systems

  46. arXiv:2104.08793  [pdf, other

    cs.CL cs.AI cs.LG

    SalKG: Learning From Knowledge Graph Explanations for Commonsense Reasoning

    Authors: Aaron Chan, Jiashu Xu, Boyuan Long, Soumya Sanyal, Tanishq Gupta, Xiang Ren

    Abstract: Augmenting pre-trained language models with knowledge graphs (KGs) has achieved success on various commonsense reasoning tasks. However, for a given task instance, the KG, or certain parts of the KG, may not be useful. Although KG-augmented models often use attention to focus on specific KG components, the KG is still always used, and the attention mechanism is never explicitly taught which KG com… ▽ More

    Submitted 20 March, 2022; v1 submitted 18 April, 2021; originally announced April 2021.

    Comments: NeurIPS 2021

  47. arXiv:2104.00990  [pdf, other

    cs.CV cs.CL

    Visual Semantic Role Labeling for Video Understanding

    Authors: Arka Sadhu, Tanmay Gupta, Mark Yatskar, Ram Nevatia, Aniruddha Kembhavi

    Abstract: We propose a new framework for understanding and representing related salient events in a video using visual semantic role labeling. We represent videos as a set of related events, wherein each event consists of a verb and multiple entities that fulfill various roles relevant to that event. To study the challenging task of semantic role labeling in videos or VidSRL, we introduce the VidSitu benchm… ▽ More

    Submitted 2 April, 2021; originally announced April 2021.

    Comments: CVPR21 camera-ready including appendix. Project Page at https://vidsitu.org/

  48. arXiv:2104.00743  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Towards General Purpose Vision Systems

    Authors: Tanmay Gupta, Amita Kamath, Aniruddha Kembhavi, Derek Hoiem

    Abstract: Computer vision systems today are primarily N-purpose systems, designed and trained for a predefined set of tasks. Adapting such systems to new tasks is challenging and often requires non-trivial modifications to the network architecture (e.g. adding new output heads) or training process (e.g. adding new losses). To reduce the time and expertise required to develop new applications, we would like… ▽ More

    Submitted 19 April, 2022; v1 submitted 1 April, 2021; originally announced April 2021.

    Comments: CVPR 2022 Oral; Project page: https://prior.allenai.org/projects/gpv

  49. arXiv:2101.10814   

    physics.soc-ph cs.DM math.PR

    Spread and defend infection in graphs

    Authors: Arya Tanmay Gupta

    Abstract: The spread of an infection, a contagion, meme, emotion, message and various other spreadable objects have been discussed in several works. Burning and firefighting have been discussed in particular on static graphs. Graph burning simulates the notion of the spread of "fire" throughout a graph (plus, one unburned node burned at each time-step); graph firefighting simulates the defending of nodes by… ▽ More

    Submitted 16 November, 2023; v1 submitted 5 January, 2021; originally announced January 2021.

    Comments: incomplete work. major revision required

  50. arXiv:2011.09533  [pdf, other

    cs.AI

    Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?

    Authors: Christian Schroeder de Witt, Tarun Gupta, Denys Makoviichuk, Viktor Makoviychuk, Philip H. S. Torr, Mingfei Sun, Shimon Whiteson

    Abstract: Most recently developed approaches to cooperative multi-agent reinforcement learning in the \emph{centralized training with decentralized execution} setting involve estimating a centralized, joint value function. In this paper, we demonstrate that, despite its various theoretical shortcomings, Independent PPO (IPPO), a form of independent learning in which each agent simply estimates its local val… ▽ More

    Submitted 18 November, 2020; originally announced November 2020.