Skip to main content

Showing 1–50 of 51 results for author: Ramesh, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17296  [pdf, other

    cs.LG

    BlockLLM: Memory-Efficient Adaptation of LLMs by Selecting and Optimizing the Right Coordinate Blocks

    Authors: Amrutha Varshini Ramesh, Vignesh Ganapathiraman, Issam H. Laradji, Mark Schmidt

    Abstract: Training large language models (LLMs) for pretraining or adapting to new tasks and domains has become increasingly critical as their applications expand. However, as the model and the data sizes grow, the training process presents significant memory challenges, often requiring a prohibitive amount of GPU memory that may not be readily available. Existing methods such as low-rank adaptation (LoRA)… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 16 pages, 7 figures

  2. arXiv:2405.17030  [pdf

    cs.CV cs.LG

    SCaRL- A Synthetic Multi-Modal Dataset for Autonomous Driving

    Authors: Avinash Nittur Ramesh, Aitor Correas-Serrano, María González-Huici

    Abstract: We present a novel synthetically generated multi-modal dataset, SCaRL, to enable the training and validation of autonomous driving solutions. Multi-modal datasets are essential to attain the robustness and high accuracy required by autonomous systems in applications such as autonomous driving. As deep learning-based solutions are becoming more prevalent for object detection, classification, and tr… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted in International Conference on Microwaves for Intelligent Mobility - 16.&17. April 2024 - Boppard near Koblenz, Germany

  3. arXiv:2405.03878  [pdf, other

    cs.LG cs.AI

    Sequence Compression Speeds Up Credit Assignment in Reinforcement Learning

    Authors: Aditya A. Ramesh, Kenny Young, Louis Kirsch, Jürgen Schmidhuber

    Abstract: Temporal credit assignment in reinforcement learning is challenging due to delayed and stochastic outcomes. Monte Carlo targets can bridge long delays between action and consequence but lead to high-variance targets due to stochasticity. Temporal difference (TD) learning uses bootstrap** to overcome variance but introduces a bias that can only be corrected through many iterations. TD($λ$) provid… ▽ More

    Submitted 4 June, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

    Comments: ICML 2024 version

  4. arXiv:2312.03858  [pdf, other

    cs.OS cs.SE

    Stop Hiding The Sharp Knives: The WebAssembly Linux Interface

    Authors: Arjun Ramesh, Tianshu Huang, Ben L. Titzer, Anthony Rowe

    Abstract: WebAssembly is gaining popularity as a portable binary format targetable from many programming languages. With a well-specified low-level virtual instruction set, minimal memory footprint and many high-performance implementations, it has been successfully adopted for lightweight in-process memory sandboxing in many contexts. Despite these advantages, WebAssembly lacks many standard system interfac… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: 12 pages, 8 figures

  5. arXiv:2309.11820  [pdf, other

    eess.IV cs.CV

    Automatic Endoscopic Ultrasound Station Recognition with Limited Data

    Authors: Abhijit Ramesh, Anantha Nandanan, Nikhil Boggavarapu, Priya Nair MD, Gilad Gressel

    Abstract: Pancreatic cancer is a lethal form of cancer that significantly contributes to cancer-related deaths worldwide. Early detection is essential to improve patient prognosis and survival rates. Despite advances in medical imaging techniques, pancreatic cancer remains a challenging disease to detect. Endoscopic ultrasound (EUS) is the most effective diagnostic tool for detecting pancreatic cancer. Howe… ▽ More

    Submitted 28 December, 2023; v1 submitted 21 September, 2023; originally announced September 2023.

  6. arXiv:2307.01169  [pdf, other

    math.OC cs.LG stat.ML

    Analyzing and Improving Greedy 2-Coordinate Updates for Equality-Constrained Optimization via Steepest Descent in the 1-Norm

    Authors: Amrutha Varshini Ramesh, Aaron Mishkin, Mark Schmidt, Yihan Zhou, Jonathan Wilder Lavington, Jennifer She

    Abstract: We consider minimizing a smooth function subject to a summation constraint over its variables. By exploiting a connection between the greedy 2-coordinate update for this problem and equality-constrained steepest descent in the 1-norm, we give a convergence rate for greedy selection under a proximal Polyak-Lojasiewicz assumption that is faster than random selection and independent of the problem di… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  7. arXiv:2306.10142  [pdf, other

    cs.CV cs.AI cs.RO

    Enhancing Visual Domain Adaptation with Source Preparation

    Authors: Anirudha Ramesh, Anurag Ghosh, Christoph Mertz, Jeff Schneider

    Abstract: Robotic Perception in diverse domains such as low-light scenarios, where new modalities like thermal imaging and specialized night-vision sensors are increasingly employed, remains a challenge. Largely, this is due to the limited availability of labeled data. Existing Domain Adaptation (DA) techniques, while promising to leverage labels from existing well-lit RGB images, fail to consider the chara… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    ACM Class: I.4; I.5; I.2

  8. arXiv:2306.01570  [pdf

    cs.LG eess.SY math.OC

    Spatio-Temporal Deep Learning-Assisted Reduced Security-Constrained Unit Commitment

    Authors: Arun Venkatesh Ramesh, Xingpeng Li

    Abstract: Security-constrained unit commitment (SCUC) is a computationally complex process utilized in power system day-ahead scheduling and market clearing. SCUC is run daily and requires state-of-the-art algorithms to speed up the process. The constraints and data associated with SCUC are both geographically and temporally correlated to ensure the reliability of the solution, which further increases the c… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: 8 Figures, 5 Tables, 1 Algorithm

  9. arXiv:2305.17066  [pdf, other

    cs.AI cs.CL cs.CV cs.LG cs.MA

    Mindstorms in Natural Language-Based Societies of Mind

    Authors: Mingchen Zhuge, Haozhe Liu, Francesco Faccio, Dylan R. Ashley, Róbert Csordás, Anand Gopalakrishnan, Abdullah Hamdi, Hasan Abed Al Kader Hammoud, Vincent Herrmann, Kazuki Irie, Louis Kirsch, Bing Li, Guohao Li, Shuming Liu, **jie Mai, Piotr Piękos, Aditya Ramesh, Imanol Schlag, Weimin Shi, Aleksandar Stanić, Wenyi Wang, Yuhui Wang, Mengmeng Xu, Deng-** Fan, Bernard Ghanem , et al. (1 additional authors not shown)

    Abstract: Both Minsky's "society of mind" and Schmidhuber's "learning to think" inspire diverse societies of large multimodal neural networks (NNs) that solve problems by interviewing each other in a "mindstorm." Recent implementations of NN-based societies of minds consist of large language models (LLMs) and other NN-based experts communicating through a natural language interface. In doing so, they overco… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: 9 pages in main text + 7 pages of references + 38 pages of appendices, 14 figures in main text + 13 in appendices, 7 tables in appendices

    MSC Class: 68T07 ACM Class: I.2.6; I.2.11

  10. arXiv:2303.08774  [pdf, other

    cs.CL cs.AI

    GPT-4 Technical Report

    Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

    Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More

    Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 100 pages; updated authors list; fixed author names and added citation

  11. arXiv:2303.06776  [pdf, other

    cs.RO cs.HC

    Robot Health Indicator: A Visual Cue to Improve Level of Autonomy Switching Systems

    Authors: Aniketh Ramesh, Madeleine Englund, Andreas Theodorou, Rustam Stolkin, Manolis Chiou

    Abstract: Using different Levels of Autonomy (LoA), a human operator can vary the extent of control they have over a robot's actions. LoAs enable operators to mitigate a robot's performance degradation or limitations in the its autonomous capabilities. However, LoA regulation and other tasks may often overload an operator's cognitive abilities. Inspired by video game user interfaces, we study if adding a 'R… ▽ More

    Submitted 12 March, 2023; originally announced March 2023.

    Comments: Accepted for Variable Autonomy for human-robot Teaming (VAT) workshop at ACM/IEEE HRI 2023

    ACM Class: I.2.9

  12. arXiv:2212.02179  [pdf, other

    cs.LG cs.RO

    Physics-Informed Model-Based Reinforcement Learning

    Authors: Adithya Ramesh, Balaraman Ravindran

    Abstract: We apply reinforcement learning (RL) to robotics tasks. One of the drawbacks of traditional RL algorithms has been their poor sample efficiency. One approach to improve the sample efficiency is model-based RL. In our model-based RL algorithm, we learn a model of the environment, essentially its transition dynamics and reward function, use it to generate imaginary trajectories and backpropagate thr… ▽ More

    Submitted 14 May, 2023; v1 submitted 5 December, 2022; originally announced December 2022.

  13. arXiv:2211.14095  [pdf, other

    cs.RO cs.AI cs.HC cs.MA

    A Hierarchical Variable Autonomy Mixed-Initiative Framework for Human-Robot Teaming in Mobile Robotics

    Authors: Dimitris Panagopoulos, Giannis Petousakis, Aniketh Ramesh, Tianshu Ruan, Grigoris Nikolaou, Rustam Stolkin, Manolis Chiou

    Abstract: This paper presents a Mixed-Initiative (MI) framework for addressing the problem of control authority transfer between a remote human operator and an AI agent when cooperatively controlling a mobile robot. Our Hierarchical Expert-guided Mixed-Initiative Control Switcher (HierEMICS) leverages information on the human operator's state and intent. The control switching policies are based on a critica… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

    Comments: 6 pages, 4 figures, ICHMS 2022, First two Authors contributed equally

  14. arXiv:2211.10282  [pdf, other

    cs.LG

    Exploring through Random Curiosity with General Value Functions

    Authors: Aditya Ramesh, Louis Kirsch, Sjoerd van Steenkiste, Jürgen Schmidhuber

    Abstract: Efficient exploration in reinforcement learning is a challenging problem commonly addressed through intrinsic rewards. Recent prominent approaches are based on state novelty or variants of artificial curiosity. However, directly applying them to partially observable environments can be ineffective and lead to premature dissipation of intrinsic rewards. Here we propose random curiosity with general… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

    Comments: Accepted to NeurIPS 2022

  15. arXiv:2211.02222  [pdf, other

    cs.LG

    The Benefits of Model-Based Generalization in Reinforcement Learning

    Authors: Kenny Young, Aditya Ramesh, Louis Kirsch, Jürgen Schmidhuber

    Abstract: Model-Based Reinforcement Learning (RL) is widely believed to have the potential to improve sample efficiency by allowing an agent to synthesize large amounts of imagined experience. Experience Replay (ER) can be considered a simple kind of model, which has proved effective at improving the stability and efficiency of deep RL. In principle, a learned parametric model could improve on ER by general… ▽ More

    Submitted 10 July, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

    Comments: Update to ICML version

  16. arXiv:2208.06742  [pdf

    eess.SY cs.LG

    Feasibility Layer Aided Machine Learning Approach for Day-Ahead Operations

    Authors: Arun Venkatesh Ramesh, Xingpeng Li

    Abstract: Day-ahead operations involves a complex and computationally intensive optimization process to determine the generator commitment schedule and dispatch. The optimization process is a mixed-integer linear program (MILP) also known as security-constrained unit commitment (SCUC). Independent system operators (ISOs) run SCUC daily and require state-of-the-art algorithms to speed up the process. Existin… ▽ More

    Submitted 13 August, 2022; originally announced August 2022.

    Comments: 10 pages, 9 figures, 8 tables

  17. arXiv:2207.01684  [pdf, other

    cs.RO cs.AI cs.HC

    Robot Vitals and Robot Health: Towards Systematically Quantifying Runtime Performance Degradation in Robots Under Adverse Conditions

    Authors: Aniketh Ramesh, Rustam Stolkin, Manolis Chiou

    Abstract: This paper addresses the problem of automatically detecting and quantifying performance degradation in remote mobile robots during task execution. A robot may encounter a variety of uncertainties and adversities during task execution, which can impair its ability to carry out tasks effectively and cause its performance to degrade. Such situations can be mitigated or averted by timely detection and… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

    Comments: 8 Pages

    MSC Class: 68T40

  18. arXiv:2207.01570  [pdf, other

    cs.LG stat.ML

    Goal-Conditioned Generators of Deep Policies

    Authors: Francesco Faccio, Vincent Herrmann, Aditya Ramesh, Louis Kirsch, Jürgen Schmidhuber

    Abstract: Goal-conditioned Reinforcement Learning (RL) aims at learning optimal policies, given goals encoded in special command inputs. Here we study goal-conditioned neural nets (NNs) that learn to generate deep NN policies in form of context-specific weight matrices, similar to Fast Weight Programmers and other methods from the 1990s. Using context commands of the form "generate a policy that achieves a… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

    Comments: Preprint. Under Review

  19. arXiv:2207.01566  [pdf, other

    cs.LG stat.ML

    General Policy Evaluation and Improvement by Learning to Identify Few But Crucial States

    Authors: Francesco Faccio, Aditya Ramesh, Vincent Herrmann, Jean Harb, Jürgen Schmidhuber

    Abstract: Learning to evaluate and improve policies is a core problem of Reinforcement Learning (RL). Traditional RL algorithms learn a value function defined for a single policy. A recently explored competitive alternative is to learn a single value function for many policies. Here we combine the actor-critic architecture of Parameter-Based Value Functions and the policy embedding of Policy Evaluation Netw… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

    Comments: Preprint. Under review

  20. arXiv:2204.06125  [pdf, other

    cs.CV

    Hierarchical Text-Conditional Image Generation with CLIP Latents

    Authors: Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, Mark Chen

    Abstract: Contrastive models like CLIP have been shown to learn robust representations of images that capture both semantics and style. To leverage these representations for image generation, we propose a two-stage model: a prior that generates a CLIP image embedding given a text caption, and a decoder that generates an image conditioned on the image embedding. We show that explicitly generating image repre… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

  21. arXiv:2112.10741  [pdf, other

    cs.CV cs.GR cs.LG

    GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

    Authors: Alex Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, Mark Chen

    Abstract: Diffusion models have recently been shown to generate high-quality synthetic images, especially when paired with a guidance technique to trade off diversity for fidelity. We explore diffusion models for the problem of text-conditional image synthesis and compare two different guidance strategies: CLIP guidance and classifier-free guidance. We find that the latter is preferred by human evaluators f… ▽ More

    Submitted 8 March, 2022; v1 submitted 20 December, 2021; originally announced December 2021.

    Comments: 20 pages, 18 figures

  22. arXiv:2111.09824  [pdf

    eess.SY cs.LG math.OC

    Machine Learning Assisted Approach for Security-Constrained Unit Commitment

    Authors: Arun Venkatesh Ramesh, Xingpeng Li

    Abstract: Security-constrained unit commitment (SCUC) is solved for power system day-ahead generation scheduling, which is a large-scale mixed-integer linear programming problem and is very computationally intensive. Model reduction of SCUC may bring significant time savings. In this work, a novel approach is proposed to effectively utilize machine learning (ML) to reduce the problem size of SCUC. An ML mod… ▽ More

    Submitted 12 July, 2022; v1 submitted 16 November, 2021; originally announced November 2021.

    Comments: 6 Pages, 5 Figures, 3 tables, 1 algorithm

  23. arXiv:2110.05448  [pdf, other

    cs.CL cs.AI

    Unsupervised Neural Machine Translation with Generative Language Models Only

    Authors: Jesse Michael Han, Igor Babuschkin, Harrison Edwards, Arvind Neelakantan, Tao Xu, Stanislas Polu, Alex Ray, Pranav Shyam, Aditya Ramesh, Alec Radford, Ilya Sutskever

    Abstract: We show how to derive state-of-the-art unsupervised neural machine translation systems from generatively pre-trained language models. Our method consists of three steps: few-shot amplification, distillation, and backtranslation. We first use the zero-shot translation ability of large pre-trained language models to generate translations for a small set of unlabeled sentences. We then amplify these… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

    Comments: 10 pages

  24. arXiv:2103.13321  [pdf

    cs.NI

    Network Reconfiguration Impact on Renewable Energy System and Energy Storage System in Day-Ahead Scheduling

    Authors: Arun Venkatesh Ramesh, Xingpeng Li

    Abstract: Renewable energy sources (RES) has gained significant interest in recent years. However, due to favourable weather conditions, the RES is installed in remote locations with limited transmission capacity. As a result, it can lead to major curtailments of the free resource when the network is congested. Therefore, energy storage system (ESS) is considered as a viable solution to store energy and add… ▽ More

    Submitted 11 January, 2021; originally announced March 2021.

  25. arXiv:2103.00020  [pdf, other

    cs.CV cs.LG

    Learning Transferable Visual Models From Natural Language Supervision

    Authors: Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever

    Abstract: State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and usability since additional labeled data is needed to specify any other visual concept. Learning directly from raw text about images is a promising alternative which leverages a much broader source of supervision. We demonstr… ▽ More

    Submitted 26 February, 2021; originally announced March 2021.

  26. arXiv:2102.12092  [pdf, other

    cs.CV cs.LG

    Zero-Shot Text-to-Image Generation

    Authors: Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, Ilya Sutskever

    Abstract: Text-to-image generation has traditionally focused on finding better modeling assumptions for training on a fixed dataset. These assumptions might involve complex architectures, auxiliary losses, or side information such as object part labels or segmentation masks supplied during training. We describe a simple approach for this task based on a transformer that autoregressively models the text and… ▽ More

    Submitted 26 February, 2021; v1 submitted 24 February, 2021; originally announced February 2021.

  27. arXiv:2011.07613  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    BirdSLAM: Monocular Multibody SLAM in Bird's-Eye View

    Authors: Swapnil Daga, Gokul B. Nair, Anirudha Ramesh, Rahul Sajnani, Junaid Ahmed Ansari, K. Madhava Krishna

    Abstract: In this paper, we present BirdSLAM, a novel simultaneous localization and map** (SLAM) system for the challenging scenario of autonomous driving platforms equipped with only a monocular camera. BirdSLAM tackles challenges faced by other monocular SLAM systems (such as scale ambiguity in monocular reconstruction, dynamic object localization, and uncertainty in feature representation) by using an… ▽ More

    Submitted 15 November, 2020; originally announced November 2020.

    Comments: Accepted in VISIGRAPP (VISAPP) 2021

  28. arXiv:2010.15674  [pdf, other

    cs.SI

    Analyzing Societal Impact of COVID-19: A Study During the Early Days of the Pandemic

    Authors: Swaroop Gowdra Shanthakumar, Anand Seetharam, Arti Ramesh

    Abstract: In this paper, we collect and study Twitter communications to understand the societal impact of COVID-19 in the United States during the early days of the pandemic. With infections soaring rapidly, users took to Twitter asking people to self isolate and quarantine themselves. Users also demanded closure of schools, bars, and restaurants as well as lockdown of cities and states. We methodically col… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: Accepted for publication in IEEE SocialCom 2020. arXiv admin note: substantial text overlap with arXiv:2004.05451

  29. arXiv:2010.14701  [pdf, other

    cs.LG cs.CL cs.CV

    Scaling Laws for Autoregressive Generative Modeling

    Authors: Tom Henighan, Jared Kaplan, Mor Katz, Mark Chen, Christopher Hesse, Jacob Jackson, Heewoo Jun, Tom B. Brown, Prafulla Dhariwal, Scott Gray, Chris Hallacy, Benjamin Mann, Alec Radford, Aditya Ramesh, Nick Ryder, Daniel M. Ziegler, John Schulman, Dario Amodei, Sam McCandlish

    Abstract: We identify empirical scaling laws for the cross-entropy loss in four domains: generative image modeling, video modeling, multimodal image$\leftrightarrow$text models, and mathematical problem solving. In all cases autoregressive Transformers smoothly improve in performance as model size and compute budgets increase, following a power-law plus constant scaling law. The optimal model size also depe… ▽ More

    Submitted 5 November, 2020; v1 submitted 27 October, 2020; originally announced October 2020.

    Comments: 20+17 pages, 33 figures; added appendix with additional language results

  30. arXiv:2010.14558  [pdf, other

    cs.SI

    Characterizing Human Mobility Patterns During COVID-19 using Cellular Network Data

    Authors: Necati A. Ayan, Nilson L. Damasceno, Sushil Chaskar, Peron R. de Sousa, Arti Ramesh, Anand Seetharam, Antonio A. de A. Rocha

    Abstract: In this paper, our goal is to analyze and compare cellular network usage data from pre-lockdown, during lockdown, and post-lockdown phases surrounding the COVID-19 pandemic to understand and model human mobility patterns during the pandemic, and evaluate the effect of lockdowns on mobility. To this end, we collaborate with one of the main cellular network providers in Brazil, and collect and analy… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: 12 pages

  31. arXiv:2010.02013  [pdf

    cs.SE cs.LG

    Towards ML Engineering: A Brief History Of TensorFlow Extended (TFX)

    Authors: Konstantinos, Katsiapis, Abhijit Karmarkar, Ahmet Altay, Aleksandr Zaks, Neoklis Polyzotis, Anusha Ramesh, Ben Mathes, Gautam Vasudevan, Irene Giannoumis, Jarek Wilkiewicz, Jiri Simsa, Justin Hong, Mitch Trott, Noé Lutz, Pavel A. Dournov, Robert Crowe, Sarah Sirajuddin, Tris Brian Warkentin, Zhitao Li

    Abstract: Software Engineering, as a discipline, has matured over the past 5+ decades. The modern world heavily depends on it, so the increased maturity of Software Engineering was an eventuality. Practices like testing and reliable technologies help make Software Engineering reliable enough to build industries upon. Meanwhile, Machine Learning (ML) has also grown over the past 2+ decades. ML is used more a… ▽ More

    Submitted 7 October, 2020; v1 submitted 28 September, 2020; originally announced October 2020.

    Comments: 16 pages

  32. Recurrent Neural-Linear Posterior Sampling for Nonstationary Contextual Bandits

    Authors: Aditya Ramesh, Paulo Rauber, Michelangelo Conserva, Jürgen Schmidhuber

    Abstract: An agent in a nonstationary contextual bandit problem should balance between exploration and the exploitation of (periodic or structured) patterns present in its previous experiences. Handcrafting an appropriate historical context is an attractive alternative to transform a nonstationary problem into a stationary problem that can be solved efficiently. However, even a carefully designed historical… ▽ More

    Submitted 3 November, 2023; v1 submitted 9 July, 2020; originally announced July 2020.

    Journal ref: Neural Computation. 2022 Oct 7;34(11):2232-72

  33. arXiv:2006.16621  [pdf, other

    cs.CV

    A Simple Domain Shifting Networkfor Generating Low Quality Images

    Authors: Guruprasad Hegde, Avinash Nittur Ramesh, Kanchana Vaishnavi Gandikota, Roman Obermaisser, Michael Moeller

    Abstract: Deep Learning systems have proven to be extremely successful for image recognition tasks for which significant amounts of training data is available, e.g., on the famous ImageNet dataset. We demonstrate that for robotics applications with cheap camera equipment, the low image quality, however,influences the classification accuracy, and freely available databases cannot be exploited in a straight f… ▽ More

    Submitted 30 June, 2020; originally announced June 2020.

    Comments: accepted ICPR 2020

  34. arXiv:2006.08003  [pdf, other

    eess.IV cs.CV cs.LG

    CompressNet: Generative Compression at Extremely Low Bitrates

    Authors: Suraj Kiran Raman, Aditya Ramesh, Vijayakrishna Naganoor, Shubham Dash, Giridharan Kumaravelu, Honglak Lee

    Abstract: Compressing images at extremely low bitrates (< 0.1 bpp) has always been a challenging task since the quality of reconstruction significantly reduces due to the strong imposed constraint on the number of bits allocated for the compressed data. With the increasing need to transfer large amounts of images with limited bandwidth, compressing images to very low sizes is a crucial task. However, the ex… ▽ More

    Submitted 14 June, 2020; originally announced June 2020.

  35. arXiv:2006.00305  [pdf, other

    cs.LG stat.ML

    RelEx: A Model-Agnostic Relational Model Explainer

    Authors: Yue Zhang, David Defazio, Arti Ramesh

    Abstract: In recent years, considerable progress has been made on improving the interpretability of machine learning models. This is essential, as complex deep learning models with millions of parameters produce state of the art results, but it can be nearly impossible to explain their predictions. While various explainability techniques have achieved impressive results, nearly all of them assume each data… ▽ More

    Submitted 30 May, 2020; originally announced June 2020.

  36. arXiv:2005.14165  [pdf, other

    cs.CL

    Language Models are Few-Shot Learners

    Authors: Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess , et al. (6 additional authors not shown)

    Abstract: Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few… ▽ More

    Submitted 22 July, 2020; v1 submitted 28 May, 2020; originally announced May 2020.

    Comments: 40+32 pages

  37. arXiv:2004.05451  [pdf, other

    cs.SI

    Understanding the Socio-Economic Disruption in the United States during COVID-19's Early Days

    Authors: Swaroop Gowdra Shanthakumar, Anand Seetharam, Arti Ramesh

    Abstract: In this paper, we collect and study Twitter communications to understand the socio-economic impact of COVID-19 in the United States during the early days of the pandemic. Our analysis reveals that COVID-19 gripped the nation during this time as is evidenced by the significant number of trending hashtags. With infections soaring rapidly, users took to Twitter asking people to self isolate and quara… ▽ More

    Submitted 11 April, 2020; originally announced April 2020.

  38. arXiv:2002.09523  [pdf, other

    cs.LG cs.SI stat.ML

    Struct-MMSB: Mixed Membership Stochastic Blockmodels with Interpretable Structured Priors

    Authors: Yue Zhang, Arti Ramesh

    Abstract: The mixed membership stochastic blockmodel (MMSB) is a popular framework for community detection and network generation. It learns a low-rank mixed membership representation for each node across communities by exploiting the underlying graph structure. MMSB assumes that the membership distributions of the nodes are independently drawn from a Dirichlet distribution, which limits its capability to m… ▽ More

    Submitted 21 February, 2020; originally announced February 2020.

    Comments: ECAI 2020

  39. arXiv:2002.09471  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Fairness-aware Relational Structures

    Authors: Yue Zhang, Arti Ramesh

    Abstract: The development of fair machine learning models that effectively avert bias and discrimination is an important problem that has garnered attention in recent years. The necessity of encoding complex relational dependencies among the features and variables for competent predictions require the development of fair, yet expressive relational models. In this work, we introduce Fair-A3SL, a fairness-awa… ▽ More

    Submitted 21 February, 2020; originally announced February 2020.

    Comments: Accepted for publication in ECAI 2020

  40. arXiv:2002.03528  [pdf, other

    cs.RO cs.CV

    Multi-object Monocular SLAM for Dynamic Environments

    Authors: Gokul B. Nair, Swapnil Daga, Rahul Sajnani, Anirudha Ramesh, Junaid Ahmed Ansari, Krishna Murthy Jatavallabhula, K. Madhava Krishna

    Abstract: In this paper, we tackle the problem of multibody SLAM from a monocular camera. The term multibody, implies that we track the motion of the camera, as well as that of other dynamic participants in the scene. The quintessential challenge in dynamic scenes is unobservability: it is not possible to unambiguously triangulate a moving object from a moving monocular camera. Existing approaches solve res… ▽ More

    Submitted 11 May, 2020; v1 submitted 9 February, 2020; originally announced February 2020.

    Comments: Accepted to IEEE Intelligent Vehicles Symposium 2020 (IV2020)

  41. arXiv:1912.07721  [pdf, ps, other

    cs.LG stat.ML

    Adversarial Model Extraction on Graph Neural Networks

    Authors: David DeFazio, Arti Ramesh

    Abstract: Along with the advent of deep neural networks came various methods of exploitation, such as fooling the classifier or contaminating its training data. Another such attack is known as model extraction, where provided API access to some black box neural network, the adversary extracts the underlying model. This is done by querying the model in such a way that the underlying neural network provides e… ▽ More

    Submitted 16 December, 2019; originally announced December 2019.

    Comments: AAAI Workshop on Deep Learning on Graphs: Methodologies and Applications (DLGMA), 2020

  42. arXiv:1908.02282  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    A Weakly-Supervised Attention-based Visualization Tool for Assessing Political Affiliation

    Authors: Srijith Rajamohan, Alana Romanella, Amit Ramesh

    Abstract: In this work, we seek to finetune a weakly-supervised expert-guided Deep Neural Network (DNN) for the purpose of determining political affiliations. In this context, stance detection is used for determining political affiliation or ideology which is framed in the form of relative proximities between entities in a low-dimensional space. An attention-based mechanism is used to provide model interpre… ▽ More

    Submitted 5 August, 2019; originally announced August 2019.

    Comments: 8 pages

  43. arXiv:1906.08873  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Learning Discriminative features using Center Loss and Reconstruction as Regularizer for Speech Emotion Recognition

    Authors: Suraj Tripathi, Abhiram Ramesh, Abhay Kumar, Chirag Singh, Promod Yenigalla

    Abstract: This paper proposes a Convolutional Neural Network (CNN) inspired by Multitask Learning (MTL) and based on speech features trained under the joint supervision of softmax loss and center loss, a powerful metric learning strategy, for the recognition of emotion in speech. Speech features such as Spectrograms and Mel-frequency Cepstral Coefficient s (MFCCs) help retain emotion-related low-level chara… ▽ More

    Submitted 31 August, 2019; v1 submitted 18 June, 2019; originally announced June 2019.

    Comments: 10 pages, Accepted in IJCAI Affective Computing Workshop 2019

  44. arXiv:1906.05682  [pdf

    eess.AS cs.LG cs.SD stat.ML

    Focal Loss based Residual Convolutional Neural Network for Speech Emotion Recognition

    Authors: Suraj Tripathi, Abhay Kumar, Abhiram Ramesh, Chirag Singh, Promod Yenigalla

    Abstract: This paper proposes a Residual Convolutional Neural Network (ResNet) based on speech features and trained under Focal Loss to recognize emotion in speech. Speech features such as Spectrogram and Mel-frequency Cepstral Coefficients (MFCCs) have shown the ability to characterize emotion better than just plain text. Further Focal Loss, first used in One-Stage Object Detectors, has shown the ability t… ▽ More

    Submitted 11 June, 2019; originally announced June 2019.

    Comments: Accepted in CICLing 2019

  45. arXiv:1906.05681  [pdf

    eess.AS cs.CL cs.LG cs.SD stat.ML

    Deep Learning based Emotion Recognition System Using Speech Features and Transcriptions

    Authors: Suraj Tripathi, Abhay Kumar, Abhiram Ramesh, Chirag Singh, Promod Yenigalla

    Abstract: This paper proposes a speech emotion recognition method based on speech features and speech transcriptions (text). Speech features such as Spectrogram and Mel-frequency Cepstral Coefficients (MFCC) help retain emotion-related low-level characteristics in speech whereas text helps capture semantic meaning, both of which help in different aspects of emotion detection. We experimented with several De… ▽ More

    Submitted 11 June, 2019; originally announced June 2019.

    Comments: Accepted in CICLing 2019

  46. arXiv:1812.01161  [pdf, other

    stat.ML cs.AI cs.LG

    A Spectral Regularizer for Unsupervised Disentanglement

    Authors: Aditya Ramesh, Youngduck Choi, Yann LeCun

    Abstract: A generative model with a disentangled representation allows for independent control over different aspects of the output. Learning disentangled representations has been a recent topic of great interest, but it remains poorly understood. We show that even for GANs that do not possess disentangled representations, one can find curved trajectories in latent space over which local disentanglement occ… ▽ More

    Submitted 5 February, 2019; v1 submitted 3 December, 2018; originally announced December 2018.

  47. Clustering of Driving Encounter Scenarios Using Connected Vehicle Trajectories

    Authors: Wenshuo Wang, Aditya Ramesh, Ding Zhao

    Abstract: Multi-vehicle interaction behavior classification and analysis offer in-depth knowledge to make an efficient decision for autonomous vehicles. This paper aims to cluster a wide range of driving encounter scenarios based only on multi-vehicle GPS trajectories. Towards this end, we propose a generic unsupervised learning framework comprising two layers: feature representation layer and clustering la… ▽ More

    Submitted 15 March, 2019; v1 submitted 22 July, 2018; originally announced July 2018.

    Comments: 12 pages, 11 figures

  48. arXiv:1806.00499  [pdf, other

    cs.LG cs.AI stat.ML

    Backpropagation for Implicit Spectral Densities

    Authors: Aditya Ramesh, Yann LeCun

    Abstract: Most successful machine intelligence systems rely on gradient-based learning, which is made possible by backpropagation. Some systems are designed to aid us in interpreting data when explicit goals cannot be provided. These unsupervised systems are commonly trained by backpropagating through a likelihood function. We introduce a tool that allows us to do this even when the likelihood is not explic… ▽ More

    Submitted 1 June, 2018; originally announced June 2018.

  49. arXiv:1611.03383  [pdf, other

    cs.LG stat.ML

    Disentangling factors of variation in deep representations using adversarial training

    Authors: Michael Mathieu, Junbo Zhao, Pablo Sprechmann, Aditya Ramesh, Yann LeCun

    Abstract: We introduce a conditional generative model for learning to disentangle the hidden factors of variation within a set of labeled observations, and separate them into complementary codes. One code summarizes the specified factors of variation associated with the labels. The other summarizes the remaining unspecified variability. During training, the only available source of supervision comes from ou… ▽ More

    Submitted 10 November, 2016; originally announced November 2016.

    Comments: Conference paper in NIPS 2016

  50. arXiv:1204.4015  [pdf, other

    physics.soc-ph cs.HC cs.SI

    Human Navigational Performance in a Complex Network with Progressive Disruptions

    Authors: Amitash Ramesh, Soumya Ramesh, Sudarshan Iyengar, Vinod Sekhar

    Abstract: The current paper is an investigation towards understanding the navigational performance of humans on a network when the "landmark" nodes are blocked. We observe that humans learn to cope up, despite the continued introduction of blockages in the network. The experiment proposed involves the task of navigating on a word network based on a puzzle called the wordmorph. We introduce blockages in the… ▽ More

    Submitted 18 April, 2012; originally announced April 2012.