Skip to main content

Showing 1–50 of 115 results for author: Subramanian, A

.
  1. arXiv:2406.16252  [pdf, other

    cs.LG cs.AI

    Graph-Augmented LLMs for Personalized Health Insights: A Case Study in Sleep Analysis

    Authors: Ajan Subramanian, Zhongqi Yang, Iman Azimi, Amir M. Rahmani

    Abstract: Health monitoring systems have revolutionized modern healthcare by enabling the continuous capture of physiological and behavioral data, essential for preventive measures and early health intervention. While integrating this data with Large Language Models (LLMs) has shown promise in delivering interactive health advice, traditional methods like Retrieval-Augmented Generation (RAG) and fine-tuning… ▽ More

    Submitted 24 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  2. arXiv:2406.10276  [pdf, other

    cs.CL cs.SD eess.AS

    Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation

    Authors: Peidong Wang, Jian Xue, **yu Li, Junkun Chen, Aswin Shanmugam Subramanian

    Abstract: Language-agnostic many-to-one end-to-end speech translation models can convert audio signals from different source languages into text in a target language. These models do not need source language identification, which improves user experience. In some cases, the input language can be given or estimated. Our goal is to use this additional language information while preserving the quality of the o… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  3. arXiv:2406.03699  [pdf, other

    cs.CL

    M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question Answering

    Authors: Anand Subramanian, Viktor Schlegel, Abhinav Ramesh Kashyap, Thanh-Tung Nguyen, Vijay Prakash Dwivedi, Stefan Winkler

    Abstract: There is vivid research on adapting Large Language Models (LLMs) to perform a variety of tasks in high-stakes domains such as healthcare. Despite their popularity, there is a lack of understanding of the extent and contributing factors that allow LLMs to recall relevant knowledge and combine it with presented information in the clinical and biomedical domain: a fundamental pre-requisite for succes… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted at ACL 2024 (Findings)

  4. arXiv:2402.05592  [pdf, other

    cs.HC

    MERP: Metaverse Extended Realtiy Portal

    Authors: Anisha Ghosh, Aditya Mitra, Anik Saha, Sibi Chakkaravarthy Sethuraman, Anitha Subramanian

    Abstract: A standardized control system called Metaverse Extended Reality Portal (MERP) is presented as a solution to the issues with conventional VR eyewear. The MERP system improves user awareness of the physical world while offering an immersive 3D view of the metaverse by using a shouldermounted projector to display a Heads-Up Display (HUD) in a designated Metaverse Experience Room. To provide natural a… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  5. arXiv:2402.02412  [pdf, other

    cs.CG

    On Approximation Schemes for Stabbing Rectilinear Polygons

    Authors: Arindam Khan, Aditya Subramanian, Tobias Widmann, Andreas Wiese

    Abstract: We study the problem of stabbing rectilinear polygons, where we are given $n$ rectilinear polygons in the plane that we want to stab, i.e., we want to select horizontal line segments such that for each given rectilinear polygon there is a line segment that intersects two opposite (parallel) edges of it. Our goal is to find a set of line segments of minimum total length such that all polygons are s… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  6. arXiv:2310.15179  [pdf, other

    physics.ao-ph cs.AI cs.LG math.DS stat.OT

    Reducing Uncertainty in Sea-level Rise Prediction: A Spatial-variability-aware Approach

    Authors: Subhankar Ghosh, Shuai An, Arun Sharma, Jayant Gupta, Shashi Shekhar, Aneesh Subramanian

    Abstract: Given multi-model ensemble climate projections, the goal is to accurately and reliably predict future sea-level rise while lowering the uncertainty. This problem is important because sea-level rise affects millions of people in coastal communities and beyond due to climate change's impacts on polar ice sheets and the ocean. This problem is challenging due to spatial variability and unknowns such a… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: 6 pages, 5 figures, I-GUIDE 2023 conference

    ACM Class: J.2; I.2.m; I.2.6; I.2.1; I.2

  7. arXiv:2309.13190  [pdf, other

    cs.LG cs.CV eess.IV

    Spatial-frequency channels, shape bias, and adversarial robustness

    Authors: Ajay Subramanian, Elena Sizikova, Najib J. Majaj, Denis G. Pelli

    Abstract: What spatial frequency information do humans and neural networks use to recognize objects? In neuroscience, critical band masking is an established tool that can reveal the frequency-selective filters used for object recognition. Critical band masking measures the sensitivity of recognition performance to noise added at each spatial frequency. Existing critical band masking studies show that human… ▽ More

    Submitted 5 November, 2023; v1 submitted 22 September, 2023; originally announced September 2023.

    Comments: Neural Information Processing Systems (NeurIPS) 2023 (Oral Presentation). Camera-ready version

  8. arXiv:2309.00059  [pdf, other

    cs.CV eess.IV

    STint: Self-supervised Temporal Interpolation for Geospatial Data

    Authors: Nidhin Harilal, Bri-Mathias Hodge, Aneesh Subramanian, Claire Monteleoni

    Abstract: Supervised and unsupervised techniques have demonstrated the potential for temporal interpolation of video data. Nevertheless, most prevailing temporal interpolation techniques hinge on optical flow, which encodes the motion of pixels between video frames. On the other hand, geospatial data exhibits lower temporal resolution while encompassing a spectrum of movements and deformations that challeng… ▽ More

    Submitted 31 August, 2023; originally announced September 2023.

  9. arXiv:2308.16435  [pdf

    cs.CV

    Njobvu-AI: An open-source tool for collaborative image labeling and implementation of computer vision models

    Authors: Jonathan S. Koning, Ashwin Subramanian, Mazen Alotaibi, Cara L. Appel, Christopher M. Sullivan, Thon Chao, Lisa Truong, Robyn L. Tanguay, Pankaj Jaiswal, Taal Levi, Damon B. Lesmeister

    Abstract: Practitioners interested in using computer vision models lack user-friendly and open-source software that combines features to label training data, allow multiple users, train new algorithms, review output, and implement new models. Labeling training data, such as images, is a key step to develo** accurate object detection algorithms using computer vision. This step is often not compatible with… ▽ More

    Submitted 30 August, 2023; originally announced August 2023.

    Comments: 13 pages, 6 figures. For code and documentation, see https://github.com/sullichrosu/Njobvu-AI/

  10. arXiv:2308.10499  [pdf, ps, other

    cs.DS

    Fair Rank Aggregation

    Authors: Diptarka Chakraborty, Syamantak Das, Arindam Khan, Aditya Subramanian

    Abstract: Ranking algorithms find extensive usage in diverse areas such as web search, employment, college admission, voting, etc. The related rank aggregation problem deals with combining multiple rankings into a single aggregate ranking. However, algorithms for both these problems might be biased against some individuals or groups due to implicit prejudice or marginalization in the historical data. We stu… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: A preliminary version of this paper appeared in NeurIPS 2022

  11. arXiv:2307.02006  [pdf, other

    cs.CL

    PULSAR at MEDIQA-Sum 2023: Large Language Models Augmented by Synthetic Dialogue Convert Patient Dialogues to Medical Records

    Authors: Viktor Schlegel, Hao Li, Yu** Wu, Anand Subramanian, Thanh-Tung Nguyen, Abhinav Ramesh Kashyap, Daniel Beck, Xiaojun Zeng, Riza Theresa Batista-Navarro, Stefan Winkler, Goran Nenadic

    Abstract: This paper describes PULSAR, our system submission at the ImageClef 2023 MediQA-Sum task on summarising patient-doctor dialogues into clinical records. The proposed framework relies on domain-specific pre-training, to produce a specialised language model which is trained on task-specific natural data augmented by synthetic data generated by a black-box LLM. We find limited evidence towards the eff… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

    Comments: 8 pages. ImageClef 2023 MediQA-Sum

  12. arXiv:2304.14499  [pdf

    cs.CV eess.IV

    Human activity recognition using deep learning approaches and single frame cnn and convolutional lstm

    Authors: Sheryl Mathew, Annapoorani Subramanian, Pooja, Balamurugan MS, Manoj Kumar Rajagopal

    Abstract: Human activity recognition is one of the most important tasks in computer vision and has proved useful in different fields such as healthcare, sports training and security. There are a number of approaches that have been explored to solve this task, some of them involving sensor data, and some involving video data. In this paper, we aim to explore two deep learning-based approaches, namely single… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

    Comments: Sixteen pages and five figures

  13. arXiv:2304.11377  [pdf, other

    cs.HC cs.AI

    SimplyMime: A Control at Our Fingertips

    Authors: Sibi Chakkaravarthy Sethuraman, Gaurav Reddy Tadkapally, Athresh Kiran, Saraju P. Mohanty, Anitha Subramanian

    Abstract: The utilization of consumer electronics, such as televisions, set-top boxes, home theaters, and air conditioners, has become increasingly prevalent in modern society as technology continues to evolve. As new devices enter our homes each year, the accumulation of multiple infrared remote controls to operate them not only results in a waste of energy and resources, but also creates a cumbersome and… ▽ More

    Submitted 22 April, 2023; originally announced April 2023.

  14. arXiv:2303.13863  [pdf, other

    cs.HC cs.AI

    MagicEye: An Intelligent Wearable Towards Independent Living of Visually Impaired

    Authors: Sibi C. Sethuraman, Gaurav R. Tadkapally, Saraju P. Mohanty, Gautam Galada, Anitha Subramanian

    Abstract: Individuals with visual impairments often face a multitude of challenging obstacles in their daily lives. Vision impairment can severely impair a person's ability to work, navigate, and retain independence. This can result in educational limits, a higher risk of accidents, and a plethora of other issues. To address these challenges, we present MagicEye, a state-of-the-art intelligent wearable devi… ▽ More

    Submitted 24 March, 2023; originally announced March 2023.

  15. arXiv:2303.09524  [pdf, other

    cs.CG cs.DS

    Online and Dynamic Algorithms for Geometric Set Cover and Hitting Set

    Authors: Arindam Khan, Aditya Lonkar, Saladi Rahul, Aditya Subramanian, Andreas Wiese

    Abstract: Set cover and hitting set are fundamental problems in combinatorial optimization which are well-studied in the offline, online, and dynamic settings. We study the geometric versions of these problems and present new online and dynamic algorithms for them. In the online version of set cover (resp. hitting set), $m$ sets (resp.~$n$ points) are give $n$ points (resp.~$m$ sets) arrive online, one-by-o… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

  16. arXiv:2303.08272  [pdf, other

    physics.chem-ph cs.LG

    Automated patent extraction powers generative modeling in focused chemical spaces

    Authors: Akshay Subramanian, Kevin P. Greenman, Alexis Gervaix, Tzuhsiung Yang, Rafael Gómez-Bombarelli

    Abstract: Deep generative models have emerged as an exciting avenue for inverse molecular design, with progress coming from the interplay between training algorithms and molecular representations. One of the key challenges in their applicability to materials science and chemistry has been the lack of access to sizeable training datasets with property labels. Published patents contain the first disclosure of… ▽ More

    Submitted 24 July, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

    Comments: Digital Discovery (2023)

  17. arXiv:2303.07122  [pdf, other

    cs.AI cs.LG physics.ao-ph stat.ME

    Quantifying Causes of Arctic Amplification via Deep Learning based Time-series Causal Inference

    Authors: Sahara Ali, Omar Faruque, Yiyi Huang, Md. Osman Gani, Aneesh Subramanian, Nicole-Jienne Shchlegel, Jianwu Wang

    Abstract: The warming of the Arctic, also known as Arctic amplification, is led by several atmospheric and oceanic drivers. However, the details of its underlying thermodynamic causes are still unknown. Inferring the causal effects of atmospheric processes on sea ice melt using fixed treatment effect strategies leads to unrealistic counterfactual estimations. Such models are also prone to bias due to time-v… ▽ More

    Submitted 25 September, 2023; v1 submitted 22 February, 2023; originally announced March 2023.

    Comments: Accepted by ICMLA 2023

  18. arXiv:2303.03849  [pdf, other

    eess.AS cs.SD

    TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings

    Authors: Christoph Boeddeker, Aswin Shanmugam Subramanian, Gordon Wichern, Reinhold Haeb-Umbach, Jonathan Le Roux

    Abstract: Since diarization and source separation of meeting data are closely related tasks, we here propose an approach to perform the two objectives jointly. It builds upon the target-speaker voice activity detection (TS-VAD) diarization approach, which assumes that initial speaker embeddings are available. We replace the final combined speaker activity estimation network of TS-VAD with a network that pro… ▽ More

    Submitted 1 January, 2024; v1 submitted 7 March, 2023; originally announced March 2023.

    Comments: Submitted to IEEE/ACM TASLP

  19. arXiv:2303.02641  [pdf, other

    cs.CV cs.AI

    CueCAn: Cue Driven Contextual Attention For Identifying Missing Traffic Signs on Unconstrained Roads

    Authors: Varun Gupta, Anbumani Subramanian, C. V. Jawahar, Rohit Saluja

    Abstract: Unconstrained Asian roads often involve poor infrastructure, affecting overall road safety. Missing traffic signs are a regular part of such roads. Missing or non-existing object detection has been studied for locating missing curbs and estimating reasonable regions for pedestrians on road scene images. Such methods involve analyzing task-specific single object cues. In this paper, we present the… ▽ More

    Submitted 5 March, 2023; originally announced March 2023.

    Comments: International Conference on Robotics and Automation (ICRA'23)

  20. arXiv:2301.01770  [pdf, other

    cs.CR

    MetaSecure: A Passwordless Authentication for the Metaverse

    Authors: Sibi Chakkaravarthy Sethuraman, Aditya Mitra, Anisha Ghosh, Gautam Galada, Anitha Subramanian

    Abstract: Metaverse in general holds a potential future for cyberspace. At the beginning of Web 2.0, it was witnessed that people were signing in with various pseudonyms or 'nyms', risking their online identities by increasing presence of fake accounts leading to difficulty in unique identification for different roles. However, in Web 3.0, the metaverse, a user's identity is tied to their original identity,… ▽ More

    Submitted 4 January, 2023; originally announced January 2023.

  21. arXiv:2212.07327  [pdf, other

    eess.AS cs.SD

    Tackling the Cocktail Fork Problem for Separation and Transcription of Real-World Soundtracks

    Authors: Darius Petermann, Gordon Wichern, Aswin Shanmugam Subramanian, Zhong-Qiu Wang, Jonathan Le Roux

    Abstract: Emulating the human ability to solve the cocktail party problem, i.e., focus on a source of interest in a complex acoustic scene, is a long standing goal of audio source separation research. Much of this research investigates separating speech from noise, speech from speech, musical instruments from each other, or sound events from each other. In this paper, we focus on the cocktail fork problem,… ▽ More

    Submitted 14 December, 2022; originally announced December 2022.

    Comments: Submitted to IEEE TASLP (In review), 13 pages, 6 figures

  22. arXiv:2212.05008  [pdf, other

    eess.AS cs.SD

    Hyperbolic Audio Source Separation

    Authors: Darius Petermann, Gordon Wichern, Aswin Subramanian, Jonathan Le Roux

    Abstract: We introduce a framework for audio source separation using embeddings on a hyperbolic manifold that compactly represent the hierarchical relationship between sound sources and time-frequency features. Inspired by recent successes modeling hierarchical relationships in text and images with hyperbolic embeddings, our algorithm obtains a hyperbolic embedding for each time-frequency bin of a mixture s… ▽ More

    Submitted 9 December, 2022; originally announced December 2022.

    Comments: Submitted to ICASSP 2023, Demo page: https://darius522.github.io/hyperbolic-audio-sep/

  23. arXiv:2211.15653  [pdf, other

    physics.space-ph physics.plasm-ph

    Energetic electron precipitation driven by electromagnetic ion cyclotron waves from ELFIN's low altitude perspective

    Authors: V. Angelopoulos, X. -J. Zhang, A. V. Artemyev, D. Mourenas, E. Tsai, C. Wilkins, A. Runov, J. Liu, D. L. Turner, W. Li, K. Khurana, R. E. Wirz, V. A. Sergeev, X. Meng, J. Wu, M. D. Hartinger, T. Raita, Y. Shen, X. An, X. Shi, M. F. Bashir, X. Shen, L. Gan, M. Qin, L. Capannolo , et al. (61 additional authors not shown)

    Abstract: We review comprehensive observations of electromagnetic ion cyclotron (EMIC) wave-driven energetic electron precipitation using data from the energetic electron detector on the Electron Losses and Fields InvestigatioN (ELFIN) mission, two polar-orbiting low-altitude spinning CubeSats, measuring 50-5000 keV electrons with good pitch-angle and energy resolution. EMIC wave-driven precipitation exhibi… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

  24. arXiv:2211.08303  [pdf, other

    eess.AS cs.AI cs.LG cs.SD stat.ML

    Reverberation as Supervision for Speech Separation

    Authors: Rohith Aralikatti, Christoph Boeddeker, Gordon Wichern, Aswin Shanmugam Subramanian, Jonathan Le Roux

    Abstract: This paper proposes reverberation as supervision (RAS), a novel unsupervised loss function for single-channel reverberant speech separation. Prior methods for unsupervised separation required the synthesis of mixtures of mixtures or assumed the existence of a teacher model, making them difficult to consider as potential methods explaining the emergence of separation abilities in an animal's audito… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

    Comments: 5 pages, 2 figures, 4 tables. Submitted to ICASSP 2023

  25. arXiv:2211.01299  [pdf, other

    eess.AS cs.CL cs.SD

    Late Audio-Visual Fusion for In-The-Wild Speaker Diarization

    Authors: Zexu Pan, Gordon Wichern, François G. Germain, Aswin Subramanian, Jonathan Le Roux

    Abstract: Speaker diarization is well studied for constrained audios but little explored for challenging in-the-wild videos, which have more speakers, shorter utterances, and inconsistent on-screen speakers. We address this gap by proposing an audio-visual diarization model which combines audio-only and visual-centric sub-systems via late fusion. For audio, we show that an attractor-based end-to-end system… ▽ More

    Submitted 27 September, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

  26. arXiv:2210.12878  [pdf, other

    cs.CV

    IDD-3D: Indian Driving Dataset for 3D Unstructured Road Scenes

    Authors: Shubham Dokania, A. H. Abdul Hafez, Anbumani Subramanian, Manmohan Chandraker, C. V. Jawahar

    Abstract: Autonomous driving and assistance systems rely on annotated data from traffic and road scenarios to model and learn the various object relations in complex real-world scenarios. Preparation and training of deploy-able deep learning architectures require the models to be suited to different traffic scenarios and adapt to different situations. Currently, existing datasets, while large-scale, lack su… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

    Comments: 10 pages, 8 figures, 5 tables, Accepted in Winter Conference on Applications of Computer Vision (WACV 2023)

  27. arXiv:2208.07943  [pdf, other

    cs.CV

    TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual Environments

    Authors: Shubham Dokania, Anbumani Subramanian, Manmohan Chandraker, C. V. Jawahar

    Abstract: High-quality structured data with rich annotations are critical components in intelligent vehicle systems dealing with road scenes. However, data curation and annotation require intensive investments and yield low-diversity scenarios. The recently growing interest in synthetic data raises questions about the scope of improvement in such systems and the amount of manual work still required to produ… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

    Comments: 18 pages, 5 figures, Accepted in European Conference on Computer Vision (ECCV 2022)

  28. arXiv:2207.04008  [pdf, other

    cs.CL

    ABB-BERT: A BERT model for disambiguating abbreviations and contractions

    Authors: Prateek Kacker, Andi Cupallari, Aswin Gridhar Subramanian, Nimit Jain

    Abstract: Abbreviations and contractions are commonly found in text across different domains. For example, doctors' notes contain many contractions that can be personalized based on their choices. Existing spelling correction models are not suitable to handle expansions because of many reductions of characters in words. In this work, we propose ABB-BERT, a BERT-based model, which deals with an ambiguous lan… ▽ More

    Submitted 8 July, 2022; originally announced July 2022.

    Journal ref: Proceedings of the 18th International Conference on Natural Language Processing, pages 289 297 Silchar, India, 2021

  29. arXiv:2206.12179  [pdf

    stat.AP physics.ao-ph q-bio.QM

    How is model-related uncertainty quantified and reported in different disciplines?

    Authors: Emily G. Simmonds, Kwaku Peprah Adjei, Christoffer Wold Andersen, Janne Cathrin Hetle Aspheim, Claudia Battistin, Nicola Bulso, Hannah Christensen, Benjamin Cretois, Ryan Cubero, Ivan A. Davidovich, Lisa Dickel, Benjamin Dunn, Etienne Dunn-Sigouin, Karin Dyrstad, Sigurd Einum, Donata Giglio, Haakon Gjerlow, Amelie Godefroidt, Ricardo Gonzalez-Gil, Soledad Gonzalo Cogno, Fabian Grosse, Paul Halloran, Mari F. Jensen, John James Kennedy, Peter Egge Langsaether , et al. (18 additional authors not shown)

    Abstract: How do we know how much we know? Quantifying uncertainty associated with our modelling work is the only way we can answer how much we know about any phenomenon. With quantitative science now highly influential in the public sphere and the results from models translating into action, we must support our conclusions with sufficient rigour to produce useful, reproducible results. Incomplete considera… ▽ More

    Submitted 1 July, 2022; v1 submitted 24 June, 2022; originally announced June 2022.

    Comments: 40 Pages (including supporting information), 3 Figures, 2 Boxes, 1 Table

  30. arXiv:2206.08427  [pdf, other

    cs.CV cs.LG

    SATBench: Benchmarking the speed-accuracy tradeoff in object recognition by humans and dynamic neural networks

    Authors: Ajay Subramanian, Sara Price, Omkar Kumbhar, Elena Sizikova, Najib J. Majaj, Denis G. Pelli

    Abstract: The core of everyday tasks like reading and driving is active object recognition. Attempts to model such tasks are currently stymied by the inability to incorporate time. People show a flexible tradeoff between speed and accuracy and this tradeoff is a crucial human skill. Deep neural networks have emerged as promising candidates for predicting peak human object recognition performance and neural… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: 19 pages, 12 figures. Under Review at NeurIPS Datasets and Benchmarks Track 2022

  31. arXiv:2204.08364  [pdf, other

    cs.CV

    Detecting, Tracking and Counting Motorcycle Rider Traffic Violations on Unconstrained Roads

    Authors: Aman Goyal, Dev Agarwal, Anbumani Subramanian, C. V. Jawahar, Ravi Kiran Sarvadevabhatla, Rohit Saluja

    Abstract: In many Asian countries with unconstrained road traffic conditions, driving violations such as not wearing helmets and triple-riding are a significant source of fatalities involving motorcycles. Identifying and penalizing such riders is vital in curbing road accidents and improving citizens' safety. With this motivation, we propose an approach for detecting, tracking, and counting motorcycle ridin… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

    Comments: 10 pages, 9 figures, Accepted at The 5th Workshop and Prize Challenge: Bridging the Gap between Computational Photography and Visual Recognition (UG2+) in conjunction with IEEE CVPR 2022

  32. Heterogeneous Target Speech Separation

    Authors: Efthymios Tzinis, Gordon Wichern, Aswin Subramanian, Paris Smaragdis, Jonathan Le Roux

    Abstract: We introduce a new paradigm for single-channel target source separation where the sources of interest can be distinguished using non-mutually exclusive concepts (e.g., loudness, gender, language, spatial location, etc). Our proposed heterogeneous separation framework can seamlessly leverage datasets with large distribution shifts and learn cross-domain representations under a variety of concepts u… ▽ More

    Submitted 7 April, 2022; originally announced April 2022.

    Comments: Submitted to Interspeech 2022

    Journal ref: Interspeech 2022

  33. Magnetization Reversal Across Multiple Serial Barriers in a Single Fe$_3$O$_4$ Nanoparticle

    Authors: Sagar Paul, Ganesh Kotagiri, Rini Ganguly, Annapoorni Subramanian, Hervé Courtois, Clemens B. Winkelmann, Anjan K. Gupta

    Abstract: Depinning of nanoscale magnetic textures, such as domain walls, vortices and skyrmions, is of paramount importance for magnetic storage and information processing. We measure time-resolved magnetic switching statistics of an individual, non-single-domain Fe$_3$O$_4$ nanoparticle using a micrometer-scale superconducting quantum interference device. Surprisingly, a strong narrowing of the waiting-ti… ▽ More

    Submitted 22 January, 2022; originally announced January 2022.

    Comments: 5 pages and suppl information available on request

    Journal ref: Phys. Rev. B 105, L180410 (2022)

  34. arXiv:2201.06569  [pdf, other

    cs.CV

    Automatic Quantification and Visualization of Street Trees

    Authors: Arpit Bahety, Rohit Saluja, Ravi Kiran Sarvadevabhatla, Anbumani Subramanian, C. V. Jawahar

    Abstract: Assessing the number of street trees is essential for evaluating urban greenery and can help municipalities employ solutions to identify tree-starved streets. It can also help identify roads with different levels of deforestation and afforestation over time. Yet, there has been little work in the area of street trees quantification. This work first explains a data collection setup carefully design… ▽ More

    Submitted 17 January, 2022; originally announced January 2022.

    Comments: Accepted at ICVGIP 2021

  35. arXiv:2112.01689  [pdf, other

    cond-mat.mtrl-sci

    Dataset of gold nanoparticle sizes and morphologies extracted from literature-mined microscopy images

    Authors: Akshay Subramanian, Kevin Cruse, Amalie Trewartha, Xingzhi Wang, A. Paul Alivisatos, Gerbrand Ceder

    Abstract: The factors controlling the size and morphology of nanoparticles have so far been poorly understood. Data-driven techniques are an exciting avenue to explore this field through the identification of trends and correlations in data. However, for these techniques to be utilized, large datasets annotated with the structural attributes of nanoparticles are required. While experimental SEM/TEM images c… ▽ More

    Submitted 6 January, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

  36. arXiv:2111.06639  [pdf, other

    cs.CV cs.AI

    Attention Guided Cosine Margin For Overcoming Class-Imbalance in Few-Shot Road Object Detection

    Authors: Ashutosh Agarwal, Anay Majee, Anbumani Subramanian, Chetan Arora

    Abstract: Few-shot object detection (FSOD) localizes and classifies objects in an image given only a few data samples. Recent trends in FSOD research show the adoption of metric and meta-learning techniques, which are prone to catastrophic forgetting and class confusion. To overcome these pitfalls in metric learning based FSOD techniques, we introduce Attention Guided Cosine Margin (AGCM) that facilitates t… ▽ More

    Submitted 12 November, 2021; originally announced November 2021.

    Comments: 8 pages, 4 figures

  37. arXiv:2111.05972  [pdf, other

    cs.LG cs.AI cs.DC

    Amazon SageMaker Model Parallelism: A General and Flexible Framework for Large Model Training

    Authors: Can Karakus, Rahul Huilgol, Fei Wu, Anirudh Subramanian, Cade Daniel, Derya Cavdar, Teng Xu, Haohan Chen, Arash Rahnama, Luis Quintela

    Abstract: With deep learning models rapidly growing in size, systems-level solutions for large-model training are required. We present Amazon SageMaker model parallelism, a software library that integrates with PyTorch, and enables easy training of large models using model parallelism and other memory-saving features. In contrast to existing solutions, the implementation of the SageMaker library is much mor… ▽ More

    Submitted 10 November, 2021; originally announced November 2021.

    Comments: 24 pages. Submitted for review

  38. arXiv:2111.05197  [pdf, other

    cs.CG cs.DS

    A PTAS for the horizontal rectangle stabbing problem

    Authors: Arindam Khan, Aditya Subramanian, Andreas Wiese

    Abstract: We study rectangle stabbing problems in which we are given $n$ axis-aligned rectangles in the plane that we want to stab, i.e., we want to select line segments such that for each given rectangle there is a line segment that intersects two opposite edges of it. In the horizontal rectangle stabbing problem (STABBING), the goal is to find a set of horizontal line segments of minimum total length su… ▽ More

    Submitted 9 November, 2021; originally announced November 2021.

    Comments: 15 pages, 3 figures

  39. arXiv:2110.15152  [pdf, other

    cs.DS math.OC

    An efficient bundle-based approach for the share-a-ride problem

    Authors: Ana Beatriz Herthel, Richard Hartl, Anand Subramanian, Thibaut Vidal

    Abstract: Some of today's most significant challenges in urban environments concern individual mobility and rapid parcel delivery. With the surge of e-commerce and the ever-increasing volume of goods to be handled, new logistic solutions are in high demand. The share-a-ride problem (SARP) was proposed as one such solution, combining people and parcel transportation in taxis. This is an NP-hard problem and t… ▽ More

    Submitted 21 February, 2023; v1 submitted 28 October, 2021; originally announced October 2021.

  40. arXiv:2110.15074  [pdf, other

    cs.CV

    Meta Guided Metric Learner for Overcoming Class Confusion in Few-Shot Road Object Detection

    Authors: Anay Majee, Anbumani Subramanian, Kshitij Agrawal

    Abstract: Localization and recognition of less-occurring road objects have been a challenge in autonomous driving applications due to the scarcity of data samples. Few-Shot Object Detection techniques extend the knowledge from existing base object classes to learn novel road objects given few training examples. Popular techniques in FSOD adopt either meta or metric learning techniques which are prone to cla… ▽ More

    Submitted 28 October, 2021; originally announced October 2021.

    Comments: Accepted to NeurIPS 2021 Workshop on Machine Learning For Autonomous Driving, 12 pages, 6 figures

  41. arXiv:2110.12205  [pdf, other

    cs.CV

    Multi-Domain Incremental Learning for Semantic Segmentation

    Authors: Prachi Garg, Rohit Saluja, Vineeth N Balasubramanian, Chetan Arora, Anbumani Subramanian, C. V. Jawahar

    Abstract: Recent efforts in multi-domain learning for semantic segmentation attempt to learn multiple geographical datasets in a universal, joint model. A simple fine-tuning experiment performed sequentially on three popular road scene segmentation datasets demonstrates that existing segmentation frameworks fail at incrementally learning on a series of visually disparate geographical domains. When learning… ▽ More

    Submitted 23 October, 2021; originally announced October 2021.

    Comments: 11 pages, 5 figures, Accepted in WACV 2022

  42. arXiv:2110.04590  [pdf, other

    cs.CL cs.SD eess.AS

    An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition

    Authors: Xuankai Chang, Takashi Maekaku, Pengcheng Guo, **g Shi, Yen-Ju Lu, Aswin Shanmugam Subramanian, Tianzi Wang, Shu-wen Yang, Yu Tsao, Hung-yi Lee, Shinji Watanabe

    Abstract: Self-supervised pretraining on speech data has achieved a lot of progress. High-fidelity representation of the speech signal is learned from a lot of untranscribed data and shows promising performance. Recently, there are several works focusing on evaluating the quality of self-supervised pretrained representations on various tasks without domain restriction, e.g. SUPERB. However, such evaluations… ▽ More

    Submitted 9 October, 2021; originally announced October 2021.

    Comments: To appear in ASRU2021

  43. arXiv:2108.08048  [pdf, other

    cs.CV cs.AI

    Few-Shot Batch Incremental Road Object Detection via Detector Fusion

    Authors: Anuj Tambwekar, Kshitij Agrawal, Anay Majee, Anbumani Subramanian

    Abstract: Incremental few-shot learning has emerged as a new and challenging area in deep learning, whose objective is to train deep learning models using very few samples of new class data, and none of the old class data. In this work we tackle the problem of batch incremental few-shot road object detection using data from the India Driving Dataset (IDD). Our approach, DualFusion, combines object detectors… ▽ More

    Submitted 18 August, 2021; originally announced August 2021.

    Comments: accepted in 2nd Autonomous Vehicle Vision Workshop, ICCV2021

  44. arXiv:2107.05189  [pdf, other

    math.OC

    Exponential-Size Neighborhoods for the Pickup-and-Delivery Traveling Salesman Problem

    Authors: Toni Pacheco, Rafael Martinelli, Anand Subramanian, Túlio A. M. Toffolo, Thibaut Vidal

    Abstract: Neighborhood search is a cornerstone of state-of-the-art traveling salesman and vehicle routing metaheuristics. While neighborhood exploration procedures are well developed for problems with individual services, their counterparts for one-to-one pickup-and-delivery problems have been more scarcely studied. A direct extension of classic neighborhoods is often inefficient or complex due to the neces… ▽ More

    Submitted 20 August, 2022; v1 submitted 12 July, 2021; originally announced July 2021.

  45. arXiv:2102.07955  [pdf, other

    eess.AS cs.SD

    Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition

    Authors: Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu

    Abstract: Multi-source localization is an important and challenging technique for multi-talker conversation analysis. This paper proposes a novel supervised learning method using deep neural networks to estimate the direction of arrival (DOA) of all the speakers simultaneously from the audio mixture. At the heart of the proposal is a source splitting mechanism that creates source-specific intermediate repre… ▽ More

    Submitted 28 November, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: Submitted to Computer Speech & Language

  46. arXiv:2101.12543  [pdf, other

    cs.CV

    Few-Shot Learning for Road Object Detection

    Authors: Anay Majee, Kshitij Agrawal, Anbumani Subramanian

    Abstract: Few-shot learning is a problem of high interest in the evolution of deep learning. In this work, we consider the problem of few-shot object detection (FSOD) in a real-world, class-imbalanced scenario. For our experiments, we utilize the India Driving Dataset (IDD), as it includes a class of less-occurring road objects in the image dataset and hence provides a setup suitable for few-shot learning.… ▽ More

    Submitted 17 March, 2021; v1 submitted 29 January, 2021; originally announced January 2021.

    Comments: Accepted to AAAI 2021 Workshop on Meta-Learning

  47. arXiv:2012.13006  [pdf, other

    eess.AS cs.SD

    The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans

    Authors: Shinji Watanabe, Florian Boyer, Xuankai Chang, Pengcheng Guo, Tomoki Hayashi, Yosuke Higuchi, Takaaki Hori, Wen-Chin Huang, Hirofumi Inaguma, Naoyuki Kamo, Shigeki Karita, Chenda Li, **g Shi, Aswin Shanmugam Subramanian, Wangyou Zhang

    Abstract: This paper describes the recent development of ESPnet (https://github.com/espnet/espnet), an end-to-end speech processing toolkit. This project was initiated in December 2017 to mainly deal with end-to-end speech recognition experiments based on sequence-to-sequence modeling. The project has grown rapidly and now covers a wide range of speech processing applications. Now ESPnet also includes text… ▽ More

    Submitted 23 December, 2020; originally announced December 2020.

  48. arXiv:2012.03891  [pdf, other

    cs.DL cs.LG

    COVIDScholar: An automated COVID-19 research aggregation and analysis platform

    Authors: Amalie Trewartha, John Dagdelen, Haoyan Huo, Kevin Cruse, Zheren Wang, Tan** He, Akshay Subramanian, Yuxing Fei, Benjamin Justus, Kristin Persson, Gerbrand Ceder

    Abstract: The ongoing COVID-19 pandemic has had far-reaching effects throughout society, and science is no exception. The scale, speed, and breadth of the scientific community's COVID-19 response has lead to the emergence of new research literature on a remarkable scale -- as of October 2020, over 81,000 COVID-19 related scientific papers have been released, at a rate of over 250 per day. This has created a… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.

  49. ESPnet-se: end-to-end speech enhancement and separation toolkit designed for asr integration

    Authors: Chenda Li, **g Shi, Wangyou Zhang, Aswin Shanmugam Subramanian, Xuankai Chang, Naoyuki Kamo, Moto Hira, Tomoki Hayashi, Christoph Boeddeker, Zhuo Chen, Shinji Watanabe

    Abstract: We present ESPnet-SE, which is designed for the quick development of speech enhancement and speech separation systems in a single framework, along with the optional downstream speech recognition module. ESPnet-SE is a new project which integrates rich automatic speech recognition related models, resources and systems to support and validate the proposed front-end implementation (i.e. speech enhanc… ▽ More

    Submitted 7 November, 2020; originally announced November 2020.

    Comments: Accepted by SLT 2021

  50. arXiv:2011.00091  [pdf, other

    eess.AS cs.CL cs.SD

    Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition with Source Localization

    Authors: Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Yong Xu, Shi-Xiong Zhang, Dong Yu

    Abstract: This paper proposes a new paradigm for handling far-field multi-speaker data in an end-to-end neural network manner, called directional automatic speech recognition (D-ASR), which explicitly models source speaker locations. In D-ASR, the azimuth angle of the sources with respect to the microphone array is defined as a latent variable. This angle controls the quality of separation, which in turn de… ▽ More

    Submitted 30 October, 2020; originally announced November 2020.

    Comments: submitted to ICASSP 2021