Skip to main content

Showing 1–39 of 39 results for author: Sriram, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.08904  [pdf, other

    cs.LG cs.SD eess.AS

    AdaPTwin: Low-Cost Adaptive Compression of Product Twins in Transformers

    Authors: Emil Biju, Anirudh Sriram, Mert Pilanci

    Abstract: While large transformer-based models have exhibited remarkable performance in speaker-independent speech recognition, their large size and computational requirements make them expensive or impractical to use in resource-constrained settings. In this work, we propose a low-rank adaptive compression technique called AdaPTwin that jointly compresses product-dependent pairs of weight matrices in the t… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 12 pages, 3 figures, submitted to NeurIPS 2024

  2. arXiv:2406.04713  [pdf, other

    cs.LG cond-mat.mtrl-sci cs.AI physics.comp-ph stat.ML

    FlowMM: Generating Materials with Riemannian Flow Matching

    Authors: Benjamin Kurt Miller, Ricky T. Q. Chen, Anuroop Sriram, Brandon M Wood

    Abstract: Crystalline materials are a fundamental component in next-generation technologies, yet modeling their distribution presents unique computational challenges. Of the plausible arrangements of atoms in a periodic lattice only a vanishingly small percentage are thermodynamically stable, which is a key indicator of the materials that can be experimentally realized. Two fundamental tasks in this area ar… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: https://github.com/facebookresearch/flowmm

    Journal ref: ICML 2024

  3. arXiv:2402.04379  [pdf, other

    cs.LG cond-mat.mtrl-sci

    Fine-Tuned Language Models Generate Stable Inorganic Materials as Text

    Authors: Nate Gruver, Anuroop Sriram, Andrea Madotto, Andrew Gordon Wilson, C. Lawrence Zitnick, Zachary Ulissi

    Abstract: We propose fine-tuning large language models for generation of stable materials. While unorthodox, fine-tuning large language models on text-encoded atomistic data is simple to implement yet reliable, with around 90% of sampled structures obeying physical constraints on atom positions and charges. Using energy above hull calculations from both learned ML potentials and gold-standard DFT calculatio… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: ICLR 2024. Code available at: https://github.com/facebookresearch/crystal-llm

  4. arXiv:2311.00341  [pdf, other

    cond-mat.mtrl-sci cs.LG

    The Open DAC 2023 Dataset and Challenges for Sorbent Discovery in Direct Air Capture

    Authors: Anuroop Sriram, Sihoon Choi, Xiaohan Yu, Logan M. Brabson, Abhishek Das, Zachary Ulissi, Matt Uyttendaele, Andrew J. Medford, David S. Sholl

    Abstract: New methods for carbon dioxide removal are urgently needed to combat global climate change. Direct air capture (DAC) is an emerging technology to capture carbon dioxide directly from ambient air. Metal-organic frameworks (MOFs) have been widely studied as potentially customizable adsorbents for DAC. However, discovering promising MOF sorbents for DAC is challenging because of the vast chemical spa… ▽ More

    Submitted 27 November, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

  5. arXiv:2306.01623  [pdf, other

    cs.CV cs.AI cs.LG

    HomE: Homography-Equivariant Video Representation Learning

    Authors: Anirudh Sriram, Adrien Gaidon, Jiajun Wu, Juan Carlos Niebles, Li Fei-Fei, Ehsan Adeli

    Abstract: Recent advances in self-supervised representation learning have enabled more efficient and robust model performance without relying on extensive labeled data. However, most works are still focused on images, with few working on videos and even fewer on multi-view videos, where more powerful inductive biases can be leveraged for self-supervision. In this work, we propose a novel method for represen… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: 10 pages, 4 figures, 4 tables

  6. arXiv:2305.11859  [pdf, other

    cs.CL

    Complex Claim Verification with Evidence Retrieved in the Wild

    Authors: Jifan Chen, Grace Kim, Aniruddh Sriram, Greg Durrett, Eunsol Choi

    Abstract: Evidence retrieval is a core part of automatic fact-checking. Prior work makes simplifying assumptions in retrieval that depart from real-world use cases: either no access to evidence, access to evidence curated by a human fact-checker, or access to evidence available long after the claim has been made. In this work, we present the first fully automated pipeline to check real-world claims by retri… ▽ More

    Submitted 15 June, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: NAACL 2024

  7. arXiv:2206.14331  [pdf, other

    physics.chem-ph cs.CE cs.LG physics.comp-ph

    Spherical Channels for Modeling Atomic Interactions

    Authors: C. Lawrence Zitnick, Abhishek Das, Adeesh Kolluru, Janice Lan, Muhammed Shuaibi, Anuroop Sriram, Zachary Ulissi, Brandon Wood

    Abstract: Modeling the energy and forces of atomic systems is a fundamental problem in computational chemistry with the potential to help address many of the world's most pressing problems, including those related to energy scarcity and climate change. These calculations are traditionally performed using Density Functional Theory, which is computationally very expensive. Machine learning has the potential t… ▽ More

    Submitted 13 October, 2022; v1 submitted 28 June, 2022; originally announced June 2022.

    Comments: 19 pages, accepted NeurIPS 2022

    ACM Class: I.2.6; J.2

  8. arXiv:2206.13654  [pdf, other

    cs.CL

    Wav2Vec-Aug: Improved self-supervised training with limited data

    Authors: Anuroop Sriram, Michael Auli, Alexei Baevski

    Abstract: Self-supervised learning (SSL) of speech representations has received much attention over the last few years but most work has focused on languages and domains with an abundance of unlabeled data. However, for many languages there is a shortage even in the unlabeled data which limits the effectiveness of SSL. In this work, we focus on the problem of applying SSL to domains with limited available d… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

  9. arXiv:2206.08917  [pdf, other

    cond-mat.mtrl-sci cs.LG physics.comp-ph

    The Open Catalyst 2022 (OC22) Dataset and Challenges for Oxide Electrocatalysts

    Authors: Richard Tran, Janice Lan, Muhammed Shuaibi, Brandon M. Wood, Siddharth Goyal, Abhishek Das, Javier Heras-Domingo, Adeesh Kolluru, Ammar Rizvi, Nima Shoghi, Anuroop Sriram, Felix Therrien, Jehad Abed, Oleksandr Voznyy, Edward H. Sargent, Zachary Ulissi, C. Lawrence Zitnick

    Abstract: The development of machine learning models for electrocatalysts requires a broad set of training data to enable their use across a wide variety of materials. One class of materials that currently lacks sufficient training data is oxides, which are critical for the development of OER catalysts. To address this, we developed the OC22 dataset, consisting of 62,331 DFT relaxations (~9,854,504 single p… ▽ More

    Submitted 7 March, 2023; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: 50 pages, 14 figures

  10. arXiv:2205.06938  [pdf, other

    cs.CL

    Generating Literal and Implied Subquestions to Fact-check Complex Claims

    Authors: Jifan Chen, Aniruddh Sriram, Eunsol Choi, Greg Durrett

    Abstract: Verifying complex political claims is a challenging task, especially when politicians use various tactics to subtly misrepresent the facts. Automatic fact-checking systems fall short here, and their predictions like "half-true" are not very useful in isolation, since we have no idea which parts of the claim are true and which are not. In this work, we focus on decomposing a complex claim into a co… ▽ More

    Submitted 31 October, 2022; v1 submitted 13 May, 2022; originally announced May 2022.

    Journal ref: EMNLP 2022

  11. arXiv:2204.02782  [pdf, other

    cs.LG cond-mat.mtrl-sci physics.chem-ph physics.comp-ph

    GemNet-OC: Develo** Graph Neural Networks for Large and Diverse Molecular Simulation Datasets

    Authors: Johannes Gasteiger, Muhammed Shuaibi, Anuroop Sriram, Stephan Günnemann, Zachary Ulissi, C. Lawrence Zitnick, Abhishek Das

    Abstract: Recent years have seen the advent of molecular simulation datasets that are orders of magnitude larger and more diverse. These new datasets differ substantially in four aspects of complexity: 1. Chemical diversity (number of different elements), 2. system size (number of atoms per sample), 3. dataset size (number of data samples), and 4. domain shift (similarity of the training and test set). Desp… ▽ More

    Submitted 30 September, 2022; v1 submitted 6 April, 2022; originally announced April 2022.

  12. arXiv:2203.14049  [pdf, other

    cs.LG cs.CL cs.HC

    Joint Transformer/RNN Architecture for Gesture Ty** in Indic Languages

    Authors: Emil Biju, Anirudh Sriram, Mitesh M. Khapra, Pratyush Kumar

    Abstract: Gesture ty** is a method of ty** words on a touch-based keyboard by creating a continuous trace passing through the relevant keys. This work is aimed at develo** a keyboard that supports gesture ty** in Indic languages. We begin by noting that when dealing with Indic languages, one needs to cater to two different sets of users: (i) users who prefer to type in the native Indic script (Devan… ▽ More

    Submitted 26 March, 2022; originally announced March 2022.

    Comments: Published at COLING 2020, 12 pages, 4 Tables and 5 Figures

  13. arXiv:2203.12298  [pdf, other

    cs.CL cs.CR cs.LG

    Input-specific Attention Subnetworks for Adversarial Detection

    Authors: Emil Biju, Anirudh Sriram, Pratyush Kumar, Mitesh M Khapra

    Abstract: Self-attention heads are characteristic of Transformer models and have been well studied for interpretability and pruning. In this work, we demonstrate an altogether different utility of attention heads, namely for adversarial detection. Specifically, we propose a method to construct input-specific attention subnetworks (IAS) from which we extract three features to discriminate between authentic a… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: Accepted at Findings of ACL 2022, 14 pages, 6 Tables and 9 Figures

  14. arXiv:2203.09697  [pdf, other

    cs.LG physics.comp-ph stat.ML

    Towards Training Billion Parameter Graph Neural Networks for Atomic Simulations

    Authors: Anuroop Sriram, Abhishek Das, Brandon M. Wood, Siddharth Goyal, C. Lawrence Zitnick

    Abstract: Recent progress in Graph Neural Networks (GNNs) for modeling atomic simulations has the potential to revolutionize catalyst discovery, which is a key step in making progress towards the energy breakthroughs needed to combat climate change. However, the GNNs that have proven most effective for this task are memory intensive as they model higher-order interactions in the graphs such as those between… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

    Comments: ICLR 2022

  15. arXiv:2106.09575  [pdf, other

    cs.LG cs.CE

    Rotation Invariant Graph Neural Networks using Spin Convolutions

    Authors: Muhammed Shuaibi, Adeesh Kolluru, Abhishek Das, Aditya Grover, Anuroop Sriram, Zachary Ulissi, C. Lawrence Zitnick

    Abstract: Progress towards the energy breakthroughs needed to combat climate change can be significantly accelerated through the efficient simulation of atomic systems. Simulation techniques based on first principles, such as Density Functional Theory (DFT), are limited in their practical use due to their high computational expense. Machine learning approaches have the potential to approximate DFT in a comp… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

    Comments: 13 pages

    ACM Class: I.2.6; J.2

  16. arXiv:2104.01027  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training

    Authors: Wei-Ning Hsu, Anuroop Sriram, Alexei Baevski, Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Jacob Kahn, Ann Lee, Ronan Collobert, Gabriel Synnaeve, Michael Auli

    Abstract: Self-supervised learning of speech representations has been a very active research area but most work is focused on a single domain such as read audio books for which there exist large quantities of labeled and unlabeled data. In this paper, we explore more general setups where the domain of the unlabeled data for pre-training data differs from the domain of the labeled data for fine-tuning, which… ▽ More

    Submitted 8 September, 2021; v1 submitted 2 April, 2021; originally announced April 2021.

  17. arXiv:2103.01436  [pdf, other

    cs.LG

    ForceNet: A Graph Neural Network for Large-Scale Quantum Calculations

    Authors: Weihua Hu, Muhammed Shuaibi, Abhishek Das, Siddharth Goyal, Anuroop Sriram, Jure Leskovec, Devi Parikh, C. Lawrence Zitnick

    Abstract: With massive amounts of atomic simulation data available, there is a huge opportunity to develop fast and accurate machine learning models to approximate expensive physics-based calculations. The key quantity to estimate is atomic forces, where the state-of-the-art Graph Neural Networks (GNNs) explicitly enforce basic physical constraints such as rotation-covariance. However, to strictly satisfy t… ▽ More

    Submitted 1 March, 2021; originally announced March 2021.

  18. arXiv:2101.04909  [pdf, other

    cs.CV cs.LG

    COVID-19 Prognosis via Self-Supervised Representation Learning and Multi-Image Prediction

    Authors: Anuroop Sriram, Matthew Muckley, Koustuv Sinha, Farah Shamout, Joelle Pineau, Krzysztof J. Geras, Lea Azour, Yindalon Aphinyanaphongs, Nafissa Yakubova, William Moore

    Abstract: The rapid spread of COVID-19 cases in recent months has strained hospital resources, making rapid and accurate triage of patients presenting to emergency departments a necessity. Machine learning techniques using clinical data such as chest X-rays have been used to predict which patients are most at risk of deterioration. We consider the task of predicting two types of patient deterioration based… ▽ More

    Submitted 24 January, 2021; v1 submitted 13 January, 2021; originally announced January 2021.

  19. Results of the 2020 fastMRI Challenge for Machine Learning MR Image Reconstruction

    Authors: Matthew J. Muckley, Bruno Riemenschneider, Alireza Radmanesh, Sunwoo Kim, Geunu Jeong, **gyu Ko, Yohan Jun, Hyungseob Shin, Dosik Hwang, Mahmoud Mostapha, Simon Arberet, Dominik Nickel, Zaccharie Ramzi, Philippe Ciuciu, Jean-Luc Starck, Jonas Teuwen, Dimitrios Karkalousos, Chao** Zhang, Anuroop Sriram, Zhengnan Huang, Nafissa Yakubova, Yvonne Lui, Florian Knoll

    Abstract: Accelerating MRI scans is one of the principal outstanding problems in the MRI research community. Towards this goal, we hosted the second fastMRI competition targeted towards reconstructing MR images with subsampled k-space data. We provided participants with data from 7,299 clinical brain scans (de-identified via a HIPAA-compliant procedure by NYU Langone Health), holding back the fully-sampled… ▽ More

    Submitted 3 May, 2021; v1 submitted 9 December, 2020; originally announced December 2020.

    Comments: M. J. Muckley and B. Riemenschneider contributed equally to this work. This updates to version accepted in IEEE Transactions on Medical Imaging. It includes a rewrite of Section II.E as well as minor changes and corrections

  20. MLS: A Large-Scale Multilingual Dataset for Speech Research

    Authors: Vineel Pratap, Qiantong Xu, Anuroop Sriram, Gabriel Synnaeve, Ronan Collobert

    Abstract: This paper introduces Multilingual LibriSpeech (MLS) dataset, a large multilingual corpus suitable for speech research. The dataset is derived from read audiobooks from LibriVox and consists of 8 languages, including about 44.5K hours of English and a total of about 6K hours for other languages. Additionally, we provide Language Models (LM) and baseline Automatic Speech Recognition (ASR) models an… ▽ More

    Submitted 19 December, 2020; v1 submitted 6 December, 2020; originally announced December 2020.

    Journal ref: Interspeech 2020

  21. arXiv:2010.09990  [pdf, other

    cond-mat.mtrl-sci cs.LG

    The Open Catalyst 2020 (OC20) Dataset and Community Challenges

    Authors: Lowik Chanussot, Abhishek Das, Siddharth Goyal, Thibaut Lavril, Muhammed Shuaibi, Morgane Riviere, Kevin Tran, Javier Heras-Domingo, Caleb Ho, Weihua Hu, Aini Palizhati, Anuroop Sriram, Brandon Wood, Junwoong Yoon, Devi Parikh, C. Lawrence Zitnick, Zachary Ulissi

    Abstract: Catalyst discovery and optimization is key to solving many societal and energy challenges including solar fuels synthesis, long-term energy storage, and renewable fertilizer production. Despite considerable effort by the catalysis community to apply machine learning models to the computational catalyst discovery process, it remains an open challenge to build models that can generalize across both… ▽ More

    Submitted 24 September, 2021; v1 submitted 19 October, 2020; originally announced October 2020.

    Comments: 37 pages, 11 figures, submitted to ACS Catalysis

  22. arXiv:2010.09435  [pdf, other

    cond-mat.mtrl-sci cs.CE cs.LG

    An Introduction to Electrocatalyst Design using Machine Learning for Renewable Energy Storage

    Authors: C. Lawrence Zitnick, Lowik Chanussot, Abhishek Das, Siddharth Goyal, Javier Heras-Domingo, Caleb Ho, Weihua Hu, Thibaut Lavril, Aini Palizhati, Morgane Riviere, Muhammed Shuaibi, Anuroop Sriram, Kevin Tran, Brandon Wood, Junwoong Yoon, Devi Parikh, Zachary Ulissi

    Abstract: Scalable and cost-effective solutions to renewable energy storage are essential to addressing the world's rising energy needs while reducing climate change. As we increase our reliance on renewable energy sources such as wind and solar, which produce intermittent power, storage is needed to transfer power from times of peak generation to peak demand. This may require the storage of power for hours… ▽ More

    Submitted 14 October, 2020; originally announced October 2020.

    Comments: 27 pages

    ACM Class: I.2.6; J.2

  23. arXiv:2008.03553  [pdf, other

    cs.CV

    Forming Local Intersections of Projections for Classifying and Searching Histopathology Images

    Authors: Aditya Sriram, Shivam Kalra, Morteza Babaie, Brady Kieffer, Waddah Al Drobi, Shahryar Rahnamayan, Hany Kashani, Hamid R. Tizhoosh

    Abstract: In this paper, we propose a novel image descriptor called Forming Local Intersections of Projections (FLIP) and its multi-resolution version (mFLIP) for representing histopathology images. The descriptor is based on the Radon transform wherein we apply parallel projections in small local neighborhoods of gray-level images. Using equidistant projection directions in each window, we extract unique a… ▽ More

    Submitted 8 August, 2020; originally announced August 2020.

    Comments: To appear in International Conference on AI in Medicine (AIME 2020)

  24. arXiv:2007.03001  [pdf, other

    eess.AS cs.CL cs.SD

    Massively Multilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters

    Authors: Vineel Pratap, Anuroop Sriram, Paden Tomasello, Awni Hannun, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

    Abstract: We study training a single acoustic model for multiple languages with the aim of improving automatic speech recognition (ASR) performance on low-resource languages, and over-all simplifying deployment of ASR systems that support diverse languages. We perform an extensive benchmark on 51 languages, with varying amount of training data by language(from 100 hours to 1100 hours). We compare three vari… ▽ More

    Submitted 7 July, 2020; v1 submitted 6 July, 2020; originally announced July 2020.

  25. arXiv:2004.06688  [pdf, other

    eess.IV cs.CV

    End-to-End Variational Networks for Accelerated MRI Reconstruction

    Authors: Anuroop Sriram, Jure Zbontar, Tullie Murrell, Aaron Defazio, C. Lawrence Zitnick, Nafissa Yakubova, Florian Knoll, Patricia Johnson

    Abstract: The slow acquisition speed of magnetic resonance imaging (MRI) has led to the development of two complementary methods: acquiring multiple views of the anatomy simultaneously (parallel imaging) and acquiring fewer samples than necessary for traditional signal processing methods (compressed sensing). While the combination of these methods has the potential to allow much faster scan times, reconstru… ▽ More

    Submitted 15 April, 2020; v1 submitted 14 April, 2020; originally announced April 2020.

  26. arXiv:2001.02518  [pdf, other

    eess.IV cs.CV

    Advancing machine learning for MR image reconstruction with an open competition: Overview of the 2019 fastMRI challenge

    Authors: Florian Knoll, Tullie Murrell, Anuroop Sriram, Nafissa Yakubova, Jure Zbontar, Michael Rabbat, Aaron Defazio, Matthew J. Muckley, Daniel K. Sodickson, C. Lawrence Zitnick, Michael P. Recht

    Abstract: Purpose: To advance research in the field of machine learning for MR image reconstruction with an open challenge. Methods: We provided participants with a dataset of raw k-space data from 1,594 consecutive clinical exams of the knee. The goal of the challenge was to reconstruct images from these data. In order to strike a balance between realistic data and a shallow learning curve for those not al… ▽ More

    Submitted 6 January, 2020; originally announced January 2020.

  27. arXiv:1911.08460  [pdf, ps, other

    cs.CL cs.SD eess.AS

    End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures

    Authors: Gabriel Synnaeve, Qiantong Xu, Jacob Kahn, Tatiana Likhomanenko, Edouard Grave, Vineel Pratap, Anuroop Sriram, Vitaliy Liptchinsky, Ronan Collobert

    Abstract: We study pseudo-labeling for the semi-supervised training of ResNet, Time-Depth Separable ConvNets, and Transformers for speech recognition, with either CTC or Seq2Seq loss functions. We perform experiments on the standard LibriSpeech dataset, and leverage additional unlabeled data from LibriVox through pseudo-labeling. We show that while Transformer-based acoustic models have superior performance… ▽ More

    Submitted 14 July, 2020; v1 submitted 19 November, 2019; originally announced November 2019.

    Comments: Published at the workshop on Self-supervision in Audio and Speech (SAS) at the 37th International Conference on Machine Learning (ICML 2020), Vienna, Austria

  28. arXiv:1911.01629  [pdf, other

    cs.CL cs.LG eess.AS

    RNN-T For Latency Controlled ASR With Improved Beam Search

    Authors: Mahaveer Jain, Kjell Schubert, Jay Mahadeokar, Ching-Feng Yeh, Kaustubh Kalgaonkar, Anuroop Sriram, Christian Fuegen, Michael L. Seltzer

    Abstract: Neural transducer-based systems such as RNN Transducers (RNN-T) for automatic speech recognition (ASR) blend the individual components of a traditional hybrid ASR systems (acoustic model, language model, punctuation model, inverse text normalization) into one single model. This greatly simplifies training and inference and hence makes RNN-T a desirable choice for ASR systems. In this work, we inve… ▽ More

    Submitted 16 January, 2020; v1 submitted 5 November, 2019; originally announced November 2019.

  29. arXiv:1910.12325  [pdf, other

    eess.IV cs.CV

    GrappaNet: Combining Parallel Imaging with Deep Learning for Multi-Coil MRI Reconstruction

    Authors: Anuroop Sriram, Jure Zbontar, Tullie Murrell, C. Lawrence Zitnick, Aaron Defazio, Daniel K. Sodickson

    Abstract: Magnetic Resonance Image (MRI) acquisition is an inherently slow process which has spurred the development of two different acceleration methods: acquiring multiple correlated samples simultaneously (parallel imaging) and acquiring fewer samples than necessary for traditional signal processing methods (compressed sensing). Both methods provide complementary approaches to accelerating the speed of… ▽ More

    Submitted 30 March, 2020; v1 submitted 27 October, 2019; originally announced October 2019.

  30. arXiv:1907.00242  [pdf, other

    cs.NI eess.SP

    Joint Functional Splitting and Content Placement for Green Hybrid CRAN

    Authors: Ajay Sriram, Meysam Masoudi, Abdulrahman Alabbasi, Cicek Cavdar

    Abstract: A hybrid cloud radio access network (H-CRAN) architecture has been proposed to alleviate the midhaul capacity limitation in C-RAN. In this architecture, functional splitting is utilized to distribute the processing functions between a central cloud and edge clouds. The flexibility of selecting specific split point enables the H-CRAN designer to reduce midhaul bandwidth, or reduce latency, or save… ▽ More

    Submitted 29 June, 2019; originally announced July 2019.

  31. arXiv:1904.00740  [pdf, other

    cs.CV

    Projectron -- A Shallow and Interpretable Network for Classifying Medical Images

    Authors: Aditya Sriram, Shivam Kalra, H. R. Tizhoosh

    Abstract: This paper introduces the `Projectron' as a new neural network architecture that uses Radon projections to both classify and represent medical images. The motivation is to build shallow networks which are more interpretable in the medical imaging domain. Radon transform is an established technique that can reconstruct images from parallel projections. The Projectron first applies global Radon tran… ▽ More

    Submitted 15 March, 2019; originally announced April 2019.

    Comments: Accepted for publication in the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary

  32. arXiv:1811.08839  [pdf, other

    cs.CV cs.LG eess.SP physics.med-ph stat.ML

    fastMRI: An Open Dataset and Benchmarks for Accelerated MRI

    Authors: Jure Zbontar, Florian Knoll, Anuroop Sriram, Tullie Murrell, Zhengnan Huang, Matthew J. Muckley, Aaron Defazio, Ruben Stern, Patricia Johnson, Mary Bruno, Marc Parente, Krzysztof J. Geras, Joe Katsnelson, Hersh Chandarana, Zizhao Zhang, Michal Drozdzal, Adriana Romero, Michael Rabbat, Pascal Vincent, Nafissa Yakubova, James Pinkerton, Duo Wang, Erich Owens, C. Lawrence Zitnick, Michael P. Recht , et al. (2 additional authors not shown)

    Abstract: Accelerating Magnetic Resonance Imaging (MRI) by taking fewer measurements has the potential to reduce medical costs, minimize stress to patients and make MRI possible in applications where it is currently prohibitively slow or expensive. We introduce the fastMRI dataset, a large-scale collection of both raw MR measurements and clinical MR images, that can be used for training and evaluation of ma… ▽ More

    Submitted 11 December, 2019; v1 submitted 21 November, 2018; originally announced November 2018.

    Comments: 35 pages, 10 figures

  33. arXiv:1711.01567  [pdf, other

    cs.CL cs.LG

    Robust Speech Recognition Using Generative Adversarial Networks

    Authors: Anuroop Sriram, Heewoo Jun, Yashesh Gaur, Sanjeev Satheesh

    Abstract: This paper describes a general, scalable, end-to-end framework that uses the generative adversarial network (GAN) objective to enable robust speech recognition. Encoders trained with the proposed approach enjoy improved invariance by learning to map noisy audio to the same embedding space as that of clean audio. Unlike previous methods, the new framework does not rely on domain expertise or simpli… ▽ More

    Submitted 5 November, 2017; originally announced November 2017.

  34. arXiv:1710.01247  [pdf, other

    cs.CV

    Learning Autoencoded Radon Projections

    Authors: Aditya Sriram, Shivam Kalra, H. R. Tizhoosh, Shahryar Rahnamayan

    Abstract: Autoencoders have been recently used for encoding medical images. In this study, we design and validate a new framework for retrieving medical images by classifying Radon projections, compressed in the deepest layer of an autoencoder. As the autoencoder reduces the dimensionality, a multilayer perceptron (MLP) can be employed to classify the images. The integration of MLP promotes a rather shallow… ▽ More

    Submitted 27 September, 2017; originally announced October 2017.

    Comments: To appear in proceedings of The IEEE Symposium Series on Computational Intelligence (IEEE SSCI 2017), Honolulu, Hawaii, USA, Nov. 27 -- Dec 1, 2017

  35. arXiv:1708.06426  [pdf, other

    cs.CL

    Cold Fusion: Training Seq2Seq Models Together with Language Models

    Authors: Anuroop Sriram, Heewoo Jun, Sanjeev Satheesh, Adam Coates

    Abstract: Sequence-to-sequence (Seq2Seq) models with attention have excelled at tasks which involve generating natural language sentences such as machine translation, image captioning and speech recognition. Performance has further been improved by leveraging unlabeled data, often in the form of a language model. In this work, we present the Cold Fusion method, which leverages a pre-trained language model d… ▽ More

    Submitted 21 August, 2017; originally announced August 2017.

  36. arXiv:1707.07413  [pdf, other

    cs.CL cs.NE

    Exploring Neural Transducers for End-to-End Speech Recognition

    Authors: Eric Battenberg, Jitong Chen, Rewon Child, Adam Coates, Yashesh Gaur, Yi Li, Hairong Liu, Sanjeev Satheesh, David Seetapun, Anuroop Sriram, Zhenyao Zhu

    Abstract: In this work, we perform an empirical comparison among the CTC, RNN-Transducer, and attention-based Seq2Seq models for end-to-end speech recognition. We show that, without any language model, Seq2Seq and RNN-Transducer models both outperform the best reported CTC models with a language model, on the popular Hub5'00 benchmark. On our internal diverse dataset, these trends continue - RNNTransducer m… ▽ More

    Submitted 24 July, 2017; originally announced July 2017.

  37. arXiv:1705.07522  [pdf, other

    cs.CV

    Classification and Retrieval of Digital Pathology Scans: A New Dataset

    Authors: Morteza Babaie, Shivam Kalra, Aditya Sriram, Christopher Mitcheltree, Shu** Zhu, Amin Khatami, Shahryar Rahnamayan, H. R. Tizhoosh

    Abstract: In this paper, we introduce a new dataset, \textbf{Kimia Path24}, for image classification and retrieval in digital pathology. We use the whole scan images of 24 different tissue textures to generate 1,325 test patches of size 1000$\times$1000 (0.5mm$\times$0.5mm). Training data can be generated according to preferences of algorithm designer and can range from approximately 27,000 to over 50,000 p… ▽ More

    Submitted 21 May, 2017; originally announced May 2017.

    Comments: Accepted for presentation at Workshop for Computer Vision for Microscopy Image Analysis (CVMI 2017) @ CVPR 2017, Honolulu, Hawaii

  38. arXiv:1705.04400  [pdf, other

    cs.CL

    Reducing Bias in Production Speech Models

    Authors: Eric Battenberg, Rewon Child, Adam Coates, Christopher Fougner, Yashesh Gaur, Jiaji Huang, Heewoo Jun, Ajay Kannan, Markus Kliegl, Atul Kumar, Hairong Liu, Vinay Rao, Sanjeev Satheesh, David Seetapun, Anuroop Sriram, Zhenyao Zhu

    Abstract: Replacing hand-engineered pipelines with end-to-end deep learning systems has enabled strong results in applications like speech and object recognition. However, the causality and latency constraints of production systems put end-to-end speech models back into the underfitting regime and expose biases in the model that we show cannot be overcome by "scaling up", i.e., training bigger models on mor… ▽ More

    Submitted 11 May, 2017; originally announced May 2017.

  39. arXiv:1609.05123  [pdf, other

    cs.LG cs.NE

    Learning Opposites Using Neural Networks

    Authors: Shivam Kalra, Aditya Sriram, Shahryar Rahnamayan, H. R. Tizhoosh

    Abstract: Many research works have successfully extended algorithms such as evolutionary algorithms, reinforcement agents and neural networks using "opposition-based learning" (OBL). Two types of the "opposites" have been defined in the literature, namely \textit{type-I} and \textit{type-II}. The former are linear in nature and applicable to the variable space, hence easy to calculate. On the other hand, ty… ▽ More

    Submitted 16 September, 2016; originally announced September 2016.

    Comments: To appear in proceedings of the 23rd International Conference on Pattern Recognition (ICPR 2016), Cancun, Mexico, December 2016