Skip to main content

Showing 1–43 of 43 results for author: Chopra, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.04318  [pdf, other

    cs.LG cs.AI cs.CV

    Adaptive Sampling of k-Space in Magnetic Resonance for Rapid Pathology Prediction

    Authors: Chen-Yu Yen, Raghav Singhal, Umang Sharma, Rajesh Ranganath, Sumit Chopra, Lerrel Pinto

    Abstract: Magnetic Resonance (MR) imaging, despite its proven diagnostic utility, remains an inaccessible imaging modality for disease surveillance at the population level. A major factor rendering MR inaccessible is lengthy scan times. An MR scanner collects measurements associated with the underlying anatomy in the Fourier space, also known as the k-space. Creating a high-fidelity image requires collectin… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: ICML 2024. Project website at https://adaptive-sampling-mr.github.io

  2. arXiv:2405.17613  [pdf, other

    cs.CV cs.CL cs.LG

    A Framework for Multi-modal Learning: Jointly Modeling Inter- & Intra-Modality Dependencies

    Authors: Divyam Madaan, Taro Makino, Sumit Chopra, Kyunghyun Cho

    Abstract: Supervised multi-modal learning involves map** multiple modalities to a target label. Previous studies in this field have concentrated on capturing in isolation either the inter-modality dependencies (the relationships between different modalities and the label) or the intra-modality dependencies (the relationships within a single modality and the label). We argue that these conventional approac… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  3. arXiv:2405.09010  [pdf, ps, other

    cs.IT

    On Low Field Size Constructions of Access-Optimal Convertible Codes

    Authors: Saransh Chopra, Francisco Maturana, K. V. Rashmi

    Abstract: Most large-scale storage systems employ erasure coding to provide resilience against disk failures. Recent work has shown that tuning this redundancy to changes in disk failure rates leads to substantial storage savings. This process requires code conversion, wherein data encoded using an $[n^{I\mskip-2mu},k^{I\mskip-2mu}]$ initial code has to be transformed into data encoded using an… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: This is an extended version of an IEEE ISIT 2024 paper with the same title

  4. arXiv:2404.16478  [pdf, other

    cs.CL cs.AI

    Evaluating Consistency and Reasoning Capabilities of Large Language Models

    Authors: Yash Saxena, Sarthak Chopra, Arunendra Mani Tripathi

    Abstract: Large Language Models (LLMs) are extensively used today across various sectors, including academia, research, business, and finance, for tasks such as text generation, summarization, and translation. Despite their widespread adoption, these models often produce incorrect and misleading information, exhibiting a tendency to hallucinate. This behavior can be attributed to several factors, with consi… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  5. arXiv:2403.16422  [pdf, other

    cs.CV cs.AI

    Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation

    Authors: Sanyam Lakhanpal, Shivang Chopra, Vinija Jain, Aman Chadha, Man Luo

    Abstract: Over the past few years, Text-to-Image (T2I) generation approaches based on diffusion models have gained significant attention. However, vanilla diffusion models often suffer from spelling inaccuracies in the text displayed within the generated images. The capability to generate visual text is crucial, offering both academic interest and a wide range of practical applications. To produce accurate… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  6. Charting the COVID Long Haul Experience -- A Longitudinal Exploration of Symptoms, Activity, and Clinical Adherence

    Authors: Jessica Pater, Shaan Chopra, Juliette Zaccour, Jeanne Carroll, Fayika Farhat Nova, Tammy Toscos, Shion Guha, Fen Lei Chang

    Abstract: COVID Long Haul (CLH) is an emerging chronic illness with varied patient experiences. Our understanding of CLH is often limited to data from electronic health records (EHRs), such as diagnoses or problem lists, which do not capture the volatility and severity of symptoms or their impact. To better understand the unique presentation of CLH, we conducted a 3-month long cohort study with 14 CLH patie… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: 21 pages, 4 figures, 7 tables, ACM Conference CHI Conference on Human Factors in Computing Systems

    ACM Class: K.4

  7. arXiv:2402.04929  [pdf, other

    cs.CV cs.AI cs.LG

    Source-Free Domain Adaptation with Diffusion-Guided Source Data Generation

    Authors: Shivang Chopra, Suraj Kothawade, Houda Aynaou, Aman Chadha

    Abstract: This paper introduces a novel approach to leverage the generalizability of Diffusion Models for Source-Free Domain Adaptation (DM-SFDA). Our proposed DMSFDA method involves fine-tuning a pre-trained text-to-image diffusion model to generate source domain images using features from the target images to guide the diffusion process. Specifically, the pre-trained diffusion model is fine-tuned to gener… ▽ More

    Submitted 26 June, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2310.01701

  8. arXiv:2401.05727  [pdf, other

    cs.CL

    Zero Resource Cross-Lingual Part Of Speech Tagging

    Authors: Sahil Chopra

    Abstract: Part of speech tagging in zero-resource settings can be an effective approach for low-resource languages when no labeled training data is available. Existing systems use two main techniques for POS tagging i.e. pretrained multilingual large language models(LLM) or project the source language labels into the zero resource target language and train a sequence labeling model on it. We explore the lat… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

  9. arXiv:2310.14196  [pdf, other

    cs.RO cs.AI

    Learning to Discern: Imitating Heterogeneous Human Demonstrations with Preference and Representation Learning

    Authors: Sachit Kuhar, Shuo Cheng, Shivang Chopra, Matthew Bronars, Danfei Xu

    Abstract: Practical Imitation Learning (IL) systems rely on large human demonstration datasets for successful policy learning. However, challenges lie in maintaining the quality of collected data and addressing the suboptimal nature of some demonstrations, which can compromise the overall dataset quality and hence the learning outcome. Furthermore, the intrinsic heterogeneity in human behavior can produce e… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: To appear at the 7th Annual Conference on Robot Learning (CoRL) 2023

  10. arXiv:2310.05592  [pdf, other

    cs.CL cs.AI cs.HC

    InterroLang: Exploring NLP Models and Datasets through Dialogue-based Explanations

    Authors: Nils Feldhus, Qianli Wang, Tatiana Anikina, Sahil Chopra, Cennet Oguz, Sebastian Möller

    Abstract: While recently developed NLP explainability methods let us open the black box in various ways (Madsen et al., 2022), a missing ingredient in this endeavor is an interactive tool offering a conversational interface. Such a dialogue system can help users explore datasets and models with explanations in a contextualized manner, e.g. via clarification or follow-up questions, and through a natural lang… ▽ More

    Submitted 23 October, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Findings. Camera-ready version

  11. arXiv:2310.01701   

    cs.CV cs.AI

    Transcending Domains through Text-to-Image Diffusion: A Source-Free Approach to Domain Adaptation

    Authors: Shivang Chopra, Suraj Kothawade, Houda Aynaou, Aman Chadha

    Abstract: Domain Adaptation (DA) is a method for enhancing a model's performance on a target domain with inadequate annotated data by applying the information the model has acquired from a related source domain with sufficient labeled data. The escalating enforcement of data-privacy regulations like HIPAA, COPPA, FERPA, etc. have sparked a heightened interest in adapting models to novel domains while circum… ▽ More

    Submitted 6 February, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: Revamped the whole paper; new version will be re-submitted

  12. arXiv:2306.13276  [pdf, other

    eess.IV cs.CV cs.LG

    On Sensitivity and Robustness of Normalization Schemes to Input Distribution Shifts in Automatic MR Image Diagnosis

    Authors: Divyam Madaan, Daniel Sodickson, Kyunghyun Cho, Sumit Chopra

    Abstract: Magnetic Resonance Imaging (MRI) is considered the gold standard of medical imaging because of the excellent soft-tissue contrast exhibited in the images reconstructed by the MRI pipeline, which in-turn enables the human radiologist to discern many pathologies easily. More recently, Deep Learning (DL) models have also achieved state-of-the-art performance in diagnosing multiple diseases using thes… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: Accepted at MIDL 2023

  13. arXiv:2304.09254  [pdf

    physics.med-ph cs.LG eess.IV

    FastMRI Prostate: A Publicly Available, Biparametric MRI Dataset to Advance Machine Learning for Prostate Cancer Imaging

    Authors: Radhika Tibrewala, Tarun Dutt, Angela Tong, Luke Ginocchio, Mahesh B Keerthivasan, Steven H Baete, Sumit Chopra, Yvonne W Lui, Daniel K Sodickson, Hersh Chandarana, Patricia M Johnson

    Abstract: The fastMRI brain and knee dataset has enabled significant advances in exploring reconstruction methods for improving speed and image quality for Magnetic Resonance Imaging (MRI) via novel, clinically relevant reconstruction approaches. In this study, we describe the April 2023 expansion of the fastMRI dataset to include biparametric prostate MRI data acquired on a clinical population. The dataset… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

    Comments: 4 pages, 1 figure

  14. arXiv:2301.11962  [pdf, other

    cs.LG

    On the Feasibility of Machine Learning Augmented Magnetic Resonance for Point-of-Care Identification of Disease

    Authors: Raghav Singhal, Mukund Sudarshan, Anish Mahishi, Sri Kaushik, Luke Ginocchio, Angela Tong, Hersh Chandarana, Daniel K. Sodickson, Rajesh Ranganath, Sumit Chopra

    Abstract: Early detection of many life-threatening diseases (e.g., prostate and breast cancer) within at-risk population can improve clinical outcomes and reduce cost of care. While numerous disease-specific "screening" tests that are closer to Point-of-Care (POC) are in use for this task, their low specificity results in unnecessary biopsies, leading to avoidable patient trauma and wasteful healthcare spen… ▽ More

    Submitted 2 February, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

  15. xURLCC in 6g with meshed RAN

    Authors: Mohammad Ali Khoshkholghi, Toktam Mahmoodi, Subhankar Pal, Subhash Chopra, Mayuri Tendulkar, Sandip Sarkar

    Abstract: 5G Ultra-Reliable Low Latency Communications Technology (URLLC) will not be able to provide extremely reliable low latency services to the complex networks in 6G. Moreover, URLLC that began with 5G has to be refined and improved in 6G to provide xURLCC (extreme URLCC) with sub-millisecond latency, for supporting diverse mission-critical applications. This paper aims to highlight the importance of… ▽ More

    Submitted 19 January, 2023; originally announced January 2023.

    Journal ref: ITU Journal on Future and Evolving Technologies, Volume 3 (2022), Issue 3, Pages 612-622

  16. arXiv:2206.08566  [pdf, other

    cs.CV

    Active Data Discovery: Mining Unknown Data using Submodular Information Measures

    Authors: Suraj Kothawade, Shivang Chopra, Saikat Ghosh, Rishabh Iyer

    Abstract: Active Learning is a very common yet powerful framework for iteratively and adaptively sampling subsets of the unlabeled sets with a human in the loop with the goal of achieving labeling efficiency. Most real world datasets have imbalance either in classes and slices, and correspondingly, parts of the dataset are rare. As a result, there has been a lot of work in designing active learning approach… ▽ More

    Submitted 17 June, 2022; originally announced June 2022.

  17. arXiv:2204.01837  [pdf, other

    cs.CE cs.DM

    Parallel Power System Restoration

    Authors: Sunil Chopra, Feng Qiu, Sangho Shim

    Abstract: Power system restoration is an essential activity for grid resilience, where grid operators restart generators, re-establish transmission paths, and restore loads after a blackout event. With a goal of restoring electric service in the shortest time, the core decisions in restoration planning are to partition the grid into sub-networks, each of which has an initial power source for black-start (ca… ▽ More

    Submitted 18 August, 2022; v1 submitted 4 April, 2022; originally announced April 2022.

    Comments: 30 pages, working paper

  18. arXiv:2007.04297  [pdf, other

    cs.CL

    Open Domain Suggestion Mining Leveraging Fine-Grained Analysis

    Authors: Shreya Singal, Tanishq Goel, Shivang Chopra, Sonika Dahiya

    Abstract: Suggestion mining tasks are often semantically complex and lack sophisticated methodologies that can be applied to real-world data. The presence of suggestions across a large diversity of domains and the absence of large labelled and balanced datasets render this task particularly challenging to deal with. In an attempt to overcome these challenges, we propose a two-tier pipeline that leverages Di… ▽ More

    Submitted 11 July, 2020; v1 submitted 27 June, 2020; originally announced July 2020.

  19. arXiv:1904.12258  [pdf, other

    cs.DS math.OC

    Generalizing the Covering Path Problem on a Grid

    Authors: Liwei Zeng, Karen Smilowitz, Sunil Chopra

    Abstract: We study the covering path problem on a grid of R^{2}. We generalize earlier results on a rectangular grid and prove that the covering path cost can be bounded by the area and perimeter of the grid. We provide (2+ε) and (1+ε)-approximations for the problem on a general grid and on a convex grid, respectively.

    Submitted 28 April, 2019; originally announced April 2019.

  20. arXiv:1903.04879  [pdf, other

    cs.SI cs.CY cs.LG

    What sets Verified Users apart? Insights, Analysis and Prediction of Verified Users on Twitter

    Authors: Indraneil Paul, Abhinav Khattar, Shaan Chopra, Ponnurangam Kumaraguru, Manish Gupta

    Abstract: Social network and publishing platforms, such as Twitter, support the concept of a secret proprietary verification process, for handles they deem worthy of platform-wide public interest. In line with significant prior work which suggests that possessing such a status symbolizes enhanced credibility in the eyes of the platform audience, a verified badge is clearly coveted among public figures and b… ▽ More

    Submitted 12 March, 2019; originally announced March 2019.

  21. arXiv:1902.02248  [pdf, other

    cs.LG cs.CV stat.ML

    Generative Image Translation for Data Augmentation of Bone Lesion Pathology

    Authors: Anant Gupta, Srivas Venkatesh, Sumit Chopra, Christian Ledig

    Abstract: Insufficient training data and severe class imbalance are often limiting factors when develo** machine learning models for the classification of rare diseases. In this work, we address the problem of classifying bone lesions from X-ray images by increasing the small number of positive samples in the training set. We propose a generative data augmentation approach based on a cycle-consistent gene… ▽ More

    Submitted 6 February, 2019; originally announced February 2019.

  22. arXiv:1812.09710  [pdf, other

    cs.SI cs.CY

    Elites Tweet? Characterizing the Twitter Verified User Network

    Authors: Indraneil Paul, Abhinav Khattar, Ponnurangam Kumaraguru, Manish Gupta, Shaan Chopra

    Abstract: Social network and publishing platforms, such as Twitter, support the concept of verification. Verified accounts are deemed worthy of platform-wide public interest and are separately authenticated by the platform itself. There have been repeated assertions by these platforms about verification not being tantamount to endorsement. However, a significant body of prior work suggests that possessing a… ▽ More

    Submitted 12 March, 2019; v1 submitted 23 December, 2018; originally announced December 2018.

  23. arXiv:1804.02063  [pdf, ps, other

    cs.CL

    Few-Shot Text Classification with Pre-Trained Word Embeddings and a Human in the Loop

    Authors: Katherine Bailey, Sunny Chopra

    Abstract: Most of the literature around text classification treats it as a supervised learning problem: given a corpus of labeled documents, train a classifier such that it can accurately predict the classes of unseen documents. In industry, however, it is not uncommon for a business to have entire corpora of documents where few or none have been classified, or where existing classifications have become mea… ▽ More

    Submitted 5 April, 2018; originally announced April 2018.

    Comments: 8 pages

  24. arXiv:1803.09040  [pdf, other

    math.OC cs.DS

    A Bounded Formulation for The School Bus Scheduling Problem

    Authors: Liwei Zeng, Sunil Chopra, Karen Smilowitz

    Abstract: This paper proposes a new formulation for the school bus scheduling problem (SBSP) which optimizes school start times and bus operation times to minimize transportation cost. Our goal is to minimize the number of buses to serve all bus routes such that each route arrives in a time window before school starts. We present a new time-indexed integer linear programming (ILP) formulation for this probl… ▽ More

    Submitted 1 August, 2020; v1 submitted 23 March, 2018; originally announced March 2018.

  25. arXiv:1801.05588  [pdf, other

    cs.SI

    White or Blue, the Whale gets its Vengeance: A Social Media Analysis of the Blue Whale Challenge

    Authors: Abhinav Khattar, Karan Dabas, Kshitij Gupta, Shaan Chopra, Ponnurangam Kumaraguru

    Abstract: The Blue Whale Challenge is a series of self-harm causing tasks that are propagated via online social media under the disguise of a "game." The list of tasks must be completed in a duration of 50 days and they cause both physical and mental harm to the player. The final task is to commit suicide. The game is supposed to be administered by people called "curators" who incite others to cause self-mu… ▽ More

    Submitted 17 January, 2018; originally announced January 2018.

    Comments: 18 pages

  26. arXiv:1709.03856  [pdf, ps, other

    cs.CL

    StarSpace: Embed All The Things!

    Authors: Ledell Wu, Adam Fisch, Sumit Chopra, Keith Adams, Antoine Bordes, Jason Weston

    Abstract: We present StarSpace, a general-purpose neural embedding model that can solve a wide variety of problems: labeling tasks such as text classification, ranking tasks such as information retrieval/web search, collaborative filtering-based or content-based recommendation, embedding of multi-relational graphs, and learning word, sentence or document level embeddings. In each case the model works by emb… ▽ More

    Submitted 20 November, 2017; v1 submitted 12 September, 2017; originally announced September 2017.

  27. arXiv:1702.04770  [pdf, other

    cs.CL cs.LG cs.NE

    Training Language Models Using Target-Propagation

    Authors: Sam Wiseman, Sumit Chopra, Marc'Aurelio Ranzato, Arthur Szlam, Ruoyu Sun, Soumith Chintala, Nicolas Vasilache

    Abstract: While Truncated Back-Propagation through Time (BPTT) is the most popular approach to training Recurrent Neural Networks (RNNs), it suffers from being inherently sequential (making parallelization difficult) and from truncating gradient flow between distant time-steps. We investigate whether Target Propagation (TPROP) style approaches can address these shortcomings. Unfortunately, extensive experim… ▽ More

    Submitted 15 February, 2017; originally announced February 2017.

  28. arXiv:1612.04936  [pdf, other

    cs.CL cs.AI

    Learning through Dialogue Interactions by Asking Questions

    Authors: Jiwei Li, Alexander H. Miller, Sumit Chopra, Marc'Aurelio Ranzato, Jason Weston

    Abstract: A good dialogue agent should have the ability to interact with users by both responding to questions and by asking questions, and importantly to learn from both types of interaction. In this work, we explore this direction by designing a simulator and a set of synthetic tasks in the movie domain that allow such interactions between a learner and a teacher. We investigate how a learner can benefit… ▽ More

    Submitted 13 February, 2017; v1 submitted 15 December, 2016; originally announced December 2016.

  29. arXiv:1611.09823  [pdf, other

    cs.AI cs.CL

    Dialogue Learning With Human-In-The-Loop

    Authors: Jiwei Li, Alexander H. Miller, Sumit Chopra, Marc'Aurelio Ranzato, Jason Weston

    Abstract: An important aspect of develo** conversational agents is to give a bot the ability to improve through communicating with humans and to learn from the mistakes that it makes. Most research has focused on learning from fixed training sets of labeled data rather than interacting with a dialogue partner in an online fashion. In this paper we explore this direction in a reinforcement learning setting… ▽ More

    Submitted 13 January, 2017; v1 submitted 29 November, 2016; originally announced November 2016.

  30. arXiv:1604.08667  [pdf, other

    cs.RO

    A Bio-Inspired Tensegrity Manipulator with Multi-DOF, Structurally Compliant Joints

    Authors: Steven Lessard, Dennis Castro, William Asper, Shaurya Deep Chopra, Leya Breanna Baltaxe-Admony, Mircea Teodorescu, Vytas SunSpiral, Adrian Agogino

    Abstract: Most traditional robotic mechanisms feature inelastic joints that are unable to robustly handle large deformations and off-axis moments. As a result, the applied loads are transferred rigidly throughout the entire structure. The disadvantage of this approach is that the exerted leverage is magnified at each subsequent joint possibly damaging the mechanism. In this paper, we present two lightweight… ▽ More

    Submitted 1 September, 2016; v1 submitted 28 April, 2016; originally announced April 2016.

    Comments: IROS 2016

  31. arXiv:1511.06931  [pdf, ps, other

    cs.CL cs.LG

    Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems

    Authors: Jesse Dodge, Andreea Gane, Xiang Zhang, Antoine Bordes, Sumit Chopra, Alexander Miller, Arthur Szlam, Jason Weston

    Abstract: A long-term goal of machine learning is to build intelligent conversational agents. One recent popular approach is to train end-to-end models on a large amount of real dialog transcripts between humans (Sordoni et al., 2015; Vinyals & Le, 2015; Shang et al., 2015). However, this approach leaves many questions unanswered as an understanding of the precise successes and shortcomings of each model is… ▽ More

    Submitted 19 April, 2016; v1 submitted 21 November, 2015; originally announced November 2015.

  32. arXiv:1511.06732  [pdf, other

    cs.LG cs.CL

    Sequence Level Training with Recurrent Neural Networks

    Authors: Marc'Aurelio Ranzato, Sumit Chopra, Michael Auli, Wojciech Zaremba

    Abstract: Many natural language processing applications use language models to generate text. These models are typically trained to predict the next word in a sequence, given the previous words and some context such as an image. However, at test time the model is expected to generate the entire sequence from scratch. This discrepancy makes generation brittle, as errors may accumulate along the way. We addre… ▽ More

    Submitted 6 May, 2016; v1 submitted 20 November, 2015; originally announced November 2015.

  33. arXiv:1511.02301  [pdf, other

    cs.CL

    The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations

    Authors: Felix Hill, Antoine Bordes, Sumit Chopra, Jason Weston

    Abstract: We introduce a new test of how well language models capture meaning in children's books. Unlike standard language modelling benchmarks, it distinguishes the task of predicting syntactic function words from that of predicting lower-frequency words, which carry greater semantic content. We compare a range of state-of-the-art models, each with a different way of encoding what has been previously read… ▽ More

    Submitted 1 April, 2016; v1 submitted 6 November, 2015; originally announced November 2015.

  34. arXiv:1509.00685  [pdf, other

    cs.CL cs.AI

    A Neural Attention Model for Abstractive Sentence Summarization

    Authors: Alexander M. Rush, Sumit Chopra, Jason Weston

    Abstract: Summarization based on text extraction is inherently limited, but generation-style abstractive methods have proven challenging to build. In this work, we propose a fully data-driven approach to abstractive sentence summarization. Our method utilizes a local attention-based model that generates each word of the summary conditioned on the input sentence. While the model is structurally simple, it ca… ▽ More

    Submitted 3 September, 2015; v1 submitted 2 September, 2015; originally announced September 2015.

    Comments: Proceedings of EMNLP 2015

  35. arXiv:1506.02075  [pdf, ps, other

    cs.LG cs.CL

    Large-scale Simple Question Answering with Memory Networks

    Authors: Antoine Bordes, Nicolas Usunier, Sumit Chopra, Jason Weston

    Abstract: Training large-scale question answering systems is complicated because training sources usually cover a small portion of the range of possible questions. This paper studies the impact of multitask and transfer learning for simple question answering; a setting for which the reasoning required to answer is quite easy, as long as one can retrieve the correct evidence given a question, which can be di… ▽ More

    Submitted 5 June, 2015; originally announced June 2015.

  36. arXiv:1502.05698  [pdf, ps, other

    cs.AI cs.CL stat.ML

    Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks

    Authors: Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M. Rush, Bart van Merriënboer, Armand Joulin, Tomas Mikolov

    Abstract: One long-term goal of machine learning research is to produce methods that are applicable to reasoning and natural language, in particular building an intelligent dialogue agent. To measure progress towards that goal, we argue for the usefulness of a set of proxy tasks that evaluate reading comprehension via question answering. Our tasks measure understanding in several ways: whether a system is a… ▽ More

    Submitted 31 December, 2015; v1 submitted 19 February, 2015; originally announced February 2015.

  37. arXiv:1412.7753  [pdf, other

    cs.NE cs.LG

    Learning Longer Memory in Recurrent Neural Networks

    Authors: Tomas Mikolov, Armand Joulin, Sumit Chopra, Michael Mathieu, Marc'Aurelio Ranzato

    Abstract: Recurrent neural network is a powerful model that learns temporal patterns in sequential data. For a long time, it was believed that recurrent networks are difficult to train using simple optimizers, such as stochastic gradient descent, due to the so-called vanishing gradient problem. In this paper, we show that learning longer term patterns in real data, such as in natural language, is perfectly… ▽ More

    Submitted 16 April, 2015; v1 submitted 24 December, 2014; originally announced December 2014.

  38. arXiv:1412.6604  [pdf, ps, other

    cs.LG cs.CV

    Video (language) modeling: a baseline for generative models of natural videos

    Authors: MarcAurelio Ranzato, Arthur Szlam, Joan Bruna, Michael Mathieu, Ronan Collobert, Sumit Chopra

    Abstract: We propose a strong baseline model for unsupervised feature learning using video data. By learning to predict missing frames or extrapolate future frames from an input video sequence, the model discovers both spatial and temporal correlations which are useful to represent complex deformations and motion patterns. The models we propose are largely borrowed from the language modeling literature, and… ▽ More

    Submitted 4 May, 2016; v1 submitted 20 December, 2014; originally announced December 2014.

  39. arXiv:1410.3916  [pdf, ps, other

    cs.AI cs.CL stat.ML

    Memory Networks

    Authors: Jason Weston, Sumit Chopra, Antoine Bordes

    Abstract: We describe a new class of learning models called memory networks. Memory networks reason with inference components combined with a long-term memory component; they learn how to use these jointly. The long-term memory can be read and written to, with the goal of using it for prediction. We investigate these models in the context of question answering (QA) where the long-term memory effectively act… ▽ More

    Submitted 29 November, 2015; v1 submitted 14 October, 2014; originally announced October 2014.

  40. arXiv:1406.3676  [pdf, other

    cs.CL

    Question Answering with Subgraph Embeddings

    Authors: Antoine Bordes, Sumit Chopra, Jason Weston

    Abstract: This paper presents a system which learns to answer questions on a broad range of topics from a knowledge base using few hand-crafted features. Our model learns low-dimensional embeddings of words and knowledge base constituents; these representations are used to score natural language questions against candidate answers. Training our system using pairs of questions and structured representations… ▽ More

    Submitted 3 September, 2014; v1 submitted 13 June, 2014; originally announced June 2014.

  41. arXiv:cs/0207038  [pdf, ps, other

    cs.AI cs.LO

    Iterated revision and the axiom of recovery: a unified treatment via epistemic states

    Authors: Samir Chopra, Aditya Ghose, Thomas Meyer

    Abstract: The axiom of recovery, while capturing a central intuition regarding belief change, has been the source of much controversy. We argue briefly against putative counterexamples to the axiom--while agreeing that some of their insight deserves to be preserved--and present additional recovery-like axioms in a framework that uses epistemic states, which encode preferences, as the object of revisions.… ▽ More

    Submitted 9 July, 2002; originally announced July 2002.

    ACM Class: I.2.3

  42. arXiv:cs/0207037  [pdf, ps, other

    cs.AI cs.LO

    Some logics of belief and disbelief

    Authors: Samir Chopra, Johannes Heidema, Thomas Meyer

    Abstract: The introduction of explicit notions of rejection, or disbelief, into logics for knowledge representation can be justified in a number of ways. Motivations range from the need for versions of negation weaker than classical negation, to the explicit recording of classic belief contraction operations in the area of belief change, and the additional levels of expressivity obtained from an extended… ▽ More

    Submitted 9 July, 2002; originally announced July 2002.

    ACM Class: I.2.3

  43. arXiv:cs/0003021  [pdf, ps, other

    cs.AI

    Relevance Sensitive Non-Monotonic Inference on Belief Sequences

    Authors: Samir Chopra, Konstantinos Georgatos, Rohit Parikh

    Abstract: We present a method for relevance sensitive non-monotonic inference from belief sequences which incorporates insights pertaining to prioritized inference and relevance sensitive, inconsistency tolerant belief revision. Our model uses a finite, logically open sequence of propositional formulas as a representation for beliefs and defines a notion of inference from maxiconsistent subsets of formul… ▽ More

    Submitted 7 March, 2000; originally announced March 2000.

    ACM Class: I.2.3