Skip to main content

Showing 51–87 of 87 results for author: Belinkov, Y

.
  1. arXiv:2006.08331  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Probing Neural Dialog Models for Conversational Understanding

    Authors: Abdelrhman Saleh, Tovly Deutsch, Stephen Casper, Yonatan Belinkov, Stuart Shieber

    Abstract: The predominant approach to open-domain dialog generation relies on end-to-end training of neural models on chat datasets. However, this approach provides little insight as to what these models learn (or do not learn) about engaging in dialog. In this study, we analyze the internal representations learned by neural open-domain dialog systems and evaluate the quality of these representations for le… ▽ More

    Submitted 7 June, 2020; originally announced June 2020.

  2. arXiv:2005.01348  [pdf, other

    cs.CL cs.LG

    The Sensitivity of Language Models and Humans to Winograd Schema Perturbations

    Authors: Mostafa Abdou, Vinit Ravishankar, Maria Barrett, Yonatan Belinkov, Desmond Elliott, Anders Søgaard

    Abstract: Large-scale pretrained language models are the major driving force behind recent improvements in performance on the Winograd Schema Challenge, a widely employed test of common sense reasoning ability. We show, however, with a new diagnostic dataset, that these models are sensitive to linguistic perturbations of the Winograd examples that minimally affect human understanding. Our results highlight… ▽ More

    Submitted 7 May, 2020; v1 submitted 4 May, 2020; originally announced May 2020.

    Comments: ACL 2020

  3. arXiv:2005.01172  [pdf, other

    cs.CL

    Similarity Analysis of Contextual Word Representation Models

    Authors: John M. Wu, Yonatan Belinkov, Hassan Sajjad, Nadir Durrani, Fahim Dalvi, James Glass

    Abstract: This paper investigates contextual word representation models from the lens of similarity analysis. Given a collection of trained models, we measure the similarity of their internal representations and attention. Critically, these models come from vastly different architectures. We use existing and novel similarity measures that aim to gauge the level of localization of information in the deep mod… ▽ More

    Submitted 3 May, 2020; originally announced May 2020.

    Comments: Accepted to ACL 2020

    MSC Class: 68T50 ACM Class: I.2.7

  4. arXiv:2005.00719  [pdf, other

    cs.CL

    Probing the Probing Paradigm: Does Probing Accuracy Entail Task Relevance?

    Authors: Abhilasha Ravichander, Yonatan Belinkov, Eduard Hovy

    Abstract: Although neural models have achieved impressive results on several NLP benchmarks, little is understood about the mechanisms they use to perform language tasks. Thus, much recent attention has been devoted to analyzing the sentence representations learned by neural encoders, through the lens of `probing' tasks. However, to what extent was the information encoded in sentence representations, as dis… ▽ More

    Submitted 7 March, 2021; v1 submitted 2 May, 2020; originally announced May 2020.

    Comments: EACL 2021

  5. arXiv:2004.12265  [pdf, other

    cs.CL

    Causal Mediation Analysis for Interpreting Neural NLP: The Case of Gender Bias

    Authors: Jesse Vig, Sebastian Gehrmann, Yonatan Belinkov, Sharon Qian, Daniel Nevo, Simas Sakenis, Jason Huang, Yaron Singer, Stuart Shieber

    Abstract: Common methods for interpreting neural models in natural language processing typically examine either their structure or their behavior, but not both. We propose a methodology grounded in the theory of causal mediation analysis for interpreting which parts of a model are causally implicated in its behavior. It enables us to analyze the mechanisms by which information flows from input to output thr… ▽ More

    Submitted 22 November, 2020; v1 submitted 25 April, 2020; originally announced April 2020.

    Comments: Expanded version

    MSC Class: 68T50 ACM Class: I.2.7

  6. arXiv:2004.04010  [pdf, other

    cs.CL cs.LG

    Analyzing Redundancy in Pretrained Transformer Models

    Authors: Fahim Dalvi, Hassan Sajjad, Nadir Durrani, Yonatan Belinkov

    Abstract: Transformer-based deep NLP models are trained using hundreds of millions of parameters, limiting their applicability in computationally constrained environments. In this paper, we study the cause of these limitations by defining a notion of Redundancy, which we categorize into two classes: General Redundancy and Task-specific Redundancy. We dissect two popular pretrained models, BERT and XLNet, st… ▽ More

    Submitted 6 October, 2020; v1 submitted 8 April, 2020; originally announced April 2020.

    Comments: 19 Pages, 14 figures, EMNLP 2020

  7. arXiv:1911.03329  [pdf, other

    cs.CL cs.LG cs.NE

    Memory-Augmented Recurrent Neural Networks Can Learn Generalized Dyck Languages

    Authors: Mirac Suzgun, Sebastian Gehrmann, Yonatan Belinkov, Stuart M. Shieber

    Abstract: We introduce three memory-augmented Recurrent Neural Networks (MARNNs) and explore their capabilities on a series of simple language modeling tasks whose solutions require stack-based mechanisms. We provide the first demonstration of neural networks recognizing the generalized Dyck languages, which express the core of what it means to be a language with hierarchical structure. Our memory-augmented… ▽ More

    Submitted 8 November, 2019; originally announced November 2019.

  8. arXiv:1911.00317  [pdf, other

    cs.CL

    On the Linguistic Representational Power of Neural Machine Translation Models

    Authors: Yonatan Belinkov, Nadir Durrani, Fahim Dalvi, Hassan Sajjad, James Glass

    Abstract: Despite the recent success of deep neural networks in natural language processing (NLP), their interpretability remains a challenge. We analyze the representations learned by neural machine translation models at various levels of granularity and evaluate their quality through relevant extrinsic properties. In particular, we seek answers to the following questions: (i) How accurately is word-struct… ▽ More

    Submitted 1 November, 2019; originally announced November 2019.

    Comments: Accepted to appear in the Journal of Computational Linguistics

  9. arXiv:1909.12673  [pdf, other

    cs.LG cs.CL cs.CV stat.ML

    A Constructive Prediction of the Generalization Error Across Scales

    Authors: Jonathan S. Rosenfeld, Amir Rosenfeld, Yonatan Belinkov, Nir Shavit

    Abstract: The dependency of the generalization error of neural networks on model and dataset size is of critical importance both in practice and for understanding the theory of neural networks. Nevertheless, the functional form of this dependency remains elusive. In this work, we present a functional form which approximates well the generalization error in practice. Capitalizing on the successful concept of… ▽ More

    Submitted 20 December, 2019; v1 submitted 27 September, 2019; originally announced September 2019.

    Comments: ICLR 2020

  10. arXiv:1909.06321  [pdf, other

    cs.CL

    End-to-End Bias Mitigation by Modelling Biases in Corpora

    Authors: Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson

    Abstract: Several recent studies have shown that strong natural language understanding (NLU) models are prone to relying on unwanted dataset biases without learning the underlying task, resulting in models that fail to generalize to out-of-domain datasets and are likely to perform poorly in real-world scenarios. We propose two learning strategies to train neural models, which are more robust to such biases… ▽ More

    Submitted 23 April, 2020; v1 submitted 13 September, 2019; originally announced September 2019.

    Comments: Accepted in ACL 2020 as a long paper

  11. arXiv:1907.04389  [pdf, other

    cs.CL

    On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference

    Authors: Yonatan Belinkov, Adam Poliak, Stuart M. Shieber, Benjamin Van Durme, Alexander M. Rush

    Abstract: Popular Natural Language Inference (NLI) datasets have been shown to be tainted by hypothesis-only biases. Adversarial learning may help models ignore sensitive biases and spurious correlations in data. We evaluate whether adversarial learning can be used in NLI to encourage models to learn representations free of hypothesis-only biases. Our analyses indicate that the representations learned via a… ▽ More

    Submitted 9 July, 2019; originally announced July 2019.

    Comments: StarSem 2019 - The Eighth Joint Conference on Lexical and Computational Semantics

  12. arXiv:1907.04380  [pdf, other

    cs.CL

    Don't Take the Premise for Granted: Mitigating Artifacts in Natural Language Inference

    Authors: Yonatan Belinkov, Adam Poliak, Stuart M. Shieber, Benjamin Van Durme, Alexander M. Rush

    Abstract: Natural Language Inference (NLI) datasets often contain hypothesis-only biases---artifacts that allow models to achieve non-trivial performance without learning whether a premise entails a hypothesis. We propose two probabilistic methods to build models that are more robust to such biases and better transfer across datasets. In contrast to standard approaches to NLI, our methods predict the probab… ▽ More

    Submitted 9 July, 2019; originally announced July 2019.

    Comments: ACL 2019

  13. arXiv:1907.04224  [pdf, other

    cs.CL cs.SD eess.AS

    Analyzing Phonetic and Graphemic Representations in End-to-End Automatic Speech Recognition

    Authors: Yonatan Belinkov, Ahmed Ali, James Glass

    Abstract: End-to-end neural network systems for automatic speech recognition (ASR) are trained from acoustic features to text transcriptions. In contrast to modular ASR systems, which contain separately-trained components for acoustic modeling, pronunciation lexicon, and language modeling, the end-to-end paradigm is both conceptually simpler and has the potential benefit of training the entire system on the… ▽ More

    Submitted 19 April, 2020; v1 submitted 9 July, 2019; originally announced July 2019.

    Comments: Corrected dataset statistics

    ACM Class: I.2.7

  14. arXiv:1906.11943  [pdf, other

    cs.CL

    Findings of the First Shared Task on Machine Translation Robustness

    Authors: Xian Li, Paul Michel, Antonios Anastasopoulos, Yonatan Belinkov, Nadir Durrani, Orhan Firat, Philipp Koehn, Graham Neubig, Juan Pino, Hassan Sajjad

    Abstract: We share the findings of the first shared task on improving robustness of Machine Translation (MT). The task provides a testbed representing challenges facing MT models deployed in the real world, and facilitates new approaches to improve models; robustness to noisy input and domain mismatch. We focus on two language pairs (English-French and English-Japanese), and the submitted systems are evalua… ▽ More

    Submitted 3 July, 2019; v1 submitted 27 June, 2019; originally announced June 2019.

  15. arXiv:1906.08430  [pdf, other

    cs.LG cs.CL cs.CV stat.ML

    Adversarial Regularization for Visual Question Answering: Strengths, Shortcomings, and Side Effects

    Authors: Gabriel Grand, Yonatan Belinkov

    Abstract: Visual question answering (VQA) models have been shown to over-rely on linguistic biases in VQA datasets, answering questions "blindly" without considering visual context. Adversarial regularization (AdvReg) aims to address this issue via an adversary sub-network that encourages the main model to learn a bias-free representation of the question. In this work, we investigate the strengths and short… ▽ More

    Submitted 19 June, 2019; originally announced June 2019.

    Comments: In Proceedings of the 2nd Workshop on Shortcomings in Vision and Language (SiVL) at NAACL-HLT 2019

  16. arXiv:1906.04284  [pdf, other

    cs.CL cs.LG stat.ML

    Analyzing the Structure of Attention in a Transformer Language Model

    Authors: Jesse Vig, Yonatan Belinkov

    Abstract: The Transformer is a fully attention-based alternative to recurrent networks that has achieved state-of-the-art results across a range of NLP tasks. In this paper, we analyze the structure of attention in a Transformer language model, the GPT-2 small pretrained model. We visualize attention for individual instances and analyze the interaction between attention and syntax over a large corpus. We fi… ▽ More

    Submitted 18 June, 2019; v1 submitted 7 June, 2019; originally announced June 2019.

    Comments: To appear in ACL BlackboxNLP workshop

  17. arXiv:1906.03648  [pdf, other

    cs.CL cs.FL cs.LG

    LSTM Networks Can Perform Dynamic Counting

    Authors: Mirac Suzgun, Sebastian Gehrmann, Yonatan Belinkov, Stuart M. Shieber

    Abstract: In this paper, we systematically assess the ability of standard recurrent networks to perform dynamic counting and to encode hierarchical representations. All the neural models in our experiments are designed to be small-sized networks both to prevent them from memorizing the training sets and to visualize and interpret their behaviour at test time. Our results demonstrate that the Long Short-Term… ▽ More

    Submitted 9 June, 2019; originally announced June 2019.

    Comments: ACL 2019 Workshop on Deep Learning and Formal Languages

    ACM Class: F.4.3; I.2.6; I.2.7

  18. arXiv:1906.01702  [pdf, other

    cs.CL cs.LG

    Improving Neural Language Models by Segmenting, Attending, and Predicting the Future

    Authors: Hongyin Luo, Lan Jiang, Yonatan Belinkov, James Glass

    Abstract: Common language models typically predict the next word given the context. In this work, we propose a method that improves language modeling by learning to align the given context and the following phrase. The model does not require any linguistic annotation of phrase segmentation. Instead, we define syntactic heights and phrase segmentation rules, enabling the model to automatically induce phrases… ▽ More

    Submitted 4 June, 2019; originally announced June 2019.

    Comments: Accepted by ACL 2019

  19. arXiv:1903.08855  [pdf, other

    cs.CL

    Linguistic Knowledge and Transferability of Contextual Representations

    Authors: Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew E. Peters, Noah A. Smith

    Abstract: Contextual word representations derived from large-scale neural language models are successful across a diverse set of NLP tasks, suggesting that they encode useful and transferable features of language. To shed light on the linguistic knowledge they capture, we study the representations produced by several recent pretrained contextualizers (variants of ELMo, the OpenAI transformer language model,… ▽ More

    Submitted 25 April, 2019; v1 submitted 21 March, 2019; originally announced March 2019.

    Comments: 22 pages, 4 figures; to appear at NAACL 2019

  20. arXiv:1902.00595  [pdf, other

    cs.CL

    Character-based Surprisal as a Model of Reading Difficulty in the Presence of Error

    Authors: Michael Hahn, Frank Keller, Yonatan Bisk, Yonatan Belinkov

    Abstract: Intuitively, human readers cope easily with errors in text; typos, misspelling, word substitutions, etc. do not unduly disrupt natural reading. Previous work indicates that letter transpositions result in increased reading times, but it is unclear if this effect generalizes to more natural errors. In this paper, we report an eye-tracking study that compares two error types (letter transpositions a… ▽ More

    Submitted 19 May, 2019; v1 submitted 1 February, 2019; originally announced February 2019.

    Comments: Published in Proceedings of CogSci 2019

  21. arXiv:1812.09359  [pdf, other

    cs.CL

    NeuroX: A Toolkit for Analyzing Individual Neurons in Neural Networks

    Authors: Fahim Dalvi, Avery Nortonsmith, D. Anthony Bau, Yonatan Belinkov, Hassan Sajjad, Nadir Durrani, James Glass

    Abstract: We present a toolkit to facilitate the interpretation and understanding of neural network models. The toolkit provides several methods to identify salient neurons with respect to the model itself or an external task. A user can visualize selected neurons, ablate them to measure their effect on the model accuracy, and manipulate them to control the behavior of the model at the test time. Such an an… ▽ More

    Submitted 21 December, 2018; originally announced December 2018.

    Comments: AAAI Conference on Artificial Intelligence (AAAI 2019), Demonstration track, pages 2

  22. arXiv:1812.09355  [pdf, other

    cs.CL

    What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models

    Authors: Fahim Dalvi, Nadir Durrani, Hassan Sajjad, Yonatan Belinkov, Anthony Bau, James Glass

    Abstract: Despite the remarkable evolution of deep neural networks in natural language processing (NLP), their interpretability remains a challenge. Previous work largely focused on what these models learn at the representation level. We break this analysis down further and study individual dimensions (neurons) in the vector representation learned by end-to-end neural models in NLP tasks. We propose two met… ▽ More

    Submitted 21 December, 2018; originally announced December 2018.

    Comments: AAA 2019, pages 10, AAAI Conference on Artificial Intelligence (AAAI 2019)

  23. arXiv:1812.08951  [pdf, other

    cs.CL cs.LG cs.NE

    Analysis Methods in Neural Language Processing: A Survey

    Authors: Yonatan Belinkov, James Glass

    Abstract: The field of natural language processing has seen impressive progress in recent years, with neural network models replacing many of the traditional systems. A plethora of new models have been proposed, many of which are thought to be opaque compared to their feature-rich counterparts. This has led researchers to analyze, interpret, and evaluate neural networks in novel and more fine-grained ways.… ▽ More

    Submitted 14 January, 2019; v1 submitted 21 December, 2018; originally announced December 2018.

    Comments: Version including the supplementary materials (3 tables), also available at https://boknilev.github.io/nlp-analysis-methods

    MSC Class: 68T50 ACM Class: I.2.7

  24. arXiv:1811.01157  [pdf, other

    cs.CL

    Identifying and Controlling Important Neurons in Neural Machine Translation

    Authors: Anthony Bau, Yonatan Belinkov, Hassan Sajjad, Nadir Durrani, Fahim Dalvi, James Glass

    Abstract: Neural machine translation (NMT) models learn representations containing substantial linguistic information. However, it is not clear if such information is fully distributed or if some of it can be attributed to individual neurons. We develop unsupervised methods for discovering important neurons in NMT models. Our methods rely on the intuition that different models learn similar properties, and… ▽ More

    Submitted 3 November, 2018; originally announced November 2018.

    ACM Class: I.2.7

  25. arXiv:1811.01001  [pdf, other

    cs.CL cs.AI cs.LG

    On Evaluating the Generalization of LSTM Models in Formal Languages

    Authors: Mirac Suzgun, Yonatan Belinkov, Stuart M. Shieber

    Abstract: Recurrent Neural Networks (RNNs) are theoretically Turing-complete and established themselves as a dominant model for language processing. Yet, there still remains an uncertainty regarding their language learning capabilities. In this paper, we empirically evaluate the inductive learning capabilities of Long Short-Term Memory networks, a popular extension of simple RNNs, to learn simple formal lan… ▽ More

    Submitted 2 November, 2018; originally announced November 2018.

    Comments: Proceedings of the Society for Computation in Linguistics (SCiL) 2019

    ACM Class: I.2.7; I.2.6; F.4.3

  26. arXiv:1809.03891  [pdf, other

    cs.CL

    Studying the History of the Arabic Language: Language Technology and a Large-Scale Historical Corpus

    Authors: Yonatan Belinkov, Alexander Magidow, Alberto Barrón-Cedeño, Avi Shmidman, Maxim Romanov

    Abstract: Arabic is a widely-spoken language with a long and rich history, but existing corpora and language technology focus mostly on modern Arabic and its varieties. Therefore, studying the history of the language has so far been mostly limited to manual analyses on a small scale. In this work, we present a large-scale historical corpus of the written Arabic language, spanning 1400 years. We describe our… ▽ More

    Submitted 11 September, 2018; originally announced September 2018.

    ACM Class: I.2.7

  27. arXiv:1804.09779  [pdf, ps, other

    cs.CL

    On the Evaluation of Semantic Phenomena in Neural Machine Translation Using Natural Language Inference

    Authors: Adam Poliak, Yonatan Belinkov, James Glass, Benjamin Van Durme

    Abstract: We propose a process for investigating the extent to which sentence representations arising from neural machine translation (NMT) systems encode distinct semantic phenomena. We use these representations as features to train a natural language inference (NLI) classifier based on datasets recast from existing semantic annotations. In applying this process to a representative NMT system, we find its… ▽ More

    Submitted 6 May, 2018; v1 submitted 25 April, 2018; originally announced April 2018.

    Comments: To be presented at NAACL 2018 - 11 pages

  28. arXiv:1801.07772  [pdf, other

    cs.CL

    Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks

    Authors: Yonatan Belinkov, Lluís Màrquez, Hassan Sajjad, Nadir Durrani, Fahim Dalvi, James Glass

    Abstract: While neural machine translation (NMT) models provide improved translation quality in an elegant, end-to-end framework, it is less clear what they learn about language. Recent work has started evaluating the quality of vector representations learned by NMT models on morphological and syntactic tasks. In this paper, we investigate the representations learned at different layers of NMT encoders. We… ▽ More

    Submitted 23 January, 2018; originally announced January 2018.

    Comments: IJCNLP 2017

    ACM Class: I.2.7

    Journal ref: IJCNLP 8 (2017), volume 1, 1-10

  29. arXiv:1711.02173  [pdf, other

    cs.CL cs.LG

    Synthetic and Natural Noise Both Break Neural Machine Translation

    Authors: Yonatan Belinkov, Yonatan Bisk

    Abstract: Character-based neural machine translation (NMT) models alleviate out-of-vocabulary issues, learn morphology, and move us closer to completely end-to-end translation systems. Unfortunately, they are also very brittle and easily falter when presented with noisy data. In this paper, we confront NMT models with synthetic and natural sources of noise. We find that state-of-the-art models fail to trans… ▽ More

    Submitted 24 February, 2018; v1 submitted 6 November, 2017; originally announced November 2017.

    Comments: ICLR 2018 camera-ready

    ACM Class: I.2.7

  30. arXiv:1709.04482  [pdf, other

    cs.CL cs.NE cs.SD

    Analyzing Hidden Representations in End-to-End Automatic Speech Recognition Systems

    Authors: Yonatan Belinkov, James Glass

    Abstract: Neural models have become ubiquitous in automatic speech recognition systems. While neural networks are typically used as acoustic models in more complex systems, recent studies have explored end-to-end speech recognition systems based on neural networks, which can be trained to directly predict text from input acoustic features. Although such systems are conceptually elegant and simpler than trad… ▽ More

    Submitted 13 September, 2017; originally announced September 2017.

    Comments: NIPS 2017

    ACM Class: I.2.7

  31. arXiv:1709.00616  [pdf, other

    cs.CL

    Challenging Language-Dependent Segmentation for Arabic: An Application to Machine Translation and Part-of-Speech Tagging

    Authors: Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Ahmed Abdelali, Yonatan Belinkov, Stephan Vogel

    Abstract: Word segmentation plays a pivotal role in improving any Arabic NLP application. Therefore, a lot of research has been spent in improving its accuracy. Off-the-shelf tools, however, are: i) complicated to use and ii) domain/dialect dependent. We explore three language-independent alternatives to morphological segmentation using: i) data-driven sub-word units, ii) characters as a unit of learning, a… ▽ More

    Submitted 2 September, 2017; originally announced September 2017.

    Comments: ACL 2017 pages 7

  32. arXiv:1708.08712  [pdf, other

    cs.CL

    Neural Machine Translation Training in a Multi-Domain Scenario

    Authors: Hassan Sajjad, Nadir Durrani, Fahim Dalvi, Yonatan Belinkov, Stephan Vogel

    Abstract: In this paper, we explore alternative ways to train a neural machine translation system in a multi-domain scenario. We investigate data concatenation (with fine tuning), model stacking (multi-level fine tuning), data selection and multi-model ensemble. Our findings show that the best translation quality can be achieved by building an initial system on a concatenation of available out-of-domain dat… ▽ More

    Submitted 20 November, 2018; v1 submitted 29 August, 2017; originally announced August 2017.

    Comments: 8 pages

    Journal ref: Proceedings of the 14th International Workshop on Spoken Language Translation (IWSLT), 2017

  33. What do Neural Machine Translation Models Learn about Morphology?

    Authors: Yonatan Belinkov, Nadir Durrani, Fahim Dalvi, Hassan Sajjad, James Glass

    Abstract: Neural machine translation (MT) models obtain state-of-the-art performance while maintaining a simple, end-to-end architecture. However, little is known about what these models learn about source and target languages during the training process. In this work, we analyze the representations learned by neural MT models at various levels of granularity and empirically evaluate the quality of the repr… ▽ More

    Submitted 22 October, 2018; v1 submitted 11 April, 2017; originally announced April 2017.

    Comments: Updated decoder experiments

    ACM Class: I.2.7

    Journal ref: ACL 55 (2017), volume 1, 861-872

  34. arXiv:1612.08989  [pdf, other

    cs.CL

    Shamela: A Large-Scale Historical Arabic Corpus

    Authors: Yonatan Belinkov, Alexander Magidow, Maxim Romanov, Avi Shmidman, Moshe Koppel

    Abstract: Arabic is a widely-spoken language with a rich and long history spanning more than fourteen centuries. Yet existing Arabic corpora largely focus on the modern period or lack sufficient diachronic information. We develop a large-scale, historical corpus of Arabic of about 1 billion words from diverse periods of time. We clean this corpus, process it with a morphological analyzer, and enhance it by… ▽ More

    Submitted 28 December, 2016; originally announced December 2016.

    Comments: Slightly expanded version of Coling LT4DH workshop paper

    ACM Class: I.2.7

  35. arXiv:1609.07701  [pdf, ps, other

    cs.CL

    Large-Scale Machine Translation between Arabic and Hebrew: Available Corpora and Initial Results

    Authors: Yonatan Belinkov, James Glass

    Abstract: Machine translation between Arabic and Hebrew has so far been limited by a lack of parallel corpora, despite the political and cultural importance of this language pair. Previous work relied on manually-crafted grammars or pivoting via English, both of which are unsatisfactory for building a scalable and accurate MT system. In this work, we compare standard phrase-based and neural systems on Arabi… ▽ More

    Submitted 25 September, 2016; originally announced September 2016.

    Comments: SeMaT 2016

    ACM Class: I.2.7

  36. arXiv:1609.07568  [pdf, other

    cs.CL

    A Character-level Convolutional Neural Network for Distinguishing Similar Languages and Dialects

    Authors: Yonatan Belinkov, James Glass

    Abstract: Discriminating between closely-related language varieties is considered a challenging and important task. This paper describes our submission to the DSL 2016 shared-task, which included two sub-tasks: one on discriminating similar languages and one on identifying Arabic dialects. We developed a character-level neural network for this task. Given a sequence of characters, our model embeds each char… ▽ More

    Submitted 24 September, 2016; originally announced September 2016.

    Comments: DSL 2016

    ACM Class: I.2.7

  37. arXiv:1608.04207  [pdf, other

    cs.CL

    Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks

    Authors: Yossi Adi, Einat Kermany, Yonatan Belinkov, Ofer Lavi, Yoav Goldberg

    Abstract: There is a lot of research interest in encoding variable length sentences into fixed length vectors, in a way that preserves the sentence meanings. Two common methods include representations based on averaging word vectors, and representations based on the hidden states of recurrent neural networks such as LSTMs. The sentence vectors are used as features for subsequent machine learning tasks or fo… ▽ More

    Submitted 9 February, 2017; v1 submitted 15 August, 2016; originally announced August 2016.