Skip to main content

Showing 1–50 of 77 results for author: Aharoni, R

.
  1. arXiv:2406.13632  [pdf, other

    cs.CL

    Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations

    Authors: Arie Cattan, Alon Jacovi, Alex Fabrikant, Jonathan Herzig, Roee Aharoni, Hannah Rashkin, Dror Marcus, Avinatan Hassidim, Yossi Matias, Idan Szpektor, Avi Caciularu

    Abstract: Despite recent advancements in Large Language Models (LLMs), their performance on tasks involving long contexts remains sub-optimal. In-Context Learning (ICL) with few-shot examples may be an appealing solution to enhance LLM performance in this scenario; However, naively adding ICL examples with long context introduces challenges, including substantial token overhead added for each few-shot examp… ▽ More

    Submitted 23 June, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

  2. arXiv:2405.16908  [pdf, other

    cs.CL

    Can Large Language Models Faithfully Express Their Intrinsic Uncertainty in Words?

    Authors: Gal Yona, Roee Aharoni, Mor Geva

    Abstract: We posit that large language models (LLMs) should be capable of expressing their intrinsic uncertainty in natural language. For example, if the LLM is equally likely to output two contradicting answers to the same question, then its generated response should reflect this uncertainty by hedging its answer (e.g., "I'm not sure, but I think..."). We formalize faithful response uncertainty based on th… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  3. arXiv:2405.05904  [pdf, other

    cs.CL

    Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?

    Authors: Zorik Gekhman, Gal Yona, Roee Aharoni, Matan Eyal, Amir Feder, Roi Reichart, Jonathan Herzig

    Abstract: When large language models are aligned via supervised fine-tuning, they may encounter new factual information that was not acquired through pre-training. It is often conjectured that this can teach the model the behavior of hallucinating factually incorrect responses, as the model is trained to generate facts that are not grounded in its pre-existing knowledge. In this work, we study the impact of… ▽ More

    Submitted 13 May, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

  4. arXiv:2402.09631  [pdf, other

    cs.LG cs.CL cs.CY

    Representation Surgery: Theory and Practice of Affine Steering

    Authors: Shashwat Singh, Shauli Ravfogel, Jonathan Herzig, Roee Aharoni, Ryan Cotterell, Ponnurangam Kumaraguru

    Abstract: Language models often exhibit undesirable behavior, e.g., generating toxic or gender-biased text. In the case of neural language models, an encoding of the undesirable behavior is often present in the model's representations. Thus, one natural (and common) approach to prevent the model from exhibiting undesirable behavior is to steer the model's representations in a manner that reduces the probabi… ▽ More

    Submitted 5 July, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: Accepted in ICML 2024

  5. arXiv:2402.00559  [pdf, other

    cs.CL

    A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains

    Authors: Alon Jacovi, Yonatan Bitton, Bernd Bohnet, Jonathan Herzig, Or Honovich, Michael Tseng, Michael Collins, Roee Aharoni, Mor Geva

    Abstract: Prompting language models to provide step-by-step answers (e.g., "Chain-of-Thought") is the prominent approach for complex reasoning tasks, where more accurate reasoning chains typically improve downstream task performance. Recent literature discusses automatic methods to verify reasoning to evaluate and improve their correctness. However, no fine-grained step-level datasets are available to enabl… ▽ More

    Submitted 21 May, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: Accepted to ACL 2024

  6. arXiv:2401.04695  [pdf, other

    cs.CL

    Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity Answers

    Authors: Gal Yona, Roee Aharoni, Mor Geva

    Abstract: Factual questions typically can be answered correctly at different levels of granularity. For example, both ``August 4, 1961'' and ``1961'' are correct answers to the question ``When was Barack Obama born?''. Standard question answering (QA) evaluation protocols, however, do not explicitly take this into account and compare a predicted answer against answers of a single granularity level. In this… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

  7. arXiv:2401.01854  [pdf, other

    cs.CL cs.AI cs.LG

    Multilingual Instruction Tuning With Just a Pinch of Multilinguality

    Authors: Uri Shaham, Jonathan Herzig, Roee Aharoni, Idan Szpektor, Reut Tsarfaty, Matan Eyal

    Abstract: As instruction-tuned large language models (LLMs) gain global adoption, their ability to follow instructions in multiple languages becomes increasingly crucial. In this work, we investigate how multilinguality during instruction tuning of a multilingual LLM affects instruction-following across languages from the pre-training corpus. We first show that many languages transfer some instruction-follo… ▽ More

    Submitted 21 May, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

    Comments: Findings of ACL 2024

  8. arXiv:2311.17670  [pdf, ps, other

    math.CO cs.DM

    2-covers of wide Young diagrams

    Authors: Ron Aharoni, Eli Berger, He Guo, Daniel Kotlar

    Abstract: A Young diagram $Y$ is called wide if every sub-diagram $Z$ formed by a subset of the rows of $Y$ dominates $Z'$, the conjugate of $Z$. A Young diagram $Y$ is called Latin if its squares can be assigned numbers so that for each $i$, the $i$th row is filled injectively with the numbers $1, \ldots ,a_i$, where $a_i$ is the length of $i$th row of $Y$, and every column is also filled injectively. A co… ▽ More

    Submitted 11 December, 2023; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: 17 pages; Added a few more questions and a reference

    MSC Class: 05A17; 05C65; 05C70; 05D15

  9. arXiv:2310.10062  [pdf, other

    cs.CL cs.AI

    A Comprehensive Evaluation of Tool-Assisted Generation Strategies

    Authors: Alon Jacovi, Avi Caciularu, Jonathan Herzig, Roee Aharoni, Bernd Bohnet, Mor Geva

    Abstract: A growing area of research investigates augmenting language models with tools (e.g., search engines, calculators) to overcome their shortcomings (e.g., missing or incorrect knowledge, incorrect logical inferences). Various few-shot tool-usage strategies have been proposed. However, there is no systematic and fair comparison across different strategies, or between these strategies and strong baseli… ▽ More

    Submitted 28 December, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 Findings

  10. arXiv:2309.03735  [pdf, ps, other

    math.CO cs.DM

    Looms

    Authors: Ron Aharoni, Eli Berger, Joseph Briggs, He Guo

    Abstract: A pair $(A,B)$ of hypergraphs is called orthogonal if $|a \cap b|=1$ for every pair of edges $a \in A$ and $b \in B$. An orthogonal pair of hypergraphs is called a loom if each of its two members is the set of minimum covers of the other. Looms appear naturally in the context of a conjecture of Gyárfás and Lehel on the covering number of cross-intersecting hypergraphs. We study their properties an… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

    Comments: 19 pages

    MSC Class: 05C65; 05C35; 05C72; 05C76; 05D15

  11. arXiv:2306.00186  [pdf, other

    cs.CL

    Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback

    Authors: Paul Roit, Johan Ferret, Lior Shani, Roee Aharoni, Geoffrey Cideron, Robert Dadashi, Matthieu Geist, Sertan Girgin, Léonard Hussenot, Orgad Keller, Nikola Momchev, Sabela Ramos, Piotr Stanczyk, Nino Vieillard, Olivier Bachem, Gal Elidan, Avinatan Hassidim, Olivier Pietquin, Idan Szpektor

    Abstract: Despite the seeming success of contemporary grounded text generation systems, they often tend to generate factually inconsistent text with respect to their input. This phenomenon is emphasized in tasks like summarization, in which the generated summaries should be corroborated by their source article. In this work, we leverage recent progress on textual entailment models to directly address this p… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

    Comments: ACL 2023

  12. arXiv:2305.14332  [pdf, other

    cs.CL

    Evaluating and Modeling Attribution for Cross-Lingual Question Answering

    Authors: Benjamin Muller, John Wieting, Jonathan H. Clark, Tom Kwiatkowski, Sebastian Ruder, Livio Baldini Soares, Roee Aharoni, Jonathan Herzig, Xinyi Wang

    Abstract: Trustworthy answer content is abundant in many high-resource languages and is instantly accessible through question answering systems, yet this content can be hard to access for those that do not speak these languages. The leap forward in cross-lingual modeling quality offered by generative language models offers much promise, yet their raw generations often fall short in factuality. To improve tr… ▽ More

    Submitted 15 November, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Published as a long paper at EMNLP 2023

  13. arXiv:2305.13194  [pdf, other

    cs.CL

    SEAHORSE: A Multilingual, Multifaceted Dataset for Summarization Evaluation

    Authors: Elizabeth Clark, Shruti Rijhwani, Sebastian Gehrmann, Joshua Maynez, Roee Aharoni, Vitaly Nikolaev, Thibault Sellam, Aditya Siddhant, Dipanjan Das, Ankur P. Parikh

    Abstract: Reliable automatic evaluation of summarization systems is challenging due to the multifaceted and subjective nature of the task. This is especially the case for languages other than English, where human evaluations are scarce. In this work, we introduce SEAHORSE, a dataset for multilingual, multifaceted summarization evaluation. SEAHORSE consists of 96K summaries with human ratings along 6 dimensi… ▽ More

    Submitted 1 November, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

  14. arXiv:2305.11171  [pdf, other

    cs.CL

    TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models

    Authors: Zorik Gekhman, Jonathan Herzig, Roee Aharoni, Chen Elkind, Idan Szpektor

    Abstract: Factual consistency evaluation is often conducted using Natural Language Inference (NLI) models, yet these models exhibit limited success in evaluating summaries. Previous work improved such models with synthetic training data. However, the data is typically based on perturbed human-written summaries, which often differ in their characteristics from real model-generated summaries and have limited… ▽ More

    Submitted 18 October, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: Accepted as a long paper in EMNLP 2023

  15. arXiv:2305.10400  [pdf, other

    cs.CL cs.CV

    What You See is What You Read? Improving Text-Image Alignment Evaluation

    Authors: Michal Yarom, Yonatan Bitton, Soravit Changpinyo, Roee Aharoni, Jonathan Herzig, Oran Lang, Eran Ofek, Idan Szpektor

    Abstract: Automatically determining whether a text and a corresponding image are semantically aligned is a significant challenge for vision-language models, with applications in generative text-to-image and image-to-text tasks. In this work, we study methods for automatic text-image alignment evaluation. We first introduce SeeTRUE: a comprehensive evaluation set, spanning multiple datasets from both text-to… ▽ More

    Submitted 26 December, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    Comments: Accepted to NeurIPS 2023. Website: https://wysiwyr-itm.github.io/

  16. arXiv:2305.07378  [pdf, other

    cs.CL cs.CY cs.LG

    Surfacing Biases in Large Language Models using Contrastive Input Decoding

    Authors: Gal Yona, Or Honovich, Itay Laish, Roee Aharoni

    Abstract: Ensuring that large language models (LMs) are fair, robust and useful requires an understanding of how different modifications to their inputs impact the model's behaviour. In the context of open-text generation tasks, however, such an evaluation is not trivial. For example, when introducing a model with an input text and a perturbed, "contrastive" version of it, meaningful differences in the next… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

  17. arXiv:2304.14318  [pdf, other

    cs.CL

    q2d: Turning Questions into Dialogs to Teach Models How to Search

    Authors: Yonatan Bitton, Shlomi Cohen-Ganor, Ido Hakimi, Yoad Lewenberg, Roee Aharoni, Enav Weinreb

    Abstract: One of the exciting capabilities of recent language models for dialog is their ability to independently search for relevant information to ground a given dialog response. However, obtaining training data to teach models how to issue search queries is time and resource consuming. In this work, we propose q2d: an automatic data generation pipeline that generates information-seeking dialogs from ques… ▽ More

    Submitted 26 December, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: Accepted to EMNLP 2023. Website: https://question2dialog.github.io/

  18. arXiv:2301.10312  [pdf, ps, other

    math.CO cs.DM

    Tight infinite matrices

    Authors: Ron Aharoni, He Guo

    Abstract: We give a simple proof of a recent result of Gollin and Joó: if a possibly infinite system of homogeneous linear equations $A\vec{x} = \vec{0}$, where $A = (a_{i, j})$ is an $I \times J$ matrix, has only the trivial solution, then there exists an injection $φ: J \to I$, such that $a_{φ(j), j} \neq 0$ for all $j \in J$.

    Submitted 24 January, 2023; originally announced January 2023.

    Comments: 7 pages

    MSC Class: 15A06; 05C50; 05C63

  19. arXiv:2212.10622  [pdf, other

    cs.CL

    mFACE: Multilingual Summarization with Factual Consistency Evaluation

    Authors: Roee Aharoni, Shashi Narayan, Joshua Maynez, Jonathan Herzig, Elizabeth Clark, Mirella Lapata

    Abstract: Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets. Despite promising results, current models still suffer from generating factually inconsistent summaries, reducing their utility for real-world application. Several recent efforts attempt to address this by devising models that automatically det… ▽ More

    Submitted 5 January, 2024; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: 28 pages with links to released data

  20. arXiv:2212.09682  [pdf, other

    cs.CL

    Multilingual Sequence-to-Sequence Models for Hebrew NLP

    Authors: Matan Eyal, Hila Noga, Roee Aharoni, Idan Szpektor, Reut Tsarfaty

    Abstract: Recent work attributes progress in NLP to large language models (LMs) with increased model size and large quantities of pretraining data. Despite this, current state-of-the-art LMs for Hebrew are both under-parameterized and under-trained compared to LMs in other languages. Additionally, previous work on pretrained Hebrew LMs focused on encoder-only models. While the encoder-only architecture is b… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

  21. arXiv:2212.08037  [pdf, other

    cs.CL

    Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models

    Authors: Bernd Bohnet, Vinh Q. Tran, Pat Verga, Roee Aharoni, Daniel Andor, Livio Baldini Soares, Massimiliano Ciaramita, Jacob Eisenstein, Kuzman Ganchev, Jonathan Herzig, Kai Hui, Tom Kwiatkowski, Ji Ma, Jianmo Ni, Lierni Sestorain Saralegui, Tal Schuster, William W. Cohen, Michael Collins, Dipanjan Das, Donald Metzler, Slav Petrov, Kellie Webster

    Abstract: Large language models (LLMs) have shown impressive results while requiring little or no direct supervision. Further, there is mounting evidence that LLMs may have potential in information-seeking scenarios. We believe the ability of an LLM to attribute the text that it generates is likely to be crucial in this setting. We formulate and study Attributed QA as a key first step in the development of… ▽ More

    Submitted 10 February, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

  22. arXiv:2211.05655  [pdf, other

    cs.CL cs.AI cs.LG

    DisentQA: Disentangling Parametric and Contextual Knowledge with Counterfactual Question Answering

    Authors: Ella Neeman, Roee Aharoni, Or Honovich, Leshem Choshen, Idan Szpektor, Omri Abend

    Abstract: Question answering models commonly have access to two sources of "knowledge" during inference time: (1) parametric knowledge - the factual knowledge encoded in the model weights, and (2) contextual knowledge - external knowledge (e.g., a Wikipedia passage) given to the model to generate a grounded answer. Having these two sources of knowledge entangled together is a core issue for generative QA mo… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

    Comments: 12 pages, 2 figures

  23. arXiv:2206.02576  [pdf, ps, other

    math.CO

    Strongly maximal matchings and strongly minimal covers

    Authors: Ron Aharoni

    Abstract: This is a not-to-be-journal-published paper, aimed to serve as reference. It is a summary of the main ideas on the topic appearing in the title, and an opportunity to state correctly the main conjecture in the field.

    Submitted 3 June, 2022; originally announced June 2022.

  24. arXiv:2204.04991  [pdf, other

    cs.CL

    TRUE: Re-evaluating Factual Consistency Evaluation

    Authors: Or Honovich, Roee Aharoni, Jonathan Herzig, Hagai Taitelbaum, Doron Kukliansy, Vered Cohen, Thomas Scialom, Idan Szpektor, Avinatan Hassidim, Yossi Matias

    Abstract: Grounded text generation systems often generate text that contains factual inconsistencies, hindering their real-world applicability. Automatic factual consistency evaluation may help alleviate this limitation by accelerating evaluation cycles, filtering inconsistent outputs and augmenting training data. While attracting increasing attention, such evaluation metrics are usually developed and evalu… ▽ More

    Submitted 3 May, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: Accepted as a long paper to NAACL 2022 main conference

  25. arXiv:2110.14332  [pdf, other

    math.CO cs.DM math.PR

    Rainbow cycles for families of matchings

    Authors: Ron Aharoni, He Guo

    Abstract: Given a graph G and a coloring of its edges, a subgraph of G is called rainbow if its edges have distinct colors. The rainbow girth of an edge coloring of G is the minimum length of a rainbow cycle in G. A generalization of the famous Caccetta-Haggkvist conjecture (CHC), proposed by the first author, is that if G has n vertices, G is n-edge-colored and the size of every color class is k, then the… ▽ More

    Submitted 24 October, 2022; v1 submitted 27 October, 2021; originally announced October 2021.

    Comments: 5 pages; minor edits; to appear in Israel Journal of Mathematics

    MSC Class: 05C35; 05D40

  26. arXiv:2110.11183  [pdf, ps, other

    math.CO

    Non-uniform degrees and rainbow versions of the Caccetta-Häggkvist conjecture

    Authors: Ron Aharoni, Eli Berger, Maria Chudnovsky, He Guo, Shira Zerbib

    Abstract: The Caccetta-Häggkvist conjecture (denoted below CHC) states that the directed girth (the smallest length of a directed cycle) $dgirth(D)$ of a directed graph $D$ on $n$ vertices is at most $\lceil \frac{n}{δ^+(D)}\rceil$, where $δ^+(D)$ is the minimum out-degree of~$D$. We consider a version involving all out-degrees, not merely the minimum one, and prove that if $D$ does not contain a sink, then… ▽ More

    Submitted 7 October, 2022; v1 submitted 21 October, 2021; originally announced October 2021.

  27. arXiv:2107.12881  [pdf, ps, other

    math.CO

    Choice Functions

    Authors: Ron Aharoni, Joseph Briggs

    Abstract: This is a survey paper on rainbow sets (another name for ``choice functions''). The main theme is the distinction between two types of choice functions: those having a large (in the sense of belonging to some specified filter, namely closed up set of sets) image, and those that have a large domain and small image, where ``smallness'' means belonging to some specified complex (a closed-down set). T… ▽ More

    Submitted 27 July, 2021; originally announced July 2021.

    Comments: 23 pages, survey paper

    MSC Class: 05D15; 05C70; 05C69; 05B35; 05E45

  28. arXiv:2104.08202  [pdf, other

    cs.CL

    $Q^{2}$: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering

    Authors: Or Honovich, Leshem Choshen, Roee Aharoni, Ella Neeman, Idan Szpektor, Omri Abend

    Abstract: Neural knowledge-grounded generative models for dialogue often produce content that is factually inconsistent with the knowledge they rely on, making them unreliable and limiting their applicability. Inspired by recent work on evaluating factual consistency in abstractive summarization, we propose an automatic evaluation metric for factual consistency in knowledge-grounded dialogue using automatic… ▽ More

    Submitted 9 September, 2021; v1 submitted 16 April, 2021; originally announced April 2021.

    Comments: Accepted to EMNLP 2021

  29. arXiv:2012.14992  [pdf, ps, other

    math.CO

    Rainbow paths and large rainbow matchings

    Authors: Ron Aharoni, Eli Berger, Maria Chudnovsky, Shira Zerbib

    Abstract: A conjecture of the first two authors is that $n$ matchings of size $n$ in any graph have a rainbow matching of size $n-1$. We prove a lower bound of $\frac{2}{3}n-1$, improving on the trivial $\frac{1}{2}n$, and an analogous result for hypergraphs. For $\{C_3,C_5\}$-free graphs and for disjoint matchings we obtain a lower bound of $\frac{3n}{4}-O(1)$. We also discuss a conjecture on rainbow alter… ▽ More

    Submitted 7 October, 2021; v1 submitted 29 December, 2020; originally announced December 2020.

  30. arXiv:2011.01053  [pdf, ps, other

    math.CO

    Fractionally balanced hypergraphs and rainbow KKM theorems

    Authors: Ron Aharoni, Eli Berger, Joseph Briggs, Erel Segal-Halevi, Shira Zerbib

    Abstract: A d-partite hypergraph is called *fractionally balanced* if there exists a non-negative, not identically zero, function on its edge set that has constant degrees in each vertex side. Using a topological version of Hall's theorem we prove lower bounds on the matching number of such hypergraphs. These bounds yield rainbow versions of the KKM theorem for products of simplices, which in turn are used… ▽ More

    Submitted 14 August, 2022; v1 submitted 2 November, 2020; originally announced November 2020.

  31. arXiv:2009.11027  [pdf, other

    cs.CL

    KoBE: Knowledge-Based Machine Translation Evaluation

    Authors: Zorik Gekhman, Roee Aharoni, Genady Beryozkin, Markus Freitag, Wolfgang Macherey

    Abstract: We propose a simple and effective method for machine translation evaluation which does not require reference translations. Our approach is based on (1) grounding the entity mentions found in each source sentence and candidate translation against a large-scale multilingual knowledge base, and (2) measuring the recall of the grounded entities found in the candidate vs. those found in the source. Our… ▽ More

    Submitted 23 September, 2020; originally announced September 2020.

    Comments: Accepted as a short paper in Findings of EMNLP 2020

  32. arXiv:2008.04637  [pdf, other

    cs.CV cs.CL

    Real-Time Sign Language Detection using Human Pose Estimation

    Authors: Amit Moryossef, Ioannis Tsochantaridis, Roee Aharoni, Sarah Ebling, Srini Narayanan

    Abstract: We propose a lightweight real-time sign language detection model, as we identify the need for such a case in videoconferencing. We extract optical flow features based on human pose estimation and, using a linear classifier, show these features are meaningful with an accuracy of 80%, evaluated on the DGS Corpus. Using a recurrent model directly on the input, we see improvements of up to 91% accurac… ▽ More

    Submitted 13 September, 2020; v1 submitted 11 August, 2020; originally announced August 2020.

    Comments: 10 pages

  33. Rainbow odd cycles

    Authors: Ron Aharoni, Joseph Briggs, Ron Holzman, Zilin Jiang

    Abstract: We prove that every family of (not necessarily distinct) odd cycles $O_1, \dots, O_{2\lceil n/2 \rceil-1}$ in the complete graph $K_n$ on $n$ vertices has a rainbow odd cycle (that is, a set of edges from distinct $O_i$'s, forming an odd cycle). As part of the proof, we characterize those families of $n$ odd cycles in $K_{n+1}$ that do not have any rainbow odd cycle. We also characterize those fam… ▽ More

    Submitted 20 September, 2021; v1 submitted 19 July, 2020; originally announced July 2020.

    Comments: 14 pages, 2 figures, accepted to SIAM Journal on Discrete Mathematics (SIDMA)

    MSC Class: 05C38 (Primary) 05C70; 05B35 (Secondary)

    Journal ref: SIAM Journal on Discrete Mathematics, Volume 35, Issue 4, pp 2293-2303, October 2021

  34. arXiv:2004.07590  [pdf, other

    math.CO

    Badges and rainbow matchings

    Authors: Ron Aharoni, Joseph Briggs, **ha Kim, Minki Kim

    Abstract: Drisko proved that $2n-1$ matchings of size $n$ in a bipartite graph have a rainbow matching of size $n$. For general graphs it is conjectured that $2n$ matchings suffice for this purpose (and that $2n-1$ matchings suffice when $n$ is even). The known graphs showing sharpness of this conjecture for $n$ even are called badges. We improve the previously best known bound from $3n-2$ to $3n-3$, using… ▽ More

    Submitted 15 February, 2021; v1 submitted 16 April, 2020; originally announced April 2020.

    Comments: Accepted for publication in Discrete Mathematics. 19 pages, 2 figures

  35. arXiv:2004.02105  [pdf, other

    cs.CL

    Unsupervised Domain Clusters in Pretrained Language Models

    Authors: Roee Aharoni, Yoav Goldberg

    Abstract: The notion of "in-domain data" in NLP is often over-simplistic and vague, as textual data varies in many nuanced linguistic aspects such as topic, style or level of formality. In addition, domain labels are many times unavailable, making it challenging to build domain-specific systems. We show that massive pre-trained language models implicitly learn sentence representations that cluster by domain… ▽ More

    Submitted 1 May, 2020; v1 submitted 5 April, 2020; originally announced April 2020.

    Comments: Accepted as a long paper in ACL 2020

  36. arXiv:2003.08247  [pdf, ps, other

    math.CO

    Cooperative conditions for the existence of rainbow matchings

    Authors: Ron Aharoni, Joseph Briggs, Minho Cho, **ha Kim

    Abstract: Let $k>1$, and let $\mathcal{F}$ be a family of $2n+k-3$ non-empty sets of edges in a bipartite graph. If the union of every $k$ members of $\mathcal{F}$ contains a matching of size $n$, then there exists an $\mathcal{F}$-rainbow matching of size $n$. Replacing $2n+k-3$ by $2n+k-2$, the result is true also for $k=1$, and it can be proved (for all $k$) both topologically and by a relatively simple… ▽ More

    Submitted 28 December, 2021; v1 submitted 18 March, 2020; originally announced March 2020.

  37. arXiv:1910.09302  [pdf, other

    cs.CL

    Diversify Your Datasets: Analyzing Generalization via Controlled Variance in Adversarial Datasets

    Authors: Ohad Rozen, Vered Shwartz, Roee Aharoni, Ido Dagan

    Abstract: Phenomenon-specific "adversarial" datasets have been recently designed to perform targeted stress-tests for particular inference types. Recent work (Liu et al., 2019a) proposed that such datasets can be utilized for training NLI and other types of models, often allowing to learn the phenomenon in focus and improve on the challenge dataset, indicating a "blind spot" in the original training data. Y… ▽ More

    Submitted 21 October, 2019; originally announced October 2019.

    Comments: CoNLL 2019

  38. arXiv:1909.13143  [pdf, ps, other

    math.CO

    Rainbow independent sets in certain classes of graphs

    Authors: Ron Aharoni, Joseph Briggs, **ha Kim, Minki Kim

    Abstract: For a given class $\mathcal{C}$ of graphs and given integers $m \leq n$, let $f_\mathcal{C}(n,m)$ be the minimal number $k$ such that every $k$ independent $n$-sets in any graph belonging to $\mathcal{C}$ have a (possibly partial) rainbow independent $m$-set. Motivated by known results on the finiteness and actual value of $f_\mathcal{C}(n,m)$ when $\mathcal{C}$ is the class of line graphs of grap… ▽ More

    Submitted 28 September, 2019; originally announced September 2019.

  39. arXiv:1903.07091  [pdf, other

    cs.CL cs.AI cs.LG

    The Missing Ingredient in Zero-Shot Neural Machine Translation

    Authors: Naveen Arivazhagan, Ankur Bapna, Orhan Firat, Roee Aharoni, Melvin Johnson, Wolfgang Macherey

    Abstract: Multilingual Neural Machine Translation (NMT) models are capable of translating between multiple source and target languages. Despite various approaches to train such models, they have difficulty with zero-shot translation: translating between language pairs that were not together seen during training. In this paper we first diagnose why state-of-the-art multilingual NMT models that rely purely on… ▽ More

    Submitted 17 March, 2019; originally announced March 2019.

  40. arXiv:1903.03467  [pdf, other

    cs.CL

    Filling Gender & Number Gaps in Neural Machine Translation with Black-box Context Injection

    Authors: Amit Moryossef, Roee Aharoni, Yoav Goldberg

    Abstract: When translating from a language that does not morphologically mark information such as gender and number into a language that does, translation systems must "guess" this missing information, often leading to incorrect translations in the given context. We propose a black-box approach for injecting the missing information to a pre-trained neural machine translation system, allowing to control the… ▽ More

    Submitted 8 March, 2019; originally announced March 2019.

    Comments: 6 pages

  41. arXiv:1903.00089  [pdf, other

    cs.CL

    Massively Multilingual Neural Machine Translation

    Authors: Roee Aharoni, Melvin Johnson, Orhan Firat

    Abstract: Multilingual neural machine translation (NMT) enables training a single model that supports translation from multiple source languages into multiple target languages. In this paper, we push the limits of multilingual NMT in terms of number of languages being used. We perform extensive experiments in training massively multilingual NMT models, translating up to 102 languages to and from English wit… ▽ More

    Submitted 2 July, 2019; v1 submitted 28 February, 2019; originally announced March 2019.

    Comments: Accepted as a long paper in NAACL 2019

  42. arXiv:1812.11872  [pdf, other

    math.CO

    A rainbow version of Mantel's Theorem

    Authors: Ron Aharoni, Matt DeVos, Sebastián González Hermosillo de la Maza, Amanda Montejano, Robert Šámal

    Abstract: Mantel's Theorem asserts that a simple $n$ vertex graph with more than $\frac{1}{4}n^2$ edges has a triangle (three mutually adjacent vertices). Here we consider a rainbow variant of this problem. We prove that whenever $G_1, G_2, G_3$ are simple graphs on a common set of $n$ vertices and $|E(G_i)| > ( \frac{ 26 - 2 \sqrt{7} }{81})n^2 \approx 0.2557 n^2$ for $1 \le i \le 3$, then there exist disti… ▽ More

    Submitted 25 February, 2020; v1 submitted 31 December, 2018; originally announced December 2018.

    Comments: 12 pages, 3 figures

    MSC Class: 05C35

  43. Cooperative colorings of trees and of bipartite graphs

    Authors: Ron Aharoni, Eli Berger, Maria Chudnovsky, Frédéric Havet, Zilin Jiang

    Abstract: Given a system $(G_1, \ldots ,G_m)$ of graphs on the same vertex set $V$, a cooperative coloring is a choice of vertex sets $I_1, \ldots ,I_m$, such that $I_j$ is independent in $G_j$ and $\bigcup_{j=1}^{m}I_j = V$. For a class $\mathcal{G}$ of graphs, let $m_{\mathcal{G}}(d)$ be the minimal $m$ such that every $m$ graphs from $\mathcal{G}$ with maximum degree $d$ have a cooperative coloring. We p… ▽ More

    Submitted 23 January, 2020; v1 submitted 16 June, 2018; originally announced June 2018.

    Comments: 8 pages, 2 figures, accepted to the Electronic Journal of Combinatorics, corrections suggested by the referees have been incorporated

    MSC Class: 05C15; 05C69

    Journal ref: The Electronic Journal of Combinatorics, volume 27, issue 1, #P1.41, February 2020

  44. Rainbow fractional matchings

    Authors: Ron Aharoni, Ron Holzman, Zilin Jiang

    Abstract: We prove that any family $E_1, \ldots , E_{\lceil rn \rceil}$ of (not necessarily distinct) sets of edges in an $r$-uniform hypergraph, each having a fractional matching of size $n$, has a rainbow fractional matching of size $n$ (that is, a set of edges from distinct $E_i$'s which supports such a fractional matching). When the hypergraph is $r$-partite and $n$ is an integer, the number of sets nee… ▽ More

    Submitted 6 May, 2019; v1 submitted 24 May, 2018; originally announced May 2018.

    Comments: 10 pages, accepted to Combinatorica, corrections suggested by the referees have been incorporated

    MSC Class: 05D15; 55U10

    Journal ref: Combinatorica, Volume 39, Issue 6, pp 1191-1202, December 2019

  45. arXiv:1805.01035  [pdf, other

    cs.CL

    Split and Rephrase: Better Evaluation and a Stronger Baseline

    Authors: Roee Aharoni, Yoav Goldberg

    Abstract: Splitting and rephrasing a complex sentence into several shorter sentences that convey the same meaning is a challenging problem in NLP. We show that while vanilla seq2seq models can reach high scores on the proposed benchmark (Narayan et al., 2017), they suffer from memorization of the training set which contains more than 89% of the unique simple sentences from the validation and test sets. To a… ▽ More

    Submitted 2 May, 2018; originally announced May 2018.

    Comments: Accepted as a short paper in ACL 2018

  46. arXiv:1804.01317  [pdf, ps, other

    math.CO

    Rainbow triangles and the Caccetta-Häggkvist conjecture

    Authors: Ron Aharoni, Ron Holzman, Matthew DeVos

    Abstract: A famous conjecture of Caccetta and Häggkvist is that in a digraph on $n$ vertices and minimum out-degree at least $\frac{n}{r}$ there is a directed cycle of length $r$ or less. We consider the following generalization: in an undirected graph on $n$ vertices, any collection of $n$ disjoint sets of edges, each of size at least $\frac{n}{r}$, has a rainbow cycle of length $r$ or less. We focus on th… ▽ More

    Submitted 4 April, 2018; originally announced April 2018.

  47. arXiv:1709.09889  [pdf, ps, other

    math.CO

    Weighted domination of independent sets

    Authors: Ron Aharoni, Irina Gorelik

    Abstract: The {\em independent domination number} $γ^i(G)$ of a graph $G$ is the maximum, over all independent sets $I$, of the minimal number of vertices needed to dominate $I$. It is known \cite{abz} that in chordal graphs $γ^i$ is equal to $γ$, the ordinary domination number. The weighted version of this result is not true, but we show that it does hold for interval graphs, and for the intersection (that… ▽ More

    Submitted 28 September, 2017; originally announced September 2017.

  48. Ramsey-nice families of graphs

    Authors: Ron Aharoni, Noga Alon, Michal Amir, Penny Haxell, Dan Hefetz, Zilin Jiang, Gal Kronenberg, Alon Naor

    Abstract: For a finite family $\mathcal{F}$ of fixed graphs let $R_k(\mathcal{F})$ be the smallest integer $n$ for which every $k$-coloring of the edges of the complete graph $K_n$ yields a monochromatic copy of some $F\in\mathcal{F}$. We say that $\mathcal{F}$ is $k$-nice if for every graph $G$ with $χ(G)=R_k(\mathcal{F})$ and for every $k$-coloring of $E(G)$ there exists a monochromatic copy of some… ▽ More

    Submitted 16 April, 2018; v1 submitted 24 August, 2017; originally announced August 2017.

    Comments: 20 pages, 2 figures

    MSC Class: 05D10; 05C55

    Journal ref: Eur. J. Combin. 72 (2018) 29-44

  49. Finding a best approximation pair of points for two polyhedra

    Authors: Ron Aharoni, Yair Censor, Zilin Jiang

    Abstract: Given two disjoint convex polyhedra, we look for a best approximation pair relative to them, i.e., a pair of points, one in each polyhedron, attaining the minimum distance between the sets. Cheney and Goldstein showed that alternating projections onto the two sets, starting from an arbitrary point, generate a sequence whose two interlaced subsequences converge to a best approximation pair. We prop… ▽ More

    Submitted 22 June, 2018; v1 submitted 30 July, 2017; originally announced July 2017.

    Comments: 14 pages, 8 figures, accepted to Computational Optimization and Applications (COAP)

    MSC Class: 65K05; 90C20; 90C25

    Journal ref: Comput Optim Appl (2018) 71: 509-523

  50. arXiv:1704.04743  [pdf, other

    cs.CL

    Towards String-to-Tree Neural Machine Translation

    Authors: Roee Aharoni, Yoav Goldberg

    Abstract: We present a simple method to incorporate syntactic information about the target language in a neural machine translation system by translating into linearized, lexicalized constituency trees. An experiment on the WMT16 German-English news translation task resulted in an improved BLEU score when compared to a syntax-agnostic NMT baseline trained on the same dataset. An analysis of the translations… ▽ More

    Submitted 6 May, 2017; v1 submitted 16 April, 2017; originally announced April 2017.

    Comments: Accepted as a short paper in ACL 2017