-
ParallelPARC: A Scalable Pipeline for Generating Natural-Language Analogies
Authors:
Oren Sultan,
Yonatan Bitton,
Ron Yosef,
Dafna Shahaf
Abstract:
Analogy-making is central to human cognition, allowing us to adapt to novel situations -- an ability that current AI systems still lack. Most analogy datasets today focus on simple analogies (e.g., word analogies); datasets including complex types of analogies are typically manually curated and very small. We believe that this holds back progress in computational analogy. In this work, we design a…
▽ More
Analogy-making is central to human cognition, allowing us to adapt to novel situations -- an ability that current AI systems still lack. Most analogy datasets today focus on simple analogies (e.g., word analogies); datasets including complex types of analogies are typically manually curated and very small. We believe that this holds back progress in computational analogy. In this work, we design a data generation pipeline, ParallelPARC (Parallel Paragraph Creator) leveraging state-of-the-art Large Language Models (LLMs) to create complex, paragraph-based analogies, as well as distractors, both simple and challenging. We demonstrate our pipeline and create ProPara-Logy, a dataset of analogies between scientific processes. We publish a gold-set, validated by humans, and a silver-set, generated automatically. We test LLMs' and humans' analogy recognition in binary and multiple-choice settings, and found that humans outperform the best models (~13% gap) after a light supervision. We demonstrate that our silver-set is useful for training models. Lastly, we show challenging distractors confuse LLMs, but not humans. We hope our pipeline will encourage research in this emerging field.
△ Less
Submitted 14 May, 2024; v1 submitted 2 March, 2024;
originally announced March 2024.
-
IRFL: Image Recognition of Figurative Language
Authors:
Ron Yosef,
Yonatan Bitton,
Dafna Shahaf
Abstract:
Figures of speech such as metaphors, similes, and idioms are integral parts of human communication. They are ubiquitous in many forms of discourse, allowing people to convey complex, abstract ideas and evoke emotion. As figurative forms are often conveyed through multiple modalities (e.g., both text and images), understanding multimodal figurative language is an important AI challenge, weaving tog…
▽ More
Figures of speech such as metaphors, similes, and idioms are integral parts of human communication. They are ubiquitous in many forms of discourse, allowing people to convey complex, abstract ideas and evoke emotion. As figurative forms are often conveyed through multiple modalities (e.g., both text and images), understanding multimodal figurative language is an important AI challenge, weaving together profound vision, language, commonsense and cultural knowledge. In this work, we develop the Image Recognition of Figurative Language (IRFL) dataset. We leverage human annotation and an automatic pipeline we created to generate a multimodal dataset, and introduce two novel tasks as a benchmark for multimodal figurative language understanding. We experimented with state-of-the-art vision and language models and found that the best (22%) performed substantially worse than humans (97%). We release our dataset, benchmark, and code, in hopes of driving the development of models that can better understand figurative language.
△ Less
Submitted 25 November, 2023; v1 submitted 27 March, 2023;
originally announced March 2023.
-
VASR: Visual Analogies of Situation Recognition
Authors:
Yonatan Bitton,
Ron Yosef,
Eli Strugo,
Dafna Shahaf,
Roy Schwartz,
Gabriel Stanovsky
Abstract:
A core process in human cognition is analogical map**: the ability to identify a similar relational structure between different situations. We introduce a novel task, Visual Analogies of Situation Recognition, adapting the classical word-analogy task into the visual domain. Given a triplet of images, the task is to select an image candidate B' that completes the analogy (A to A' is like B to wha…
▽ More
A core process in human cognition is analogical map**: the ability to identify a similar relational structure between different situations. We introduce a novel task, Visual Analogies of Situation Recognition, adapting the classical word-analogy task into the visual domain. Given a triplet of images, the task is to select an image candidate B' that completes the analogy (A to A' is like B to what?). Unlike previous work on visual analogy that focused on simple image transformations, we tackle complex analogies requiring understanding of scenes.
We leverage situation recognition annotations and the CLIP model to generate a large set of 500k candidate analogies. Crowdsourced annotations for a sample of the data indicate that humans agree with the dataset label ~80% of the time (chance level 25%). Furthermore, we use human annotations to create a gold-standard dataset of 3,820 validated analogies. Our experiments demonstrate that state-of-the-art models do well when distractors are chosen randomly (~86%), but struggle with carefully chosen distractors (~53%, compared to 90% human accuracy). We hope our dataset will encourage the development of new analogy-making models. Website: https://vasr-dataset.github.io/
△ Less
Submitted 8 December, 2022;
originally announced December 2022.
-
WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models
Authors:
Yonatan Bitton,
Nitzan Bitton Guetta,
Ron Yosef,
Yuval Elovici,
Mohit Bansal,
Gabriel Stanovsky,
Roy Schwartz
Abstract:
While vision-and-language models perform well on tasks such as visual question answering, they struggle when it comes to basic human commonsense reasoning skills. In this work, we introduce WinoGAViL: an online game of vision-and-language associations (e.g., between werewolves and a full moon), used as a dynamic evaluation benchmark. Inspired by the popular card game Codenames, a spymaster gives a…
▽ More
While vision-and-language models perform well on tasks such as visual question answering, they struggle when it comes to basic human commonsense reasoning skills. In this work, we introduce WinoGAViL: an online game of vision-and-language associations (e.g., between werewolves and a full moon), used as a dynamic evaluation benchmark. Inspired by the popular card game Codenames, a spymaster gives a textual cue related to several visual candidates, and another player tries to identify them. Human players are rewarded for creating associations that are challenging for a rival AI model but still solvable by other human players. We use the game to collect 3.5K instances, finding that they are intuitive for humans (>90% Jaccard index) but challenging for state-of-the-art AI models, where the best model (ViLT) achieves a score of 52%, succeeding mostly where the cue is visually salient. Our analysis as well as the feedback we collect from players indicate that the collected associations require diverse reasoning skills, including general knowledge, common sense, abstraction, and more. We release the dataset, the code and the interactive game, allowing future data collection that can be used to develop models with better association abilities.
△ Less
Submitted 11 October, 2022; v1 submitted 25 July, 2022;
originally announced July 2022.
-
A Linear Algorithm for Computing Independence Polynomials of Trees
Authors:
Ohr Kadrawi,
Vadim E. Levit,
Ron Yosef,
Matan Mizrachi
Abstract:
An independent set in a graph is a set of pairwise non-adjacent vertices. Let $α(G)$ denote the cardinality of a maximum independent set in the graph $G = (V, E)$. Gutman and Harary defined the independence polynomial of $G$
\[
I(G;x) = \sum_{k=0}^{α(G)}{s_k}x^{k}={s_0}+{s_1}x+{s_2}x^{2}+...+{s_{α(G)}}x^{α(G)},
\]
where $s_k$ denotes the number of independent sets of cardinality $k$ in the…
▽ More
An independent set in a graph is a set of pairwise non-adjacent vertices. Let $α(G)$ denote the cardinality of a maximum independent set in the graph $G = (V, E)$. Gutman and Harary defined the independence polynomial of $G$
\[
I(G;x) = \sum_{k=0}^{α(G)}{s_k}x^{k}={s_0}+{s_1}x+{s_2}x^{2}+...+{s_{α(G)}}x^{α(G)},
\]
where $s_k$ denotes the number of independent sets of cardinality $k$ in the graph $G$. A comprehensive survey on the subject is due to Levit and Mandrescu, where some recursive formulas are allowing to calculate the independence polynomial. A direct implementation of these recursions does not bring about an efficient algorithm. Yosef, Mizrachi, and Kadrawi developed an efficient way for computing the independence polynomials of trees with $n$ vertices, such that a database containing all of the independence polynomials of all the trees with up to $n-1$ vertices is required. This approach is not suitable for big trees, as an extensive database is needed. On the other hand, using dynamic programming, it is possible to develop an efficient algorithm that prevents repeated calculations. In summary, our dynamic programming algorithm runs over a tree in linear time and does not depend on a database.
△ Less
Submitted 2 January, 2022;
originally announced January 2022.
-
On Unimodality of Independence Polynomials of Trees
Authors:
Ron Yosef,
Matan Mizrachi,
Ohr Kadrawi
Abstract:
An independent set in a graph is a set of pairwise non-adjacent vertices. The independence number $α{(G)}$ is the size of a maximum independent set in the graph $G$. The independence polynomial of a graph is the generating function for the sequence of numbers of independent sets of each size. In other words, the $k$-th coefficient of the independence polynomial equals the number of independent set…
▽ More
An independent set in a graph is a set of pairwise non-adjacent vertices. The independence number $α{(G)}$ is the size of a maximum independent set in the graph $G$. The independence polynomial of a graph is the generating function for the sequence of numbers of independent sets of each size. In other words, the $k$-th coefficient of the independence polynomial equals the number of independent sets comprised of $k$ vertices. For instance, the degree of the independence polynomial of the graph $G$ is equal to $α{(G)}$. In 1987, Alavi, Malde, Schwenk, and Erd{ö}s conjectured that the independence polynomial of a tree is unimodal. In what follows, we provide support to this assertion considering trees with up to $20$ vertices. Moreover, we show that the corresponding independence polynomials are log-concave and, consequently, unimodal. The algorithm computing the independence polynomial of a given tree makes use of a database of non-isomorphic unlabeled trees to prevent repeated computations.
△ Less
Submitted 7 March, 2022; v1 submitted 17 January, 2021;
originally announced January 2021.