Search | arXiv e-print repository

Make It Count: Text-to-Image Generation with an Accurate Number of Objects

Authors: Lital Binyamin, Yoad Tewel, Hilit Segev, Eran Hirsch, Royi Rassin, Gal Chechik

Abstract: Despite the unprecedented success of text-to-image diffusion models, controlling the number of depicted objects using text is surprisingly hard. This is important for various applications from technical documents, to children's books to illustrating cooking recipes. Generating object-correct counts is fundamentally challenging because the generative model needs to keep a sense of separate identity… ▽ More Despite the unprecedented success of text-to-image diffusion models, controlling the number of depicted objects using text is surprisingly hard. This is important for various applications from technical documents, to children's books to illustrating cooking recipes. Generating object-correct counts is fundamentally challenging because the generative model needs to keep a sense of separate identity for every instance of the object, even if several objects look identical or overlap, and then carry out a global computation implicitly during generation. It is still unknown if such representations exist. To address count-correct generation, we first identify features within the diffusion model that can carry the object identity information. We then use them to separate and count instances of objects during the denoising process and detect over-generation and under-generation. We fix the latter by training a model that predicts both the shape and location of a missing object, based on the layout of existing ones, and show how it can be used to guide denoising with correct object count. Our approach, CountGen, does not depend on external source to determine object layout, but rather uses the prior from the diffusion model itself, creating prompt-dependent and seed-dependent layouts. Evaluated on two benchmark datasets, we find that CountGen strongly outperforms the count-accuracy of existing baselines. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: Project page is at https://make-it-count-paper.github.io/

arXiv:2403.17104 [pdf, other]

Attribute First, then Generate: Locally-attributable Grounded Text Generation

Authors: Aviv Slobodkin, Eran Hirsch, Arie Cattan, Tal Schuster, Ido Dagan

Abstract: Recent efforts to address hallucinations in Large Language Models (LLMs) have focused on attributed text generation, which supplements generated texts with citations of supporting sources for post-generation fact-checking and corrections. Yet, these citations often point to entire documents or paragraphs, burdening users with extensive verification work. In this paper, we introduce a locally-attri… ▽ More Recent efforts to address hallucinations in Large Language Models (LLMs) have focused on attributed text generation, which supplements generated texts with citations of supporting sources for post-generation fact-checking and corrections. Yet, these citations often point to entire documents or paragraphs, burdening users with extensive verification work. In this paper, we introduce a locally-attributable text generation approach, prioritizing concise attributions. Our method, named ``Attribute First, then Generate'', breaks down the conventional end-to-end generation process into three intuitive steps: content selection, sentence planning, and sequential sentence generation. By initially identifying relevant source segments (``select first'') and then conditioning the generation process on them (``then generate''), we ensure these segments also act as the output's fine-grained attributions (``select'' becomes ``attribute''). Tested on Multi-document Summarization and Long-form Question-answering, our method not only yields more concise citations than the baselines but also maintains - and in some cases enhances - both generation quality and attribution accuracy. Furthermore, it significantly reduces the time required for fact verification by human assessors. △ Less

Submitted 1 April, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.13444 [pdf, other]

MedCycle: Unpaired Medical Report Generation via Cycle-Consistency

Authors: Elad Hirsch, Gefen Dawidowicz, Ayellet Tal

Abstract: Generating medical reports for X-ray images presents a significant challenge, particularly in unpaired scenarios where access to paired image-report data for training is unavailable. Previous works have typically learned a joint embedding space for images and reports, necessitating a specific labeling schema for both. We introduce an innovative approach that eliminates the need for consistent labe… ▽ More Generating medical reports for X-ray images presents a significant challenge, particularly in unpaired scenarios where access to paired image-report data for training is unavailable. Previous works have typically learned a joint embedding space for images and reports, necessitating a specific labeling schema for both. We introduce an innovative approach that eliminates the need for consistent labeling schemas, thereby enhancing data accessibility and enabling the use of incompatible datasets. This approach is based on cycle-consistent map** functions that transform image embeddings into report embeddings, coupled with report auto-encoding for medical report generation. Our model and objectives consider intricate local details and the overarching semantic context within images and reports. This approach facilitates the learning of effective map** functions, resulting in the generation of coherent reports. It outperforms state-of-the-art results in unpaired chest X-ray report generation, demonstrating improvements in both language and clinical metrics. △ Less

Submitted 21 March, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

arXiv:2402.11489 [pdf, other]

What's the Plan? Evaluating and Develo** Planning-Aware Techniques for Language Models

Authors: Eran Hirsch, Guy Uziel, Ateret Anaby-Tavor

Abstract: Planning is a fundamental task in artificial intelligence that involves finding a sequence of actions that achieve a specified goal in a given environment. Large language models (LLMs) are increasingly used for applications that require planning capabilities, such as web or embodied agents. In line with recent studies, we demonstrate through experimentation that LLMs lack necessary skills required… ▽ More Planning is a fundamental task in artificial intelligence that involves finding a sequence of actions that achieve a specified goal in a given environment. Large language models (LLMs) are increasingly used for applications that require planning capabilities, such as web or embodied agents. In line with recent studies, we demonstrate through experimentation that LLMs lack necessary skills required for planning. Based on these observations, we advocate for the potential of a hybrid approach that combines LLMs with classical planning methodology. Then, we introduce SimPlan, a novel hybrid-method, and evaluate its performance in a new challenging setup. Our extensive experiments across various planning domains demonstrate that SimPlan significantly outperforms existing LLM-based planners. △ Less

Submitted 22 May, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

Comments: 9 pages and an appendix

arXiv:2310.09017 [pdf, other]

Dont Add, dont Miss: Effective Content Preserving Generation from Pre-Selected Text Spans

Authors: Aviv Slobodkin, Avi Caciularu, Eran Hirsch, Ido Dagan

Abstract: The recently introduced Controlled Text Reduction (CTR) task isolates the text generation step within typical summarization-style tasks. It does so by challenging models to generate coherent text conforming to pre-selected content within the input text (``highlights''). This framing enables increased modularity in summarization-like tasks, allowing to couple a single CTR model with various content… ▽ More The recently introduced Controlled Text Reduction (CTR) task isolates the text generation step within typical summarization-style tasks. It does so by challenging models to generate coherent text conforming to pre-selected content within the input text (``highlights''). This framing enables increased modularity in summarization-like tasks, allowing to couple a single CTR model with various content-selection setups and modules. However, there are currently no reliable CTR models, while the performance of the existing baseline for the task is mediocre, falling short of practical utility. Here, we address this gap by introducing a high-quality, open-source CTR model that tackles two prior key limitations: inadequate enforcement of the content-preservation constraint, and suboptimal silver training data. Addressing these, we amplify the content-preservation constraint in both training, via RL, and inference, via a controlled decoding strategy. Further, we substantially improve the silver training data quality via GPT-4 distillation. Overall, pairing the distilled dataset with the highlight-adherence strategies yields marked gains over the current baseline, of up to 30 ROUGE-L points, providing a reliable CTR model for downstream use. △ Less

Submitted 25 February, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

Comments: EMNLP 2023, findings

arXiv:2306.08877 [pdf, other]

Linguistic Binding in Diffusion Models: Enhancing Attribute Correspondence through Attention Map Alignment

Authors: Royi Rassin, Eran Hirsch, Daniel Glickman, Shauli Ravfogel, Yoav Goldberg, Gal Chechik

Abstract: Text-conditioned image generation models often generate incorrect associations between entities and their visual attributes. This reflects an impaired map** between linguistic binding of entities and modifiers in the prompt and visual binding of the corresponding elements in the generated image. As one notable example, a query like "a pink sunflower and a yellow flamingo" may incorrectly produce… ▽ More Text-conditioned image generation models often generate incorrect associations between entities and their visual attributes. This reflects an impaired map** between linguistic binding of entities and modifiers in the prompt and visual binding of the corresponding elements in the generated image. As one notable example, a query like "a pink sunflower and a yellow flamingo" may incorrectly produce an image of a yellow sunflower and a pink flamingo. To remedy this issue, we propose SynGen, an approach which first syntactically analyses the prompt to identify entities and their modifiers, and then uses a novel loss function that encourages the cross-attention maps to agree with the linguistic binding reflected by the syntax. Specifically, we encourage large overlap between attention maps of entities and their modifiers, and small overlap with other entities and modifier words. The loss is optimized during inference, without retraining or fine-tuning the model. Human evaluation on three datasets, including one new and challenging set, demonstrate significant improvements of SynGen compared with current state of the art methods. This work highlights how making use of sentence structure during inference can efficiently and substantially improve the faithfulness of text-to-image generation. △ Less

Submitted 23 January, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

Comments: Accepted to NeurIPS 2023 (oral). Our code is publicly available at https://github.com/RoyiRa/Syntax-Guided-Generation

arXiv:2306.01418 [pdf, other]

An OPC UA-based industrial Big Data architecture

Authors: Eduard Hirsch, Simon Hoher, Stefan Huber

Abstract: Industry 4.0 factories are complex and data-driven. Data is yielded from many sources, including sensors, PLCs, and other devices, but also from IT, like ERP or CRM systems. We ask how to collect and process this data in a way, such that it includes metadata and can be used for industrial analytics or to derive intelligent support systems. This paper describes a new, query model based approach, wh… ▽ More Industry 4.0 factories are complex and data-driven. Data is yielded from many sources, including sensors, PLCs, and other devices, but also from IT, like ERP or CRM systems. We ask how to collect and process this data in a way, such that it includes metadata and can be used for industrial analytics or to derive intelligent support systems. This paper describes a new, query model based approach, which uses a big data architecture to capture data from various sources using OPC UA as a foundation. It buffers and preprocesses the information for the purpose of harmonizing and providing a holistic state space of a factory, as well as map**s to the current state of a production site. That information can be made available to multiple processing sinks, decoupled from the data sources, which enables them to work with the information without interfering with devices of the production, disturbing the network devices they are working in, or influencing the production process negatively. Metadata and connected semantic information is kept throughout the process, allowing to feed algorithms with meaningful data, so that it can be accessed in its entirety to perform time series analysis, machine learning or similar evaluations as well as replaying the data from the buffer for repeatable simulations. △ Less

Submitted 2 June, 2023; originally announced June 2023.

arXiv:2305.15605 [pdf, other]

Revisiting Sentence Union Generation as a Testbed for Text Consolidation

Authors: Eran Hirsch, Valentina Pyatkin, Ruben Wolhandler, Avi Caciularu, Asi Shefer, Ido Dagan

Abstract: Tasks involving text generation based on multiple input texts, such as multi-document summarization, long-form question answering and contemporary dialogue applications, challenge models for their ability to properly consolidate partly-overlap** multi-text information. However, these tasks entangle the consolidation phase with the often subjective and ill-defined content selection requirement, i… ▽ More Tasks involving text generation based on multiple input texts, such as multi-document summarization, long-form question answering and contemporary dialogue applications, challenge models for their ability to properly consolidate partly-overlap** multi-text information. However, these tasks entangle the consolidation phase with the often subjective and ill-defined content selection requirement, impeding proper assessment of models' consolidation capabilities. In this paper, we suggest revisiting the sentence union generation task as an effective well-defined testbed for assessing text consolidation capabilities, decoupling the consolidation challenge from subjective content selection. To support research on this task, we present refined annotation methodology and tools for crowdsourcing sentence union, create the largest union dataset to date and provide an analysis of its rich coverage of various consolidation aspects. We then propose a comprehensive evaluation protocol for union generation, including both human and automatic evaluation. Finally, as baselines, we evaluate state-of-the-art language models on the task, along with a detailed analysis of their capacity to address multi-text consolidation challenges and their limitations. △ Less

Submitted 24 May, 2023; originally announced May 2023.

Comments: Findings of the Association for Computational Linguistics (ACL 2023)

arXiv:2305.06001 [pdf, other]

doi 10.1109/INDIN51400.2023.10217905

Digital Twins of Business Processes as Enablers for IT / OT Integration

Authors: Hannes Waclawek, Georg Schäfer, Christoph Binder, Eduard Hirsch, Stefan Huber

Abstract: The vision of Industry 4.0 introduces new requirements to Operational Technology (OT) systems. Solutions for these requirements already exist in the Information Technology (IT) world, however, due to the different characteristics of both worlds, these solutions often cannot be directly used in the world of OT. We therefore propose an Industrial Business Process Twin (IBPT), allowing to apply metho… ▽ More The vision of Industry 4.0 introduces new requirements to Operational Technology (OT) systems. Solutions for these requirements already exist in the Information Technology (IT) world, however, due to the different characteristics of both worlds, these solutions often cannot be directly used in the world of OT. We therefore propose an Industrial Business Process Twin (IBPT), allowing to apply methods of one world to another not directly but, instead, to a representation, that is in bidirectional exchange with the other world. The proposed IBPT entity acts as an intermediary, decoupling the worlds of IT and OT, thus allowing for an integration of IT and OT components of different manufacturers and platforms. Using this approach, we demonstrate the four essential Industry 4.0 design principles information transparency, technical assistance, interconnection and decentralized decisions based on the gamified Industry 4.0 scenario of playing the game of Nine Men's Morris. This scenario serves well for agent based Artificial Intelligence (AI)-research and education. We develop an Open Platform Communications Unified Architecture (OPC UA) information and communication model and then evaluate the IBPT component with respect to the different views of the Reference Architecture Model Industry 4.0 (RAMI4.0). △ Less

Submitted 10 May, 2023; originally announced May 2023.

Comments: Submitted to INDIN2023 conference

Journal ref: 2023 IEEE 21st International Conference on Industrial Informatics (INDIN)

arXiv:2303.17531 [pdf, other]

Asymmetric Image Retrieval with Cross Model Compatible Ensembles

Authors: Ori Linial, Alon Shoshan, Nadav Bhonker, Elad Hirsch, Lior Zamir, Igor Kviatkovsky, Gerard Medioni

Abstract: The asymmetrical retrieval setting is a well suited solution for resource constrained applications such as face recognition and image retrieval. In this setting, a large model is used for indexing the gallery while a lightweight model is used for querying. The key principle in such systems is ensuring that both models share the same embedding space. Most methods in this domain are based on knowled… ▽ More The asymmetrical retrieval setting is a well suited solution for resource constrained applications such as face recognition and image retrieval. In this setting, a large model is used for indexing the gallery while a lightweight model is used for querying. The key principle in such systems is ensuring that both models share the same embedding space. Most methods in this domain are based on knowledge distillation. While useful, they suffer from several drawbacks: they are upper-bounded by the performance of the single best model found and cannot be extended to use an ensemble of models in a straightforward manner. In this paper we present an approach that does not rely on knowledge distillation, rather it utilizes embedding transformation models. This allows the use of N independently trained and diverse gallery models (e.g., trained on different datasets or having a different architecture) and a single query model. As a result, we improve the overall accuracy beyond that of any single model while maintaining a low computational budget for querying. Additionally, we propose a gallery image rejection method that utilizes the diversity between multiple transformed embeddings to estimate the uncertainty of gallery images. △ Less

Submitted 29 October, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

arXiv:2303.11755 [pdf, other]

LIMITR: Leveraging Local Information for Medical Image-Text Representation

Authors: Gefen Dawidowicz, Elad Hirsch, Ayellet Tal

Abstract: Medical imaging analysis plays a critical role in the diagnosis and treatment of various medical conditions. This paper focuses on chest X-ray images and their corresponding radiological reports. It presents a new model that learns a joint X-ray image & report representation. The model is based on a novel alignment scheme between the visual data and the text, which takes into account both local an… ▽ More Medical imaging analysis plays a critical role in the diagnosis and treatment of various medical conditions. This paper focuses on chest X-ray images and their corresponding radiological reports. It presents a new model that learns a joint X-ray image & report representation. The model is based on a novel alignment scheme between the visual data and the text, which takes into account both local and global information. Furthermore, the model integrates domain-specific information of two types -- lateral images and the consistent visual structure of chest images. Our representation is shown to benefit three types of retrieval tasks: text-image retrieval, class-based retrieval, and phrase-grounding. △ Less

Submitted 21 March, 2023; originally announced March 2023.

arXiv:2302.06241 [pdf, ps, other]

Proving Unsatisfiability with Hitting Formulas

Authors: Yuval Filmus, Edward A. Hirsch, Artur Riazanov, Alexander Smal, Marc Vinyals

Abstract: Hitting formulas have been studied in many different contexts at least since [Iwama,89]. A hitting formula is a set of Boolean clauses such that any two of them cannot be simultaneously falsified. [Peitl,Szeider,05] conjectured that hitting formulas should contain the hardest formulas for resolution. They supported their conjecture with experimental findings. Using the fact that hitting formulas a… ▽ More Hitting formulas have been studied in many different contexts at least since [Iwama,89]. A hitting formula is a set of Boolean clauses such that any two of them cannot be simultaneously falsified. [Peitl,Szeider,05] conjectured that hitting formulas should contain the hardest formulas for resolution. They supported their conjecture with experimental findings. Using the fact that hitting formulas are easy to check for satisfiability we use them to build a static proof system Hitting: a refutation of a CNF in Hitting is an unsatisfiable hitting formula such that each of its clauses is a weakening of a clause of the refuted CNF. Comparing this system to resolution and other proof systems is equivalent to studying the hardness of hitting formulas. We show that tree-like resolution and Hitting are quasi-polynomially separated. We prove that Hitting is quasi-polynomially simulated by tree-like resolution, thus hitting formulas cannot be exponentially hard for resolution, so Peitl-Szeider's conjecture is partially refuted. Nevertheless Hitting is surprisingly difficult to polynomially simulate. Using the ideas of PIT for noncommutative circuits [Raz-Shpilka,05] we show that Hitting is simulated by Extended Frege. As a byproduct, we show that a number of static (semi)algebraic systems are verifiable in a deterministic polynomial time. We consider multiple extensions of Hitting. Hitting(+) formulas are conjunctions of clauses containing affine equations instead of just literals, and every assignment falsifies at most one clause. The resulting system is related to Res(+) proof system for which no superpolynomial lower bounds are known: Hitting(+) simulates the tree-like version of Res(+) and is at least quasi-polynomially stronger. We show an exponential lower bound for Hitting(+). △ Less

Submitted 13 February, 2023; originally announced February 2023.

MSC Class: 03F20 ACM Class: F.2.2

arXiv:2211.14835 [pdf, other]

CLID: Controlled-Length Image Descriptions with Limited Data

Authors: Elad Hirsch, Ayellet Tal

Abstract: Controllable image captioning models generate human-like image descriptions, enabling some kind of control over the generated captions. This paper focuses on controlling the caption length, i.e. a short and concise description or a long and detailed one. Since existing image captioning datasets contain mostly short captions, generating long captions is challenging. To address the shortage of long… ▽ More Controllable image captioning models generate human-like image descriptions, enabling some kind of control over the generated captions. This paper focuses on controlling the caption length, i.e. a short and concise description or a long and detailed one. Since existing image captioning datasets contain mostly short captions, generating long captions is challenging. To address the shortage of long training examples, we propose to enrich the dataset with varying-length self-generated captions. These, however, might be of varying quality and are thus unsuitable for conventional training. We introduce a novel training strategy that selects the data points to be used at different times during the training. Our method dramatically improves the length-control abilities, while exhibiting SoTA performance in terms of caption quality. Our approach is general and is shown to be applicable also to paragraph generation. △ Less

Submitted 21 January, 2024; v1 submitted 27 November, 2022; originally announced November 2022.

arXiv:2210.17429 [pdf, ps, other]

The power of the Binary Value Principle

Authors: Yaroslav Alekseev, Edward A. Hirsch

Abstract: The (extended) Binary Value Principle (eBVP: $\sum_{i=1}^n x_i2^{i-1} = -k$ for $k>0$ and $x^2_i=x_i$) has received a lot of attention recently, several lower bounds have been proved for it (Alekseev et al 2020, Alekseev 2021, Part and Tzameret 2021). Also it has been shown (Alekseev et al 2020) that the probabilistically verifiable Ideal Proof System (IPS) (Grochow and Pitassi 2018) together with… ▽ More The (extended) Binary Value Principle (eBVP: $\sum_{i=1}^n x_i2^{i-1} = -k$ for $k>0$ and $x^2_i=x_i$) has received a lot of attention recently, several lower bounds have been proved for it (Alekseev et al 2020, Alekseev 2021, Part and Tzameret 2021). Also it has been shown (Alekseev et al 2020) that the probabilistically verifiable Ideal Proof System (IPS) (Grochow and Pitassi 2018) together with eBVP polynomially simulates a similar semialgebraic proof system. In this paper we consider Polynomial Calculus with the algebraic version of Tseitin's extension rule (Ext-PC). Contrary to IPS, this is a Cook--Reckhow proof system. We show that in this context eBVP still allows to simulate similar semialgebraic systems. We also prove that it allows to simulate the Square Root Rule (Grigoriev and Hirsch 2003), which is in sharp contrast with the result of (Alekseev 2021) that shows an exponential lower bound on the size of Ext-PC derivations of the Binary Value Principle from its square. On the other hand, we demonstrate that eBVP probably does not help in proving exponential lower bounds for Boolean formulas: we show that an Ext-PC (even with the Square Root Rule) derivation of any unsatisfiable Boolean formula in CNF from eBVP must be of exponential size. △ Less

Submitted 29 November, 2022; v1 submitted 31 October, 2022; originally announced October 2022.

Comments: 21 pages

MSC Class: 03F20 ACM Class: F.2.2

arXiv:2210.13449 [pdf, other]

Controlled Text Reduction

Authors: Aviv Slobodkin, Paul Roit, Eran Hirsch, Ori Ernst, Ido Dagan

Abstract: Producing a reduced version of a source text, as in generic or focused summarization, inherently involves two distinct subtasks: deciding on targeted content and generating a coherent text conveying it. While some popular approaches address summarization as a single end-to-end task, prominent works support decomposed modeling for individual subtasks. Further, semi-automated text reduction is also… ▽ More Producing a reduced version of a source text, as in generic or focused summarization, inherently involves two distinct subtasks: deciding on targeted content and generating a coherent text conveying it. While some popular approaches address summarization as a single end-to-end task, prominent works support decomposed modeling for individual subtasks. Further, semi-automated text reduction is also very appealing, where users may identify targeted content while models would generate a corresponding coherent summary. In this paper, we focus on the second subtask, of generating coherent text given pre-selected content. Concretely, we formalize \textit{Controlled Text Reduction} as a standalone task, whose input is a source text with marked spans of targeted content ("highlighting"). A model then needs to generate a coherent text that includes all and only the target information. We advocate the potential of such models, both for modular fully-automatic summarization, as well as for semi-automated human-in-the-loop use cases. Facilitating proper research, we crowdsource high-quality dev and test datasets for the task. Further, we automatically generate a larger "silver" training dataset from available summarization benchmarks, leveraging a pretrained summary-source alignment model. Finally, employing these datasets, we present a supervised baseline model, showing promising results and insightful analyses. △ Less

Submitted 24 October, 2022; originally announced October 2022.

Comments: Accepted to EMNLP 2022

arXiv:2205.11413 [pdf, other]

QASem Parsing: Text-to-text Modeling of QA-based Semantics

Authors: Ayal Klein, Eran Hirsch, Ron Eliav, Valentina Pyatkin, Avi Caciularu, Ido Dagan

Abstract: Several recent works have suggested to represent semantic relations with questions and answers, decomposing textual information into separate interrogative natural language statements. In this paper, we consider three QA-based semantic tasks - namely, QA-SRL, QANom and QADiscourse, each targeting a certain type of predication - and propose to regard them as jointly providing a comprehensive repres… ▽ More Several recent works have suggested to represent semantic relations with questions and answers, decomposing textual information into separate interrogative natural language statements. In this paper, we consider three QA-based semantic tasks - namely, QA-SRL, QANom and QADiscourse, each targeting a certain type of predication - and propose to regard them as jointly providing a comprehensive representation of textual information. To promote this goal, we investigate how to best utilize the power of sequence-to-sequence (seq2seq) pre-trained language models, within the unique setup of semi-structured outputs, consisting of an unordered set of question-answer pairs. We examine different input and output linearization strategies, and assess the effect of multitask learning and of simple data augmentation techniques in the setting of imbalanced training data. Consequently, we release the first unified QASem parsing tool, practical for downstream applications who can benefit from an explicit, QA-based account of information units in a text. △ Less

Submitted 14 February, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

arXiv:2110.05156 [pdf, other]

Beyond Desktop Computation: Challenges in Scaling a GPU Infrastructure

Authors: Martin Uray, Eduard Hirsch, Gerold Katzinger, Michael Gadermayr

Abstract: Enterprises and labs performing computationally expensive data science applications sooner or later face the problem of scale but unconnected infrastructure. For this up-scaling process, an IT service provider can be hired or in-house personnel can attempt to implement a software stack. The first option can be quite expensive if it is just about connecting several machines. For the latter option o… ▽ More Enterprises and labs performing computationally expensive data science applications sooner or later face the problem of scale but unconnected infrastructure. For this up-scaling process, an IT service provider can be hired or in-house personnel can attempt to implement a software stack. The first option can be quite expensive if it is just about connecting several machines. For the latter option often experience is missing with the data science staff in order to navigate through the software jungle. In this technical report, we illustrate the decision process towards an on-premises infrastructure, our implemented system architecture, and the transformation of the software stack towards a scaleable GPU cluster system. △ Less

Submitted 11 October, 2021; originally announced October 2021.

Comments: 6 pages, 2 figures, to be published in Proceedings of the 4th International Data Science Conference - iDSC2021

arXiv:2109.11621 [pdf, other]

iFacetSum: Coreference-based Interactive Faceted Summarization for Multi-Document Exploration

Authors: Eran Hirsch, Alon Eirew, Ori Shapira, Avi Caciularu, Arie Cattan, Ori Ernst, Ramakanth Pasunuru, Hadar Ronen, Mohit Bansal, Ido Dagan

Abstract: We introduce iFacetSum, a web application for exploring topical document sets. iFacetSum integrates interactive summarization together with faceted search, by providing a novel faceted navigation scheme that yields abstractive summaries for the user's selections. This approach offers both a comprehensive overview as well as concise details regarding subtopics of choice. Fine-grained facets are aut… ▽ More We introduce iFacetSum, a web application for exploring topical document sets. iFacetSum integrates interactive summarization together with faceted search, by providing a novel faceted navigation scheme that yields abstractive summaries for the user's selections. This approach offers both a comprehensive overview as well as concise details regarding subtopics of choice. Fine-grained facets are automatically produced based on cross-document coreference pipelines, rendering generic concepts, entities and statements surfacing in the source texts. We analyze the effectiveness of our application through small-scale user studies, which suggest the usefulness of our approach. △ Less

Submitted 23 September, 2021; originally announced September 2021.

Comments: Proceedings of EMNLP 2021, System Demonstrations. 7 pages and an appendix

arXiv:2101.05037 [pdf, other]

doi 10.1109/SDS.2019.8768645

Immutable and Democratic Data in permissionless Peer-to-Peer Systems

Authors: Maximilian Ernst Tschuchnig, Dejan Radovanovic, Eduard Hirsch, Anna-Maria Oberluggauer, Georg Schäfer

Abstract: Conventional data storage methods like SQL and NoSQL offer a huge amount of possibilities with one major disadvantage, having to use a centralized authority. This authority may be in the form of a centralized or decentralized master server or a permissioned peer-to-peer setting. This paper looks at different technologies on how to persist data without using a central authority, mainly looking at p… ▽ More Conventional data storage methods like SQL and NoSQL offer a huge amount of possibilities with one major disadvantage, having to use a centralized authority. This authority may be in the form of a centralized or decentralized master server or a permissioned peer-to-peer setting. This paper looks at different technologies on how to persist data without using a central authority, mainly looking at permissionless peer-to-peer networks, primarily Distributed Ledger Technologies (DLTs) and a combination of DLTs with conventional databases. Afterwards it is shown how a system like this might be implemented in two prototypes which are then evaluated against conventional databases. △ Less

Submitted 13 January, 2021; originally announced January 2021.

arXiv:2005.08772 [pdf, other]

Color Visual Illusions: A Statistics-based Computational Model

Authors: Elad Hirsch, Ayellet Tal

Abstract: Visual illusions may be explained by the likelihood of patches in real-world images, as argued by input-driven paradigms in Neuro-Science. However, neither the data nor the tools existed in the past to extensively support these explanations. The era of big data opens a new opportunity to study input-driven approaches. We introduce a tool that computes the likelihood of patches, given a large datas… ▽ More Visual illusions may be explained by the likelihood of patches in real-world images, as argued by input-driven paradigms in Neuro-Science. However, neither the data nor the tools existed in the past to extensively support these explanations. The era of big data opens a new opportunity to study input-driven approaches. We introduce a tool that computes the likelihood of patches, given a large dataset to learn from. Given this tool, we present a model that supports the approach and explains lightness and color visual illusions in a unified manner. Furthermore, our model generates visual illusions in natural images, by applying the same tool, reversely. △ Less

Submitted 22 October, 2020; v1 submitted 18 May, 2020; originally announced May 2020.

arXiv:1911.06738 [pdf, ps, other]

Semi-Algebraic Proofs, IPS Lower Bounds and the $τ$-Conjecture: Can a Natural Number be Negative?

Authors: Yaroslav Alekseev, Dima Grigoriev, Edward A. Hirsch, Iddo Tzameret

Abstract: We introduce the binary value principle which is a simple subset-sum instance expressing that a natural number written in binary cannot be negative, relating it to central problems in proof and algebraic complexity. We prove conditional superpolynomial lower bounds on the Ideal Proof System (IPS) refutation size of this instance, based on a well-known hypothesis by Shub and Smale about the hardnes… ▽ More We introduce the binary value principle which is a simple subset-sum instance expressing that a natural number written in binary cannot be negative, relating it to central problems in proof and algebraic complexity. We prove conditional superpolynomial lower bounds on the Ideal Proof System (IPS) refutation size of this instance, based on a well-known hypothesis by Shub and Smale about the hardness of computing factorials, where IPS is the strong algebraic proof system introduced by Grochow and Pitassi (2018). Conversely, we show that short IPS refutations of this instance bridge the gap between sufficiently strong algebraic and semi-algebraic proof systems. Our results extend to full-fledged IPS the paradigm introduced in Forbes et al. (2016), whereby lower bounds against subsystems of IPS were obtained using restricted algebraic circuit lower bounds, and demonstrate that the binary value principle captures the advantage of semi-algebraic over algebraic reasoning, for sufficiently strong systems. Specifically, we show the following: (abstract continues in document.) △ Less

Submitted 15 November, 2019; originally announced November 2019.

arXiv:1810.01605 [pdf, ps, other]

doi 10.1007/s11192-018-2994-1

$\mathbf{h_α}$: An index to quantify an individual's scientific leadership

Authors: J. E. Hirsch

Abstract: The $α$ person is the dominant person in a group. We define the $α$-author of a paper as the author of the paper with the highest $h$-index among all the coauthors, and an $α$-paper of a scientist as a paper authored or coauthored by the scientist where he/she is the $α$-author. For most but not all papers in the literature there is only one $α$-author. We define the $h_α$ index of a scientist as… ▽ More The $α$ person is the dominant person in a group. We define the $α$-author of a paper as the author of the paper with the highest $h$-index among all the coauthors, and an $α$-paper of a scientist as a paper authored or coauthored by the scientist where he/she is the $α$-author. For most but not all papers in the literature there is only one $α$-author. We define the $h_α$ index of a scientist as the number of papers in the $h$-core of the scientist (i.e. the set of papers that contribute to the $h$-index of the scientist) where this scientist is the $α$-author. We also define the $h'_α$ index of a scientist as the number of $α$-papers of this scientist that have $\geq$ $h'_α$ citations. $h_α$ and $h'_α$ contain similar information, while $h'_α$ is conceptually more appealing it is harder to obtain from existing databases, hence of less current practical interest. We propose that the $h_α$ and/or $h'_α$ indices, or other variants discussed in the paper, are useful complements to the $h$-index of a scientist to quantify his/her scientific achievement, that rectify an inherent drawback of the $h$-index, its inability to distinguish between authors with different coauthorships patterns. A high $h$ index in conjunction with a high $h_α/h$ ratio is a hallmark of scientific leadership. △ Less

Submitted 10 December, 2018; v1 submitted 3 October, 2018; originally announced October 2018.

Comments: Some changes in response to referees' comments

Journal ref: Scientometrics Vol 118, p. 673-686 (2019)

arXiv:0908.2707 [pdf, ps, other]

On optimal heuristic randomized semidecision procedures, with application to proof complexity

Authors: Edward A. Hirsch, Dmitry Itsykson

Abstract: The existence of a (p-)optimal propositional proof system is a major open question in (proof) complexity; many people conjecture that such systems do not exist. Krajicek and Pudlak (1989) show that this question is equivalent to the existence of an algorithm that is optimal on all propositional tautologies. Monroe (2009) recently gave a conjecture implying that such algorithm does not exist. W… ▽ More The existence of a (p-)optimal propositional proof system is a major open question in (proof) complexity; many people conjecture that such systems do not exist. Krajicek and Pudlak (1989) show that this question is equivalent to the existence of an algorithm that is optimal on all propositional tautologies. Monroe (2009) recently gave a conjecture implying that such algorithm does not exist. We show that in the presence of errors such optimal algorithms do exist. The concept is motivated by the notion of heuristic algorithms. Namely, we allow the algorithm to claim a small number of false "theorems" (according to any samplable distribution on non-tautologies) and err with bounded probability on other inputs. Our result can also be viewed as the existence of an optimal proof system in a class of proof systems obtained by generalizing automatizable proof systems. △ Less

Submitted 3 February, 2010; v1 submitted 19 August, 2009; originally announced August 2009.

Comments: 11 pages, accepted to STACS 2010

ACM Class: F.2

Showing 1–23 of 23 results for author: Hirsch, E