Search | arXiv e-print repository

CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity

Authors: Moshe Berchansky, Daniel Fleischer, Moshe Wasserblat, Peter Izsak

Abstract: State-of-the-art performance in QA tasks is currently achieved by systems employing Large Language Models (LLMs), however these models tend to hallucinate information in their responses. One approach focuses on enhancing the generation process by incorporating attribution from the given input to the output. However, the challenge of identifying appropriate attributions and verifying their accuracy… ▽ More State-of-the-art performance in QA tasks is currently achieved by systems employing Large Language Models (LLMs), however these models tend to hallucinate information in their responses. One approach focuses on enhancing the generation process by incorporating attribution from the given input to the output. However, the challenge of identifying appropriate attributions and verifying their accuracy against a source is a complex task that requires significant improvements in assessing such systems. We introduce an attribution-oriented Chain-of-Thought reasoning method to enhance the accuracy of attributions. This approach focuses the reasoning process on generating an attribution-centric output. Evaluations on two context-enhanced question-answering datasets using GPT-4 demonstrate improved accuracy and correctness of attributions. In addition, the combination of our method with finetuning enhances the response and attribution accuracy of two smaller LLMs, showing their potential to outperform GPT-4 in some cases. △ Less

Submitted 16 April, 2024; originally announced April 2024.

arXiv:2306.11908 [pdf, other]

Accelerating Generalized Random Forests with Fixed-Point Trees

Authors: David Fleischer, David A. Stephens, Archer Yang

Abstract: Generalized random forests arXiv:1610.01271 build upon the well-established success of conventional forests (Breiman, 2001) to offer a flexible and powerful non-parametric method for estimating local solutions of heterogeneous estimating equations. Estimators are constructed by leveraging random forests as an adaptive kernel weighting algorithm and implemented through a gradient-based tree-growing… ▽ More Generalized random forests arXiv:1610.01271 build upon the well-established success of conventional forests (Breiman, 2001) to offer a flexible and powerful non-parametric method for estimating local solutions of heterogeneous estimating equations. Estimators are constructed by leveraging random forests as an adaptive kernel weighting algorithm and implemented through a gradient-based tree-growing procedure. By expressing this gradient-based approximation as being induced from a single Newton-Raphson root-finding iteration, and drawing upon the connection between estimating equations and fixed-point problems arXiv:2110.11074, we propose a new tree-growing rule for generalized random forests induced from a fixed-point iteration type of approximation, enabling gradient-free optimization, and yielding substantial time savings for tasks involving even modest dimensionality of the target quantity (e.g. multiple/multi-level treatment effects). We develop an asymptotic theory for estimators obtained from forests whose trees are grown through the fixed-point splitting rule, and provide numerical simulations demonstrating that the estimators obtained from such forests are comparable to those obtained from the more costly gradient-based rule. △ Less

Submitted 20 June, 2023; originally announced June 2023.

Comments: 22 pages, 5 figures

arXiv:1905.06638 [pdf, other]

Latent Universal Task-Specific BERT

Authors: Alon Rozental, Zohar Kelrich, Daniel Fleischer

Abstract: This paper describes a language representation model which combines the Bidirectional Encoder Representations from Transformers (BERT) learning mechanism described in Devlin et al. (2018) with a generalization of the Universal Transformer model described in Dehghani et al. (2018). We further improve this model by adding a latent variable that represents the persona and topics of interests of the w… ▽ More This paper describes a language representation model which combines the Bidirectional Encoder Representations from Transformers (BERT) learning mechanism described in Devlin et al. (2018) with a generalization of the Universal Transformer model described in Dehghani et al. (2018). We further improve this model by adding a latent variable that represents the persona and topics of interests of the writer for each training example. We also describe a simple method to improve the usefulness of our language representation for solving problems in a specific domain at the expense of its ability to generalize to other fields. Finally, we release a pre-trained language representation model for social texts that was trained on 100 million tweets. △ Less

Submitted 16 May, 2019; originally announced May 2019.

Comments: 6 pages, 2 figures

arXiv:1808.08782 [pdf, other]

doi 10.18653/v1/W18-6207

Amobee at IEST 2018: Transfer Learning from Language Models

Authors: Alon Rozental, Daniel Fleischer, Zohar Kelrich

Abstract: This paper describes the system developed at Amobee for the WASSA 2018 implicit emotions shared task (IEST). The goal of this task was to predict the emotion expressed by missing words in tweets without an explicit mention of those words. We developed an ensemble system consisting of language models together with LSTM-based networks containing a CNN attention mechanism. Our approach represents a n… ▽ More This paper describes the system developed at Amobee for the WASSA 2018 implicit emotions shared task (IEST). The goal of this task was to predict the emotion expressed by missing words in tweets without an explicit mention of those words. We developed an ensemble system consisting of language models together with LSTM-based networks containing a CNN attention mechanism. Our approach represents a novel use of language models (specifically trained on a large Twitter dataset) to predict and classify emotions. Our system reached 1st place with a macro $\text{F}_1$ score of 0.7145. △ Less

Submitted 23 October, 2018; v1 submitted 27 August, 2018; originally announced August 2018.

Comments: 7 pages, accepted to the 9th WASSA Workshop, part of the EMNLP 2018 Conference; added links to open-source material

arXiv:1804.04380 [pdf, other]

doi 10.18653/v1/S18-1033

Amobee at SemEval-2018 Task 1: GRU Neural Network with a CNN Attention Mechanism for Sentiment Classification

Authors: Alon Rozental, Daniel Fleischer

Abstract: This paper describes the participation of Amobee in the shared sentiment analysis task at SemEval 2018. We participated in all the English sub-tasks and the Spanish valence tasks. Our system consists of three parts: training task-specific word embeddings, training a model consisting of gated-recurrent-units (GRU) with a convolution neural network (CNN) attention mechanism and training stacking-bas… ▽ More This paper describes the participation of Amobee in the shared sentiment analysis task at SemEval 2018. We participated in all the English sub-tasks and the Spanish valence tasks. Our system consists of three parts: training task-specific word embeddings, training a model consisting of gated-recurrent-units (GRU) with a convolution neural network (CNN) attention mechanism and training stacking-based ensembles for each of the sub-tasks. Our algorithm reached 3rd and 1st places in the valence ordinal classification sub-tasks in English and Spanish, respectively. △ Less

Submitted 12 April, 2018; originally announced April 2018.

Comments: 8 pages, accepted to the 12th International Workshop on Semantic Evaluation 2018

arXiv:1705.01306 [pdf, other]

doi 10.18653/v1/S17-2108

Amobee at SemEval-2017 Task 4: Deep Learning System for Sentiment Detection on Twitter

Authors: Alon Rozental, Daniel Fleischer

Abstract: This paper describes the Amobee sentiment analysis system, adapted to compete in SemEval 2017 task 4. The system consists of two parts: a supervised training of RNN models based on a Twitter sentiment treebank, and the use of feedforward NN, Naive Bayes and logistic regression classifiers to produce predictions for the different sub-tasks. The algorithm reached the 3rd place on the 5-label classif… ▽ More This paper describes the Amobee sentiment analysis system, adapted to compete in SemEval 2017 task 4. The system consists of two parts: a supervised training of RNN models based on a Twitter sentiment treebank, and the use of feedforward NN, Naive Bayes and logistic regression classifiers to produce predictions for the different sub-tasks. The algorithm reached the 3rd place on the 5-label classification task (sub-task C). △ Less

Submitted 3 May, 2017; originally announced May 2017.

Comments: 6 pages, accepted to the 11th International Workshop on Semantic Evaluation (SemEval-2017)

arXiv:1411.5475 [pdf, ps, other]

doi 10.1007/JHEP02(2015)162

IR Dualities in General 3d Supersymmetric SU(N) QCD Theories

Authors: Ofer Aharony, Daniel Fleischer

Abstract: In the last twenty years, low-energy (IR) dualities have been found for many pairs of supersymmetric gauge theories with four supercharges, both in four space-time dimensions and in three space-time dimensions. In particular, duals have been found for 3d N=2 supersymmetric QCD theories with gauge group U(N), with F chiral multiplets in the fundamental representation, with F' chiral multiplets in t… ▽ More In the last twenty years, low-energy (IR) dualities have been found for many pairs of supersymmetric gauge theories with four supercharges, both in four space-time dimensions and in three space-time dimensions. In particular, duals have been found for 3d N=2 supersymmetric QCD theories with gauge group U(N), with F chiral multiplets in the fundamental representation, with F' chiral multiplets in the anti-fundamental representation, and with Chern-Simons level k, for all values of N, F, F' and k for which the theory preserves supersymmetry. For SU(N) theories the duals have been found in some cases, such as F=F' and F'=0, but not in the general case. In this note we find the IR dual for SU(N) SQCD theories with general values of N, F, F' and a non-zero k, which preserve supersymmetry. △ Less

Submitted 28 February, 2015; v1 submitted 20 November, 2014; originally announced November 2014.

Comments: 12 pages. Added reference and corrected typo; JHEP version

Report number: WIS/09/14-NOV-DPPA

Journal ref: JHEP 1502 (2015) 162

Showing 1–7 of 7 results for author: Fleischer, D