-
CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity
Authors:
Moshe Berchansky,
Daniel Fleischer,
Moshe Wasserblat,
Peter Izsak
Abstract:
State-of-the-art performance in QA tasks is currently achieved by systems employing Large Language Models (LLMs), however these models tend to hallucinate information in their responses. One approach focuses on enhancing the generation process by incorporating attribution from the given input to the output. However, the challenge of identifying appropriate attributions and verifying their accuracy…
▽ More
State-of-the-art performance in QA tasks is currently achieved by systems employing Large Language Models (LLMs), however these models tend to hallucinate information in their responses. One approach focuses on enhancing the generation process by incorporating attribution from the given input to the output. However, the challenge of identifying appropriate attributions and verifying their accuracy against a source is a complex task that requires significant improvements in assessing such systems. We introduce an attribution-oriented Chain-of-Thought reasoning method to enhance the accuracy of attributions. This approach focuses the reasoning process on generating an attribution-centric output. Evaluations on two context-enhanced question-answering datasets using GPT-4 demonstrate improved accuracy and correctness of attributions. In addition, the combination of our method with finetuning enhances the response and attribution accuracy of two smaller LLMs, showing their potential to outperform GPT-4 in some cases.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Accelerating Generalized Random Forests with Fixed-Point Trees
Authors:
David Fleischer,
David A. Stephens,
Archer Yang
Abstract:
Generalized random forests arXiv:1610.01271 build upon the well-established success of conventional forests (Breiman, 2001) to offer a flexible and powerful non-parametric method for estimating local solutions of heterogeneous estimating equations. Estimators are constructed by leveraging random forests as an adaptive kernel weighting algorithm and implemented through a gradient-based tree-growing…
▽ More
Generalized random forests arXiv:1610.01271 build upon the well-established success of conventional forests (Breiman, 2001) to offer a flexible and powerful non-parametric method for estimating local solutions of heterogeneous estimating equations. Estimators are constructed by leveraging random forests as an adaptive kernel weighting algorithm and implemented through a gradient-based tree-growing procedure. By expressing this gradient-based approximation as being induced from a single Newton-Raphson root-finding iteration, and drawing upon the connection between estimating equations and fixed-point problems arXiv:2110.11074, we propose a new tree-growing rule for generalized random forests induced from a fixed-point iteration type of approximation, enabling gradient-free optimization, and yielding substantial time savings for tasks involving even modest dimensionality of the target quantity (e.g. multiple/multi-level treatment effects). We develop an asymptotic theory for estimators obtained from forests whose trees are grown through the fixed-point splitting rule, and provide numerical simulations demonstrating that the estimators obtained from such forests are comparable to those obtained from the more costly gradient-based rule.
△ Less
Submitted 20 June, 2023;
originally announced June 2023.
-
Latent Universal Task-Specific BERT
Authors:
Alon Rozental,
Zohar Kelrich,
Daniel Fleischer
Abstract:
This paper describes a language representation model which combines the Bidirectional Encoder Representations from Transformers (BERT) learning mechanism described in Devlin et al. (2018) with a generalization of the Universal Transformer model described in Dehghani et al. (2018). We further improve this model by adding a latent variable that represents the persona and topics of interests of the w…
▽ More
This paper describes a language representation model which combines the Bidirectional Encoder Representations from Transformers (BERT) learning mechanism described in Devlin et al. (2018) with a generalization of the Universal Transformer model described in Dehghani et al. (2018). We further improve this model by adding a latent variable that represents the persona and topics of interests of the writer for each training example. We also describe a simple method to improve the usefulness of our language representation for solving problems in a specific domain at the expense of its ability to generalize to other fields. Finally, we release a pre-trained language representation model for social texts that was trained on 100 million tweets.
△ Less
Submitted 16 May, 2019;
originally announced May 2019.
-
Amobee at IEST 2018: Transfer Learning from Language Models
Authors:
Alon Rozental,
Daniel Fleischer,
Zohar Kelrich
Abstract:
This paper describes the system developed at Amobee for the WASSA 2018 implicit emotions shared task (IEST). The goal of this task was to predict the emotion expressed by missing words in tweets without an explicit mention of those words. We developed an ensemble system consisting of language models together with LSTM-based networks containing a CNN attention mechanism. Our approach represents a n…
▽ More
This paper describes the system developed at Amobee for the WASSA 2018 implicit emotions shared task (IEST). The goal of this task was to predict the emotion expressed by missing words in tweets without an explicit mention of those words. We developed an ensemble system consisting of language models together with LSTM-based networks containing a CNN attention mechanism. Our approach represents a novel use of language models (specifically trained on a large Twitter dataset) to predict and classify emotions. Our system reached 1st place with a macro $\text{F}_1$ score of 0.7145.
△ Less
Submitted 23 October, 2018; v1 submitted 27 August, 2018;
originally announced August 2018.
-
Amobee at SemEval-2018 Task 1: GRU Neural Network with a CNN Attention Mechanism for Sentiment Classification
Authors:
Alon Rozental,
Daniel Fleischer
Abstract:
This paper describes the participation of Amobee in the shared sentiment analysis task at SemEval 2018. We participated in all the English sub-tasks and the Spanish valence tasks. Our system consists of three parts: training task-specific word embeddings, training a model consisting of gated-recurrent-units (GRU) with a convolution neural network (CNN) attention mechanism and training stacking-bas…
▽ More
This paper describes the participation of Amobee in the shared sentiment analysis task at SemEval 2018. We participated in all the English sub-tasks and the Spanish valence tasks. Our system consists of three parts: training task-specific word embeddings, training a model consisting of gated-recurrent-units (GRU) with a convolution neural network (CNN) attention mechanism and training stacking-based ensembles for each of the sub-tasks. Our algorithm reached 3rd and 1st places in the valence ordinal classification sub-tasks in English and Spanish, respectively.
△ Less
Submitted 12 April, 2018;
originally announced April 2018.
-
Amobee at SemEval-2017 Task 4: Deep Learning System for Sentiment Detection on Twitter
Authors:
Alon Rozental,
Daniel Fleischer
Abstract:
This paper describes the Amobee sentiment analysis system, adapted to compete in SemEval 2017 task 4. The system consists of two parts: a supervised training of RNN models based on a Twitter sentiment treebank, and the use of feedforward NN, Naive Bayes and logistic regression classifiers to produce predictions for the different sub-tasks. The algorithm reached the 3rd place on the 5-label classif…
▽ More
This paper describes the Amobee sentiment analysis system, adapted to compete in SemEval 2017 task 4. The system consists of two parts: a supervised training of RNN models based on a Twitter sentiment treebank, and the use of feedforward NN, Naive Bayes and logistic regression classifiers to produce predictions for the different sub-tasks. The algorithm reached the 3rd place on the 5-label classification task (sub-task C).
△ Less
Submitted 3 May, 2017;
originally announced May 2017.
-
IR Dualities in General 3d Supersymmetric SU(N) QCD Theories
Authors:
Ofer Aharony,
Daniel Fleischer
Abstract:
In the last twenty years, low-energy (IR) dualities have been found for many pairs of supersymmetric gauge theories with four supercharges, both in four space-time dimensions and in three space-time dimensions. In particular, duals have been found for 3d N=2 supersymmetric QCD theories with gauge group U(N), with F chiral multiplets in the fundamental representation, with F' chiral multiplets in t…
▽ More
In the last twenty years, low-energy (IR) dualities have been found for many pairs of supersymmetric gauge theories with four supercharges, both in four space-time dimensions and in three space-time dimensions. In particular, duals have been found for 3d N=2 supersymmetric QCD theories with gauge group U(N), with F chiral multiplets in the fundamental representation, with F' chiral multiplets in the anti-fundamental representation, and with Chern-Simons level k, for all values of N, F, F' and k for which the theory preserves supersymmetry. For SU(N) theories the duals have been found in some cases, such as F=F' and F'=0, but not in the general case. In this note we find the IR dual for SU(N) SQCD theories with general values of N, F, F' and a non-zero k, which preserve supersymmetry.
△ Less
Submitted 28 February, 2015; v1 submitted 20 November, 2014;
originally announced November 2014.