Search | arXiv e-print repository

Exploring Category Structure with Contextual Language Models and Lexical Semantic Networks

Authors: Joseph Renner, Pascal Denis, Rémi Gilleron, Angèle Brunellière

Abstract: Recent work on predicting category structure with distributional models, using either static word embeddings (Heyman and Heyman, 2019) or contextualized language models (CLMs) (Misra et al., 2021), report low correlations with human ratings, thus calling into question their plausibility as models of human semantic memory. In this work, we revisit this question testing a wider array of methods for… ▽ More Recent work on predicting category structure with distributional models, using either static word embeddings (Heyman and Heyman, 2019) or contextualized language models (CLMs) (Misra et al., 2021), report low correlations with human ratings, thus calling into question their plausibility as models of human semantic memory. In this work, we revisit this question testing a wider array of methods for probing CLMs for predicting typicality scores. Our experiments, using BERT (Devlin et al., 2018), show the importance of using the right type of CLM probes, as our best BERT-based typicality prediction methods substantially improve over previous works. Second, our results highlight the importance of polysemy in this task: our best results are obtained when using a disambiguation mechanism. Finally, additional experiments reveal that Information Contentbased WordNet (Miller, 1995), also endowed with disambiguation, match the performance of the best BERT-based method, and in fact capture complementary information, which can be combined with BERT to achieve enhanced typicality predictions. △ Less

Submitted 14 February, 2023; originally announced February 2023.

Journal ref: Conference of the European Chapter of the Association for Computational Linguistics, May 2023, Dubrovnik, Croatia

arXiv:0807.2983 [pdf, ps, other]

On Probability Distributions for Trees: Representations, Inference and Learning

Authors: François Denis, Amaury Habrard, Rémi Gilleron, Marc Tommasi, Édouard Gilbert

Abstract: We study probability distributions over free algebras of trees. Probability distributions can be seen as particular (formal power) tree series [Berstel et al 82, Esik et al 03], i.e. map**s from trees to a semiring K . A widely studied class of tree series is the class of rational (or recognizable) tree series which can be defined either in an algebraic way or by means of multiplicity tree aut… ▽ More We study probability distributions over free algebras of trees. Probability distributions can be seen as particular (formal power) tree series [Berstel et al 82, Esik et al 03], i.e. map**s from trees to a semiring K . A widely studied class of tree series is the class of rational (or recognizable) tree series which can be defined either in an algebraic way or by means of multiplicity tree automata. We argue that the algebraic representation is very convenient to model probability distributions over a free algebra of trees. First, as in the string case, the algebraic representation allows to design learning algorithms for the whole class of probability distributions defined by rational tree series. Note that learning algorithms for rational tree series correspond to learning algorithms for weighted tree automata where both the structure and the weights are learned. Second, the algebraic representation can be easily extended to deal with unranked trees (like XML trees where a symbol may have an unbounded number of children). Both properties are particularly relevant for applications: nondeterministic automata are required for the inference problem to be relevant (recall that Hidden Markov Models are equivalent to nondeterministic string automata); nowadays applications for Web Information Extraction, Web Services and document processing consider unranked trees. △ Less

Submitted 18 July, 2008; originally announced July 2008.

Journal ref: Dans NIPS Workshop on Representations and Inference on Probability Distributions (2007)

arXiv:cs/0609015 [pdf, ps, other]

Residual Finite Tree Automata

Authors: J. Carme, R. Gilleron, A. Lemay, A. Terlutte, M. Tommasi

Abstract: Tree automata based algorithms are essential in many fields in computer science such as verification, specification, program analysis. They become also essential for databases with the development of standards such as XML. In this paper, we define new classes of non deterministic tree automata, namely residual finite tree automata (RFTA). In the bottom-up case, we obtain a new characterization o… ▽ More Tree automata based algorithms are essential in many fields in computer science such as verification, specification, program analysis. They become also essential for databases with the development of standards such as XML. In this paper, we define new classes of non deterministic tree automata, namely residual finite tree automata (RFTA). In the bottom-up case, we obtain a new characterization of regular tree languages. In the top-down case, we obtain a subclass of regular tree languages which contains the class of languages recognized by deterministic top-down tree automata. RFTA also come with the property of existence of canonical non deterministic tree automata. △ Less

Submitted 5 September, 2006; originally announced September 2006.

Showing 1–3 of 3 results for author: Gilleron, R