-
To Word Senses and Beyond: Inducing Concepts with Contextualized Language Models
Authors:
Bastien Liétard,
Pascal Denis,
Mikaella Keller
Abstract:
Polysemy and synonymy are two crucial interrelated facets of lexical ambiguity. While both phenomena have been studied extensively in NLP, leading to dedicated systems, they are often been considered independently. While many tasks dealing with polysemy (e.g. Word Sense Disambiguiation or Induction) highlight the role of a word's senses, the study of synonymy is rooted in the study of concepts, i.…
▽ More
Polysemy and synonymy are two crucial interrelated facets of lexical ambiguity. While both phenomena have been studied extensively in NLP, leading to dedicated systems, they are often been considered independently. While many tasks dealing with polysemy (e.g. Word Sense Disambiguiation or Induction) highlight the role of a word's senses, the study of synonymy is rooted in the study of concepts, i.e. meaning shared across the lexicon. In this paper, we introduce Concept Induction, the unsupervised task of learning a soft clustering among words that defines a set of concepts directly from data. This task generalizes that of Word Sense Induction. We propose a bi-level approach to Concept Induction that leverages both a local lemma-centric view and a global cross-lexicon perspective to induce concepts. We evaluate the obtained clustering on SemCor's annotated data and obtain good performances (BCubed F1 above 0.60). We find that the local and the global levels are mutually beneficial to induce concepts and also senses in our setting. Finally, we create static embeddings representing our induced concepts and use them on the Word-in-Context task, obtaining competitive performances with the State-of-the-Art.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Synthetic Data Generation for Intersectional Fairness by Leveraging Hierarchical Group Structure
Authors:
Gaurav Maheshwari,
Aurélien Bellet,
Pascal Denis,
Mikaela Keller
Abstract:
In this paper, we introduce a data augmentation approach specifically tailored to enhance intersectional fairness in classification tasks. Our method capitalizes on the hierarchical structure inherent to intersectionality, by viewing groups as intersections of their parent categories. This perspective allows us to augment data for smaller groups by learning a transformation function that combines…
▽ More
In this paper, we introduce a data augmentation approach specifically tailored to enhance intersectional fairness in classification tasks. Our method capitalizes on the hierarchical structure inherent to intersectionality, by viewing groups as intersections of their parent categories. This perspective allows us to augment data for smaller groups by learning a transformation function that combines data from these parent groups. Our empirical analysis, conducted on four diverse datasets including both text and images, reveals that classifiers trained with this data augmentation approach achieve superior intersectional fairness and are more robust to ``leveling down'' when compared to methods optimizing traditional group fairness metrics.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Design, fabrication and test of a 5 GHz klystron based on the kladistron principle
Authors:
Juliette Plouin,
Claude Marchand,
Pierrick Hamel,
Sergey Arsenyev,
Antoine Mollard,
Armel Beunas,
Philippe Denis,
Franck Peauger
Abstract:
A new bunching method, named "kladistron" has been developed at CEA in order to provide high efficiency klystrons. A first "kladistron" prototype was designed and realized. It was adapted from the 4.9 GHz TH2166 from Thales, where the interaction line was transformed from 6 to 16 cavities. The design and fabrication phases of this prototype are developed in this paper. The kladistron prototype was…
▽ More
A new bunching method, named "kladistron" has been developed at CEA in order to provide high efficiency klystrons. A first "kladistron" prototype was designed and realized. It was adapted from the 4.9 GHz TH2166 from Thales, where the interaction line was transformed from 6 to 16 cavities. The design and fabrication phases of this prototype are developed in this paper. The kladistron prototype was tested in Thales facility. Its efficiency is finally lower (41 %) than expected (55 %), moreover it presents a spurious oscillation at 4.96 GHz. After analysis of the experimental results, it is concluded that the discrepancy between design and real frequencies is the cause for the low efficiency while the spurious oscillation results from a high gain peak at 4.96 GHz.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
A Tale of Two Laws of Semantic Change: Predicting Synonym Changes with Distributional Semantic Models
Authors:
Bastien Liétard,
Mikaela Keller,
Pascal Denis
Abstract:
Lexical Semantic Change is the study of how the meaning of words evolves through time. Another related question is whether and how lexical relations over pairs of words, such as synonymy, change over time. There are currently two competing, apparently opposite hypotheses in the historical linguistic literature regarding how synonymous words evolve: the Law of Differentiation (LD) argues that synon…
▽ More
Lexical Semantic Change is the study of how the meaning of words evolves through time. Another related question is whether and how lexical relations over pairs of words, such as synonymy, change over time. There are currently two competing, apparently opposite hypotheses in the historical linguistic literature regarding how synonymous words evolve: the Law of Differentiation (LD) argues that synonyms tend to take on different meanings over time, whereas the Law of Parallel Change (LPC) claims that synonyms tend to undergo the same semantic change and therefore remain synonyms. So far, there has been little research using distributional models to assess to what extent these laws apply on historical corpora. In this work, we take a first step toward detecting whether LD or LPC operates for given word pairs. After recasting the problem into a more tractable task, we combine two linguistic resources to propose the first complete evaluation framework on this problem and provide empirical evidence in favor of a dominance of LD. We then propose various computational approaches to the problem using Distributional Semantic Models and grounded in recent literature on Lexical Semantic Change detection. Our best approaches achieve a balanced accuracy above 0.6 on our dataset. We discuss challenges still faced by these approaches, such as polysemy or the potential confusion between synonymy and hypernymy.
△ Less
Submitted 30 May, 2023;
originally announced May 2023.
-
Fair Without Leveling Down: A New Intersectional Fairness Definition
Authors:
Gaurav Maheshwari,
Aurélien Bellet,
Pascal Denis,
Mikaela Keller
Abstract:
In this work, we consider the problem of intersectional group fairness in the classification setting, where the objective is to learn discrimination-free models in the presence of several intersecting sensitive groups. First, we illustrate various shortcomings of existing fairness measures commonly used to capture intersectional fairness. Then, we propose a new definition called the $α$-Intersecti…
▽ More
In this work, we consider the problem of intersectional group fairness in the classification setting, where the objective is to learn discrimination-free models in the presence of several intersecting sensitive groups. First, we illustrate various shortcomings of existing fairness measures commonly used to capture intersectional fairness. Then, we propose a new definition called the $α$-Intersectional Fairness, which combines the absolute and the relative performance across sensitive groups and can be seen as a generalization of the notion of differential fairness. We highlight several desirable properties of the proposed definition and analyze its relation to other fairness measures. Finally, we benchmark multiple popular in-processing fair machine learning approaches using our new fairness definition and show that they do not achieve any improvement over a simple baseline. Our results reveal that the increase in fairness measured by previous definitions hides a "leveling down" effect, i.e., degrading the best performance over groups rather than improving the worst one.
△ Less
Submitted 7 November, 2023; v1 submitted 21 May, 2023;
originally announced May 2023.
-
Exploring Category Structure with Contextual Language Models and Lexical Semantic Networks
Authors:
Joseph Renner,
Pascal Denis,
Rémi Gilleron,
Angèle Brunellière
Abstract:
Recent work on predicting category structure with distributional models, using either static word embeddings (Heyman and Heyman, 2019) or contextualized language models (CLMs) (Misra et al., 2021), report low correlations with human ratings, thus calling into question their plausibility as models of human semantic memory. In this work, we revisit this question testing a wider array of methods for…
▽ More
Recent work on predicting category structure with distributional models, using either static word embeddings (Heyman and Heyman, 2019) or contextualized language models (CLMs) (Misra et al., 2021), report low correlations with human ratings, thus calling into question their plausibility as models of human semantic memory. In this work, we revisit this question testing a wider array of methods for probing CLMs for predicting typicality scores. Our experiments, using BERT (Devlin et al., 2018), show the importance of using the right type of CLM probes, as our best BERT-based typicality prediction methods substantially improve over previous works. Second, our results highlight the importance of polysemy in this task: our best results are obtained when using a disambiguation mechanism. Finally, additional experiments reveal that Information Contentbased WordNet (Miller, 1995), also endowed with disambiguation, match the performance of the best BERT-based method, and in fact capture complementary information, which can be combined with BERT to achieve enhanced typicality predictions.
△ Less
Submitted 14 February, 2023;
originally announced February 2023.
-
Fair NLP Models with Differentially Private Text Encoders
Authors:
Gaurav Maheshwari,
Pascal Denis,
Mikaela Keller,
Aurélien Bellet
Abstract:
Encoded text representations often capture sensitive attributes about individuals (e.g., race or gender), which raise privacy concerns and can make downstream models unfair to certain groups. In this work, we propose FEDERATE, an approach that combines ideas from differential privacy and adversarial training to learn private text representations which also induces fairer models. We empirically eva…
▽ More
Encoded text representations often capture sensitive attributes about individuals (e.g., race or gender), which raise privacy concerns and can make downstream models unfair to certain groups. In this work, we propose FEDERATE, an approach that combines ideas from differential privacy and adversarial training to learn private text representations which also induces fairer models. We empirically evaluate the trade-off between the privacy of the representations and the fairness and accuracy of the downstream model on four NLP datasets. Our results show that FEDERATE consistently improves upon previous methods, and thus suggest that privacy and fairness can positively reinforce each other.
△ Less
Submitted 12 May, 2022;
originally announced May 2022.
-
Electronic properties of substitutional impurities in graphene-like C$_2$N, $tg$-C$_3$N$_4$, and $hg$-C$_3$N$_4$
Authors:
Saif Ullah,
Pablo A. Denis,
Marcos G. Menezes,
Fernando Sato,
Rodrigo B. Capaz
Abstract:
We study the electronic and structural properties of substitutional impurities of graphenelike nanoporous materials C$_2$N, $tg$-, and $hg$-C$_3$N$_4$ by means of density functional theory calculations. We consider four types of impurities; boron substitution on carbon sites (B(C)), carbon substitution on nitrogen sites (C(N)), nitrogen substitution on carbon sites (N(C)), and sulfur substitution…
▽ More
We study the electronic and structural properties of substitutional impurities of graphenelike nanoporous materials C$_2$N, $tg$-, and $hg$-C$_3$N$_4$ by means of density functional theory calculations. We consider four types of impurities; boron substitution on carbon sites (B(C)), carbon substitution on nitrogen sites (C(N)), nitrogen substitution on carbon sites (N(C)), and sulfur substitution on nitrogen sites (S(N)). From cohesive energy calculations, we find that the C(N) and B(C) substitutions are the most energetically favorable and induce small bond modifications in the vicinity of the impurity, while the S(N) induces strong lattice distortions. Though all of the studied impurities induce defect levels inside the band gap of these materials, their electronic properties are poles apart depending on the behavior of the impurity as an acceptor or a donor. It is also observed that acceptor (donor) wavefunctions are composed only of $σ$ ($π$) orbitals from the impurity itself and/or neighboring sites. Consequently, acceptor wavefunctions are directed towards the pores and donor wavefunctions are more extended throughout the neighboring atoms, a property that could further be explored to modify the interaction between these materials and adsorbates. Moreover, impurity properties display a strong site sensitivity and ground state binding energies ranging from $0.03$ to $1.13$ eV, thus offering an interesting route for tuning the optical properties of these materials. Finally, spin-polarized calculations reveal that all impurity configurations have a magnetic ground state that rises from the spin splitting of the impurity levels. In a few configurations, more than one impurity level can be found inside the gap and two of them could potentially be explored as two-level systems for single-photon emission, following similar proposals recently made on defect complexes on TMDCs.
△ Less
Submitted 19 October, 2020;
originally announced October 2020.
-
Probabilistic Inference Using Generators - The Statues Algorithm
Authors:
Pierre Denis
Abstract:
We present here a new probabilistic inference algorithm that gives exact results in the domain of discrete probability distributions. This algorithm, named the Statues algorithm, calculates the marginal probability distribution on probabilistic models defined as direct acyclic graphs. These models are made up of well-defined primitives that allow to express, in particular, joint probability distri…
▽ More
We present here a new probabilistic inference algorithm that gives exact results in the domain of discrete probability distributions. This algorithm, named the Statues algorithm, calculates the marginal probability distribution on probabilistic models defined as direct acyclic graphs. These models are made up of well-defined primitives that allow to express, in particular, joint probability distributions, Bayesian networks, discrete Markov chains, conditioning and probabilistic arithmetic. The Statues algorithm relies on a variable binding mechanism based on the generator construct, a special form of coroutine; being related to the enumeration algorithm, this new algorithm brings important improvements in terms of efficiency, which makes it valuable in regard to other exact marginalization algorithms. After introduction of several definitions, primitives and compositional rules, we present in details the Statues algorithm. Then, we briefly discuss the interest of this algorithm compared to others and we present possible extensions. Finally, we introduce Lea and MicroLea, two Python libraries implementing the Statues algorithm, along with several use cases. A proof of the correctness of the algorithm is provided in appendix.
△ Less
Submitted 2 August, 2018; v1 submitted 24 June, 2018;
originally announced June 2018.
-
Learning Recursive Segments for Discourse Parsing
Authors:
Stergos Afantenos,
Pascal Denis,
Philippe Muller,
Laurence Danlos
Abstract:
Automatically detecting discourse segments is an important preliminary step towards full discourse parsing. Previous research on discourse segmentation have relied on the assumption that elementary discourse units (EDUs) in a document always form a linear sequence (i.e., they can never be nested). Unfortunately, this assumption turns out to be too strong, for some theories of discourse like SDRT a…
▽ More
Automatically detecting discourse segments is an important preliminary step towards full discourse parsing. Previous research on discourse segmentation have relied on the assumption that elementary discourse units (EDUs) in a document always form a linear sequence (i.e., they can never be nested). Unfortunately, this assumption turns out to be too strong, for some theories of discourse like SDRT allows for nested discourse units. In this paper, we present a simple approach to discourse segmentation that is able to produce nested EDUs. Our approach builds on standard multi-class classification techniques combined with a simple repairing heuristic that enforces global coherence. Our system was developed and evaluated on the first round of annotations provided by the French Annodis project (an ongoing effort to create a discourse bank for French). Cross-validated on only 47 documents (1,445 EDUs), our system achieves encouraging performance results with an F-score of 73% for finding EDUs.
△ Less
Submitted 28 March, 2010;
originally announced March 2010.
-
Mechanical properties of Graphene Nanoribbons
Authors:
Ricardo Faccio,
Pablo A. Denis,
Helena Pardo,
Cecilia Goyenola,
Alvaro W. Mombru
Abstract:
Herein, we investigate the structural, electronic and mechanical properties of zigzag graphene nanoribbons upon the presence of stress applying Density Functional Theory within the GGA-PBE approximation. The uniaxial stress is applied along the periodic direction, allowing a unitary deformation in the range of +/- 0.02%. The mechanical properties show a linear-response within that range while th…
▽ More
Herein, we investigate the structural, electronic and mechanical properties of zigzag graphene nanoribbons upon the presence of stress applying Density Functional Theory within the GGA-PBE approximation. The uniaxial stress is applied along the periodic direction, allowing a unitary deformation in the range of +/- 0.02%. The mechanical properties show a linear-response within that range while the non-linear dependence is found for higher strain. The most relevant results indicate that Young's modulus is considerable higher than those determined for graphene and carbon nanotubes. The geometrical reconstruction of the C-C bonds at the edges hardness the nanostructure. Electronic structure features are not sensitive to strain in this linear elastic regime, being an additional promise for the using of carbon nanostructures in nano-electronic devices in the near future.
△ Less
Submitted 6 June, 2009; v1 submitted 9 May, 2009;
originally announced May 2009.
-
Induced magnetism by single carbon vacancies in a three-dimensional graphitic network: a supercell study
Authors:
Ricardo Faccio,
Helena Pardo,
Pablo A. Denis,
Rodrigo Yoshikawa Oeiras,
Fernando M. Araújo-Moreira,
Marcos Veríssimo-Alves,
Alvaro W. Mombrú
Abstract:
We present an ab initio DFT study of the magnetic moments that arise in graphite by creating single carbon vacancies in a 3-D graphite network, using a full potential, all electron, spin polarized electronic structure calculations. In previous reports the appearance of magnetic moments was explained in a 2-D graphene sheet just through the existence of the vacancies itself [1-5]. The dependence…
▽ More
We present an ab initio DFT study of the magnetic moments that arise in graphite by creating single carbon vacancies in a 3-D graphite network, using a full potential, all electron, spin polarized electronic structure calculations. In previous reports the appearance of magnetic moments was explained in a 2-D graphene sheet just through the existence of the vacancies itself [1-5]. The dependence of the arising magnetic moment on the nature and geometry of the vacancies for different supercells is reported. We found that the highest value of magnetic moment is obtained for a 3x3x1 supercell and that the highly diluted 5x5x1 supercell shows no magnetic ordering. The results obtained in this manuscript are indicative of the importance of interlayer interactions present in a 3-D stacking. We conclude that this should not be underestimated when vacancies-based studies on magnetism in graphitic systems are carried out.
△ Less
Submitted 8 December, 2007;
originally announced December 2007.
-
Classes de cycles en cohomologie rigide
Authors:
Petrequin Denis
Abstract:
We define the rigid homology. The trace morphism in rigid cohomology define by duality the cycle class in rigid homology. We verify the compatibility of this classes with rationnal equivalence and intersection theory. We deduce some formal consequences such as the Riemman-Roch-Grothendieck theorem in rigid cohomology and the self-intersection formula.
We define the rigid homology. The trace morphism in rigid cohomology define by duality the cycle class in rigid homology. We verify the compatibility of this classes with rationnal equivalence and intersection theory. We deduce some formal consequences such as the Riemman-Roch-Grothendieck theorem in rigid cohomology and the self-intersection formula.
△ Less
Submitted 20 March, 2001;
originally announced March 2001.
-
Classes de Chern en cohomologie rigide
Authors:
Petrequin Denis
Abstract:
In this paper, I construct Chern classes in the rigid cohomology of P. Berthelot. We start by constructing Chern classes for proper varieties. To prove all the properties we have to reinterpret the construction in a crystalline way. Then we can deal with open varieties using a compactification theorem.
In this paper, I construct Chern classes in the rigid cohomology of P. Berthelot. We start by constructing Chern classes for proper varieties. To prove all the properties we have to reinterpret the construction in a crystalline way. Then we can deal with open varieties using a compactification theorem.
△ Less
Submitted 19 March, 2001;
originally announced March 2001.