Skip to main content

Showing 1–10 of 10 results for author: Cancedda, N

.
  1. arXiv:2406.16810  [pdf, other

    cs.LG cs.AI cs.CL

    PISTOL: Dataset Compilation Pipeline for Structural Unlearning of LLMs

    Authors: Xinchi Qiu, William F. Shen, Yihong Chen, Nicola Cancedda, Pontus Stenetorp, Nicholas D. Lane

    Abstract: Recently, machine unlearning, which seeks to erase specific data stored in the pre-trained or fine-tuned models, has emerged as a crucial protective measure for LLMs. However, unlearning approaches for LLMs that have been considered thus far have focused on the removal of independent data points and have not taken into account that the stored facts are logically connected to one another and form a… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  2. arXiv:2404.05411  [pdf, other

    cs.CL

    Know When To Stop: A Study of Semantic Drift in Text Generation

    Authors: Ava Spataru, Eric Hambro, Elena Voita, Nicola Cancedda

    Abstract: In this work, we explicitly show that modern LLMs tend to generate correct facts first, then "drift away" and generate incorrect facts later: this was occasionally observed but never properly measured. We develop a semantic drift score that measures the degree of separation between correct and incorrect facts in generated texts and confirm our hypothesis when generating Wikipedia-style biographies… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  3. arXiv:2402.09221  [pdf, other

    cs.AI cs.CL

    Spectral Filters, Dark Signals, and Attention Sinks

    Authors: Nicola Cancedda

    Abstract: Projecting intermediate representations onto the vocabulary is an increasingly popular interpretation tool for transformer-based LLMs, also known as the logit lens. We propose a quantitative extension to this approach and define spectral filters on intermediate representations based on partitioning the singular vectors of the vocabulary embedding and unembedding matrices into bands. We find that t… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  4. arXiv:2306.08896  [pdf, other

    cs.CL

    Multilingual End to End Entity Linking

    Authors: Mikhail Plekhanov, Nora Kassner, Kashyap Popat, Louis Martin, Simone Merello, Borislav Kozlovskii, Frédéric A. Dreyer, Nicola Cancedda

    Abstract: Entity Linking is one of the most common Natural Language Processing tasks in practical applications, but so far efficient end-to-end solutions with multilingual coverage have been lacking, leading to complex model stacks. To fill this gap, we release and open source BELA, the first fully end-to-end multilingual entity linking model that efficiently detects and links entities in texts in any of 97… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  5. arXiv:2305.12027  [pdf, other

    cs.CL cs.AI

    Polar Ducks and Where to Find Them: Enhancing Entity Linking with Duck Ty** and Polar Box Embeddings

    Authors: Mattia Atzeni, Mikhail Plekhanov, Frédéric A. Dreyer, Nora Kassner, Simone Merello, Louis Martin, Nicola Cancedda

    Abstract: Entity linking methods based on dense retrieval are an efficient and widely used solution in large-scale applications, but they fall short of the performance of generative models, as they are sensitive to the structure of the embedding space. In order to address this issue, this paper introduces DUCK, an approach to infusing structural information in the space of entity representations, using prio… ▽ More

    Submitted 20 October, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: Accepted at EMNLP 2023

  6. arXiv:2302.04761  [pdf, other

    cs.CL

    Toolformer: Language Models Can Teach Themselves to Use Tools

    Authors: Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, Thomas Scialom

    Abstract: Language models (LMs) exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale. They also, paradoxically, struggle with basic functionality, such as arithmetic or factual lookup, where much simpler and smaller models excel. In this paper, we show that LMs can teach themselves to use external tools via simple APIs and achieve the best of… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

  7. arXiv:2205.12570  [pdf, other

    cs.CL

    EDIN: An End-to-end Benchmark and Pipeline for Unknown Entity Discovery and Indexing

    Authors: Nora Kassner, Fabio Petroni, Mikhail Plekhanov, Sebastian Riedel, Nicola Cancedda

    Abstract: Existing work on Entity Linking mostly assumes that the reference knowledge base is complete, and therefore all mentions can be linked. In practice this is hardly ever the case, as knowledge bases are incomplete and because novel concepts arise constantly. This paper created the Unknown Entity Discovery and Indexing (EDIN) benchmark where unknown entities, that is entities without a description in… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

  8. arXiv:2103.12528  [pdf, other

    cs.CL cs.AI stat.ML

    Multilingual Autoregressive Entity Linking

    Authors: Nicola De Cao, Ledell Wu, Kashyap Popat, Mikel Artetxe, Naman Goyal, Mikhail Plekhanov, Luke Zettlemoyer, Nicola Cancedda, Sebastian Riedel, Fabio Petroni

    Abstract: We present mGENRE, a sequence-to-sequence system for the Multilingual Entity Linking (MEL) problem -- the task of resolving language-specific mentions to a multilingual Knowledge Base (KB). For a mention in a given language, mGENRE predicts the name of the target entity left-to-right, token-by-token in an autoregressive fashion. The autoregressive formulation allows us to effectively cross-encode… ▽ More

    Submitted 23 March, 2021; originally announced March 2021.

    Comments: 20 pages, 8 figures, and 11 tables

  9. Reply With: Proactive Recommendation of Email Attachments

    Authors: Christophe Van Gysel, Bhaskar Mitra, Matteo Venanzi, Roy Rosemarin, Grzegorz Kukla, Piotr Grudzien, Nicola Cancedda

    Abstract: Email responses often contain items-such as a file or a hyperlink to an external document-that are attached to or included inline in the body of the message. Analysis of an enterprise email corpus reveals that 35% of the time when users include these items as part of their response, the attachable item is already present in their inbox or sent folder. A modern email client can proactively retrieve… ▽ More

    Submitted 16 November, 2017; v1 submitted 16 October, 2017; originally announced October 2017.

    Comments: CIKM2017. Proceedings of the 26th ACM International Conference on Information and Knowledge Management. 2017

  10. arXiv:cs/0107017  [pdf, ps, other

    cs.CL

    Learning Computational Grammars

    Authors: John Nerbonne, Anja Belz, Nicola Cancedda, Herve Dejean, James Hammerton, Rob Koeling, Stasinos Konstantopoulos, Miles Osborne, Franck Thollard, Erik F. Tjong Kim Sang

    Abstract: This paper reports on the "Learning Computational Grammars" (LCG) project, a postdoc network devoted to studying the application of machine learning techniques to grammars suitable for computational use. We were interested in a more systematic survey to understand the relevance of many factors to the success of learning, esp. the availability of annotated data, the kind of dependencies in the da… ▽ More

    Submitted 15 July, 2001; originally announced July 2001.

    ACM Class: I.2.7

    Journal ref: In: Walter Daelemans and Remi Zajac (eds.), Proceedings of CoNLL-2001, Toulouse, France, 2001, pp. 97-104