Skip to main content

Showing 1–6 of 6 results for author: Nicosia, M

.
  1. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  2. arXiv:2305.14224  [pdf, other

    cs.CL

    mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations

    Authors: Jonas Pfeiffer, Francesco Piccinno, Massimo Nicosia, Xinyi Wang, Machel Reid, Sebastian Ruder

    Abstract: Multilingual sequence-to-sequence models perform poorly with increased language coverage and fail to consistently generate text in the correct target language in few-shot settings. To address these challenges, we propose mmT5, a modular multilingual sequence-to-sequence model. mmT5 utilizes language-specific modules during pre-training, which disentangle language-specific information from language… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  3. XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages

    Authors: Sebastian Ruder, Jonathan H. Clark, Alexander Gutkin, Mihir Kale, Min Ma, Massimo Nicosia, Shruti Rijhwani, Parker Riley, Jean-Michel A. Sarr, Xinyi Wang, John Wieting, Nitish Gupta, Anna Katanova, Christo Kirov, Dana L. Dickinson, Brian Roark, Bidisha Samanta, Connie Tao, David I. Adelani, Vera Axelrod, Isaac Caswell, Colin Cherry, Dan Garrette, Reeve Ingle, Melvin Johnson , et al. (2 additional authors not shown)

    Abstract: Data scarcity is a crucial issue for the development of highly multilingual NLP systems. Yet for many under-represented languages (ULs) -- languages for which NLP re-search is particularly far behind in meeting user needs -- it is feasible to annotate small amounts of data. Motivated by this, we propose XTREME-UP, a benchmark defined by: its focus on the scarce-data scenario rather than zero-shot;… ▽ More

    Submitted 24 May, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

  4. arXiv:2212.07223  [pdf, other

    cs.CL

    Evaluating Byte and Wordpiece Level Models for Massively Multilingual Semantic Parsing

    Authors: Massimo Nicosia, Francesco Piccinno

    Abstract: Token free approaches have been successfully applied to a series of word and span level tasks. In this work, we compare a byte-level (ByT5) and a wordpiece based (mT5) sequence to sequence model on the 51 languages of the MASSIVE multilingual semantic parsing dataset. We examine multiple experimental settings: (i) zero-shot, (ii) full gold data and (iii) zero-shot with synthetic data. By leveragin… ▽ More

    Submitted 14 December, 2022; originally announced December 2022.

    Comments: Massively Multilingual NLU 2022 Workshop Paper @ EMNLP 2022 - Winning approach of the MMNLU-22 Zero-Shot Challenge

  5. arXiv:2109.04319  [pdf, other

    cs.CL cs.AI cs.LG

    Translate & Fill: Improving Zero-Shot Multilingual Semantic Parsing with Synthetic Data

    Authors: Massimo Nicosia, Zhongdi Qu, Yasemin Altun

    Abstract: While multilingual pretrained language models (LMs) fine-tuned on a single language have shown substantial cross-lingual task transfer capabilities, there is still a wide performance gap in semantic parsing tasks when target language supervision is available. In this paper, we propose a novel Translate-and-Fill (TaF) method to produce silver training data for a multilingual semantic parser. This m… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

    Comments: Accepted to EMNLP 2021 (Findings)

  6. arXiv:1908.11787  [pdf, other

    cs.CL

    Answering Conversational Questions on Structured Data without Logical Forms

    Authors: Thomas Müller, Francesco Piccinno, Massimo Nicosia, Peter Shaw, Yasemin Altun

    Abstract: We present a novel approach to answering sequential questions based on structured objects such as knowledge bases or tables without using a logical form as an intermediate representation. We encode tables as graphs using a graph neural network model based on the Transformer architecture. The answers are then selected from the encoded graph using a pointer network. This model is appropriate for pro… ▽ More

    Submitted 30 August, 2019; originally announced August 2019.

    Comments: EMNLP 2019