-
MiTTenS: A Dataset for Evaluating Misgendering in Translation
Authors:
Kevin Robinson,
Sneha Kudugunta,
Romina Stella,
Sunipa Dev,
Jasmijn Bastings
Abstract:
Misgendering is the act of referring to someone in a way that does not reflect their gender identity. Translation systems, including foundation models capable of translation, can produce errors that result in misgendering harms. To measure the extent of such potential harms when translating into and out of English, we introduce a dataset, MiTTenS, covering 26 languages from a variety of language f…
▽ More
Misgendering is the act of referring to someone in a way that does not reflect their gender identity. Translation systems, including foundation models capable of translation, can produce errors that result in misgendering harms. To measure the extent of such potential harms when translating into and out of English, we introduce a dataset, MiTTenS, covering 26 languages from a variety of language families and scripts, including several traditionally underpresented in digital resources. The dataset is constructed with handcrafted passages that target known failure patterns, longer synthetically generated passages, and natural passages sourced from multiple domains. We demonstrate the usefulness of the dataset by evaluating both dedicated neural machine translation systems and foundation models, and show that all systems exhibit errors resulting in misgendering harms, even in high resource languages.
△ Less
Submitted 12 January, 2024;
originally announced January 2024.
-
Gemini: A Family of Highly Capable Multimodal Models
Authors:
Gemini Team,
Rohan Anil,
Sebastian Borgeaud,
Jean-Baptiste Alayrac,
Jiahui Yu,
Radu Soricut,
Johan Schalkwyk,
Andrew M. Dai,
Anja Hauth,
Katie Millican,
David Silver,
Melvin Johnson,
Ioannis Antonoglou,
Julian Schrittwieser,
Amelia Glaese,
Jilin Chen,
Emily Pitler,
Timothy Lillicrap,
Angeliki Lazaridou,
Orhan Firat,
James Molloy,
Michael Isard,
Paul R. Barham,
Tom Hennigan,
Benjamin Lee
, et al. (1325 additional authors not shown)
Abstract:
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr…
▽ More
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
△ Less
Submitted 17 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Authors:
Sneha Kudugunta,
Isaac Caswell,
Biao Zhang,
Xavier Garcia,
Christopher A. Choquette-Choo,
Katherine Lee,
Derrick Xin,
Aditya Kusupati,
Romi Stella,
Ankur Bapna,
Orhan Firat
Abstract:
We introduce MADLAD-400, a manually audited, general domain 3T token monolingual dataset based on CommonCrawl, spanning 419 languages. We discuss the limitations revealed by self-auditing MADLAD-400, and the role data auditing had in the dataset creation process. We then train and release a 10.7B-parameter multilingual machine translation model on 250 billion tokens covering over 450 languages usi…
▽ More
We introduce MADLAD-400, a manually audited, general domain 3T token monolingual dataset based on CommonCrawl, spanning 419 languages. We discuss the limitations revealed by self-auditing MADLAD-400, and the role data auditing had in the dataset creation process. We then train and release a 10.7B-parameter multilingual machine translation model on 250 billion tokens covering over 450 languages using publicly available data, and find that it is competitive with models that are significantly larger, and report the results on different domains. In addition, we train a 8B-parameter language model, and assess the results on few-shot translation. We make the baseline models available to the research community.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
The Dead-Alive Physicist experiment: a case-study disproving the hypothesis that consciousness causes the wave-function collapse in the quantum measurement process
Authors:
Carlo Roselli,
Bruno Raffaele Stella
Abstract:
This paper aims to falsify the hypothesis that the observer's consciousness is necessary for quantum measurement. To achieve our target, we propose a variation of the Schroedinger's cat thought experiment called "DAP", short for "Dead-Alive Physicist", in which a human being replaces the cat. This strategy enables us to logically disprove the consistency of the above hypothesis and to oblige its s…
▽ More
This paper aims to falsify the hypothesis that the observer's consciousness is necessary for quantum measurement. To achieve our target, we propose a variation of the Schroedinger's cat thought experiment called "DAP", short for "Dead-Alive Physicist", in which a human being replaces the cat. This strategy enables us to logically disprove the consistency of the above hypothesis and to oblige its supporters either to be trapped in solipsism or to rely on an alternative interpretation of quantum mechanics in which the role of the conscious observer has to be reviewed. Our analysis hence provides support to clarify the relationship between the observer the objects of her/his experimental observation; this and a few other implications are discussed in the fourth section and in the conclusions.
△ Less
Submitted 11 June, 2020;
originally announced June 2020.
-
Y(9.46 GeV) and the gluon discovery (a critical recollection of PLUTO results)
Authors:
Bruno R. Stella,
Hans-Jürgen Meyer
Abstract:
The hadronic decays of Y(9.46GeV) were first studied by PLUTO experiment at DORIS e+e- storage ring (DESY). To determine the contribution of PLUTO to the discovery of the gluon, as members of the collaboration, we have reconsidered all the material produced by it in 1978 and the first half of 1979. It results clearly that the experiment demonstrated the main decay of the Y resonance to be mediated…
▽ More
The hadronic decays of Y(9.46GeV) were first studied by PLUTO experiment at DORIS e+e- storage ring (DESY). To determine the contribution of PLUTO to the discovery of the gluon, as members of the collaboration, we have reconsidered all the material produced by it in 1978 and the first half of 1979. It results clearly that the experiment demonstrated the main decay of the Y resonance to be mediated by 3 gluons hadronizing into 3 jets. Jettiness resulted evident by the <P_T> with respect to the thrust axis, which was as observed by PLUTO itself at nearby continuum c.m.s. energies for 2-quark jet events. Instead, the average sphericity <S>, more topological variables and the momentum distribution showed a net difference with the same data, results compatible with jettiness only in case of more than 2 jets. Flatness as consequence of a 3-body decay (therefore 3 jets) was indicated by the low <P_out>, altogether a result independent of models. The charged multiplicity was observed to be larger than in the continuum and in case of MC 3 gluon jets fragmenting like quarks, as expected for gluon jets. In June 1979 PLUTO measured the matrix element of the 3-gluon decay to be quantitatively according QCD (even after hadronization, which does not obscure the perturbative predictions) and demonstrated the spin 1 nature of the gluon by excluding spin 0 and spin 1/2. The gluon hadronization like a quark jet, as in 3-gluon jet MC, was compatible with topological data and multiplicity; this was the first experimental study of (identified) gluon jets. The PLUTO results were confirmed both by other experiments at DORIS and later by more sophisticated detectors. At higher energies at PETRA the existence of gluons of spin 1 was confirmed by PLUTO and by 3 more experiments by measuring the gluon radiation, soft gluons by jet broadening, hard gluons by the emission of (now clearly visible) gluon jets by quarks.
△ Less
Submitted 18 November, 2011; v1 submitted 11 August, 2010;
originally announced August 2010.