Search | arXiv e-print repository

MiTTenS: A Dataset for Evaluating Misgendering in Translation

Authors: Kevin Robinson, Sneha Kudugunta, Romina Stella, Sunipa Dev, Jasmijn Bastings

Abstract: Misgendering is the act of referring to someone in a way that does not reflect their gender identity. Translation systems, including foundation models capable of translation, can produce errors that result in misgendering harms. To measure the extent of such potential harms when translating into and out of English, we introduce a dataset, MiTTenS, covering 26 languages from a variety of language f… ▽ More Misgendering is the act of referring to someone in a way that does not reflect their gender identity. Translation systems, including foundation models capable of translation, can produce errors that result in misgendering harms. To measure the extent of such potential harms when translating into and out of English, we introduce a dataset, MiTTenS, covering 26 languages from a variety of language families and scripts, including several traditionally underpresented in digital resources. The dataset is constructed with handcrafted passages that target known failure patterns, longer synthetically generated passages, and natural passages sourced from multiple domains. We demonstrate the usefulness of the dataset by evaluating both dedicated neural machine translation systems and foundation models, and show that all systems exhibit errors resulting in misgendering harms, even in high resource languages. △ Less

Submitted 12 January, 2024; originally announced January 2024.

Comments: GitHub repository https://github.com/google-research-datasets/mittens

arXiv:2312.11805 [pdf, other]

Gemini: A Family of Highly Capable Multimodal Models

Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI. △ Less

Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

arXiv:2309.04662 [pdf, other]

MADLAD-400: A Multilingual And Document-Level Large Audited Dataset

Authors: Sneha Kudugunta, Isaac Caswell, Biao Zhang, Xavier Garcia, Christopher A. Choquette-Choo, Katherine Lee, Derrick Xin, Aditya Kusupati, Romi Stella, Ankur Bapna, Orhan Firat

Abstract: We introduce MADLAD-400, a manually audited, general domain 3T token monolingual dataset based on CommonCrawl, spanning 419 languages. We discuss the limitations revealed by self-auditing MADLAD-400, and the role data auditing had in the dataset creation process. We then train and release a 10.7B-parameter multilingual machine translation model on 250 billion tokens covering over 450 languages usi… ▽ More We introduce MADLAD-400, a manually audited, general domain 3T token monolingual dataset based on CommonCrawl, spanning 419 languages. We discuss the limitations revealed by self-auditing MADLAD-400, and the role data auditing had in the dataset creation process. We then train and release a 10.7B-parameter multilingual machine translation model on 250 billion tokens covering over 450 languages using publicly available data, and find that it is competitive with models that are significantly larger, and report the results on different domains. In addition, we train a 8B-parameter language model, and assess the results on few-shot translation. We make the baseline models available to the research community. △ Less

Submitted 8 September, 2023; originally announced September 2023.

Comments: Preprint

arXiv:2006.06368 [pdf]

doi 10.1007/s10701-021-00427-y

The Dead-Alive Physicist experiment: a case-study disproving the hypothesis that consciousness causes the wave-function collapse in the quantum measurement process

Authors: Carlo Roselli, Bruno Raffaele Stella

Abstract: This paper aims to falsify the hypothesis that the observer's consciousness is necessary for quantum measurement. To achieve our target, we propose a variation of the Schroedinger's cat thought experiment called "DAP", short for "Dead-Alive Physicist", in which a human being replaces the cat. This strategy enables us to logically disprove the consistency of the above hypothesis and to oblige its s… ▽ More This paper aims to falsify the hypothesis that the observer's consciousness is necessary for quantum measurement. To achieve our target, we propose a variation of the Schroedinger's cat thought experiment called "DAP", short for "Dead-Alive Physicist", in which a human being replaces the cat. This strategy enables us to logically disprove the consistency of the above hypothesis and to oblige its supporters either to be trapped in solipsism or to rely on an alternative interpretation of quantum mechanics in which the role of the conscious observer has to be reviewed. Our analysis hence provides support to clarify the relationship between the observer the objects of her/his experimental observation; this and a few other implications are discussed in the fourth section and in the conclusions. △ Less

Submitted 11 June, 2020; originally announced June 2020.

Comments: 9 pages and 2 figures

arXiv:1008.1869 [pdf]

doi 10.1140/epjh/e2011-10029-3

Y(9.46 GeV) and the gluon discovery (a critical recollection of PLUTO results)

Authors: Bruno R. Stella, Hans-Jürgen Meyer

Abstract: The hadronic decays of Y(9.46GeV) were first studied by PLUTO experiment at DORIS e+e- storage ring (DESY). To determine the contribution of PLUTO to the discovery of the gluon, as members of the collaboration, we have reconsidered all the material produced by it in 1978 and the first half of 1979. It results clearly that the experiment demonstrated the main decay of the Y resonance to be mediated… ▽ More The hadronic decays of Y(9.46GeV) were first studied by PLUTO experiment at DORIS e+e- storage ring (DESY). To determine the contribution of PLUTO to the discovery of the gluon, as members of the collaboration, we have reconsidered all the material produced by it in 1978 and the first half of 1979. It results clearly that the experiment demonstrated the main decay of the Y resonance to be mediated by 3 gluons hadronizing into 3 jets. Jettiness resulted evident by the <P_T> with respect to the thrust axis, which was as observed by PLUTO itself at nearby continuum c.m.s. energies for 2-quark jet events. Instead, the average sphericity <S>, more topological variables and the momentum distribution showed a net difference with the same data, results compatible with jettiness only in case of more than 2 jets. Flatness as consequence of a 3-body decay (therefore 3 jets) was indicated by the low <P_out>, altogether a result independent of models. The charged multiplicity was observed to be larger than in the continuum and in case of MC 3 gluon jets fragmenting like quarks, as expected for gluon jets. In June 1979 PLUTO measured the matrix element of the 3-gluon decay to be quantitatively according QCD (even after hadronization, which does not obscure the perturbative predictions) and demonstrated the spin 1 nature of the gluon by excluding spin 0 and spin 1/2. The gluon hadronization like a quark jet, as in 3-gluon jet MC, was compatible with topological data and multiplicity; this was the first experimental study of (identified) gluon jets. The PLUTO results were confirmed both by other experiments at DORIS and later by more sophisticated detectors. At higher energies at PETRA the existence of gluons of spin 1 was confirmed by PLUTO and by 3 more experiments by measuring the gluon radiation, soft gluons by jet broadening, hard gluons by the emission of (now clearly visible) gluon jets by quarks. △ Less

Submitted 18 November, 2011; v1 submitted 11 August, 2010; originally announced August 2010.

Comments: 41 pages, 18 figures, substantially revised version of DESY report 10-130, published 21. Oct. 2011 in "The European Physical Journal H (Perspectives on Contemporary Physics)". The final published version is available at http//:www.epj.org. The final author's corrected version is available at arXiv:1008.1869v3 [hep-ex], 18. Nov. 2011

Report number: DESY 10-130

Journal ref: Eur. Phys. J. H. 3, 203-243 (2011)

Showing 1–5 of 5 results for author: Stella, R