Skip to main content

Showing 1–17 of 17 results for author: Ramasesh, V

.
  1. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  2. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  3. arXiv:2207.04901  [pdf, other

    cs.CL cs.LG

    Exploring Length Generalization in Large Language Models

    Authors: Cem Anil, Yuhuai Wu, Anders Andreassen, Aitor Lewkowycz, Vedant Misra, Vinay Ramasesh, Ambrose Slone, Guy Gur-Ari, Ethan Dyer, Behnam Neyshabur

    Abstract: The ability to extrapolate from short problem instances to longer ones is an important form of out-of-distribution generalization in reasoning tasks, and is crucial when learning from datasets where longer problem instances are rare. These include theorem proving, solving quantitative mathematics problems, and reading/summarizing novels. In this paper, we run careful empirical studies exploring th… ▽ More

    Submitted 14 November, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

  4. arXiv:2206.14858  [pdf, other

    cs.CL cs.AI cs.LG

    Solving Quantitative Reasoning Problems with Language Models

    Authors: Aitor Lewkowycz, Anders Andreassen, David Dohan, Ethan Dyer, Henryk Michalewski, Vinay Ramasesh, Ambrose Slone, Cem Anil, Imanol Schlag, Theo Gutman-Solo, Yuhuai Wu, Behnam Neyshabur, Guy Gur-Ari, Vedant Misra

    Abstract: Language models have achieved remarkable performance on a wide range of tasks that require natural language understanding. Nevertheless, state-of-the-art models have generally struggled with tasks that require quantitative reasoning, such as solving mathematics, science, and engineering problems at the college level. To help close this gap, we introduce Minerva, a large language model pretrained o… ▽ More

    Submitted 30 June, 2022; v1 submitted 29 June, 2022; originally announced June 2022.

    Comments: 12 pages, 5 figures + references and appendices

  5. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  6. arXiv:2110.15253  [pdf, other

    cs.LG stat.ML

    Understanding How Encoder-Decoder Architectures Attend

    Authors: Kyle Aitken, Vinay V Ramasesh, Yuan Cao, Niru Maheswaranathan

    Abstract: Encoder-decoder networks with attention have proven to be a powerful way to solve many sequence-to-sequence tasks. In these networks, attention aligns encoder and decoder states and is often used for visualizing network behavior. However, the mechanisms used by networks to generate appropriate attention matrices are still mysterious. Moreover, how these mechanisms vary depending on the particular… ▽ More

    Submitted 28 October, 2021; originally announced October 2021.

    Comments: 10+14 pages, 16 figures. NeurIPS 2021

  7. arXiv:2010.15114  [pdf, other

    cs.LG cs.CL stat.ML

    The geometry of integration in text classification RNNs

    Authors: Kyle Aitken, Vinay V. Ramasesh, Ankush Garg, Yuan Cao, David Sussillo, Niru Maheswaranathan

    Abstract: Despite the widespread application of recurrent neural networks (RNNs) across a variety of tasks, a unified understanding of how RNNs solve these tasks remains elusive. In particular, it is unclear what dynamical patterns arise in trained RNNs, and how those patterns depend on the training dataset or task. This work addresses these questions in the context of a specific natural language processing… ▽ More

    Submitted 3 June, 2022; v1 submitted 28 October, 2020; originally announced October 2020.

    Comments: 9+19 pages, 30 figures; v2: smaller file size

  8. Qutrit randomized benchmarking

    Authors: A. Morvan, V. V. Ramasesh, M. S. Blok, J. M. Kreikebaum, K. O'Brien, L. Chen, B. K. Mitchell, R. K. Naik, D. I. Santiago, I. Siddiqi

    Abstract: Ternary quantum processors offer significant computational advantages over conventional qubit technologies, leveraging the encoding and processing of quantum information in qutrits (three-level systems). To evaluate and compare the performance of such emerging quantum hardware it is essential to have robust benchmarking methods suitable for a higher-dimensional Hilbert space. We demonstrate extens… ▽ More

    Submitted 20 August, 2020; originally announced August 2020.

    Comments: 6 pages (+ 2 pages supplement), 5 figures

    Journal ref: Phys. Rev. Lett. 126, 210504 (2021)

  9. arXiv:2007.07400  [pdf, other

    cs.LG cs.CV stat.ML

    Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics

    Authors: Vinay V. Ramasesh, Ethan Dyer, Maithra Raghu

    Abstract: A central challenge in develo** versatile machine learning systems is catastrophic forgetting: a model trained on tasks in sequence will suffer significant performance drops on earlier tasks. Despite the ubiquity of catastrophic forgetting, there is limited understanding of the underlying process and its causes. In this paper, we address this important knowledge gap, investigating how forgetting… ▽ More

    Submitted 14 July, 2020; originally announced July 2020.

  10. Quantum Information Scrambling in a Superconducting Qutrit Processor

    Authors: M. S. Blok, V. V. Ramasesh, T. Schuster, K. O'Brien, J. M. Kreikebaum, D. Dahlen, A. Morvan, B. Yoshida, N. Y. Yao, I. Siddiqi

    Abstract: The theory of quantum information provides a common language which links disciplines ranging from cosmology to condensed-matter physics. For example, the delocalization of quantum information in strongly-interacting many-body systems, known as quantum information scrambling, has recently begun to unite our understanding of black hole dynamics, transport in exotic non-Fermi liquids, and many-body a… ▽ More

    Submitted 10 February, 2021; v1 submitted 6 March, 2020; originally announced March 2020.

    Journal ref: Phys. Rev. X 11, 021010 (2021)

  11. arXiv:1710.02875  [pdf, other

    quant-ph physics.atom-ph

    Scattering into one-dimensional waveguides from a coherently-driven quantum-optical system

    Authors: Kevin A. Fischer, Rahul Trivedi, Vinay Ramasesh, Irfan Siddiqi, Jelena Vučković

    Abstract: We develop a new computational tool and framework for characterizing the scattering of photons by energy-nonconserving Hamiltonians into unidirectional (chiral) waveguides, for example, with coherent pulsed excitation. The temporal waveguide modes are a natural basis for characterizing scattering in quantum optics, and afford a powerful technique based on a coarse discretization of time. This over… ▽ More

    Submitted 19 May, 2018; v1 submitted 8 October, 2017; originally announced October 2017.

    Comments: Numerical package in collaboration with Ben Bartlett (Stanford University), implemented in QuTiP: The Quantum Toolbox in Python, Quantum 2018

    Journal ref: Quantum 2, 69 (2018)

  12. Robust determination of molecular spectra on a quantum processor

    Authors: James I. Colless, Vinay V. Ramasesh, Dar Dahlen, Machiel S. Blok, Jarrod R. McClean, Jonathan Carter, Wibe A. de Jong, Irfan Siddiqi

    Abstract: Harnessing the full power of nascent quantum processors requires the efficient management of a limited number of quantum bits with finite lifetime. Hybrid algorithms leveraging classical resources have demonstrated promising initial results in the efficient calculation of Hamiltonian ground states--an important eigenvalue problem in the physical sciences that is often classically intractable. In t… ▽ More

    Submitted 20 July, 2017; originally announced July 2017.

    Journal ref: Phys. Rev. X 8, 011021 (2018)

  13. arXiv:1610.03069  [pdf, other

    quant-ph cond-mat.mes-hall

    Observing Topological Invariants Using Quantum Walk in Superconducting Circuits

    Authors: Emmanuel Flurin, Vinay V. Ramasesh, Shay Hacohen-Gourgy, Leigh S. Martin, Norman Y. Yao, Irfan Siddiqi

    Abstract: The direct measurement of topological invariants in both engineered and naturally occurring quantum materials is a key step in classifying quantum phases of matter. Here we motivate a toolbox based on time-dependent quantum walks as a method to digitally simulate single-particle topological band structures. Using a superconducting qubit dispersively coupled to a microwave cavity, we implement two… ▽ More

    Submitted 10 October, 2016; originally announced October 2016.

    Comments: 5 pages, 4 figures

    Journal ref: Phys. Rev. X 7, 031023 (2017)

  14. Direct Probe of Topological Invariants Using Bloch Oscillating Quantum Walks

    Authors: Vinay V. Ramasesh, Emmanuel Flurin, Mark S. Rudner, Irfan Siddiqi, Norman Y. Yao

    Abstract: The topology of a single-particle band structure plays a fundamental role in understanding a multitude of physical phenomena. Motivated by the connection between quantum walks and such topological band structures, we demonstrate that a simple time-dependent, Bloch-oscillating quantum walk enables the direct measurement of topological invariants. We consider two classes of one-dimensional quantum w… ▽ More

    Submitted 29 September, 2016; originally announced September 2016.

    Comments: Main text: 6 pages, 4 figures; Supplement: 4 pages, 0 figures

    Journal ref: Phys. Rev. Lett. 118, 130501 (2017)

  15. Dynamics of simultaneously measured non-commuting observables

    Authors: Shay Hacohen-Gourgy, Leigh S. Martin, Emmanuel Flurin, Vinay V. Ramasesh, K. Birgitta Whaley, Irfan Siddiqi

    Abstract: In quantum mechanics, measurements cause wavefunction collapse that yields precise outcomes, for non-commuting observables such as position and momentum Heisenberg's uncertainty principle limits the intrinsic precision of a state. Although theoretical work has demonstrated the possibility to perform simultaneous non-commuting measurements and has revealed the limits on measurement outcomes, only r… ▽ More

    Submitted 25 October, 2016; v1 submitted 23 August, 2016; originally announced August 2016.

    Journal ref: Nature (2016)

  16. arXiv:1506.05837  [pdf, other

    quant-ph cond-mat.mes-hall cond-mat.supr-con

    Cooling and Autonomous Feedback in a Bose-Hubbard chain with Attractive Interactions

    Authors: Shay Hacohen-Gourgy, Vinay V. Ramasesh, Claudia De Grandi, Irfan Siddiqi, Steve M. Girvin

    Abstract: We engineer a quantum bath that enables entropy and energy exchange with a one-dimensional Bose-Hubbard lattice with attractive on-site interactions. We implement this in an array of three superconducting transmon qubits coupled to a single cavity mode; the transmons represent lattice sites and their excitation quanta embody bosonic particles. Our cooling protocol preserves particle number--realiz… ▽ More

    Submitted 15 December, 2015; v1 submitted 18 June, 2015; originally announced June 2015.

    Comments: 5 pages paper, 21 pages supplementary

    Journal ref: Phys. Rev. Lett. 115, 240501 (2015)

  17. A Quantum Gas Microscope for Fermionic Atoms

    Authors: Lawrence W. Cheuk, Matthew A. Nichols, Melih Okan, Thomas Gersdorf, Vinay V. Ramasesh, Waseem S. Bakr, Thomas Lompe, Martin W. Zwierlein

    Abstract: Strongly interacting fermions define the properties of complex matter at all densities, from atomic nuclei to modern solid state materials and neutron stars. Ultracold atomic Fermi gases have emerged as a pristine platform for the study of many-fermion systems. Here we realize a quantum gas microscope for fermionic $^{40}$K atoms trapped in an optical lattice, which allows one to probe strongly co… ▽ More

    Submitted 10 March, 2015; v1 submitted 9 March, 2015; originally announced March 2015.

    Journal ref: Phys. Rev. Lett. 114, 193001 (2015)