Search | arXiv e-print repository

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content. △ Less

Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

arXiv:2305.10938 [pdf]

A method for the ethical analysis of brain-inspired AI

Authors: Michele Farisco, Gianluca Baldassarre, Emilio Cartoni, Antonia Leach, Mihai A. Petrovici, Achim Rosemann, Arleen Salles, Bernd Stahl, Sacha J. van Albada

Abstract: Despite its successes, to date Artificial Intelligence (AI) is still characterized by a number of shortcomings with regards to different application domains and goals. These limitations are arguably both conceptual (e.g., related to underlying theoretical models, such as symbolic vs. connectionist), and operational (e.g., related to robustness and ability to generalize). Biologically inspired AI,… ▽ More Despite its successes, to date Artificial Intelligence (AI) is still characterized by a number of shortcomings with regards to different application domains and goals. These limitations are arguably both conceptual (e.g., related to underlying theoretical models, such as symbolic vs. connectionist), and operational (e.g., related to robustness and ability to generalize). Biologically inspired AI, and more specifically brain-inspired AI, promises to provide further biological aspects beyond those that are already traditionally included in AI, making it possible to assess and possibly overcome some of its present shortcomings. This article examines some conceptual, technical, and ethical issues raised by the development and use of brain-inspired AI. Against this background, the paper asks whether there is anything ethically unique about brain-inspired AI. The aim of the paper is to introduce a method that has a heuristic nature and that can be applied to identify and address the ethical issues arising from brain-inspired AI. The conclusion resulting from the application of this method is that, compared to traditional AI, brain-inspired AI raises new foundational ethical issues and some new practical ethical issues, and exacerbates some of the issues raised by traditional AI. △ Less

Submitted 18 May, 2023; originally announced May 2023.

Comments: 30 pages theoretical article resulting from a multidisciplinary collaboration about technical, theoretical and ethical aspects of brain-inspired AI

arXiv:2304.08424 [pdf, other]

Long-term Forecasting with TiDE: Time-series Dense Encoder

Authors: Abhimanyu Das, Weihao Kong, Andrew Leach, Shaan Mathur, Rajat Sen, Rose Yu

Abstract: Recent work has shown that simple linear models can outperform several Transformer based approaches in long term time-series forecasting. Motivated by this, we propose a Multi-layer Perceptron (MLP) based encoder-decoder model, Time-series Dense Encoder (TiDE), for long-term time-series forecasting that enjoys the simplicity and speed of linear models while also being able to handle covariates and… ▽ More Recent work has shown that simple linear models can outperform several Transformer based approaches in long term time-series forecasting. Motivated by this, we propose a Multi-layer Perceptron (MLP) based encoder-decoder model, Time-series Dense Encoder (TiDE), for long-term time-series forecasting that enjoys the simplicity and speed of linear models while also being able to handle covariates and non-linear dependencies. Theoretically, we prove that the simplest linear analogue of our model can achieve near optimal error rate for linear dynamical systems (LDS) under some assumptions. Empirically, we show that our method can match or outperform prior approaches on popular long-term time-series forecasting benchmarks while being 5-10x faster than the best Transformer based model. △ Less

Submitted 4 April, 2024; v1 submitted 17 April, 2023; originally announced April 2023.

arXiv:2103.04922 [pdf, other]

doi 10.1109/TPAMI.2021.3116668

Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models

Authors: Sam Bond-Taylor, Adam Leach, Yang Long, Chris G. Willcocks

Abstract: Deep generative models are a class of techniques that train deep neural networks to model the distribution of training samples. Research has fragmented into various interconnected approaches, each of which make trade-offs including run-time, diversity, and architectural restrictions. In particular, this compendium covers energy-based models, variational autoencoders, generative adversarial network… ▽ More Deep generative models are a class of techniques that train deep neural networks to model the distribution of training samples. Research has fragmented into various interconnected approaches, each of which make trade-offs including run-time, diversity, and architectural restrictions. In particular, this compendium covers energy-based models, variational autoencoders, generative adversarial networks, autoregressive models, normalizing flows, in addition to numerous hybrid approaches. These techniques are compared and contrasted, explaining the premises behind each and how they are interrelated, while reviewing current state-of-the-art advances and implementations. △ Less

Submitted 28 March, 2022; v1 submitted 8 March, 2021; originally announced March 2021.

Comments: 20 pages, 9 figures, will appear in IEEE Transactions on Pattern Analysis and Machine Intelligence

MSC Class: 68T01 (Primary); 68T07 (Secondary) ACM Class: I.5.0; I.4.0; G.3

arXiv:1912.06290 [pdf, other]

Meta-Learning Initializations for Image Segmentation

Authors: Sean M. Hendryx, Andrew B. Leach, Paul D. Hein, Clayton T. Morrison

Abstract: We extend first-order model agnostic meta-learning algorithms (including FOMAML and Reptile) to image segmentation, present a novel neural network architecture built for fast learning which we call EfficientLab, and leverage a formal definition of the test error of meta-learning algorithms to decrease error on out of distribution tasks. We show state of the art results on the FSS-1000 dataset by m… ▽ More We extend first-order model agnostic meta-learning algorithms (including FOMAML and Reptile) to image segmentation, present a novel neural network architecture built for fast learning which we call EfficientLab, and leverage a formal definition of the test error of meta-learning algorithms to decrease error on out of distribution tasks. We show state of the art results on the FSS-1000 dataset by meta-training EfficientLab with FOMAML and using Bayesian optimization to infer the optimal test-time adaptation routine hyperparameters. We also construct a small benchmark dataset, FP-k, for the empirical study of how meta-learning systems perform in both few- and many-shot settings. On the FP-k dataset, we show that meta-learned initializations provide value for canonical few-shot image segmentation but their performance is quickly matched by conventional transfer learning with performance being equal beyond 10 labeled examples. Our code, meta-learned model, and the FP-k dataset are available at https://github.com/ml4ai/mliis . △ Less

Submitted 7 May, 2020; v1 submitted 12 December, 2019; originally announced December 2019.

arXiv:1911.06932 [pdf, other]

3D Conditional Generative Adversarial Networks to enable large-scale seismic image enhancement

Authors: Praneet Dutta, Bruce Power, Adam Halpert, Carlos Ezequiel, Aravind Subramanian, Chanchal Chatterjee, Sindhu Hari, Kenton Prindle, Vishal Vaddina, Andrew Leach, Raj Domala, Laura Bandura, Massimo Mascaro

Abstract: We propose GAN-based image enhancement models for frequency enhancement of 2D and 3D seismic images. Seismic imagery is used to understand and characterize the Earth's subsurface for energy exploration. Because these images often suffer from resolution limitations and noise contamination, our proposed method performs large-scale seismic volume frequency enhancement and denoising. The enhanced imag… ▽ More We propose GAN-based image enhancement models for frequency enhancement of 2D and 3D seismic images. Seismic imagery is used to understand and characterize the Earth's subsurface for energy exploration. Because these images often suffer from resolution limitations and noise contamination, our proposed method performs large-scale seismic volume frequency enhancement and denoising. The enhanced images reduce uncertainty and improve decisions about issues, such as optimal well placement, that often rely on low signal-to-noise ratio (SNR) seismic volumes. We explored the impact of adding lithology class information to the models, resulting in improved performance on PSNR and SSIM metrics over a baseline model with no conditional information. △ Less

Submitted 15 November, 2019; originally announced November 2019.

Comments: To be Presented at the NeurIPS 2019, Second Workshop on Machine Learning and the Physicial Sciences, Vancouver, Canada

arXiv:1806.11308 [pdf, other]

doi 10.1016/j.nima.2018.06.078

Characterisation and Testing of CHEC-M - a camera prototype for the Small-Sized Telescopes of the Cherenkov Telescope Array

Authors: J. Zorn, R. White, J. J. Watson, T. P. Armstrong, A. Balzer, M. Barcelo, D. Berge, R. Bose, A. M. Brown, M. Bryan, P. M. Chadwick, P. Clark, H. Costantini, G. Cotter, L. Dangeon, M. Daniel, A. De Franco, P. Deiml, G. Fasola, S. Funk, M. Gebyehu, J. Gironnet, J. A. Graham, T. Greenshaw, J. A. Hinton , et al. (20 additional authors not shown)

Abstract: The Compact High Energy Camera (CHEC) is a camera design for the Small-Sized Telescopes (SSTs; 4 m diameter mirror) of the Cherenkov Telescope Array (CTA). The SSTs are focused on very-high-energy $γ$-ray detection via atmospheric Cherenkov light detection over a very large area. This implies many individual units and hence cost-effective implementation. CHEC relies on dual-mirror optics to reduce… ▽ More The Compact High Energy Camera (CHEC) is a camera design for the Small-Sized Telescopes (SSTs; 4 m diameter mirror) of the Cherenkov Telescope Array (CTA). The SSTs are focused on very-high-energy $γ$-ray detection via atmospheric Cherenkov light detection over a very large area. This implies many individual units and hence cost-effective implementation. CHEC relies on dual-mirror optics to reduce the plate-scale and make use of 6 $\times$ 6 mm$^2$ pixels, leading to a low-cost ($\sim$150 kEuro), compact (0.5 m $\times$ 0.5 m), and light ($\sim$45 kg) camera with 2048 pixels providing a camera FoV of $\sim$9 degrees. The electronics are based on custom TARGET (TeV array readout with GSa/s sampling and event trigger) ASICs and FPGAs sampling incoming signals at a gigasample per second, with flexible camera-level triggering within a single backplane FPGA. CHEC is designed to observe in the $γ$-ray energy range of 1$-$300 TeV, and at impact distances up to $\sim$500 m. To accommodate this and provide full flexibility for later data analysis, full waveforms with 96 samples for all 2048 pixels can be read out at rates up to $\sim$900 Hz. The first prototype, CHEC-M, based on multi-anode photomultipliers (MAPMs) as photosensors, was commissioned and characterised in the laboratory and during two measurement campaigns on a telescope structure at the Paris Observatory in Meudon. In this paper, the results and conclusions from the laboratory and on-site testing of CHEC-M are presented. They have provided essential input on the system design and on operational and data analysis procedures for a camera of this type. A second full-camera prototype based on Silicon photomultipliers (SiPMs), addressing the drawbacks of CHEC-M identified during the first prototype phase, has already been built and is currently being commissioned and tested in the laboratory. △ Less

Submitted 16 July, 2018; v1 submitted 29 June, 2018; originally announced June 2018.

arXiv:1707.02695 [pdf, other]

doi 10.2140/camcos.2018.13.215

Symmetrized importance samplers for stochastic differential equations

Authors: Andrew Leach, Kevin K. Lin, Matthias Morzfeld

Abstract: We study a class of importance sampling methods for stochastic differential equations (SDEs). A small-noise analysis is performed, and the results suggest that a simple symmetrization procedure can significantly improve the performance of our importance sampling schemes when the noise is not too large. We demonstrate that this is indeed the case for a number of linear and nonlinear examples. Poten… ▽ More We study a class of importance sampling methods for stochastic differential equations (SDEs). A small-noise analysis is performed, and the results suggest that a simple symmetrization procedure can significantly improve the performance of our importance sampling schemes when the noise is not too large. We demonstrate that this is indeed the case for a number of linear and nonlinear examples. Potential applications, e.g., data assimilation, are discussed. △ Less

Submitted 28 March, 2018; v1 submitted 10 July, 2017; originally announced July 2017.

Comments: Added brief discussion of Hamilton-Jacobi equation. Also made various minor corrections. To appear in Communciations in Applied Mathematics and Computational Science

Journal ref: Commun. Appl. Math. Comput. Sci. 13 (2018) 215-241

arXiv:1210.3547 [pdf, ps, other]

doi 10.1016/j.jmps.2013.04.008

A Surface Stacking Fault Energy Approach to Predicting Defect Nucleation in Surface-Dominated Nanostructures

Authors: **-Wu Jiang, Austin M. Leach, Ken Gall, Harold S. Park, Timon Rabczuk

Abstract: We present a surface stacking fault (SSF) energy approach to predicting defect nucleation from the surfaces of surface-dominated nanostructure such as FCC metal nanowires. The approach leads to a criteria that predicts the initial yield mechanism via either slip or twinning depending on whether the unstable twinning energy or unstable slip energy is smaller as determined from the resulting SSF ene… ▽ More We present a surface stacking fault (SSF) energy approach to predicting defect nucleation from the surfaces of surface-dominated nanostructure such as FCC metal nanowires. The approach leads to a criteria that predicts the initial yield mechanism via either slip or twinning depending on whether the unstable twinning energy or unstable slip energy is smaller as determined from the resulting SSF energy curve. The approach is validated through a comparison between the SSF energy calculation and low-temperature classical molecular dynamics simulations of copper nanowires with different axial and transverse surface orientations, and cross sectional geometries. We focus on the effects of the geometric cross section by studying the transition from slip to twinning previously predicted in moving from a square to rectangular cross section for $\ <100\ > /\{100\}$ nanowires, and also for moving from a rhombic to truncated rhombic cross sectional geometry for $\ <110\ >$ nanowires. \hsp{We also provide the important demonstration that the criteria is able to predict the correct deformation mechanism when full dislocation slip is considered concurrently with partial dislocation slip and twinning. This is done in the context of rhombic aluminum nanowires which do not show a tensile reorientation due to full dislocation slip.} We show that the SSF energy criteria successfully predicts the initial mode of surface-nucleated plasticity at low temperature, while also discussing the effects of strain and temperature on the applicability of the criterion. △ Less

Submitted 22 March, 2013; v1 submitted 12 October, 2012; originally announced October 2012.

Comments: revisions according to referee suggestions, full dislocation fault discussed, 15 pages, 21 figures

Journal ref: Journal of the Mechanics and Physics of Solids 61, 1915 (2013)

arXiv:0710.0819 [pdf, ps, other]

doi 10.1088/0264-9381/25/3/035013

Compactifying the state space for alternative theories of gravity

Authors: Naureen Goheer, Jannie A. Leach, Peter K. S. Dunsby

Abstract: In this paper we address important issues surrounding the choice of variables when performing a dynamical systems analysis of alternative theories of gravity. We discuss the advantages and disadvantages of compactifying the state space, and illustrate this using two examples. We first show how to define a compact state space for the class of LRS Bianchi type I models in $R^n$-gravity and compare… ▽ More In this paper we address important issues surrounding the choice of variables when performing a dynamical systems analysis of alternative theories of gravity. We discuss the advantages and disadvantages of compactifying the state space, and illustrate this using two examples. We first show how to define a compact state space for the class of LRS Bianchi type I models in $R^n$-gravity and compare to a non--compact expansion--normalised approach. In the second example we consider the flat Friedmann matter subspace of the previous example, and compare the compact analysis to studies where non-compact non--expansion--normalised variables were used. In both examples we comment on the existence of bouncing or recollapsing orbits as well as the existence of static models. △ Less

Submitted 22 January, 2008; v1 submitted 3 October, 2007; originally announced October 2007.

Comments: 18 pages, revised to match published version

Journal ref: Class.Quant.Grav.25:035013,2008

arXiv:0710.0814 [pdf, ps, other]

doi 10.1088/0264-9381/24/22/026

Dynamical systems analysis of anisotropic cosmologies in $R^n$-gravity

Authors: Naureen Goheer, Jannie A. Leach, Peter K. S. Dunsby

Abstract: In this paper we study the dynamics of {\it orthogonal spatially homogeneous} Bianchi cosmologies in $R^n$-gravity. We construct a compact state space by dividing the state space into different sectors. We perform a detailed analysis of the cosmological behaviour in terms of the parameter $n$, determining all the equilibrium points, their stability and corresponding cosmological evolution. In pa… ▽ More In this paper we study the dynamics of {\it orthogonal spatially homogeneous} Bianchi cosmologies in $R^n$-gravity. We construct a compact state space by dividing the state space into different sectors. We perform a detailed analysis of the cosmological behaviour in terms of the parameter $n$, determining all the equilibrium points, their stability and corresponding cosmological evolution. In particular, the appropriately compactified state space allows us to investigate static and bouncing solutions. We find no Einstein static solutions, but there do exist cosmologies with bounce behaviours. We also investigate the isotropisation of these models and find that all isotropic points are flat Friedmann like. △ Less

Submitted 6 November, 2007; v1 submitted 3 October, 2007; originally announced October 2007.

Comments: 21pages, revised to match published version

Journal ref: Class.Quant.Grav.24:5689-5708,2007

arXiv:gr-qc/0702122 [pdf, ps, other]

An analysis of the shear dynamics in Bianchi I cosmologies with $R^n$-gravity

Authors: Jannie A. Leach, Peter K. S. Dunsby, Sante Carloni

Abstract: We consider the case of $R^n$-gravity and perform a detailed analysis of the dynamics in Bianchi I cosmologies which exhibit {\it local rotational symmetry} (LRS). We find exact solutions and study their behaviour and stability in terms of the values of the parameter $n$. In particular, we found a set of cosmic histories in which the universe is initially isotropic, then develops shear anisotrop… ▽ More We consider the case of $R^n$-gravity and perform a detailed analysis of the dynamics in Bianchi I cosmologies which exhibit {\it local rotational symmetry} (LRS). We find exact solutions and study their behaviour and stability in terms of the values of the parameter $n$. In particular, we found a set of cosmic histories in which the universe is initially isotropic, then develops shear anisotropies which approaches a constant value. △ Less

Submitted 23 February, 2007; originally announced February 2007.

Comments: 4 pages, 2 figures; to appear in the proceedings of the MG11

arXiv:gr-qc/0701009 [pdf, ps, other]

doi 10.1088/0264-9381/25/3/035008

Cosmological dynamics of Scalar--Tensor Gravity

Authors: S Carloni, J A Leach, S Capozziello, P K S Dunsby

Abstract: We study the phase--space of FLRW models derived from Scalar--Tensor Gravity where the non--minimal coupling is $F(φ)=ξφ^2$ and the effective potential is $V(φ)=λφ^n$. Our analysis allows to unfold many feature of the cosmology of this class of theories. For example, the evolution mechanism towards states indistinguishable from GR is recovered and proved to depend critically on the form of the p… ▽ More We study the phase--space of FLRW models derived from Scalar--Tensor Gravity where the non--minimal coupling is $F(φ)=ξφ^2$ and the effective potential is $V(φ)=λφ^n$. Our analysis allows to unfold many feature of the cosmology of this class of theories. For example, the evolution mechanism towards states indistinguishable from GR is recovered and proved to depend critically on the form of the potential $V(φ)$. Also, transient almost--Friedmann phases evolving towards accelerated expansion and unstable inflationary phases evolving towards stable ones are found. Some of our results are shown to hold also for the String-Dilaton action. △ Less

Submitted 13 August, 2007; v1 submitted 31 December, 2006; originally announced January 2007.

Comments: 25 pages, 4 figures, 12 tables, submitted to CQG

Journal ref: Class.Quant.Grav.25:035008,2008

arXiv:gr-qc/0603012 [pdf, ps, other]

doi 10.1088/0264-9381/23/15/011

Shear dynamics in Bianchi I cosmologies with R^n-gravity

Authors: Jannie A Leach, Sante Carloni, Peter K S Dunsby

Abstract: We give the equations governing the shear evolution in Bianchi spacetimes for general f(R)-theories of gravity. We consider the case of R^n-gravity and perform a detailed analysis of the dynamics in Bianchi I cosmologies which exhibit local rotational symmetry. We find exact solutions and study their behaviour and stability in terms of the values of the parameter n. In particular, we found a set… ▽ More We give the equations governing the shear evolution in Bianchi spacetimes for general f(R)-theories of gravity. We consider the case of R^n-gravity and perform a detailed analysis of the dynamics in Bianchi I cosmologies which exhibit local rotational symmetry. We find exact solutions and study their behaviour and stability in terms of the values of the parameter n. In particular, we found a set of cosmic histories in which the universe is initially isotropic, then develops shear anisotropies which approaches a constant value. △ Less

Submitted 23 June, 2006; v1 submitted 6 March, 2006; originally announced March 2006.

Comments: 25 pages LaTeX, 6 figures. Revised to match the final version accepted for publication in CQG

Journal ref: Class.Quant.Grav.23:4915-4937,2006

arXiv:gr-qc/0502109

Conditional escape of gravitons from the brane

Authors: Jannie A. Leach, William M. Lesame

Abstract: In this paper we consider a cosmological Friedmann Robertson Walker brane world imbedded in a 5-dimensional anti-de Sitter Schwarzschild bulk. We show, using potential diagrams, that for an anti-de Sitter bulk the null geodesics never return to the brane. Null geodesics do however return for k=+1 when we include the Schwarzschild like mass and the condition of return is obtained from the corresp… ▽ More In this paper we consider a cosmological Friedmann Robertson Walker brane world imbedded in a 5-dimensional anti-de Sitter Schwarzschild bulk. We show, using potential diagrams, that for an anti-de Sitter bulk the null geodesics never return to the brane. Null geodesics do however return for k=+1 when we include the Schwarzschild like mass and the condition of return is obtained from the corresponding effective potential. We next obtain the condition for a gravitational signal to be emitted from a FRW brane with isotropically distributed matter. We use these results to investigate the conditions under which shortcuts through the bulk are possible. △ Less

Submitted 4 October, 2014; v1 submitted 25 February, 2005; originally announced February 2005.

Comments: This paper has been withdrawn as it was never published

Showing 1–15 of 15 results for author: Leach, A