-
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Authors:
Gemini Team,
Petko Georgiev,
Ving Ian Lei,
Ryan Burnell,
Libin Bai,
Anmol Gulati,
Garrett Tanzer,
Damien Vincent,
Zhufeng Pan,
Shibo Wang,
Soroosh Mariooryad,
Yifan Ding,
Xinyang Geng,
Fred Alcober,
Roy Frostig,
Mark Omernick,
Lexi Walker,
Cosmin Paduraru,
Christina Sorokin,
Andrea Tacchetti,
Colin Gaffney,
Samira Daruki,
Olcan Sercinoglu,
Zach Gleicher,
Juliette Love
, et al. (1092 additional authors not shown)
Abstract:
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February…
▽ More
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content.
△ Less
Submitted 14 June, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
A method for the ethical analysis of brain-inspired AI
Authors:
Michele Farisco,
Gianluca Baldassarre,
Emilio Cartoni,
Antonia Leach,
Mihai A. Petrovici,
Achim Rosemann,
Arleen Salles,
Bernd Stahl,
Sacha J. van Albada
Abstract:
Despite its successes, to date Artificial Intelligence (AI) is still characterized by a number of shortcomings with regards to different application domains and goals. These limitations are arguably both conceptual (e.g., related to underlying theoretical models, such as symbolic vs. connectionist), and operational (e.g., related to robustness and ability to generalize). Biologically inspired AI,…
▽ More
Despite its successes, to date Artificial Intelligence (AI) is still characterized by a number of shortcomings with regards to different application domains and goals. These limitations are arguably both conceptual (e.g., related to underlying theoretical models, such as symbolic vs. connectionist), and operational (e.g., related to robustness and ability to generalize). Biologically inspired AI, and more specifically brain-inspired AI, promises to provide further biological aspects beyond those that are already traditionally included in AI, making it possible to assess and possibly overcome some of its present shortcomings. This article examines some conceptual, technical, and ethical issues raised by the development and use of brain-inspired AI. Against this background, the paper asks whether there is anything ethically unique about brain-inspired AI. The aim of the paper is to introduce a method that has a heuristic nature and that can be applied to identify and address the ethical issues arising from brain-inspired AI. The conclusion resulting from the application of this method is that, compared to traditional AI, brain-inspired AI raises new foundational ethical issues and some new practical ethical issues, and exacerbates some of the issues raised by traditional AI.
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
Long-term Forecasting with TiDE: Time-series Dense Encoder
Authors:
Abhimanyu Das,
Weihao Kong,
Andrew Leach,
Shaan Mathur,
Rajat Sen,
Rose Yu
Abstract:
Recent work has shown that simple linear models can outperform several Transformer based approaches in long term time-series forecasting. Motivated by this, we propose a Multi-layer Perceptron (MLP) based encoder-decoder model, Time-series Dense Encoder (TiDE), for long-term time-series forecasting that enjoys the simplicity and speed of linear models while also being able to handle covariates and…
▽ More
Recent work has shown that simple linear models can outperform several Transformer based approaches in long term time-series forecasting. Motivated by this, we propose a Multi-layer Perceptron (MLP) based encoder-decoder model, Time-series Dense Encoder (TiDE), for long-term time-series forecasting that enjoys the simplicity and speed of linear models while also being able to handle covariates and non-linear dependencies. Theoretically, we prove that the simplest linear analogue of our model can achieve near optimal error rate for linear dynamical systems (LDS) under some assumptions. Empirically, we show that our method can match or outperform prior approaches on popular long-term time-series forecasting benchmarks while being 5-10x faster than the best Transformer based model.
△ Less
Submitted 4 April, 2024; v1 submitted 17 April, 2023;
originally announced April 2023.
-
Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models
Authors:
Sam Bond-Taylor,
Adam Leach,
Yang Long,
Chris G. Willcocks
Abstract:
Deep generative models are a class of techniques that train deep neural networks to model the distribution of training samples. Research has fragmented into various interconnected approaches, each of which make trade-offs including run-time, diversity, and architectural restrictions. In particular, this compendium covers energy-based models, variational autoencoders, generative adversarial network…
▽ More
Deep generative models are a class of techniques that train deep neural networks to model the distribution of training samples. Research has fragmented into various interconnected approaches, each of which make trade-offs including run-time, diversity, and architectural restrictions. In particular, this compendium covers energy-based models, variational autoencoders, generative adversarial networks, autoregressive models, normalizing flows, in addition to numerous hybrid approaches. These techniques are compared and contrasted, explaining the premises behind each and how they are interrelated, while reviewing current state-of-the-art advances and implementations.
△ Less
Submitted 28 March, 2022; v1 submitted 8 March, 2021;
originally announced March 2021.
-
Meta-Learning Initializations for Image Segmentation
Authors:
Sean M. Hendryx,
Andrew B. Leach,
Paul D. Hein,
Clayton T. Morrison
Abstract:
We extend first-order model agnostic meta-learning algorithms (including FOMAML and Reptile) to image segmentation, present a novel neural network architecture built for fast learning which we call EfficientLab, and leverage a formal definition of the test error of meta-learning algorithms to decrease error on out of distribution tasks. We show state of the art results on the FSS-1000 dataset by m…
▽ More
We extend first-order model agnostic meta-learning algorithms (including FOMAML and Reptile) to image segmentation, present a novel neural network architecture built for fast learning which we call EfficientLab, and leverage a formal definition of the test error of meta-learning algorithms to decrease error on out of distribution tasks. We show state of the art results on the FSS-1000 dataset by meta-training EfficientLab with FOMAML and using Bayesian optimization to infer the optimal test-time adaptation routine hyperparameters. We also construct a small benchmark dataset, FP-k, for the empirical study of how meta-learning systems perform in both few- and many-shot settings. On the FP-k dataset, we show that meta-learned initializations provide value for canonical few-shot image segmentation but their performance is quickly matched by conventional transfer learning with performance being equal beyond 10 labeled examples. Our code, meta-learned model, and the FP-k dataset are available at https://github.com/ml4ai/mliis .
△ Less
Submitted 7 May, 2020; v1 submitted 12 December, 2019;
originally announced December 2019.
-
3D Conditional Generative Adversarial Networks to enable large-scale seismic image enhancement
Authors:
Praneet Dutta,
Bruce Power,
Adam Halpert,
Carlos Ezequiel,
Aravind Subramanian,
Chanchal Chatterjee,
Sindhu Hari,
Kenton Prindle,
Vishal Vaddina,
Andrew Leach,
Raj Domala,
Laura Bandura,
Massimo Mascaro
Abstract:
We propose GAN-based image enhancement models for frequency enhancement of 2D and 3D seismic images. Seismic imagery is used to understand and characterize the Earth's subsurface for energy exploration. Because these images often suffer from resolution limitations and noise contamination, our proposed method performs large-scale seismic volume frequency enhancement and denoising. The enhanced imag…
▽ More
We propose GAN-based image enhancement models for frequency enhancement of 2D and 3D seismic images. Seismic imagery is used to understand and characterize the Earth's subsurface for energy exploration. Because these images often suffer from resolution limitations and noise contamination, our proposed method performs large-scale seismic volume frequency enhancement and denoising. The enhanced images reduce uncertainty and improve decisions about issues, such as optimal well placement, that often rely on low signal-to-noise ratio (SNR) seismic volumes. We explored the impact of adding lithology class information to the models, resulting in improved performance on PSNR and SSIM metrics over a baseline model with no conditional information.
△ Less
Submitted 15 November, 2019;
originally announced November 2019.
-
Characterisation and Testing of CHEC-M - a camera prototype for the Small-Sized Telescopes of the Cherenkov Telescope Array
Authors:
J. Zorn,
R. White,
J. J. Watson,
T. P. Armstrong,
A. Balzer,
M. Barcelo,
D. Berge,
R. Bose,
A. M. Brown,
M. Bryan,
P. M. Chadwick,
P. Clark,
H. Costantini,
G. Cotter,
L. Dangeon,
M. Daniel,
A. De Franco,
P. Deiml,
G. Fasola,
S. Funk,
M. Gebyehu,
J. Gironnet,
J. A. Graham,
T. Greenshaw,
J. A. Hinton
, et al. (20 additional authors not shown)
Abstract:
The Compact High Energy Camera (CHEC) is a camera design for the Small-Sized Telescopes (SSTs; 4 m diameter mirror) of the Cherenkov Telescope Array (CTA). The SSTs are focused on very-high-energy $γ$-ray detection via atmospheric Cherenkov light detection over a very large area. This implies many individual units and hence cost-effective implementation. CHEC relies on dual-mirror optics to reduce…
▽ More
The Compact High Energy Camera (CHEC) is a camera design for the Small-Sized Telescopes (SSTs; 4 m diameter mirror) of the Cherenkov Telescope Array (CTA). The SSTs are focused on very-high-energy $γ$-ray detection via atmospheric Cherenkov light detection over a very large area. This implies many individual units and hence cost-effective implementation. CHEC relies on dual-mirror optics to reduce the plate-scale and make use of 6 $\times$ 6 mm$^2$ pixels, leading to a low-cost ($\sim$150 kEuro), compact (0.5 m $\times$ 0.5 m), and light ($\sim$45 kg) camera with 2048 pixels providing a camera FoV of $\sim$9 degrees. The electronics are based on custom TARGET (TeV array readout with GSa/s sampling and event trigger) ASICs and FPGAs sampling incoming signals at a gigasample per second, with flexible camera-level triggering within a single backplane FPGA. CHEC is designed to observe in the $γ$-ray energy range of 1$-$300 TeV, and at impact distances up to $\sim$500 m. To accommodate this and provide full flexibility for later data analysis, full waveforms with 96 samples for all 2048 pixels can be read out at rates up to $\sim$900 Hz. The first prototype, CHEC-M, based on multi-anode photomultipliers (MAPMs) as photosensors, was commissioned and characterised in the laboratory and during two measurement campaigns on a telescope structure at the Paris Observatory in Meudon. In this paper, the results and conclusions from the laboratory and on-site testing of CHEC-M are presented. They have provided essential input on the system design and on operational and data analysis procedures for a camera of this type. A second full-camera prototype based on Silicon photomultipliers (SiPMs), addressing the drawbacks of CHEC-M identified during the first prototype phase, has already been built and is currently being commissioned and tested in the laboratory.
△ Less
Submitted 16 July, 2018; v1 submitted 29 June, 2018;
originally announced June 2018.
-
Symmetrized importance samplers for stochastic differential equations
Authors:
Andrew Leach,
Kevin K. Lin,
Matthias Morzfeld
Abstract:
We study a class of importance sampling methods for stochastic differential equations (SDEs). A small-noise analysis is performed, and the results suggest that a simple symmetrization procedure can significantly improve the performance of our importance sampling schemes when the noise is not too large. We demonstrate that this is indeed the case for a number of linear and nonlinear examples. Poten…
▽ More
We study a class of importance sampling methods for stochastic differential equations (SDEs). A small-noise analysis is performed, and the results suggest that a simple symmetrization procedure can significantly improve the performance of our importance sampling schemes when the noise is not too large. We demonstrate that this is indeed the case for a number of linear and nonlinear examples. Potential applications, e.g., data assimilation, are discussed.
△ Less
Submitted 28 March, 2018; v1 submitted 10 July, 2017;
originally announced July 2017.
-
A Surface Stacking Fault Energy Approach to Predicting Defect Nucleation in Surface-Dominated Nanostructures
Authors:
**-Wu Jiang,
Austin M. Leach,
Ken Gall,
Harold S. Park,
Timon Rabczuk
Abstract:
We present a surface stacking fault (SSF) energy approach to predicting defect nucleation from the surfaces of surface-dominated nanostructure such as FCC metal nanowires. The approach leads to a criteria that predicts the initial yield mechanism via either slip or twinning depending on whether the unstable twinning energy or unstable slip energy is smaller as determined from the resulting SSF ene…
▽ More
We present a surface stacking fault (SSF) energy approach to predicting defect nucleation from the surfaces of surface-dominated nanostructure such as FCC metal nanowires. The approach leads to a criteria that predicts the initial yield mechanism via either slip or twinning depending on whether the unstable twinning energy or unstable slip energy is smaller as determined from the resulting SSF energy curve. The approach is validated through a comparison between the SSF energy calculation and low-temperature classical molecular dynamics simulations of copper nanowires with different axial and transverse surface orientations, and cross sectional geometries. We focus on the effects of the geometric cross section by studying the transition from slip to twinning previously predicted in moving from a square to rectangular cross section for $\ <100\ > /\{100\}$ nanowires, and also for moving from a rhombic to truncated rhombic cross sectional geometry for $\ <110\ >$ nanowires. \hsp{We also provide the important demonstration that the criteria is able to predict the correct deformation mechanism when full dislocation slip is considered concurrently with partial dislocation slip and twinning. This is done in the context of rhombic aluminum nanowires which do not show a tensile reorientation due to full dislocation slip.} We show that the SSF energy criteria successfully predicts the initial mode of surface-nucleated plasticity at low temperature, while also discussing the effects of strain and temperature on the applicability of the criterion.
△ Less
Submitted 22 March, 2013; v1 submitted 12 October, 2012;
originally announced October 2012.
-
Compactifying the state space for alternative theories of gravity
Authors:
Naureen Goheer,
Jannie A. Leach,
Peter K. S. Dunsby
Abstract:
In this paper we address important issues surrounding the choice of variables when performing a dynamical systems analysis of alternative theories of gravity. We discuss the advantages and disadvantages of compactifying the state space, and illustrate this using two examples. We first show how to define a compact state space for the class of LRS Bianchi type I models in $R^n$-gravity and compare…
▽ More
In this paper we address important issues surrounding the choice of variables when performing a dynamical systems analysis of alternative theories of gravity. We discuss the advantages and disadvantages of compactifying the state space, and illustrate this using two examples. We first show how to define a compact state space for the class of LRS Bianchi type I models in $R^n$-gravity and compare to a non--compact expansion--normalised approach. In the second example we consider the flat Friedmann matter subspace of the previous example, and compare the compact analysis to studies where non-compact non--expansion--normalised variables were used. In both examples we comment on the existence of bouncing or recollapsing orbits as well as the existence of static models.
△ Less
Submitted 22 January, 2008; v1 submitted 3 October, 2007;
originally announced October 2007.
-
Dynamical systems analysis of anisotropic cosmologies in $R^n$-gravity
Authors:
Naureen Goheer,
Jannie A. Leach,
Peter K. S. Dunsby
Abstract:
In this paper we study the dynamics of {\it orthogonal spatially homogeneous} Bianchi cosmologies in $R^n$-gravity. We construct a compact state space by dividing the state space into different sectors. We perform a detailed analysis of the cosmological behaviour in terms of the parameter $n$, determining all the equilibrium points, their stability and corresponding cosmological evolution. In pa…
▽ More
In this paper we study the dynamics of {\it orthogonal spatially homogeneous} Bianchi cosmologies in $R^n$-gravity. We construct a compact state space by dividing the state space into different sectors. We perform a detailed analysis of the cosmological behaviour in terms of the parameter $n$, determining all the equilibrium points, their stability and corresponding cosmological evolution. In particular, the appropriately compactified state space allows us to investigate static and bouncing solutions. We find no Einstein static solutions, but there do exist cosmologies with bounce behaviours. We also investigate the isotropisation of these models and find that all isotropic points are flat Friedmann like.
△ Less
Submitted 6 November, 2007; v1 submitted 3 October, 2007;
originally announced October 2007.
-
An analysis of the shear dynamics in Bianchi I cosmologies with $R^n$-gravity
Authors:
Jannie A. Leach,
Peter K. S. Dunsby,
Sante Carloni
Abstract:
We consider the case of $R^n$-gravity and perform a detailed analysis of the dynamics in Bianchi I cosmologies which exhibit {\it local rotational symmetry} (LRS). We find exact solutions and study their behaviour and stability in terms of the values of the parameter $n$. In particular, we found a set of cosmic histories in which the universe is initially isotropic, then develops shear anisotrop…
▽ More
We consider the case of $R^n$-gravity and perform a detailed analysis of the dynamics in Bianchi I cosmologies which exhibit {\it local rotational symmetry} (LRS). We find exact solutions and study their behaviour and stability in terms of the values of the parameter $n$. In particular, we found a set of cosmic histories in which the universe is initially isotropic, then develops shear anisotropies which approaches a constant value.
△ Less
Submitted 23 February, 2007;
originally announced February 2007.
-
Cosmological dynamics of Scalar--Tensor Gravity
Authors:
S Carloni,
J A Leach,
S Capozziello,
P K S Dunsby
Abstract:
We study the phase--space of FLRW models derived from Scalar--Tensor Gravity where the non--minimal coupling is $F(φ)=ξφ^2$ and the effective potential is $V(φ)=λφ^n$. Our analysis allows to unfold many feature of the cosmology of this class of theories. For example, the evolution mechanism towards states indistinguishable from GR is recovered and proved to depend critically on the form of the p…
▽ More
We study the phase--space of FLRW models derived from Scalar--Tensor Gravity where the non--minimal coupling is $F(φ)=ξφ^2$ and the effective potential is $V(φ)=λφ^n$. Our analysis allows to unfold many feature of the cosmology of this class of theories. For example, the evolution mechanism towards states indistinguishable from GR is recovered and proved to depend critically on the form of the potential $V(φ)$. Also, transient almost--Friedmann phases evolving towards accelerated expansion and unstable inflationary phases evolving towards stable ones are found. Some of our results are shown to hold also for the String-Dilaton action.
△ Less
Submitted 13 August, 2007; v1 submitted 31 December, 2006;
originally announced January 2007.
-
Shear dynamics in Bianchi I cosmologies with R^n-gravity
Authors:
Jannie A Leach,
Sante Carloni,
Peter K S Dunsby
Abstract:
We give the equations governing the shear evolution in Bianchi spacetimes for general f(R)-theories of gravity. We consider the case of R^n-gravity and perform a detailed analysis of the dynamics in Bianchi I cosmologies which exhibit local rotational symmetry. We find exact solutions and study their behaviour and stability in terms of the values of the parameter n. In particular, we found a set…
▽ More
We give the equations governing the shear evolution in Bianchi spacetimes for general f(R)-theories of gravity. We consider the case of R^n-gravity and perform a detailed analysis of the dynamics in Bianchi I cosmologies which exhibit local rotational symmetry. We find exact solutions and study their behaviour and stability in terms of the values of the parameter n. In particular, we found a set of cosmic histories in which the universe is initially isotropic, then develops shear anisotropies which approaches a constant value.
△ Less
Submitted 23 June, 2006; v1 submitted 6 March, 2006;
originally announced March 2006.
-
Conditional escape of gravitons from the brane
Authors:
Jannie A. Leach,
William M. Lesame
Abstract:
In this paper we consider a cosmological Friedmann Robertson Walker brane world imbedded in a 5-dimensional anti-de Sitter Schwarzschild bulk. We show, using potential diagrams, that for an anti-de Sitter bulk the null geodesics never return to the brane. Null geodesics do however return for k=+1 when we include the Schwarzschild like mass and the condition of return is obtained from the corresp…
▽ More
In this paper we consider a cosmological Friedmann Robertson Walker brane world imbedded in a 5-dimensional anti-de Sitter Schwarzschild bulk. We show, using potential diagrams, that for an anti-de Sitter bulk the null geodesics never return to the brane. Null geodesics do however return for k=+1 when we include the Schwarzschild like mass and the condition of return is obtained from the corresponding effective potential. We next obtain the condition for a gravitational signal to be emitted from a FRW brane with isotropically distributed matter. We use these results to investigate the conditions under which shortcuts through the bulk are possible.
△ Less
Submitted 4 October, 2014; v1 submitted 25 February, 2005;
originally announced February 2005.