Skip to main content

Showing 1–14 of 14 results for author: Ritter, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  2. arXiv:2307.07439  [pdf, other

    eess.IV cs.CV cs.LG

    Atlas-Based Interpretable Age Prediction In Whole-Body MR Images

    Authors: Sophie Starck, Yadunandan Vivekanand Kini, Jessica Johanna Maria Ritter, Rickmer Braren, Daniel Rueckert, Tamara Mueller

    Abstract: Age prediction is an important part of medical assessments and research. It can aid in detecting diseases as well as abnormal ageing by highlighting the discrepancy between chronological and biological age. To gain a comprehensive understanding of age-related changes observed in various body parts, we investigate them on a larger scale by using whole-body 3D images. We utilise the Grad-CAM interpr… ▽ More

    Submitted 2 November, 2023; v1 submitted 14 July, 2023; originally announced July 2023.

  3. arXiv:2305.20013  [pdf

    quant-ph cs.NI

    Software Architecture for Operation and Use of Quantum Communications Networks

    Authors: Dinesh Verma, Eden Figueroa, Gabriella Carini, Mark Ritter

    Abstract: Quantum Communications Networks using the properties of qubits, namely state superposition, no-cloning and entanglement, can enable the exchange of information in a very secure manner across optical links or free space. New innovations enable the use of optical repeaters as well as multi-cast communication in the networks. Some types of quantum communications mechanisms can be implemented at room-… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

  4. arXiv:2305.18565  [pdf, other

    cs.CV cs.CL cs.LG

    PaLI-X: On Scaling up a Multilingual Vision and Language Model

    Authors: Xi Chen, Josip Djolonga, Piotr Padlewski, Basil Mustafa, Soravit Changpinyo, Jialin Wu, Carlos Riquelme Ruiz, Sebastian Goodman, Xiao Wang, Yi Tay, Siamak Shakeri, Mostafa Dehghani, Daniel Salz, Mario Lucic, Michael Tschannen, Arsha Nagrani, Hexiang Hu, Mandar Joshi, Bo Pang, Ceslee Montgomery, Paulina Pietrzyk, Marvin Ritter, AJ Piergiovanni, Matthias Minderer, Filip Pavetic , et al. (18 additional authors not shown)

    Abstract: We present the training recipe and results of scaling up PaLI-X, a multilingual vision and language model, both in terms of size of the components and the breadth of its training task mixture. Our model achieves new levels of performance on a wide-range of varied and complex tasks, including multiple image-based captioning and question-answering tasks, image-based document understanding and few-sh… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

  5. arXiv:2303.00693  [pdf, other

    hep-ph cs.CV stat.ML

    PE-GAN: Prior Embedding GAN for PXD images at Belle II

    Authors: Baran Hashemi, Nikolai Hartmann, Thomas Kuhr, Martin Ritter, Matej srebre

    Abstract: The pixel vertex detector (PXD) is an essential part of the Belle II detector recording particle positions. Data from the PXD and other sensors allow us to reconstruct particle tracks and decay vertices. The effect of background hits on track reconstruction is simulated by adding measured or simulated background hit patterns to the hits produced by simulated signal particles. This model requires… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

    Comments: 25th International Conference on Computing in High Energy and Nuclear Physics (CHEP 2021)

  6. arXiv:2208.10607  [pdf, other

    cs.CV

    Individual Tree Detection in Large-Scale Urban Environments using High-Resolution Multispectral Imagery

    Authors: Jonathan Ventura, Camille Pawlak, Milo Honsberger, Cameron Gonsalves, Julian Rice, Natalie L. R. Love, Skyler Han, Viet Nguyen, Keilana Sugano, Jacqueline Doremus, G. Andrew Fricker, Jenn Yost, Matt Ritter

    Abstract: We introduce a novel deep learning method for detection of individual trees in urban environments using high-resolution multispectral aerial imagery. We use a convolutional neural network to regress a confidence map indicating the locations of individual trees, which are localized using a peak finding algorithm. Our method provides complete spatial coverage by detecting trees in both public and pr… ▽ More

    Submitted 27 October, 2022; v1 submitted 22 August, 2022; originally announced August 2022.

  7. arXiv:2203.17189  [pdf, other

    cs.LG cs.CL

    Scaling Up Models and Data with $\texttt{t5x}$ and $\texttt{seqio}$

    Authors: Adam Roberts, Hyung Won Chung, Anselm Levskaya, Gaurav Mishra, James Bradbury, Daniel Andor, Sharan Narang, Brian Lester, Colin Gaffney, Afroz Mohiuddin, Curtis Hawthorne, Aitor Lewkowycz, Alex Salcianu, Marc van Zee, Jacob Austin, Sebastian Goodman, Livio Baldini Soares, Haitang Hu, Sasha Tsvyashchenko, Aakanksha Chowdhery, Jasmijn Bastings, Jannis Bulian, Xavier Garcia, Jianmo Ni, Andrew Chen , et al. (18 additional authors not shown)

    Abstract: Recent neural network-based language models have benefited greatly from scaling up the size of training datasets and the number of parameters in the models themselves. Scaling can be complicated due to various factors including the need to distribute computation on supercomputer clusters (e.g., TPUs), prevent bottlenecks when infeeding data, and ensure reproducible results. In this work, we presen… ▽ More

    Submitted 31 March, 2022; originally announced March 2022.

  8. arXiv:2107.12283  [pdf, other

    cs.CV

    Continental-Scale Building Detection from High Resolution Satellite Imagery

    Authors: Wojciech Sirko, Sergii Kashubin, Marvin Ritter, Abigail Annkah, Yasser Salah Eddine Bouchareb, Yann Dauphin, Daniel Keysers, Maxim Neumann, Moustapha Cisse, John Quinn

    Abstract: Identifying the locations and footprints of buildings is vital for many practical and scientific purposes. Such information can be particularly useful in develo** regions where alternative data sources may be scarce. In this work, we describe a model training pipeline for detecting buildings across the entire continent of Africa, using 50 cm satellite imagery. Starting with the U-Net model, wide… ▽ More

    Submitted 29 July, 2021; v1 submitted 26 July, 2021; originally announced July 2021.

  9. arXiv:2010.02808  [pdf, other

    cs.CV

    Representation learning from videos in-the-wild: An object-centric approach

    Authors: Rob Romijnders, Aravindh Mahendran, Michael Tschannen, Josip Djolonga, Marvin Ritter, Neil Houlsby, Mario Lucic

    Abstract: We propose a method to learn image representations from uncurated videos. We combine a supervised loss from off-the-shelf object detectors and self-supervised losses which naturally arise from the video-shot-frame-object hierarchy present in each video. We report competitive results on 19 transfer learning tasks of the Visual Task Adaptation Benchmark (VTAB), and on 8 out-of-distribution-generaliz… ▽ More

    Submitted 9 February, 2021; v1 submitted 6 October, 2020; originally announced October 2020.

    Comments: Published at WACV 2021

  10. User documentation and training at Belle II

    Authors: Sam Cunliffe, Ilya Komarov, Thomas Kuhr, Martin Ritter, Francesco Tenchini

    Abstract: Belle II is a rapidly growing collaboration with members from one hundred and nineteen institutes spread around the globe. The software development team of the experiment, as well as the software users, are very much decentralised. Together with the active development of the software, such decentralisation makes the adoption of the latest software releases by users an essential, but quite challeng… ▽ More

    Submitted 9 September, 2020; originally announced September 2020.

  11. arXiv:1912.02783  [pdf, other

    cs.CV cs.LG

    Self-Supervised Learning of Video-Induced Visual Invariances

    Authors: Michael Tschannen, Josip Djolonga, Marvin Ritter, Aravindh Mahendran, Xiaohua Zhai, Neil Houlsby, Sylvain Gelly, Mario Lucic

    Abstract: We propose a general framework for self-supervised learning of transferable visual representations based on Video-Induced Visual Invariances (VIVI). We consider the implicit hierarchy present in the videos and make use of (i) frame-level invariances (e.g. stability to color and contrast perturbations), (ii) shot/clip-level invariances (e.g. robustness to changes in object orientation and lighting… ▽ More

    Submitted 1 April, 2020; v1 submitted 5 December, 2019; originally announced December 2019.

    Comments: CVPR 2020

  12. arXiv:1903.02271  [pdf, other

    cs.LG cs.CV stat.ML

    High-Fidelity Image Generation With Fewer Labels

    Authors: Mario Lucic, Michael Tschannen, Marvin Ritter, Xiaohua Zhai, Olivier Bachem, Sylvain Gelly

    Abstract: Deep generative models are becoming a cornerstone of modern machine learning. Recent work on conditional generative adversarial networks has shown that learning complex, high-dimensional distributions over natural images is within reach. While the latest models are able to generate high-fidelity, diverse natural images at high resolution, they rely on a vast quantity of labeled data. In this work… ▽ More

    Submitted 14 May, 2019; v1 submitted 6 March, 2019; originally announced March 2019.

    Comments: Mario Lucic, Michael Tschannen, and Marvin Ritter contributed equally to this work. ICML 2019 camera-ready version. Code available at https://github.com/google/compare_gan

  13. arXiv:1811.11212  [pdf, other

    cs.LG cs.CV stat.ML

    Self-Supervised GANs via Auxiliary Rotation Loss

    Authors: Ting Chen, Xiaohua Zhai, Marvin Ritter, Mario Lucic, Neil Houlsby

    Abstract: Conditional GANs are at the forefront of natural image synthesis. The main drawback of such models is the necessity for labeled data. In this work we exploit two popular unsupervised learning techniques, adversarial training and self-supervision, and take a step towards bridging the gap between conditional and unconditional GANs. In particular, we allow the networks to collaborate on the task of r… ▽ More

    Submitted 9 April, 2019; v1 submitted 27 November, 2018; originally announced November 2018.

  14. arXiv:1711.10958  [pdf, other

    cs.SD cs.AI eess.AS

    Now Playing: Continuous low-power music recognition

    Authors: Blaise Agüera y Arcas, Beat Gfeller, Ruiqi Guo, Kevin Kilgour, Sanjiv Kumar, James Lyon, Julian Odell, Marvin Ritter, Dominik Roblek, Matthew Sharifi, Mihajlo Velimirović

    Abstract: Existing music recognition applications require a connection to a server that performs the actual recognition. In this paper we present a low-power music recognizer that runs entirely on a mobile device and automatically recognizes music without user interaction. To reduce battery consumption, a small music detector runs continuously on the mobile device's DSP chip and wakes up the main applicatio… ▽ More

    Submitted 29 November, 2017; originally announced November 2017.

    Comments: Authors are listed in alphabetical order by last name