Skip to main content

Showing 1–12 of 12 results for author: Farabet, C

.
  1. arXiv:2404.07839  [pdf, other

    cs.LG cs.AI cs.CL

    RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

    Authors: Aleksandar Botev, Soham De, Samuel L Smith, Anushan Fernando, George-Cristian Muraru, Ruba Haroun, Leonard Berrada, Razvan Pascanu, Pier Giuseppe Sessa, Robert Dadashi, Léonard Hussenot, Johan Ferret, Sertan Girgin, Olivier Bachem, Alek Andreev, Kathleen Kenealy, Thomas Mesnard, Cassidy Hardin, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, Pouya Tafti , et al. (37 additional authors not shown)

    Abstract: We introduce RecurrentGemma, an open language model which uses Google's novel Griffin architecture. Griffin combines linear recurrences with local attention to achieve excellent performance on language. It has a fixed-sized state, which reduces memory use and enables efficient inference on long sequences. We provide a pre-trained model with 2B non-embedding parameters, and an instruction tuned var… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  2. arXiv:2403.08295  [pdf, other

    cs.CL cs.AI

    Gemma: Open Models Based on Gemini Research and Technology

    Authors: Gemma Team, Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, Pouya Tafti, Léonard Hussenot, Pier Giuseppe Sessa, Aakanksha Chowdhery, Adam Roberts, Aditya Barua, Alex Botev, Alex Castro-Ros, Ambrose Slone, Amélie Héliou, Andrea Tacchetti, Anna Bulanova, Antonia Paterson, Beth Tsai, Bobak Shahriari , et al. (83 additional authors not shown)

    Abstract: This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models. Gemma models demonstrate strong performance across academic benchmarks for language understanding, reasoning, and safety. We release two sizes of models (2 billion and 7 billion parameters), and provide both pretrained and fine-tuned checkpoints. Ge… ▽ More

    Submitted 16 April, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  3. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  4. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  5. arXiv:2311.02462  [pdf, ps, other

    cs.AI

    Levels of AGI for Operationalizing Progress on the Path to AGI

    Authors: Meredith Ringel Morris, Jascha Sohl-dickstein, Noah Fiedel, Tris Warkentin, Allan Dafoe, Aleksandra Faust, Clement Farabet, Shane Legg

    Abstract: We propose a framework for classifying the capabilities and behavior of Artificial General Intelligence (AGI) models and their precursors. This framework introduces levels of AGI performance, generality, and autonomy, providing a common language to compare models, assess risks, and measure progress along the path to AGI. To develop our framework, we analyze existing definitions of AGI, and distill… ▽ More

    Submitted 5 June, 2024; v1 submitted 4 November, 2023; originally announced November 2023.

    Comments: version 4 - Position Paper accepted to ICML 2024. Note that due to ICML position paper titling format requirements, the title has changed slightly from that of the original arXiv pre-print. The original pre-print title was "Levels of AGI: Operationalizing Progress on the Path to AGI" but the official published title for ICML 2024 is "Levels of AGI for Operationalizing Progress on the Path to AGI"

    Journal ref: Proceedings of ICML 2024

  6. arXiv:2103.16130  [pdf, other

    cs.CV

    Active Learning for Deep Object Detection via Probabilistic Modeling

    Authors: Jiwoong Choi, Ismail Elezi, Hyuk-Jae Lee, Clement Farabet, Jose M. Alvarez

    Abstract: Active learning aims to reduce labeling costs by selecting only the most informative samples on a dataset. Few existing works have addressed active learning for object detection. Most of these methods are based on multiple models or are straightforward extensions of classification methods, hence estimate an image's informativeness using only the classification head. In this paper, we propose a nov… ▽ More

    Submitted 20 August, 2021; v1 submitted 30 March, 2021; originally announced March 2021.

    Comments: ICCV 2021, includes supplementary material

  7. arXiv:2004.04699  [pdf, other

    cs.CV

    Scalable Active Learning for Object Detection

    Authors: Elmar Haussmann, Michele Fenzi, Kashyap Chitta, Jan Ivanecky, Hanson Xu, Donna Roy, Akshita Mittel, Nicolas Koumchatzky, Clement Farabet, Jose M. Alvarez

    Abstract: Deep Neural Networks trained in a fully supervised fashion are the dominant technology in perception-based autonomous driving systems. While collecting large amounts of unlabeled data is already a major undertaking, only a subset of it can be labeled by humans due to the effort needed for high-quality annotation. Therefore, finding the right data to label has become a key challenge. Active learnin… ▽ More

    Submitted 9 April, 2020; originally announced April 2020.

    Comments: accepted at IEEE-IV2020

  8. arXiv:1905.12737  [pdf, other

    cs.LG stat.ML

    Training Data Subset Search with Ensemble Active Learning

    Authors: Kashyap Chitta, Jose M. Alvarez, Elmar Haussmann, Clement Farabet

    Abstract: Deep Neural Networks (DNNs) often rely on very large datasets for training. Given the large size of such datasets, it is conceivable that they contain certain samples that either do not contribute or negatively impact the DNN's optimization. Modifying the training distribution in a way that excludes such samples could provide an effective solution to both improve performance and reduce training ti… ▽ More

    Submitted 7 November, 2020; v1 submitted 29 May, 2019; originally announced May 2019.

  9. arXiv:1301.3572  [pdf, other

    cs.CV

    Indoor Semantic Segmentation using depth information

    Authors: Camille Couprie, Clément Farabet, Laurent Najman, Yann LeCun

    Abstract: This work addresses multi-class segmentation of indoor scenes with RGB-D inputs. While this area of research has gained much attention recently, most works still rely on hand-crafted features. In contrast, we apply a multiscale convolutional network to learn features directly from the images and the depth information. We obtain state-of-the-art on the NYU-v2 depth dataset with an accuracy of 64.5%… ▽ More

    Submitted 14 March, 2013; v1 submitted 15 January, 2013; originally announced January 2013.

    Comments: 8 pages, 3 figures

  10. arXiv:1301.2820  [pdf, other

    cs.CV

    Clustering Learning for Robotic Vision

    Authors: Eugenio Culurciello, Jordan Bates, Aysegul Dundar, Jose Carrasco, Clement Farabet

    Abstract: We present the clustering learning technique applied to multi-layer feedforward deep neural networks. We show that this unsupervised learning technique can compute network filters with only a few minutes and a much reduced set of parameters. The goal of this paper is to promote the technique for general-purpose robotic vision systems. We report its use in static image datasets and object tracking… ▽ More

    Submitted 13 March, 2013; v1 submitted 13 January, 2013; originally announced January 2013.

    Comments: Code for this paper is available here: https://github.com/culurciello/CL_paper1_code

  11. arXiv:1301.1671  [pdf, other

    cs.CV

    Causal graph-based video segmentation

    Authors: Camille Couprie, Clément Farabet, Yann LeCun

    Abstract: Numerous approaches in image processing and computer vision are making use of super-pixels as a pre-processing step. Among the different methods producing such over-segmentation of an image, the graph-based approach of Felzenszwalb and Huttenlocher is broadly employed. One of its interesting properties is that the regions are computed in a greedy manner in quasi-linear time. The algorithm may be t… ▽ More

    Submitted 8 January, 2013; originally announced January 2013.

    Comments: 6 pages, 5 figures

  12. arXiv:1202.2160  [pdf, other

    cs.CV cs.LG

    Scene Parsing with Multiscale Feature Learning, Purity Trees, and Optimal Covers

    Authors: Clément Farabet, Camille Couprie, Laurent Najman, Yann LeCun

    Abstract: Scene parsing, or semantic segmentation, consists in labeling each pixel in an image with the category of the object it belongs to. It is a challenging task that involves the simultaneous detection, segmentation and recognition of all the objects in the image. The scene parsing method proposed here starts by computing a tree of segments from a graph of pixel dissimilarities. Simultaneously, a se… ▽ More

    Submitted 13 July, 2012; v1 submitted 9 February, 2012; originally announced February 2012.

    Comments: 9 pages, 4 figures - Published in 29th International Conference on Machine Learning (ICML 2012), Jun 2012, Edinburgh, United Kingdom