Skip to main content

Showing 1–18 of 18 results for author: Lacroix, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.04088  [pdf, other

    cs.LG cs.CL

    Mixtral of Experts

    Authors: Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak, Teven Le Scao, Théophile Gervet, Thibaut Lavril, Thomas Wang, Timothée Lacroix , et al. (1 additional authors not shown)

    Abstract: We introduce Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model. Mixtral has the same architecture as Mistral 7B, with the difference that each layer is composed of 8 feedforward blocks (i.e. experts). For every token, at each layer, a router network selects two experts to process the current state and combine their outputs. Even though each token only sees two experts, the selected e… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: See more details at https://mistral.ai/news/mixtral-of-experts/

  2. arXiv:2310.06825  [pdf, other

    cs.CL cs.AI cs.LG

    Mistral 7B

    Authors: Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed

    Abstract: We introduce Mistral 7B v0.1, a 7-billion-parameter language model engineered for superior performance and efficiency. Mistral 7B outperforms Llama 2 13B across all evaluated benchmarks, and Llama 1 34B in reasoning, mathematics, and code generation. Our model leverages grouped-query attention (GQA) for faster inference, coupled with sliding window attention (SWA) to effectively handle sequences o… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: Models and code are available at https://mistral.ai/news/announcing-mistral-7b/

  3. arXiv:2305.15239  [pdf, other

    cs.AI cs.CY cs.LG

    Deep Learning and Ethics

    Authors: Travis LaCroix, Simon J. D. Prince

    Abstract: This article appears as chapter 21 of Prince (2023, Understanding Deep Learning); a complete draft of the textbook is available here: http://udlbook.com. This chapter considers potential harms arising from the design and use of AI systems. These include algorithmic bias, lack of explainability, data privacy violations, militarization, fraud, and environmental concerns. The aim is not to provide ad… ▽ More

    Submitted 20 June, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Copyright in this Work has been licensed exclusively to The MIT Press, https://mitpress.mit.edu, which will be releasing the final version to the public in 2023. All inquiries regarding rights should be addressed to The MIT Press, Rights and Permissions Department

  4. arXiv:2302.13971  [pdf, other

    cs.CL

    LLaMA: Open and Efficient Foundation Language Models

    Authors: Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample

    Abstract: We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is co… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

  5. arXiv:2210.12283  [pdf, other

    cs.AI cs.LG

    Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs

    Authors: Albert Q. Jiang, Sean Welleck, ** Peng Zhou, Wenda Li, Jiacheng Liu, Mateja Jamnik, Timothée Lacroix, Yuhuai Wu, Guillaume Lample

    Abstract: The formalization of existing mathematical proofs is a notoriously difficult process. Despite decades of research on automation and proof assistants, writing formal proofs remains arduous and only accessible to a few experts. While previous studies to automate formalization focused on powerful search algorithms, no attempts were made to take advantage of available informal proofs. In this work, we… ▽ More

    Submitted 20 February, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

  6. arXiv:2207.00868  [pdf, other

    cs.AI cs.CL cs.CY cs.LG

    The Linguistic Blind Spot of Value-Aligned Agency, Natural and Artificial

    Authors: Travis LaCroix

    Abstract: The value-alignment problem for artificial intelligence (AI) asks how we can ensure that the 'values' (i.e., objective functions) of artificial systems are aligned with the values of humanity. In this paper, I argue that linguistic communication (natural language) is a necessary condition for robust value alignment. I discuss the consequences that the truth of this claim would have for research pr… ▽ More

    Submitted 2 July, 2022; originally announced July 2022.

    Comments: 49 pages; 2 Figures; 1 Table; -- Under Review

  7. arXiv:2205.11491  [pdf, other

    cs.CL cs.AI

    HyperTree Proof Search for Neural Theorem Proving

    Authors: Guillaume Lample, Marie-Anne Lachaux, Thibaut Lavril, Xavier Martinet, Amaury Hayat, Gabriel Ebner, Aurélien Rodriguez, Timothée Lacroix

    Abstract: We propose an online training procedure for a transformer-based automated theorem prover. Our approach leverages a new search algorithm, HyperTree Proof Search (HTPS), inspired by the recent success of AlphaZero. Our model learns from previous proof searches through online training, allowing it to generalize to domains far from the training distribution. We report detailed ablations of our pipelin… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

  8. arXiv:2204.05151  [pdf, other

    cs.CY cs.AI cs.LG

    Metaethical Perspectives on 'Benchmarking' AI Ethics

    Authors: Travis LaCroix, Alexandra Sasha Luccioni

    Abstract: Benchmarks are seen as the cornerstone for measuring technical progress in Artificial Intelligence (AI) research and have been developed for a variety of tasks ranging from question answering to facial recognition. An increasingly prominent research area in AI is ethics, which currently has no set of benchmarks nor commonly accepted way for measuring the 'ethicality' of an AI system. In this paper… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

    Comments: 39 Pages

  9. Moral Dilemmas for Moral Machines

    Authors: Travis LaCroix

    Abstract: Autonomous systems are being developed and deployed in situations that may require some degree of ethical decision-making ability. As a result, research in machine ethics has proliferated in recent years. This work has included using moral dilemmas as validation mechanisms for implementing decision-making algorithms in ethically-loaded situations. Using trolley-style problems in the context of aut… ▽ More

    Submitted 11 March, 2022; originally announced March 2022.

    Comments: AI Ethics (2022)

  10. arXiv:2112.08256  [pdf, other

    cs.CY cs.AI cs.CL

    Est-ce que vous compute? Code-switching, cultural identity, and AI

    Authors: Arianna Falbo, Travis LaCroix

    Abstract: Cultural code-switching concerns how we adjust our overall behaviours, manners of speaking, and appearance in response to a perceived change in our social environment. We defend the need to investigate cultural code-switching capacities in artificial intelligence systems. We explore a series of ethical and epistemic issues that arise when bringing cultural code-switching to bear on artificial inte… ▽ More

    Submitted 15 December, 2021; originally announced December 2021.

    Comments: 19 pages. Under Review. Please cite published version, if available

  11. arXiv:2101.10276  [pdf, other

    cs.LG cs.AI cs.MA

    Emergent Communication under Competition

    Authors: Michael Noukhovitch, Travis LaCroix, Angeliki Lazaridou, Aaron Courville

    Abstract: The literature in modern machine learning has only negative results for learning to communicate between competitive agents using standard RL. We introduce a modified sender-receiver game to study the spectrum of partially-competitive scenarios and show communication can indeed emerge in a competitive setting. We empirically demonstrate three key takeaways for future research. First, we show that c… ▽ More

    Submitted 25 January, 2021; originally announced January 2021.

    Comments: To be presented at AAMAS 2021

  12. arXiv:2006.05203  [pdf, other

    cs.CY cs.AI cs.GT cs.LG

    The Tragedy of the AI Commons

    Authors: Travis LaCroix, Aydin Mohseni

    Abstract: Policy and guideline proposals for ethical artificial-intelligence research have proliferated in recent years. These are supposed to guide the socially-responsible development of AI for the common good. However, there typically exist incentives for non-cooperation (i.e., non-adherence to such policies and guidelines); and, these proposals often lack effective mechanisms to enforce their own normat… ▽ More

    Submitted 18 January, 2021; v1 submitted 9 June, 2020; originally announced June 2020.

    Comments: 40 Pages, 5 Figures

  13. arXiv:2004.04926  [pdf, ps, other

    stat.ML cs.LG

    Tensor Decompositions for temporal knowledge base completion

    Authors: Timothée Lacroix, Guillaume Obozinski, Nicolas Usunier

    Abstract: Most algorithms for representation learning and link prediction in relational data have been designed for static data. However, the data they are applied to usually evolves with time, such as friend graphs in social networks or user interactions with items in recommender systems. This is also the case for knowledge bases, which contain facts such as (US, has president, B. Obama, [2009-2017]) that… ▽ More

    Submitted 10 April, 2020; originally announced April 2020.

  14. arXiv:2001.00006  [pdf, ps, other

    cs.CY cs.AI cs.LG stat.ML

    Learning from Learning Machines: Optimisation, Rules, and Social Norms

    Authors: Travis LaCroix, Yoshua Bengio

    Abstract: There is an analogy between machine learning systems and economic entities in that they are both adaptive, and their behaviour is specified in a more-or-less explicit way. It appears that the area of AI that is most analogous to the behaviour of economic entities is that of morally good decision-making, but it is an open question as to how precisely moral behaviour can be achieved in an AI system.… ▽ More

    Submitted 29 December, 2019; originally announced January 2020.

  15. arXiv:1911.11668  [pdf, ps, other

    cs.LG cs.AI cs.CL stat.ML

    Biology and Compositionality: Empirical Considerations for Emergent-Communication Protocols

    Authors: Travis LaCroix

    Abstract: Significant advances have been made in artificial systems by using biological systems as a guide. However, there is often little interaction between computational models for emergent communication and biological models of the emergence of language. Many researchers in language origins and emergent communication take compositionality as their primary target for explaining how simple communication s… ▽ More

    Submitted 27 December, 2019; v1 submitted 26 November, 2019; originally announced November 2019.

    Comments: Accepted for NeurIPS 2019 workshop Emergent Communication: Towards Natural Language

  16. arXiv:1903.12287  [pdf, other

    cs.LG cs.AI cs.DC cs.SI stat.ML

    PyTorch-BigGraph: A Large-scale Graph Embedding System

    Authors: Adam Lerer, Ledell Wu, Jiajun Shen, Timothee Lacroix, Luca Wehrstedt, Abhijit Bose, Alex Peysakhovich

    Abstract: Graph embedding methods produce unsupervised node features from graphs that can then be used for a variety of machine learning tasks. Modern graphs, particularly in industrial applications, contain billions of nodes and trillions of edges, which exceeds the capability of existing embedding systems. We present PyTorch-BigGraph (PBG), an embedding system that incorporates several modifications to tr… ▽ More

    Submitted 9 April, 2019; v1 submitted 28 March, 2019; originally announced March 2019.

    Journal ref: Proceedings of The Conference on Systems and Machine Learning, 2019

  17. arXiv:1806.07297  [pdf, other

    stat.ML cs.AI cs.LG cs.SI

    Canonical Tensor Decomposition for Knowledge Base Completion

    Authors: Timothée Lacroix, Nicolas Usunier, Guillaume Obozinski

    Abstract: The problem of Knowledge Base Completion can be framed as a 3rd-order binary tensor completion problem. In this light, the Canonical Tensor Decomposition (CP) (Hitchcock, 1927) seems like a natural solution; however, current implementations of CP on standard Knowledge Base Completion benchmarks are lagging behind their competitors. In this work, we attempt to understand the limits of CP for knowle… ▽ More

    Submitted 19 June, 2018; originally announced June 2018.

  18. arXiv:1611.00625  [pdf, other

    cs.LG cs.AI

    TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games

    Authors: Gabriel Synnaeve, Nantas Nardelli, Alex Auvolat, Soumith Chintala, Timothée Lacroix, Zeming Lin, Florian Richoux, Nicolas Usunier

    Abstract: We present TorchCraft, a library that enables deep learning research on Real-Time Strategy (RTS) games such as StarCraft: Brood War, by making it easier to control these games from a machine learning framework, here Torch. This white paper argues for using RTS games as a benchmark for AI research, and describes the design and components of TorchCraft.

    Submitted 3 November, 2016; v1 submitted 1 November, 2016; originally announced November 2016.

    ACM Class: I.2.1