Skip to main content

Showing 1–10 of 10 results for author: Furlanello, T

.
  1. arXiv:2404.00399  [pdf, other

    cs.CL cs.AI cs.LG

    Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

    Authors: Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, Diganta Misra, Ben Bogin, Xuan-Son Vu, Marzena Karpinska, Arnav Varma Dantuluri, Wojciech Kusa, Tommaso Furlanello, Rio Yokota, Niklas Muennighoff, Suhas Pai, Tosin Adewumi, Veronika Laippala, Xiaozhe Yao, Adalberto Junior, Alpay Ariyak , et al. (20 additional authors not shown)

    Abstract: Pretrained language models underpin several AI applications, but their high computational cost for training limits accessibility. Initiatives such as BLOOM and StarCoder aim to democratize access to pretrained models for collaborative community development. However, such existing models face challenges: limited multilingual capabilities, continual pretraining causing catastrophic forgetting, where… ▽ More

    Submitted 23 April, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: Preprint

  2. arXiv:1906.10437  [pdf, other

    cs.LG stat.ML

    Learning Causal State Representations of Partially Observable Environments

    Authors: Amy Zhang, Zachary C. Lipton, Luis Pineda, Kamyar Azizzadenesheli, Anima Anandkumar, Laurent Itti, Joelle Pineau, Tommaso Furlanello

    Abstract: Intelligent agents can cope with sensory-rich environments by learning task-agnostic state abstractions. In this paper, we propose an algorithm to approximate causal states, which are the coarsest partition of the joint history of actions and observations in partially-observable Markov decision processes (POMDP). Our method learns approximate causal state representations from RNNs trained to predi… ▽ More

    Submitted 8 February, 2021; v1 submitted 25 June, 2019; originally announced June 2019.

    Comments: 35 pages, 8 figures

  3. arXiv:1805.04770  [pdf, other

    stat.ML cs.AI cs.LG

    Born Again Neural Networks

    Authors: Tommaso Furlanello, Zachary C. Lipton, Michael Tschannen, Laurent Itti, Anima Anandkumar

    Abstract: Knowledge Distillation (KD) consists of transferring “knowledge” from one machine learning model (the teacher) to another (the student). Commonly, the teacher is a high-capacity model with formidable performance, while the student is more compact. By transferring knowledge, one hopes to benefit from the student’s compactness, without sacrificing too much performance. We study KD from a new p… ▽ More

    Submitted 29 June, 2018; v1 submitted 12 May, 2018; originally announced May 2018.

    Comments: Published @ICML 2018

  4. arXiv:1804.02088  [pdf, other

    cs.CV

    Question Type Guided Attention in Visual Question Answering

    Authors: Yang Shi, Tommaso Furlanello, Sheng Zha, Animashree Anandkumar

    Abstract: Visual Question Answering (VQA) requires integration of feature maps with drastically different structures and focus of the correct regions. Image descriptors have structures at multiple spatial scales, while lexical inputs inherently follow a temporal sequence and naturally cluster into semantically different question types. A lot of previous works use complex models to extract feature representa… ▽ More

    Submitted 18 July, 2018; v1 submitted 5 April, 2018; originally announced April 2018.

  5. arXiv:1707.08308  [pdf, other

    cs.LG

    Tensor Regression Networks

    Authors: Jean Kossaifi, Zachary C. Lipton, Arinbjorn Kolbeinsson, Aran Khanna, Tommaso Furlanello, Anima Anandkumar

    Abstract: Convolutional neural networks typically consist of many convolutional layers followed by one or more fully connected layers. While convolutional layers map between high-order activation tensors, the fully connected layers operate on flattened activation vectors. Despite empirical success, this approach has notable drawbacks. Flattening followed by fully connected layers discards multilinear struct… ▽ More

    Submitted 20 July, 2020; v1 submitted 26 July, 2017; originally announced July 2017.

  6. arXiv:1706.06706  [pdf, other

    cs.CV

    Compact Tensor Pooling for Visual Question Answering

    Authors: Yang Shi, Tommaso Furlanello, Anima Anandkumar

    Abstract: Performing high level cognitive tasks requires the integration of feature maps with drastically different structure. In Visual Question Answering (VQA) image descriptors have spatial structures, while lexical inputs inherently follow a temporal sequence. The recently proposed Multimodal Compact Bilinear pooling (MCB) forms the outer products, via count-sketch approximation, of the visual and textu… ▽ More

    Submitted 20 June, 2017; originally announced June 2017.

  7. arXiv:1706.00439  [pdf, other

    cs.LG

    Tensor Contraction Layers for Parsimonious Deep Nets

    Authors: Jean Kossaifi, Aran Khanna, Zachary C. Lipton, Tommaso Furlanello, Anima Anandkumar

    Abstract: Tensors offer a natural representation for many kinds of data frequently encountered in machine learning. Images, for example, are naturally represented as third order tensors, where the modes correspond to height, width, and channels. Tensor methods are noted for their ability to discover multi-dimensional dependencies, and tensor decompositions in particular, have been used to produce compact lo… ▽ More

    Submitted 1 June, 2017; originally announced June 2017.

  8. arXiv:1606.02355  [pdf, other

    cs.LG cs.AI stat.ML

    Active Long Term Memory Networks

    Authors: Tommaso Furlanello, Jia** Zhao, Andrew M. Saxe, Laurent Itti, Bosco S. Tjan

    Abstract: Continual Learning in artificial neural networks suffers from interference and forgetting when different tasks are learned sequentially. This paper introduces the Active Long Term Memory Networks (A-LTM), a model of sequential multi-task deep learning that is able to maintain previously learned association between sensory input and behavioral output while acquiring knew knowledge. A-LTM exploits t… ▽ More

    Submitted 7 June, 2016; originally announced June 2016.

  9. arXiv:1411.0827  [pdf, other

    cs.SI math.DS physics.soc-ph

    Entropy Dynamics of Community Alignment in the Italian Parliament Time-Dependent Network

    Authors: Gabriele Lami, Marco Cristoforetti, Giuseppe Jurman, Cesare Furlanello, Tommaso Furlanello

    Abstract: Complex institutions are typically characterized by meso-scale structures which are fundamental for the successful coordination of multiple agents. Here we introduce a framework to study the temporal dynamics of the node-community relationship based on the concept of community alignment, a measure derived from the modularity matrix that defines the alignment of a node with respect to the core of i… ▽ More

    Submitted 4 November, 2014; originally announced November 2014.

  10. arXiv:1310.6547  [pdf, ps, other

    q-bio.NC q-bio.QM stat.ML

    Sparse Predictive Structure of Deconvolved Functional Brain Networks

    Authors: Tommaso Furlanello, Marco Cristoforetti, Cesare Furlanello, Giuseppe Jurman

    Abstract: The functional and structural representation of the brain as a complex network is marked by the fact that the comparison of noisy and intrinsically correlated high-dimensional structures between experimental conditions or groups shuns typical mass univariate methods. Furthermore most network estimation methods cannot distinguish between real and spurious correlation arising from the convolution du… ▽ More

    Submitted 24 October, 2013; originally announced October 2013.