Search | arXiv e-print repository

LOCOST: State-Space Models for Long Document Abstractive Summarization

Authors: Florian Le Bronnec, Song Duong, Mathieu Ravaut, Alexandre Allauzen, Nancy F. Chen, Vincent Guigue, Alberto Lumbreras, Laure Soulier, Patrick Gallinari

Abstract: State-space models are a low-complexity alternative to transformers for encoding long sequences and capturing long-term dependencies. We propose LOCOST: an encoder-decoder architecture based on state-space models for conditional text generation with long context inputs. With a computational complexity of $O(L \log L)$, this architecture can handle significantly longer sequences than state-of-the-a… ▽ More State-space models are a low-complexity alternative to transformers for encoding long sequences and capturing long-term dependencies. We propose LOCOST: an encoder-decoder architecture based on state-space models for conditional text generation with long context inputs. With a computational complexity of $O(L \log L)$, this architecture can handle significantly longer sequences than state-of-the-art models that are based on sparse attention patterns. We evaluate our model on a series of long document abstractive summarization tasks. The model reaches a performance level that is 93-96% comparable to the top-performing sparse transformers of the same size while saving up to 50% memory during training and up to 87% during inference. Additionally, LOCOST effectively handles input texts exceeding 600K tokens at inference time, setting new state-of-the-art results on full-book summarization and opening new perspectives for long input processing. △ Less

Submitted 25 March, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

Comments: 9 pages, 5 figures, 7 tables, EACL 2024 conference

arXiv:2302.11269 [pdf, other]

Learning from Multiple Sources for Data-to-Text and Text-to-Data

Authors: Song Duong, Alberto Lumbreras, Mike Gartrell, Patrick Gallinari

Abstract: Data-to-text (D2T) and text-to-data (T2D) are dual tasks that convert structured data, such as graphs or tables into fluent text, and vice versa. These tasks are usually handled separately and use corpora extracted from a single source. Current systems leverage pre-trained language models fine-tuned on D2T or T2D tasks. This approach has two main limitations: first, a separate system has to be tun… ▽ More Data-to-text (D2T) and text-to-data (T2D) are dual tasks that convert structured data, such as graphs or tables into fluent text, and vice versa. These tasks are usually handled separately and use corpora extracted from a single source. Current systems leverage pre-trained language models fine-tuned on D2T or T2D tasks. This approach has two main limitations: first, a separate system has to be tuned for each task and source; second, learning is limited by the scarcity of available corpora. This paper considers a more general scenario where data are available from multiple heterogeneous sources. Each source, with its specific data format and semantic domain, provides a non-parallel corpus of text and structured data. We introduce a variational auto-encoder model with disentangled style and content variables that allows us to represent the diversity that stems from multiple sources of text and data. Our model is designed to handle the tasks of D2T and T2D jointly. We evaluate our model on several datasets, and show that by learning from multiple sources, our model closes the performance gap with its supervised single-source counterpart and outperforms it in some cases. △ Less

Submitted 22 February, 2023; originally announced February 2023.

Comments: AISTATS 2023

arXiv:1812.07360 [pdf, other]

doi 10.1007/s00180-016-0668-0

Non-parametric clustering over user features and latent behavioral functions with dual-view mixture models

Authors: Alberto Lumbreras, Julien Velcin, Marie Guégan, Bertrand Jouve

Abstract: We present a dual-view mixture model to cluster users based on their features and latent behavioral functions. Every component of the mixture model represents a probability density over a feature view for observed user attributes and a behavior view for latent behavioral functions that are indirectly observed through user actions or behaviors. Our task is to infer the groups of users as well as th… ▽ More We present a dual-view mixture model to cluster users based on their features and latent behavioral functions. Every component of the mixture model represents a probability density over a feature view for observed user attributes and a behavior view for latent behavioral functions that are indirectly observed through user actions or behaviors. Our task is to infer the groups of users as well as their latent behavioral functions. We also propose a non-parametric version based on a Dirichlet Process to automatically infer the number of clusters. We test the properties and performance of the model on a synthetic dataset that represents the participation of users in the threads of an online forum. Experiments show that dual-view models outperform single-view ones when one of the views lacks information. △ Less

Submitted 18 December, 2018; originally announced December 2018.

Journal ref: Lumbreras, A., Velcin, J., Guégan, M. et al. Comput Stat (2017) 32:145

arXiv:1812.06866 [pdf, other]

Bayesian Mean-parameterized Nonnegative Binary Matrix Factorization

Authors: Alberto Lumbreras, Louis Filstroff, Cédric Févotte

Abstract: Binary data matrices can represent many types of data such as social networks, votes, or gene expression. In some cases, the analysis of binary matrices can be tackled with nonnegative matrix factorization (NMF), where the observed data matrix is approximated by the product of two smaller nonnegative matrices. In this context, probabilistic NMF assumes a generative model where the data is usually… ▽ More Binary data matrices can represent many types of data such as social networks, votes, or gene expression. In some cases, the analysis of binary matrices can be tackled with nonnegative matrix factorization (NMF), where the observed data matrix is approximated by the product of two smaller nonnegative matrices. In this context, probabilistic NMF assumes a generative model where the data is usually Bernoulli-distributed. Often, a link function is used to map the factorization to the $[0,1]$ range, ensuring a valid Bernoulli mean parameter. However, link functions have the potential disadvantage to lead to uninterpretable models. Mean-parameterized NMF, on the contrary, overcomes this problem. We propose a unified framework for Bayesian mean-parameterized nonnegative binary matrix factorization models (NBMF). We analyze three models which correspond to three possible constraints that respect the mean-parametrization without the need for link functions. Furthermore, we derive a novel collapsed Gibbs sampler and a collapsed variational algorithm to infer the posterior distribution of the factors. Next, we extend the proposed models to a nonparametric setting where the number of used latent dimensions is automatically driven by the observed data. We analyze the performance of our NBMF methods in multiple datasets for different tasks such as dictionary learning and prediction of missing data. Experiments show that our methods provide similar or superior results than the state of the art, while automatically detecting the number of relevant components. △ Less

Submitted 20 June, 2020; v1 submitted 17 December, 2018; originally announced December 2018.

arXiv:1810.00002 [pdf, other]

doi 10.1051/0004-6361/201834312

The missing light of the Hubble Ultra Deep Field

Authors: Alejandro Borlaff, Ignacio Trujillo, Javier Román, John E. Beckman, M. Carmen Eliche-Moral, Raúl Infante-Sáinz, Alejandro Lumbreras, Rodrigo Takuro Sato Martín de Almagro, Carlos Gómez-Guijarro, María Cebrián, Antonio Dorta, Nicolás Cardiel, Mohammad Akhlaghi, Cristina Martínez-Lombilla

Abstract: The Hubble Ultra Deep field (HUDF) is the deepest region ever observed with the Hubble Space Telescope. With the main objective of unveiling the nature of galaxies up to $z \sim 7-8$, the observing and reduction strategy have focused on the properties of small and unresolved objects, rather than the outskirts of the largest objects, which are usually over-subtracted. We aim to create a new set o… ▽ More The Hubble Ultra Deep field (HUDF) is the deepest region ever observed with the Hubble Space Telescope. With the main objective of unveiling the nature of galaxies up to $z \sim 7-8$, the observing and reduction strategy have focused on the properties of small and unresolved objects, rather than the outskirts of the largest objects, which are usually over-subtracted. We aim to create a new set of WFC3/IR mosaics of the HUDF using novel techniques to preserve the properties of the low surface brightness regions. We created ABYSS: a pipeline that optimises the estimate and modelling of low-level systematic effects to obtain a robust background subtraction. We have improved four key points in the reduction: 1) creation of new absolute sky flat fields, 2) extended persistence models, 3) dedicated sky background subtraction and 4) robust co-adding. The new mosaics successfully recover the low surface brightness structure removed on the previous HUDF published reductions. The amount of light recovered with a mean surface brightness dimmer than $\overlineμ=26$ mar arcsec$^{-2}$ is equivalent to a m=19 mag source when compared to the XDF and a m=20 mag compared to the HUDF12. We present a set of techniques to reduce ultra-deep images ($μ>32.5$ mag arcsec$^{-2}$, $3σ$ in $10\times10$ arcsec boxes), that successfully allow to detect the low surface brightness structure of extended sources on ultra deep surveys. The developed procedures are applicable to HST, JWST, EUCLID and many other space and ground-based observatories. We will make the final ABYSS WFC3/IR HUDF mosaics publicly available at http://www.iac.es/proyecto/abyss/. △ Less

Submitted 4 February, 2019; v1 submitted 28 September, 2018; originally announced October 2018.

Comments: Published in Astronomy & Astrophysics

Journal ref: A&A 621, A133 (2019)

arXiv:1801.01799 [pdf, other]

Closed-form Marginal Likelihood in Gamma-Poisson Matrix Factorization

Authors: Louis Filstroff, Alberto Lumbreras, Cédric Févotte

Abstract: We present novel understandings of the Gamma-Poisson (GaP) model, a probabilistic matrix factorization model for count data. We show that GaP can be rewritten free of the score/activation matrix. This gives us new insights about the estimation of the topic/dictionary matrix by maximum marginal likelihood estimation. In particular, this explains the robustness of this estimator to over-specified va… ▽ More We present novel understandings of the Gamma-Poisson (GaP) model, a probabilistic matrix factorization model for count data. We show that GaP can be rewritten free of the score/activation matrix. This gives us new insights about the estimation of the topic/dictionary matrix by maximum marginal likelihood estimation. In particular, this explains the robustness of this estimator to over-specified values of the factorization rank, especially its ability to automatically prune irrelevant dictionary columns, as empirically observed in previous work. The marginalization of the activation matrix leads in turn to a new Monte Carlo Expectation-Maximization algorithm with favorable properties. △ Less

Submitted 31 May, 2018; v1 submitted 5 January, 2018; originally announced January 2018.

Comments: Accepted for publication at ICML 2018

arXiv:1401.0455 [pdf, ps, other]

doi 10.1088/0004-6256/147/4/72

SMA Submillimeter Observations of HL Tau: Revealing a compact molecular outflow

Authors: Alba M. Lumbreras, Luis A. Zapata

Abstract: We present archival high angular resolution ($\sim$ 2$''$) $^{12}$CO(3-2) line and continuum submillimeter observations of the young stellar object HL Tau made with the Submillimeter Array (SMA). The $^{12}$CO(3-2) line observations reveal the presence of a compact and wide opening angle bipolar outflow with a northeast and southwest orientation (P.A. = 50$^\circ$), and that is associated with the… ▽ More We present archival high angular resolution ($\sim$ 2$''$) $^{12}$CO(3-2) line and continuum submillimeter observations of the young stellar object HL Tau made with the Submillimeter Array (SMA). The $^{12}$CO(3-2) line observations reveal the presence of a compact and wide opening angle bipolar outflow with a northeast and southwest orientation (P.A. = 50$^\circ$), and that is associated with the optical and infrared jet emanating from HL Tau with a similar orientation. On the other hand, the 850 $μ$m continuum emission observations exhibit a strong and compact source in the position of HL Tau that has a spatial size of $\sim$ 200 $\times$ 70 AU with a P.A. $=$ 145$^\circ$, and a dust mass of around 0.1 M$_\odot$. These physical parameters are in agreement with values obtained recently from millimeter observations. This submillimeter source is therefore related with the disk surrounding HL Tau. △ Less

Submitted 2 January, 2014; originally announced January 2014.

Comments: Accepted to AJ

arXiv:1309.7187 [pdf]

Analyse des rôles dans les communautés virtuelles : définitions et premières expérimentations sur IMDb

Authors: Alberto Lumbreras, James Lanagan, Julien Velcin, Bertrand Jouve

Abstract: Role analysis in online communities allows us to understand and predict users behavior. Though several approaches have been followed, there is still lack of generalization of their methods and their results. In this paper, we discuss about the ground theory of roles and search for a consistent and computable definition that allows the automatic detection of roles played by users in forum threads o… ▽ More Role analysis in online communities allows us to understand and predict users behavior. Though several approaches have been followed, there is still lack of generalization of their methods and their results. In this paper, we discuss about the ground theory of roles and search for a consistent and computable definition that allows the automatic detection of roles played by users in forum threads on the internet. We analyze the web site IMDb to illustrate the discussion. △ Less

Submitted 11 March, 2016; v1 submitted 27 September, 2013; originally announced September 2013.

Comments: 4e Conférence sur les modèles et l'analyse des réseaux : Approches mathématiques et informatiques, MARAMI 2013, in French

Showing 1–8 of 8 results for author: Lumbreras, A