Skip to main content

Showing 1–45 of 45 results for author: Torres-Moreno, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.08870  [pdf, other

    cs.RO stat.AP

    Benchmarking Particle Filter Algorithms for Efficient Velodyne-Based Vehicle Localization

    Authors: Jose Luis Blanco-Claraco, Francisco Mañas-Alvarez, Jose Luis Torres-Moreno, Francisco Rodriguez, Antonio Gimenez-Fernandez

    Abstract: Kee** a vehicle well-localized within a prebuilt-map is at the core of any autonomous vehicle navigation system. In this work, we show that both standard SIR sampling and rejection-based optimal sampling are suitable for efficient (10 to 20 ms) real-time pose tracking without feature detection that is using raw point clouds from a 3D LiDAR. Motivated by the large amount of information captured b… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: 24 pages, 13 figures

  2. arXiv:2112.13241  [pdf, other

    cs.CL cs.AI

    A Preliminary Study for Literary Rhyme Generation based on Neuronal Representation, Semantics and Shallow Parsing

    Authors: Luis-Gil Moreno-Jiménez, Juan-Manuel Torres-Moreno, Roseli S. Wedemann

    Abstract: In recent years, researchers in the area of Computational Creativity have studied the human creative process proposing different approaches to reproduce it with a formal procedure. In this paper, we introduce a model for the generation of literary rhymes in Spanish, combining structures of language and neural network models %(\textit{Word2vec}).%, into a structure for semantic assimilation. The re… ▽ More

    Submitted 25 December, 2021; originally announced December 2021.

    Comments: 7 pages, 2 figures

    Journal ref: STIL 2021 - Symposium in Information and Human Language Technology / Bracis

  3. arXiv:2112.10189  [pdf, ps, other

    cs.CL cs.IR

    LUC at ComMA-2021 Shared Task: Multilingual Gender Biased and Communal Language Identification without using linguistic features

    Authors: Rodrigo Cuéllar-Hidalgo, Julio de Jesús Guerrero-Zambrano, Dominic Forest, Gerardo Reyes-Salgado, Juan-Manuel Torres-Moreno

    Abstract: This work aims to evaluate the ability that both probabilistic and state-of-the-art vector space modeling (VSM) methods provide to well known machine learning algorithms to identify social network documents to be classified as aggressive, gender biased or communally charged. To this end, an exploratory stage was performed first in order to find relevant settings to test, i.e. by using training and… ▽ More

    Submitted 19 December, 2021; originally announced December 2021.

    Comments: 6 pages

    Journal ref: ComMA-2021 Shared Task: Multilingual Gender Biased and Communal Language Identification

  4. arXiv:2005.08223  [pdf, ps, other

    cs.CL cs.IR

    LiSSS: A toy corpus of Spanish Literary Sentences for Emotions detection

    Authors: Juan-Manuel Torres-Moreno, Luis-Gil Moreno-Jiménez

    Abstract: In this work we present a new small data-set in Computational Creativity (CC) field, the Spanish Literary Sentences for emotions detection corpus (LISSS). We address this corpus of literary sentences in order to evaluate or design algorithms of emotions classification and detection. We have constitute this corpus by manually classifying the sentences in a set of emotions: Love, Fear, Happiness, An… ▽ More

    Submitted 6 June, 2020; v1 submitted 17 May, 2020; originally announced May 2020.

    Comments: 8 pages, 3 tables

  5. arXiv:2005.00468  [pdf

    cs.IR cs.CL

    Automatic Discourse Segmentation: Review and Perspectives

    Authors: Iria da Cunha, Juan-Manuel Torres-Moreno

    Abstract: Multilingual discourse parsing is a very prominent research topic. The first stage for discourse parsing is discourse segmentation. The study reported in this article addresses a review of two on-line available discourse segmenters (for English and Portuguese). We evaluate the possibility of develo** similar discourse segmenters for Spanish, French and African languages.

    Submitted 1 May, 2020; originally announced May 2020.

    Comments: 5 pages, 1 figure

    Journal ref: International Workshop on African Human Language Technologies. 17-20 Jan 2010

  6. arXiv:2004.06747  [pdf, other

    cs.IR cs.CL

    Extending Text Informativeness Measures to Passage Interestingness Evaluation (Language Model vs. Word Embedding)

    Authors: Carlos-Emiliano González-Gallardo, Eric SanJuan, Juan-Manuel Torres-Moreno

    Abstract: Standard informativeness measures used to evaluate Automatic Text Summarization mostly rely on n-gram overlap** between the automatic summary and the reference summaries. These measures differ from the metric they use (cosine, ROUGE, Kullback-Leibler, Logarithm Similarity, etc.) and the bag of terms they consider (single words, word n-grams, entities, nuggets, etc.). Recent word embedding approa… ▽ More

    Submitted 14 April, 2020; originally announced April 2020.

  7. arXiv:2004.04468  [pdf, other

    cs.CL

    A Multilingual Study of Multi-Sentence Compression using Word Vertex-Labeled Graphs and Integer Linear Programming

    Authors: Elvys Linhares Pontes, Stéphane Huet, Juan-Manuel Torres-Moreno, Thiago G. da Silva, Andréa Carneiro Linhares

    Abstract: Multi-Sentence Compression (MSC) aims to generate a short sentence with the key information from a cluster of similar sentences. MSC enables summarization and question-answering systems to generate outputs combining fully formed sentences from one or several documents. This paper describes an Integer Linear Programming method for MSC using a vertex-labeled graph to select different keywords, with… ▽ More

    Submitted 9 April, 2020; originally announced April 2020.

    Comments: Preprint version

    Journal ref: Computación y Sistemas Vo. 24, No. 2, 2020

  8. arXiv:2002.04095  [pdf, other

    cs.CL

    Automatic Discourse Segmentation: an evaluation in French

    Authors: Rémy Saksik, Alejandro Molina-Villegas, Andréa Carneiro Linhares, Juan-Manuel Torres-Moreno

    Abstract: In this article, we describe some discursive segmentation methods as well as a preliminary evaluation of the segmentation quality. Although our experiment were carried for documents in French, we have developed three discursive segmentation models solely based on resources simultaneously available in several languages: marker lists and a statistic POS labeling. We have also carried out automatic e… ▽ More

    Submitted 11 June, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

    Comments: 7 pages, 2 figures, 2 tables

  9. arXiv:2001.11382  [pdf, ps, other

    cs.CL cs.IR

    Intweetive Text Summarization

    Authors: Jean Valère Cossu, Juan-Manuel Torres-Moreno, Eric SanJuan, Marc El-Bèze

    Abstract: The amount of user generated contents from various social medias allows analyst to handle a wide view of conversations on several topics related to their business. Nevertheless kee** up-to-date with this amount of information is not humanly feasible. Automatic Summarization then provides an interesting mean to digest the dynamics and the mass volume of contents. In this paper, we address the iss… ▽ More

    Submitted 16 January, 2020; originally announced January 2020.

    Comments: 8 pages, 4 tables

    Journal ref: International Journal of Computational Linguistics and Applications vol. 7, no. 1, 2016, pp. 67-83

  10. arXiv:2001.11381  [pdf, other

    cs.CL

    Generación automática de frases literarias en español

    Authors: Luis-Gil Moreno-Jiménez, Juan-Manuel Torres-Moreno, Roseli S. Wedemann

    Abstract: In this work we present a state of the art in the area of Computational Creativity (CC). In particular, we address the automatic generation of literary sentences in Spanish. We propose three models of text generation based mainly on statistical algorithms and shallow parsing analysis. We also present some rather encouraging preliminary results.

    Submitted 17 January, 2020; originally announced January 2020.

    Comments: 13 pages, in Spanish, 6 figures, 3 tables

  11. arXiv:2001.10613  [pdf, other

    cs.CY cs.CL cs.IR

    Predicting Personalized Academic and Career Roads: First Steps Toward a Multi-Uses Recommender System

    Authors: Alexandre Nadjem, Juan-Manuel Torres-Moreno, Marc El-Bèze, Guillaume Marrel, Benoît Bonte

    Abstract: Nobody knows what one's do in the future and everyone will have had a different answer to the question : how do you see yourself in five years after your current job/diploma? In this paper we introduce concepts, large categories of fields of studies or job domains in order to represent the vision of the future of the user's trajectory. Then, we show how they can influence the prediction when propo… ▽ More

    Submitted 3 January, 2020; originally announced January 2020.

    Comments: 4 pages, 3 figures, 4 tables

    Journal ref: Digital Tools & Uses Congress (DTUC '18), pp 1--4, 2018, Paris, France

  12. arXiv:2001.07098  [pdf, other

    cs.CL cs.IR

    Audio Summarization with Audio Features and Probability Distribution Divergence

    Authors: Carlos-Emiliano González-Gallardo, Romain Deveaud, Eric SanJuan, Juan-Manuel Torres-Moreno

    Abstract: The automatic summarization of multimedia sources is an important task that facilitates the understanding of an individual by condensing the source while maintaining relevant information. In this paper we focus on audio summarization based on audio features and the probability of distribution divergence. Our method, based on an extractive summarization approach, aims to select the most relevant se… ▽ More

    Submitted 2 April, 2020; v1 submitted 20 January, 2020; originally announced January 2020.

    Comments: 20th International Conference on Computational Linguistics and Intelligent Text Processing

  13. arXiv:2001.06190  [pdf

    cs.AI

    Visual Simplified Characters' Emotion Emulator Implementing OCC Model

    Authors: Ana Lilia Laureano-Cruces, Laura Hernández-Domínguez, Martha Mora-Torres, Juan-Manuel Torres-Moreno, Jaime Enrique Cabrera-López

    Abstract: In this paper, we present a visual emulator of the emotions seen in characters in stories. This system is based on a simplified view of the cognitive structure of emotions proposed by Ortony, Clore and Collins (OCC Model). The goal of this paper is to provide a visual platform that allows us to observe changes in the characters' different emotions, and the intricate interrelationships between: 1)… ▽ More

    Submitted 17 January, 2020; originally announced January 2020.

    Comments: 7 pages, 14 figures, 2 tables

    Journal ref: CGST Conference on Computer Science and Engineering, Istanbul, Turkey, 19-21 December 2011

  14. arXiv:2001.05285  [pdf, other

    cs.CL

    Detecting New Word Meanings: A Comparison of Word Embedding Models in Spanish

    Authors: Andrés Torres-Rivera, Juan-Manuel Torres-Moreno

    Abstract: Semantic neologisms (SN) are defined as words that acquire a new word meaning while maintaining their form. Given the nature of this kind of neologisms, the task of identifying these new word meanings is currently performed manually by specialists at observatories of neology. To detect SN in a semi-automatic way, we developed a system that implements a combination of the following strategies: topi… ▽ More

    Submitted 12 January, 2020; originally announced January 2020.

    Comments: 16 pages, 3 figures

    Journal ref: COnference en Recherche d'Informations et Applications {CORIA} 2019 France

  15. arXiv:1912.09558  [pdf, ps, other

    cs.CL

    RIMAX: Ranking Semantic Rhymes by calculating Definition Similarity

    Authors: Alfonso Medina-Urrea, Juan-Manuel Torres-Moreno

    Abstract: This paper presents RIMAX, a new system for detecting semantic rhymes, using a Comprehensive Mexican Spanish Dictionary (DEM) and its Rhyming Dictionary (REM). We use the Vector Space Model to calculate the similarity of the definition of a query with the definitions corresponding to the assonant and consonant rhymes of the query. The preliminary results using a manual evaluation are very encourag… ▽ More

    Submitted 25 December, 2019; v1 submitted 19 December, 2019; originally announced December 2019.

    Comments: 5 pages

  16. arXiv:1903.07397  [pdf, other

    cs.CL

    Un duel probabiliste pour départager deux présidents (LIA @ DEFT'2005)

    Authors: Marc El-Bèze, Juan-Manuel Torres-Moreno, Frédéric Béchet

    Abstract: We present a set of probabilistic models applied to binary classification as defined in the DEFT'05 challenge. The challenge consisted a mixture of two differents problems in Natural Language Processing : identification of author (a sequence of François Mitterrand's sentences might have been inserted into a speech of Jacques Chirac) and thematic break detection (the subjects addressed by the two a… ▽ More

    Submitted 11 March, 2019; originally announced March 2019.

    Comments: 27 figures, 1 table (in French)

    Journal ref: RNTI (E10)776:1889-1918, 2007

  17. arXiv:1810.10641  [pdf, other

    cs.CL

    Predicting the Semantic Textual Similarity with Siamese CNN and LSTM

    Authors: Elvys Linhares Pontes, Stéphane Huet, Andréa Carneiro Linhares, Juan-Manuel Torres-Moreno

    Abstract: Semantic Textual Similarity (STS) is the basis of many applications in Natural Language Processing (NLP). Our system combines convolution and recurrent neural networks to measure the semantic similarity of sentences. It uses a convolution network to take account of the local context of words and an LSTM to consider the global context of sentences. This combination of networks helps to preserve the… ▽ More

    Submitted 24 October, 2018; originally announced October 2018.

  18. arXiv:1810.10639  [pdf, ps, other

    cs.CL

    A Multilingual Study of Compressive Cross-Language Text Summarization

    Authors: Elvys Linhares Pontes, Stéphane Huet, Juan-Manuel Torres-Moreno

    Abstract: Cross-Language Text Summarization (CLTS) generates summaries in a language different from the language of the source documents. Recent methods use information from both languages to generate summaries with the most informative sentences. However, these methods have performance that can vary according to languages, which can reduce the quality of summaries. In this paper, we propose a compressive f… ▽ More

    Submitted 24 October, 2018; originally announced October 2018.

  19. arXiv:1809.00994  [pdf, other

    cs.CL

    Étude de l'informativité des transcriptions : une approche basée sur le résumé automatique

    Authors: Carlos-Emiliano González-Gallardo, Malek Hajjem, Eric SanJuan, Juan-Manuel Torres-Moreno

    Abstract: In this paper we propose a new approach to evaluate the informativeness of transcriptions coming from Automatic Speech Recognition systems. This approach, based in the notion of informativeness, is focused on the framework of Automatic Text Summarization performed over these transcriptions. At a first glance we estimate the informative content of the various automatic transcriptions, then we explo… ▽ More

    Submitted 4 September, 2018; originally announced September 2018.

    Comments: in French, 15e Conférence en Recherche d'Information et Applications (CORIA)

  20. arXiv:1808.08850  [pdf, ps, other

    cs.CL

    WiSeBE: Window-based Sentence Boundary Evaluation

    Authors: Carlos-Emiliano González-Gallardo, Juan-Manuel Torres-Moreno

    Abstract: Sentence Boundary Detection (SBD) has been a major research topic since Automatic Speech Recognition transcripts have been used for further Natural Language Processing tasks like Part of Speech Tagging, Question Answering or Automatic Summarization. But what about evaluation? Do standard evaluation metrics like precision, recall, F-score or classification error; and more important, evaluating an a… ▽ More

    Submitted 27 August, 2018; originally announced August 2018.

    Comments: In proceedings of the 17th Mexican International Conference on Artificial Intelligence (MICAI), 2018

  21. arXiv:1802.04559  [pdf, other

    cs.CL

    Sentence Boundary Detection for French with Subword-Level Information Vectors and Convolutional Neural Networks

    Authors: Carlos-Emiliano González-Gallardo, Juan-Manuel Torres-Moreno

    Abstract: In this work we tackle the problem of sentence boundary detection applied to French as a binary classification task ("sentence boundary" or "not sentence boundary"). We combine convolutional neural networks with subword-level information vectors, which are word embedding representations learned from Wikipedia that take advantage of the words morphology; so each word is represented as a bag of thei… ▽ More

    Submitted 13 February, 2018; originally announced February 2018.

    Comments: In proceedings of the International Conference on Natural Language, Signal and Speech Processing (ICNLSSP) 2017

  22. arXiv:1710.06524  [pdf, ps, other

    cs.CL

    Unsupervised Sentence Representations as Word Information Series: Revisiting TF--IDF

    Authors: Ignacio Arroyo-Fernández, Carlos-Francisco Méndez-Cruz, Gerardo Sierra, Juan-Manuel Torres-Moreno, Grigori Sidorov

    Abstract: Sentence representation at the semantic level is a challenging task for Natural Language Processing and Artificial Intelligence. Despite the advances in word embeddings (i.e. word vector representations), capturing sentence meaning is an open question due to complexities of semantic interactions among words. In this paper, we present an embedding method, which is aimed at learning unsupervised sen… ▽ More

    Submitted 19 October, 2017; v1 submitted 17 October, 2017; originally announced October 2017.

  23. arXiv:1703.06630  [pdf, other

    cs.IR cs.CL

    Automatic Text Summarization Approaches to Speed up Topic Model Learning Process

    Authors: Mohamed Morchid, Juan-Manuel Torres-Moreno, Richard Dufour, Javier Ramírez-Rodríguez, Georges Linarès

    Abstract: The number of documents available into Internet moves each day up. For this reason, processing this amount of information effectively and expressibly becomes a major concern for companies and scientists. Methods that represent a textual document by a topic representation are widely used in Information Retrieval (IR) to process big data such as Wikipedia articles. One of the main difficulty in usin… ▽ More

    Submitted 20 March, 2017; originally announced March 2017.

    Comments: 16 pages, 4 tables, 8 figures

    Journal ref: International Journal of Computational Linguistics and Applications, 7(2):87-109, 2016

  24. arXiv:1703.06501  [pdf, other

    cs.CL

    Métodos de Otimização Combinatória Aplicados ao Problema de Compressão MultiFrases

    Authors: Elvys Linhares Pontes, Thiago Gouveia da Silva, Andréa Carneiro Linhares, Juan-Manuel Torres-Moreno, Stéphane Huet

    Abstract: The Internet has led to a dramatic increase in the amount of available information. In this context, reading and understanding this flow of information have become costly tasks. In the last years, to assist people to understand textual data, various Natural Language Processing (NLP) applications based on Combinatorial Optimization have been devised. However, for Multi-Sentences Compression (MSC),… ▽ More

    Submitted 19 March, 2017; originally announced March 2017.

    Comments: 12 pages, 1 figure, 3 tables (paper in Portuguese), Preprint of XLVIII Simpósio Brasileiro de Pesquisa Operacional, 2016, Vitória, ES, (Brazil)

  25. arXiv:1703.04718  [pdf, ps, other

    cs.CL

    Extending Automatic Discourse Segmentation for Texts in Spanish to Catalan

    Authors: Iria da Cunha, Eric SanJuan, Juan-Manuel Torres-Moreno, Irene Castellón

    Abstract: At present, automatic discourse analysis is a relevant research topic in the field of NLP. However, discourse is one of the phenomena most difficult to process. Although discourse parsers have been already developed for several languages, this tool does not exist for Catalan. In order to implement this kind of parser, the first step is to develop a discourse segmenter. In this article we present t… ▽ More

    Submitted 11 March, 2017; originally announced March 2017.

    Journal ref: Proceedings of the First Workshop on Modeling, Learning and Mining for Cross/Multilinguality (MultiLingMine 2016), 38th European Conference on Information Retrieval (ECIR 2016)

  26. arXiv:1703.03923  [pdf, other

    cs.IR cs.CL

    A German Corpus for Text Similarity Detection Tasks

    Authors: Juan-Manuel Torres-Moreno, Gerardo Sierra, Peter Peinl

    Abstract: Text similarity detection aims at measuring the degree of similarity between a pair of texts. Corpora available for text similarity detection are designed to evaluate the algorithms to assess the paraphrase level among documents. In this paper we present a textual German corpus for similarity detection. The purpose of this corpus is to automatically assess the similarity between a pair of texts an… ▽ More

    Submitted 11 March, 2017; originally announced March 2017.

    Comments: 1 figure; 13 pages

    Journal ref: Preprint of International Journal of Computational Linguistics and Applications, vol. 5, no. 2, 2014, pp. 9-24

  27. arXiv:1702.06510  [pdf, ps, other

    cs.IR cs.CL

    Algorithmes de classification et d'optimisation: participation du LIA/ADOC á DEFT'14

    Authors: Luis Adrián Cabrera-Diego, Stéphane Huet, Bassam Jabaian, Alejandro Molina, Juan-Manuel Torres-Moreno, Marc El-Bèze, Barthélémy Durette

    Abstract: This year, the DEFT campaign (Défi Fouilles de Textes) incorporates a task which aims at identifying the session in which articles of previous TALN conferences were presented. We describe the three statistical systems developed at LIA/ADOC for this task. A fusion of these systems enables us to obtain interesting results (micro-precision score of 0.76 measured on the test corpus)

    Submitted 21 February, 2017; originally announced February 2017.

    Comments: 8 pages, 3 tables, Conference paper (in French)

  28. arXiv:1702.06478  [pdf, ps, other

    cs.CL cs.IR

    Systèmes du LIA à DEFT'13

    Authors: Xavier Bost, Ilaria Brunetti, Luis Adrián Cabrera-Diego, Jean-Valère Cossu, Andréa Linhares, Mohamed Morchid, Juan-Manuel Torres-Moreno, Marc El-Bèze, Richard Dufour

    Abstract: The 2013 Défi de Fouille de Textes (DEFT) campaign is interested in two types of language analysis tasks, the document classification and the information extraction in the specialized domain of cuisine recipes. We present the systems that the LIA has used in DEFT 2013. Our systems show interesting results, even though the complexity of the proposed tasks.

    Submitted 21 February, 2017; originally announced February 2017.

    Comments: 12 pages, 3 tables, (Paper in French)

    Journal ref: Proceedings of the Ninth DEFT Workshop, DEFT2013, Les Sables-d'Olonne, France, 21st June 2013

  29. arXiv:1702.06467  [pdf, other

    cs.IR cs.CL cs.SI

    Efficient Social Network Multilingual Classification using Character, POS n-grams and Dynamic Normalization

    Authors: Carlos-Emiliano González-Gallardo, Juan-Manuel Torres-Moreno, Azucena Montes Rendón, Gerardo Sierra

    Abstract: In this paper we describe a dynamic normalization process applied to social network multilingual documents (Facebook and Twitter) to improve the performance of the Author profiling task for short texts. After the normalization process, $n$-grams of characters and n-grams of POS tags are obtained to extract all the possible stylistic information encoded in the documents (emoticons, character floodi… ▽ More

    Submitted 21 February, 2017; originally announced February 2017.

    Comments: 8 pages, 6 figures, Conference paper

    Journal ref: Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, Vol 1: KDIR, 307-314, 2016, Porto, Portugal

  30. arXiv:1601.07124  [pdf, other

    cs.CL cs.IR

    LIA-RAG: a system based on graphs and divergence of probabilities applied to Speech-To-Text Summarization

    Authors: Elvys Linhares Pontes, Juan-Manuel Torres-Moreno, Andréa Carneiro Linhares

    Abstract: This paper aims to introduces a new algorithm for automatic speech-to-text summarization based on statistical divergences of probabilities and graphs. The input is a text from speech conversations with noise, and the output a compact text summary. Our results, on the pilot task CCCS Multiling 2015 French corpus are very encouraging

    Submitted 26 January, 2016; originally announced January 2016.

    Comments: 7 pages, 2 figures, CCCS Multiling 2015 Workshop

  31. arXiv:1506.06205  [pdf, other

    cs.IT

    Trivergence of Probability Distributions, at glance

    Authors: Juan-Manuel Torres-Moreno

    Abstract: In this paper we introduce the intuitive notion of trivergence of probability distributions (TPD). This notion allow us to calculate the similarity among triplets of objects. For this computation, we can use the well known measures of probability divergences like Kullback-Leibler and Jensen-Shannon. Divergence measures may be used in Information Retrieval tasks as Automatic Text Summarization, Tex… ▽ More

    Submitted 20 June, 2015; originally announced June 2015.

    Comments: 10 pages, 1 figure

  32. arXiv:1501.04920  [pdf

    cs.IR cs.CL

    Regroupement sémantique de définitions en espagnol

    Authors: Gerardo Sierra, Juan-Manuel Torres-Moreno, Alejandro Molina

    Abstract: This article focuses on the description and evaluation of a new unsupervised learning method of clustering of definitions in Spanish according to their semantic. Textual Energy was used as a clustering measure, and we study an adaptation of the Precision and Recall to evaluate our method.

    Submitted 20 January, 2015; originally announced January 2015.

    Comments: 11 pages, in French, 5 figures. Workshop Evaluation des méthodes d'Extraction de Connaissances dans les Données EvalECD EGC'10, 2010 Tunis

  33. Optimisation using Natural Language Processing: Personalized Tour Recommendation for Museums

    Authors: Mayeul Mathias, Assema Moussa, Fen Zhou, Juan-Manuel Torres-Moreno, Marie-Sylvie Poli, Didier Josselin, Marc El-Bèze, Andréa Carneiro Linhares, Francoise Rigat

    Abstract: This paper proposes a new method to provide personalized tour recommendation for museum visits. It combines an optimization of preference criteria of visitors with an automatic extraction of artwork importance from museum information based on Natural Language Processing using textual energy. This project includes researchers from computer and social sciences. Some results are obtained with numeric… ▽ More

    Submitted 6 January, 2015; originally announced January 2015.

    Comments: 8 pages, 4 figures; Proceedings of the 2014 Federated Conference on Computer Science and Information Systems pp. 439-446

  34. arXiv:1501.01243  [pdf

    cs.CL

    Un résumeur à base de graphes, indépéndant de la langue

    Authors: Juan-Manuel Torres-Moreno, Javier Ramirez, Iria da Cunha

    Abstract: In this paper we present REG, a graph-based approach for study a fundamental problem of Natural Language Processing (NLP): the automatic text summarization. The algorithm maps a document as a graph, then it computes the weight of their sentences. We have applied this approach to summarize documents in three languages.

    Submitted 6 January, 2015; originally announced January 2015.

    Comments: 8 pages, in French, 2 figures; International Workshop on African Human Language Technologies

  35. arXiv:1212.3493  [pdf, ps, other

    cs.CL cs.IR

    Sentence Compression in Spanish driven by Discourse Segmentation and Language Models

    Authors: Alejandro Molina, Juan-Manuel Torres-Moreno, Iria da Cunha, Eric SanJuan, Gerardo Sierra

    Abstract: Previous works demonstrated that Automatic Text Summarization (ATS) by sentences extraction may be improved using sentence compression. In this work we present a sentence compressions approach guided by level-sentence discourse segmentation and probabilistic language models (LM). The results presented here show that the proposed solution is able to generate coherent summaries with grammatical comp… ▽ More

    Submitted 17 December, 2012; v1 submitted 14 December, 2012; originally announced December 2012.

    Comments: 7 pages, 3 tables

  36. arXiv:1212.1918  [pdf

    cs.IR cs.CL

    Condensés de textes par des méthodes numériques

    Authors: Juan-Manuel Torres-Moreno, Patricia Velázquez-Morales, Jean-Guy Meunier

    Abstract: Since information in electronic form is already a standard, and that the variety and the quantity of information become increasingly large, the methods of summarizing or automatic condensation of texts is a critical phase of the analysis of texts. This article describes CORTEX a system based on numerical methods, which allows obtaining a condensation of a text, which is independent of the topic an… ▽ More

    Submitted 9 December, 2012; originally announced December 2012.

    Comments: Conférence JADT 2002, Saint-Malo/France. 12 pages, 7 figures

  37. arXiv:1210.3312  [pdf, other

    cs.IR cs.AI cs.CL

    Artex is AnotheR TEXt summarizer

    Authors: Juan-Manuel Torres-Moreno

    Abstract: This paper describes Artex, another algorithm for Automatic Text Summarization. In order to rank sentences, a simple inner product is calculated between each sentence, a document vector (text topic) and a lexical vector (vocabulary used by a sentence). Summaries are then generated by assembling the highest ranked sentences. No ruled-based linguistic post-processing is necessary in order to obtain… ▽ More

    Submitted 11 October, 2012; originally announced October 2012.

    Comments: 11 pages, 5 figures. arXiv admin note: substantial text overlap with arXiv:1209.3126

  38. arXiv:1209.3126  [pdf, other

    cs.IR cs.CL

    Beyond Stemming and Lemmatization: Ultra-stemming to Improve Automatic Text Summarization

    Authors: Juan-Manuel Torres-Moreno

    Abstract: In Automatic Text Summarization, preprocessing is an important phase to reduce the space of textual representation. Classically, stemming and lemmatization have been widely used for normalizing words. However, even using normalization on large texts, the curse of dimensionality can disturb the performance of summarizers. This paper describes a new method for normalization of words to further reduc… ▽ More

    Submitted 14 September, 2012; originally announced September 2012.

    Comments: 22 pages, 12 figures, 9 tables

  39. arXiv:1004.3371  [pdf, other

    cs.IR

    Improving Update Summarization by Revisiting the MMR Criterion

    Authors: Florian Boudin, Juan-Manuel Torres-Moreno, Marc El-Bèze

    Abstract: This paper describes a method for multi-document update summarization that relies on a double maximization criterion. A Maximal Marginal Relevance like criterion, modified and so called Smmr, is used to select sentences that are close to the topic and at the same time, distant from sentences used in already read documents. Summaries are then generated by assembling the high ranked material and app… ▽ More

    Submitted 20 April, 2010; originally announced April 2010.

    Comments: 20 pages, 3 figures and 8 tables.

    ACM Class: I.2.7

  40. arXiv:1001.1093  [pdf, other

    math.OC cs.NI math.CO

    Solving the Frequency Assignment Problem by Site Availability and Constraint Programming

    Authors: Andrea Carneiro Linhares, Juan-Manuel Torres-Moreno, Peter Peinl, Philippe Michelon

    Abstract: The efficient use of bandwidth for radio communications becomes more and more crucial when develo** new information technologies and their applications. The core issues are addressed by the so-called Frequency Assignment Problems (FAP). Our work investigates static FAP, where an attempt is first made to configure a kernel of links. We study the problem based on the concepts and techniques of C… ▽ More

    Submitted 7 January, 2010; originally announced January 2010.

    Comments: 11 pages, 1 figure and 3 tables

  41. arXiv:0906.0470  [pdf, ps, other

    cs.LG

    An optimal linear separator for the Sonar Signals Classification task

    Authors: Juan-Manuel Torres-Moreno, Mirta B. Gordon

    Abstract: The problem of classifying sonar signals from rocks and mines first studied by Gorman and Sejnowski has become a benchmark against which many learning algorithms have been tested. We show that both the training set and the test set of this benchmark are linearly separable, although with different hyperplanes. Moreover, the complete set of learning and test patterns together, is also linearly sep… ▽ More

    Submitted 2 June, 2009; originally announced June 2009.

    Comments: 8 pages, 6 tables

  42. arXiv:0905.2990  [pdf, other

    cs.IR cs.CL

    Automatic Summarization System coupled with a Question-Answering System (QAAS)

    Authors: Juan-Manuel Torres-Moreno, Pier-Luc St-Onge, Michel Gagnon, Marc El-Bèze, Patrice Bellot

    Abstract: To select the most relevant sentences of a document, it uses an optimal decision algorithm that combines several metrics. The metrics processes, weighting and extract pertinence sentences by statistical and informational algorithms. This technique might improve a Question-Answering system, whose function is to provide an exact answer to a question in natural language. In this paper, we present t… ▽ More

    Submitted 18 May, 2009; originally announced May 2009.

    Comments: 28 pages, 11 figures

  43. arXiv:0905.2347  [pdf, other

    cs.LG

    Combining Supervised and Unsupervised Learning for GIS Classification

    Authors: Juan-Manuel Torres-Moreno, Laurent Bougrain, Frdéric Alexandre

    Abstract: This paper presents a new hybrid learning algorithm for unsupervised classification tasks. We combined Fuzzy c-means learning algorithm and a supervised version of Minimerror to develop a hybrid incremental strategy allowing unsupervised classifications. We applied this new approach to a real-world database in order to know if the information contained in unlabeled features of a Geographic Infor… ▽ More

    Submitted 14 May, 2009; originally announced May 2009.

    Comments: 8 pages, 3 figures

  44. arXiv:0905.1130  [pdf, other

    cs.IR cs.CL

    Statistical Automatic Summarization in Organic Chemistry

    Authors: Florian Boudin, Patricia Velazquez-Morales, Juan-Manuel Torres-Moreno

    Abstract: We present an oriented numerical summarizer algorithm, applied to producing automatic summaries of scientific documents in Organic Chemistry. We present its implementation named Yachs (Yet Another Chemistry Summarizer) that combines a specific document pre-processing with a sentence scoring method relying on the statistical properties of documents. We show that Yachs achieves the best results am… ▽ More

    Submitted 7 May, 2009; originally announced May 2009.

    Comments: 10 pages, 3 figures

  45. arXiv:0904.4587  [pdf, ps, other

    cs.AI cs.NE

    Adaptive Learning with Binary Neurons

    Authors: Juan-Manuel Torres-Moreno, Mirta B. Gordon

    Abstract: A efficient incremental learning algorithm for classification tasks, called NetLines, well adapted for both binary and real-valued input patterns is presented. It generates small compact feedforward neural networks with one hidden layer of binary units and binary output units. A convergence theorem ensures that solutions with a finite number of hidden units exist for both binary and real-valued… ▽ More

    Submitted 29 April, 2009; originally announced April 2009.

    Comments: 29 pages, 7 figures