Skip to main content

Showing 1–8 of 8 results for author: Martínez-Prieto, M A

Searching in archive cs. Search in all archives.
.
  1. On the Reproducibility of Experiments of Indexing Repetitive Document Collections

    Authors: Antonio Fariña, Miguel A. Martínez-Prieto, Francisco Claude, Gonzalo Navarro, Juan J. Lastra-Díaz, Nicola Prezza, Diego Seco

    Abstract: This work introduces a companion reproducible paper with the aim of allowing the exact replication of the methods, experiments, and results discussed in a previous work [5]. In that parent paper, we proposed many and varied techniques for compressing indexes which exploit that highly repetitive collections are formed mostly of documents that are near-copies of others. More concretely, we describe… ▽ More

    Submitted 26 December, 2019; originally announced December 2019.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941. Replication framework available at: https://github.com/migumar2/uiHRDC/

    Journal ref: Information Systems; Volume 83, July 2019; pages 181-194

  2. A Grammar-based Compressed Representation of 3D Trajectories

    Authors: Nieves R. Brisaboa, Adrián Gómez-Brandón, Miguel A. Martínez-Prieto, José R. Paramá

    Abstract: Much research has been published about trajectory management on the ground or at the sea, but compression or indexing of flight trajectories have usually been less explored. However, air traffic management is a challenge because airspace is becoming more and more congested, and large flight data collections must be preserved and exploited for varied purposes. This paper proposes 3DGraCT, a new met… ▽ More

    Submitted 28 December, 2018; originally announced December 2018.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941

    Journal ref: String Processing and Information Retrieval: 25th International Symposium, SPIRE 2018, Lima, Peru, October 9-11, 2018, Proceedings. Springer International Publishing. pp 102-116. ISBN: 9783030004781

  3. Universal Indexes for Highly Repetitive Document Collections

    Authors: Francisco Claude, Antonio Fariña, Miguel A. Martínez-Prieto, Gonzalo Navarro

    Abstract: Indexing highly repetitive collections has become a relevant problem with the emergence of large repositories of versioned documents, among other applications. These collections may reach huge sizes, but are formed mostly of documents that are near-copies of others. Traditional techniques for indexing these collections fail to properly exploit their regularities in order to reduce space. We intr… ▽ More

    Submitted 23 May, 2016; v1 submitted 29 April, 2016; originally announced April 2016.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941

    Journal ref: Information Systems, Volume 61, Pages 1-23, 2016

  4. Generalized Biwords for Bitext Compression and Translation Spotting

    Authors: Felipe Sánchez-Martínez, Rafael C. Carrasco, Miguel A. Martínez-Prieto, Joaquin Adiego

    Abstract: Large bilingual parallel texts (also known as bitexts) are usually stored in a compressed form, and previous work has shown that they can be more efficiently compressed if the fact that the two texts are mutual translations is exploited. For example, a bitext can be seen as a sequence of biwords ---pairs of parallel words with a high probability of co-occurrence--- that can be used as an intermedi… ▽ More

    Submitted 18 January, 2014; originally announced January 2014.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 43, pages 389-418, 2012

  5. arXiv:1310.4954  [pdf, ps, other

    cs.DB cs.DS cs.IR

    Compressed Vertical Partitioning for Full-In-Memory RDF Management

    Authors: Sandra Álvarez-García, Nieves R. Brisaboa, Javier D. Fernández, Miguel A. Martínez-Prieto, Gonzalo Navarro

    Abstract: The Web of Data has been gaining momentum and this leads to increasingly publish more semi-structured datasets following the RDF model, based on atomic triple units of subject, predicate, and object. Although it is a simple model, compression methods become necessary because datasets are increasingly larger and various scalability issues arise around their organization and storage. This requiremen… ▽ More

    Submitted 21 October, 2013; v1 submitted 18 October, 2013; originally announced October 2013.

  6. arXiv:1105.4004  [pdf

    cs.IR cs.DB

    Compressed k2-Triples for Full-In-Memory RDF Engines

    Authors: Sandra Álvarez-García, Nieves R. Brisaboa, Javier D. Fernández, Miguel A. Martínez-Prieto

    Abstract: Current "data deluge" has flooded the Web of Data with very large RDF datasets. They are hosted and queried through SPARQL endpoints which act as nodes of a semantic net built on the principles of the Linked Data project. Although this is a realistic philosophy for global data publishing, its query performance is diminished when the RDF engines (behind the endpoints) manage these huge datasets. Th… ▽ More

    Submitted 19 May, 2011; originally announced May 2011.

    Comments: In Proc. of AMCIS'2011

  7. arXiv:1103.5043  [pdf, other

    cs.IR cs.AI cs.HC

    An Empirical Study of Real-World SPARQL Queries

    Authors: Mario Arias, Javier D. Fernández, Miguel A. Martínez-Prieto, Pablo de la Fuente

    Abstract: Understanding how users tailor their SPARQL queries is crucial when designing query evaluation engines or fine-tuning RDF stores with performance in mind. In this paper we analyze 3 million real-world SPARQL queries extracted from logs of the DBPedia and SWDF public endpoints. We aim at finding which are the most used language elements both from syntactical and structural perspectives, paying spec… ▽ More

    Submitted 25 March, 2011; originally announced March 2011.

    Comments: 1st International Workshop on Usage Analysis and the Web of Data (USEWOD2011) in the 20th International World Wide Web Conference (WWW2011), Hyderabad, India, March 28th, 2011

    Report number: WWW2011USEWOD/2011/arifermarfue ACM Class: H.2.3

  8. arXiv:1101.5506  [pdf, ps, other

    cs.DS

    Compressed String Dictionaries

    Authors: Nieves R. Brisaboa, Rodrigo Cánovas, Miguel A. Martínez-Prieto, Gonzalo Navarro

    Abstract: The problem of storing a set of strings --- a string dictionary --- in compact form appears naturally in many cases. While classically it has represented a small part of the whole data to be processed (e.g., for Natural Language processing or for indexing text collections), more recent applications in Web engines, Web mining, RDF graphs, Internet routing, Bioinformatics, and many others, make use… ▽ More

    Submitted 28 January, 2011; originally announced January 2011.