Skip to main content

Showing 1–26 of 26 results for author: Brisaboa, N R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2009.10045  [pdf, other

    cs.DS

    Space/time-efficient RDF stores based on circular suffix sorting

    Authors: Nieves R. Brisaboa, Ana Cerdeira-Pena, Guillermo de Bernardo, Antonio Fariña, Gonzalo Navarro

    Abstract: In recent years, RDF has gained popularity as a format for the standardized publication and exchange of information in the Web of Data. In this paper we introduce RDFCSA, a data structure that is able to self-index an RDF dataset in small space and supports efficient querying. RDFCSA regards the triples of the RDF store as short circular strings and applies suffix sorting on those strings, so that… ▽ More

    Submitted 13 April, 2022; v1 submitted 21 September, 2020; originally announced September 2020.

    Comments: This work has been submitted to a Journal for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  2. arXiv:2002.12050  [pdf, other

    cs.DS

    Semantrix: A Compressed Semantic Matrix

    Authors: Nieves R. Brisaboa, Antonio Fariña, Gonzalo Navarro, Tirso V. Rodeiro

    Abstract: We present a compact data structure to represent both the duration and length of homogeneous segments of trajectories from moving objects in a way that, as a data warehouse, it allows us to efficiently answer cumulative queries. The division of trajectories into relevant segments has been studied in the literature under the topic of Trajectory Segmentation. In this paper, we design a data structur… ▽ More

    Submitted 28 February, 2020; v1 submitted 27 February, 2020; originally announced February 2020.

    Comments: 10 pages, Data Compression Conference 2020. This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941

  3. arXiv:2002.11622  [pdf, ps, other

    cs.DB cs.DS

    Revisiting compact RDF stores based on k2-trees

    Authors: Nieves R. Brisaboa, Ana Cerdeira-Pena, Guillermo de Bernardo, Antonio Fariña

    Abstract: We present a new compact representation to efficiently store and query large RDF datasets in main memory. Our proposal, called BMatrix, is based on the k2-tree, a data structure devised to represent binary matrices in a compressed way, and aims at improving the results of previous state-of-the-art alternatives, especially in datasets with a relatively large number of predicates. We introduce our t… ▽ More

    Submitted 26 February, 2020; originally announced February 2020.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941

  4. Grammar Compressed Sequences with Rank/Select Support

    Authors: Alberto Ordóñez, Gonzalo Navarro, Nieves R. Brisaboa

    Abstract: Sequence representations supporting not only direct access to their symbols, but also rank/select operations, are a fundamental building block in many compressed data structures. Several recent applications need to represent highly repetitive sequences, and classical statistical compression proves ineffective. We introduce, instead, grammar-based representations for repetitive sequences, which use… ▽ More

    Submitted 21 November, 2019; v1 submitted 20 November, 2019; originally announced November 2019.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941

    Journal ref: Journal of Discrete Algorithms 43, pp. 54-71 (2017)

  5. New structures to solve aggregated queries for trips over public transportation networks

    Authors: Nieves R. Brisaboa, Antonio Fariña, Daniil Galaktionov, Tirso V. Rodeiro, M. Andrea Rodríguez

    Abstract: Representing the trajectories of mobile objects is a hot topic from the widespread use of smartphones and other GPS devices. However, few works have focused on representing trips over public transportation networks (buses, subway, and trains) where a user's trips can be seen as a sequence of stages performed within a vehicle shared with many other users. In this context, representing vehicle journ… ▽ More

    Submitted 20 November, 2019; originally announced November 2019.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941

    Journal ref: Proc. of the 25th International Symposium on String Processing and Information Retrieval (SPIRE), Lima, Peru, October 9-11th, pp 85-101 (2018)

  6. Extending General Compact Querieable Representations to GIS Applications

    Authors: Nieves R. Brisaboa, Ana Cerdeira-Pena, Guillermo de Bernardo, Gonzalo Navarro, Oscar Pedreira

    Abstract: The raster model is commonly used for the representation of images in many domains, and is especially useful in Geographic Information Systems (GIS) to store information about continuous variables of the space (elevation, temperature, etc.). Current representations of raster data are usually designed for external memory or, when stored in main memory, lack efficient query capabilities. In this pap… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941,

    Journal ref: Information Sciences 2020

  7. Improved Compressed String Dictionaries

    Authors: Nieves R. Brisaboa, Ana Cerdeira-Pena, Guillermo de Bernardo, Gonzalo Navarro

    Abstract: We introduce a new family of compressed data structures to efficiently store and query large string dictionaries in main memory. Our main technique is a combination of hierarchical Front-coding with ideas from longest-common-prefix computation in suffix arrays. Our data structures yield relevant space-time tradeoffs in real-world dictionaries. We focus on two domains where string dictionaries are… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941

    Journal ref: Proc. 28th ACM International Conference on Information and Knowledge Management (CIKM 2019)

  8. Dv2v: A Dynamic Variable-to-Variable Compressor

    Authors: Nieves R. Brisaboa, Antonio Fariña, Adrián Gómez-Brandón, Gonzalo Navarro, Tirso V. Rodeiro

    Abstract: We present Dv2v, a new dynamic (one-pass) variable-to-variable compressor. Variable-to-variable compression aims at using a modeler that gathers variable-length input symbols and a variable-length statistical coder that assigns shorter codewords to the more frequent symbols. In Dv2v, we process the input text word-wise to gather variable-length symbols that can be either terminals (new words) or n… ▽ More

    Submitted 11 November, 2019; originally announced November 2019.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941

    Journal ref: Dv2v: A Dynamic Variable-to-Variable Compressor. In 2019 Data Compression Conference (DCC) (pp. 83-92). IEEE

  9. GraCT: A Grammar-based Compressed Index for Trajectory Data

    Authors: Nieves R. Brisaboa, Adrián Gómez-Brandón, Gonzalo Navarro, José R. Paramá

    Abstract: We introduce a compressed data structure for the storage of free trajectories of moving objects (such as ships and planes) that efficiently supports various spatio-temporal queries. Our structure, dubbed GraCT, stores the absolute positions of all the objects at regular time intervals (snapshots) using a $k^2$-tree, which is a space- and time-efficient version of a region quadtree. Positions betwe… ▽ More

    Submitted 11 November, 2019; originally announced November 2019.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941

    Journal ref: Information Sciences, 2019, vol. 483, p. 106-135

  10. A Compact Representation for Trips over Networks built on self-indexes

    Authors: Nieves R. Brisaboa, Antonio Fariña, Daniil Galaktionov, M. Andrea Rodriguez

    Abstract: Representing the movements of objects (trips) over a network in a compact way while retaining the capability of exploiting such data effectively is an important challenge of real applications. We present a new Compact Trip Representation (CTR) that handles the spatio-temporal data associated with users' trips over transportation networks. Depending on the network and types of queries, nodes in the… ▽ More

    Submitted 28 December, 2018; originally announced December 2018.

    Comments: 42 pages

    Journal ref: Information Systems, Volume 78, November 2018, Pages 1-22

  11. Using Compressed Suffix-Arrays for a Compact Representation of Temporal-Graphs

    Authors: Nieves R. Brisaboa, Diego Caro, Antonio Fariña, M. Andrea Rodriguez

    Abstract: Temporal graphs represent binary relationships that change along time. They can model the dynamism of, for example, social and communication networks. Temporal graphs are defined as sets of contacts that are edges tagged with the temporal intervals when they are active. This work explores the use of the Compressed Suffix Array (CSA), a well-known compact and self-indexed data structure in the area… ▽ More

    Submitted 28 December, 2018; originally announced December 2018.

    Comments: 41 pages, Information Sciences

    Journal ref: Information Sciences Volume 465, October 2018, Pages 459-483

  12. A Grammar-based Compressed Representation of 3D Trajectories

    Authors: Nieves R. Brisaboa, Adrián Gómez-Brandón, Miguel A. Martínez-Prieto, José R. Paramá

    Abstract: Much research has been published about trajectory management on the ground or at the sea, but compression or indexing of flight trajectories have usually been less explored. However, air traffic management is a challenge because airspace is becoming more and more congested, and large flight data collections must be preserved and exploited for varied purposes. This paper proposes 3DGraCT, a new met… ▽ More

    Submitted 28 December, 2018; originally announced December 2018.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941

    Journal ref: String Processing and Information Retrieval: 25th International Symposium, SPIRE 2018, Lima, Peru, October 9-11, 2018, Proceedings. Springer International Publishing. pp 102-116. ISBN: 9783030004781

  13. arXiv:1810.05753  [pdf, other

    cs.DS

    Relative compression of trajectories

    Authors: Nieves R. Brisaboa, Travis Gagie, Adrián Gómez-Brandón, Gonzalo Navarro, José R. Paramá

    Abstract: We present RCT, a new compact data structure to represent trajectories of objects. It is based on a relative compression technique called Relative Lempel-Ziv (RLZ), which compresses sequences by applying an LZ77 encoding with respect to an artificial reference. Combined with $O(z)$-sized data structures on the sequence of phrases that allows to solve trajectory and spatio-temporal queries efficien… ▽ More

    Submitted 12 October, 2018; originally announced October 2018.

  14. arXiv:1803.02576  [pdf, ps, other

    cs.DS

    Compact Representations of Event Sequences

    Authors: Nieves R. Brisaboa, Guillermo de Bernardo, Gonzalo Navarro, Tirso V. Rodeiro, Diego Seco

    Abstract: We introduce a new technique for the efficient management of large sequences of multidimensional data, which takes advantage of regularities that arise in real-world datasets and supports different types of aggregation queries. More importantly, our representation is flexible in the sense that the relevant dimensions and queries may be used to guide the construction process, easily providing a spa… ▽ More

    Submitted 7 March, 2018; originally announced March 2018.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941

  15. arXiv:1803.01362  [pdf, ps, other

    cs.DS

    Two-Dimensional Block Trees

    Authors: Nieves R. Brisaboa, Travis Gagie, Adrián Gómez-Brandón, Gonzalo Navarro

    Abstract: The Block Tree (BT) is a novel compact data structure designed to compress sequence collections. It obtains compression ratios close to Lempel-Ziv and supports efficient direct access to any substring. The BT divides the text recursively into fixed-size blocks and those appearing earlier are represented with pointers. On repetitive collections, a few blocks can represent all the others, and thus t… ▽ More

    Submitted 4 March, 2018; originally announced March 2018.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941

  16. Efficient Compression and Indexing of Trajectories

    Authors: Nieves R. Brisaboa, Travis Gagie, Adrián Gómez-Brandón, Gonzalo Navarro, José R. Paramá

    Abstract: We present a new compressed representation of free trajectories of moving objects. It combines a partial-sums-based structure that retrieves in constant time the position of the object at any instant, with a hierarchical minimum-bounding-boxes representation that allows determining if the object is seen in a certain rectangular area during a time period. Combined with spatial snapshots at regular… ▽ More

    Submitted 5 October, 2017; originally announced October 2017.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941

    Journal ref: String Processing and Information Retrieval: 24th International Symposium, SPIRE 2017, Palermo, Italy, September 26-29, 2017, Proceedings. Springer International Publishing. pp 103-115. ISBN: 9783319674278

  17. arXiv:1707.02769  [pdf, ps, other

    cs.DS

    Compressed Representation of Dynamic Binary Relations with Applications

    Authors: Nieves R. Brisaboa, Ana Cerdeira-Pena, Guillermo de Bernardo, Gonzalo Navarro

    Abstract: We introduce a dynamic data structure for the compact representation of binary relations $\mathcal{R} \subseteq A \times B$. The data structure is a dynamic variant of the k$^2$-tree, a static compact representation that takes advantage of clustering in the binary relation to achieve compression. Our structure can efficiently check whether two objects $(a,b) \in A \times B$ are related, and list t… ▽ More

    Submitted 10 July, 2017; originally announced July 2017.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941, Information Systems (2017)

  18. A succinct data structure for self-indexing ternary relations

    Authors: Sandra Alvarez-Garcia, Guillermo de Bernardo, Nieves R. Brisaboa, Gonzalo Navarro

    Abstract: The representation of binary relations has been intensively studied and many different theoretical and practical representations have been proposed to answer the usual queries in multiple domains. However, ternary relations have not received as much attention, even though many real-world applications require the processing of ternary relations. In this paper we present a new compressed and self-in… ▽ More

    Submitted 10 July, 2017; originally announced July 2017.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941, Journal of Discrete Algorithms (2017)

  19. Compact Trip Representation over Networks

    Authors: Nieves R. Brisaboa, Antonio Fariña, Daniil Galaktionov, M. Andrea Rodríguez

    Abstract: We present a new Compact Trip Representation (CTR) that allows us to manage users' trips (moving objects) over networks. These could be public transportation networks (buses, subway, trains, and so on) where nodes are stations or stops, or road networks where nodes are intersections. CTR represents the sequences of nodes and time instants in users' trips. The spatial component is handled with a da… ▽ More

    Submitted 13 December, 2016; originally announced December 2016.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941

    Journal ref: 23rd International Symposium, SPIRE 2016, Beppu, Japan, October 18-20, 2016, Proceedings pp 240-253

  20. Efficient Representation of Multidimensional Data over Hierarchical Domains

    Authors: Nieves R. Brisaboa, Ana Cerdeira-Pena, Narciso López-López, Gonzalo Navarro, Miguel R. Penabad, Fernando Silva-Coira

    Abstract: We consider the problem of representing multidimensional data where the domain of each dimension is organized hierarchically, and the queries require summary information at a different node in the hierarchy of each dimension. This is the typical case of OLAP databases. A basic approach is to represent each hierarchy as a one-dimensional line and recast the queries as multidimensional range queries… ▽ More

    Submitted 13 December, 2016; originally announced December 2016.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941

    Journal ref: String Processing and Information Retrieval: 23rd International Symposium, SPIRE 2016, Beppu, Japan, October 18-20, 2016, Proceedings. Springer International Publishing. pp 191-203. ISBN: 9783319460482

  21. GraCT: A Grammar based Compressed representation of Trajectories

    Authors: Nieves R. Brisaboa, Adrián Gómez-Brandón, Gonzalo Navarro, José R. Paramá

    Abstract: We present a compressed data structure to store free trajectories of moving objects (ships over the sea, for example) allowing spatio-temporal queries. Our method, GraCT, uses a $k^2$-tree to store the absolute positions of all objects at regular time intervals (snapshots), whereas the positions between snapshots are represented as logs of relative movements compressed with Re-Pair. Our experiment… ▽ More

    Submitted 10 December, 2016; originally announced December 2016.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941

    Journal ref: String Processing and Information Retrieval: 23rd International Symposium, SPIRE 2016, Beppu, Japan, October 18-20, 2016, Proceedings. Springer International Publishing. pp 218-230. ISBN: 9783319460482

  22. Aggregated 2D Range Queries on Clustered Points

    Authors: Nieves R. Brisaboa, Guillermo De Bernardo, Roberto Konow, Gonzalo Navarro, Diego Seco

    Abstract: Efficient processing of aggregated range queries on two-dimensional grids is a common requirement in information retrieval and data mining systems, for example in Geographic Information Systems and OLAP cubes. We introduce a technique to represent grids supporting aggregated range queries that requires little space when the data points in the grid are clustered, which is common in practice. We sho… ▽ More

    Submitted 30 March, 2016; v1 submitted 7 March, 2016; originally announced March 2016.

    Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941

    Journal ref: Information Systems, Volume 60, Pages 34-49, 2016

  23. arXiv:1310.4954  [pdf, ps, other

    cs.DB cs.DS cs.IR

    Compressed Vertical Partitioning for Full-In-Memory RDF Management

    Authors: Sandra Álvarez-García, Nieves R. Brisaboa, Javier D. Fernández, Miguel A. Martínez-Prieto, Gonzalo Navarro

    Abstract: The Web of Data has been gaining momentum and this leads to increasingly publish more semi-structured datasets following the RDF model, based on atomic triple units of subject, predicate, and object. Although it is a simple model, compression methods become necessary because datasets are increasingly larger and various scalability issues arise around their organization and storage. This requiremen… ▽ More

    Submitted 21 October, 2013; v1 submitted 18 October, 2013; originally announced October 2013.

  24. arXiv:1207.5425  [pdf, ps, other

    cs.IR cs.DB

    Ranked Document Retrieval in (Almost) No Space

    Authors: Nieves R. Brisaboa, Ana Cerdeira-Pena, Gonzalo Navarro, Oscar Pedreira

    Abstract: Ranked document retrieval is a fundamental task in search engines. Such queries are solved with inverted indexes that require additional 45%-80% of the compressed text space, and take tens to hundreds of microseconds per query. In this paper we show how ranked document retrieval queries can be solved within tens of milliseconds using essentially no extra space over an in-memory compressed represen… ▽ More

    Submitted 23 July, 2012; originally announced July 2012.

    Comments: This is an extended version of the paper that will appear in Proc. of SPIRE'2012

  25. arXiv:1105.4004  [pdf

    cs.IR cs.DB

    Compressed k2-Triples for Full-In-Memory RDF Engines

    Authors: Sandra Álvarez-García, Nieves R. Brisaboa, Javier D. Fernández, Miguel A. Martínez-Prieto

    Abstract: Current "data deluge" has flooded the Web of Data with very large RDF datasets. They are hosted and queried through SPARQL endpoints which act as nodes of a semantic net built on the principles of the Linked Data project. Although this is a realistic philosophy for global data publishing, its query performance is diminished when the RDF engines (behind the endpoints) manage these huge datasets. Th… ▽ More

    Submitted 19 May, 2011; originally announced May 2011.

    Comments: In Proc. of AMCIS'2011

  26. arXiv:1101.5506  [pdf, ps, other

    cs.DS

    Compressed String Dictionaries

    Authors: Nieves R. Brisaboa, Rodrigo Cánovas, Miguel A. Martínez-Prieto, Gonzalo Navarro

    Abstract: The problem of storing a set of strings --- a string dictionary --- in compact form appears naturally in many cases. While classically it has represented a small part of the whole data to be processed (e.g., for Natural Language processing or for indexing text collections), more recent applications in Web engines, Web mining, RDF graphs, Internet routing, Bioinformatics, and many others, make use… ▽ More

    Submitted 28 January, 2011; originally announced January 2011.