Skip to main content

Showing 1–26 of 26 results for author: Ayguadé, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.10170  [pdf, other

    cs.AR

    A Mess of Memory System Benchmarking, Simulation and Application Profiling

    Authors: Pouya Esmaili-Dokht, Francesco Sgherzi, Valeria Soldera Girelli, Isaac Boixaderas, Mariana Carmin, Alireza Momeni, Adria Armejach, Estanislao Mercadal, German Llort, Petar Radojkovic, Miquel Moreto, Judit Gimenez, Xavier Martorell, Eduard Ayguade, Jesus Labarta, Emanuele Confalonieri, Rishabh Dubey, Jason Adlard

    Abstract: The Memory stress (Mess) framework provides a unified view of the memory system benchmarking, simulation and application profiling. The Mess benchmark provides a holistic and detailed memory system characterization. It is based on hundreds of measurements that are represented as a family of bandwidth--latency curves. The benchmark increases the coverage of all the previous tools and leads to new f… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: 17 pages

  2. Automated Generation of High-Performance Computational Fluid Dynamics Codes

    Authors: Sandra Macià, Pedro J. Martıínez-Ferrer, Eduard Ayguadé, Vicenç Beltran

    Abstract: Domain-Specific Languages (DSLs) improve programmers productivity by decoupling problem descriptions from algorithmic implementations. However, DSLs for High-Performance Computing (HPC) have two additional critical requirements: performance and scalability. This paper presents the automated process of generating, from abstract mathematical specifications of Computational Fluid Dynamics (CFD) probl… ▽ More

    Submitted 27 April, 2022; v1 submitted 26 April, 2022; originally announced April 2022.

    Comments: 30 pages, 18 figures. Postprint submitted to the Journal of Computational Science (Elsevier). Article updated with reviewers' comments, additional material in section 4.3 including figures and correction of typos

    ACM Class: D.1.3; J.2

    Journal ref: Journal of Computational Science Volume 61, May 2022, 101664

  3. Enhancing Resource Management through Prediction-based Policies

    Authors: Antoni Navarro, Arthur F. Lorenzon, Eduard Ayguadé, Vicenç Beltran

    Abstract: Task-based programming models are emerging as a promising alternative to make the most of multi-/many-core systems. These programming models rely on runtime systems, and their goal is to improve application performance by properly scheduling application tasks to cores. Additionally, these runtime systems offer policies to cope with application phases that lack in parallelism to fill all cores. How… ▽ More

    Submitted 23 September, 2020; originally announced September 2020.

    Comments: Postprint submitted and published at Euro-Par2020: International European Conference on Parallel and Distributed Computing (Springer) (https://link.springer.com/chapter/10.1007%2F978-3-030-57675-2_31)

    Journal ref: International European Conference on Parallel and Distributed Computing, 12247, 493-509 (2020)

  4. arXiv:2009.08698  [pdf, other

    cs.NE cs.LG

    Generating Efficient DNN-Ensembles with Evolutionary Computation

    Authors: Marc Ortiz, Florian Scheidegger, Marc Casas, Cristiano Malossi, Eduard Ayguadé

    Abstract: In this work, we leverage ensemble learning as a tool for the creation of faster, smaller, and more accurate deep learning models. We demonstrate that we can jointly optimize for accuracy, inference time, and the number of parameters by combining DNN classifiers. To achieve this, we combine multiple ensemble strategies: bagging, boosting, and an ordered chain of classifiers. To reduce the number o… ▽ More

    Submitted 3 May, 2021; v1 submitted 18 September, 2020; originally announced September 2020.

    Comments: 8 pages

  5. Asynchronous Runtime with Distributed Manager for Task-based Programming Models

    Authors: Jaume Bosch, Carlos Álvarez, Daniel Jiménez-González, Xavier Martorell, Eduard Ayguadé

    Abstract: Parallel task-based programming models, like OpenMP, allow application developers to easily create a parallel version of their sequential codes. The standard OpenMP 4.0 introduced the possibility of describing a set of data dependences per task that the runtime uses to order the tasks execution. This order is calculated using shared graphs, which are updated by all threads in exclusive access usin… ▽ More

    Submitted 8 September, 2020; v1 submitted 7 September, 2020; originally announced September 2020.

    Comments: 2020 Parallel Computing

    Journal ref: Parallel Computing, Volume 97, 2020

  6. arXiv:2007.13693  [pdf, other

    cs.CV cs.LG

    The MAMe Dataset: On the relevance of High Resolution and Variable Shape image properties

    Authors: Ferran Parés, Anna Arias-Duart, Dario Garcia-Gasulla, Gema Campo-Francés, Nina Viladrich, Eduard Ayguadé, Jesús Labarta

    Abstract: In the image classification task, the most common approach is to resize all images in a dataset to a unique shape, while reducing their precision to a size which facilitates experimentation at scale. This practice has benefits from a computational perspective, but it entails negative side-effects on performance due to loss of information and image deformation. In this work we introduce the MAMe da… ▽ More

    Submitted 20 May, 2021; v1 submitted 27 July, 2020; originally announced July 2020.

  7. Extending the OpenCHK Model with Advanced Checkpoint Features

    Authors: Marcos Maroñas, Sergi Mateo, Kai Keller, Leonardo Bautista-Gomez, Eduard Ayguadé, Vicenç Beltran

    Abstract: One of the major challenges in using extreme scale systems efficiently is to mitigate the impact of faults. Application-level checkpoint/restart (CR) methods provide the best trade-off between productivity, robustness, and performance. There are many solutions implementing CR at the application level. They all provide advanced I/O capabilities to minimize the overhead introduced by CR. Nevertheles… ▽ More

    Submitted 1 July, 2020; v1 submitted 30 June, 2020; originally announced June 2020.

    Journal ref: Future Generation Computer Systems, Volume 112, 2020, Pages 738-750

  8. Worksharing Tasks: An Efficient Way to Exploit Irregular and Fine-Grained Loop Parallelism

    Authors: M. Maronas, K. Sala, S. Mateo, E. Ayguadé, V. Beltran Barcelona Supercomputing Center

    Abstract: Shared memory programming models usually provide worksharing and task constructs. The former relies on the efficient fork-join execution model to exploit structured parallelism; while the latter relies on fine-grained synchronization among tasks and a flexible data-flow execution model to exploit dynamic, irregular, and nested parallelism. On applications that show both structured and unstructured… ▽ More

    Submitted 7 April, 2020; originally announced April 2020.

    Journal ref: 2019 IEEE 26th International Conference on High Performance Computing, Data, and Analytics (HiPC), Hyderabad, India, 2019, pp. 383-394

  9. arXiv:1911.11471  [pdf, other

    q-bio.GN cs.LG

    Random Forest as a Tumour Genetic Marker Extractor

    Authors: Raquel Pérez-Arnal, Dario Garcia-Gasulla, David Torrents, Ferran Parés, Ulises Cortés, Jesús Labarta, Eduard Ayguadé

    Abstract: Finding tumour genetic markers is essential to biomedicine due to their relevance for cancer detection and therapy development. In this paper, we explore a recently released dataset of chromosome rearrangements in 2,586 cancer patients, where different sorts of alterations have been detected. Using a Random Forest classifier, we evaluate the relevance of several features (some directly available i… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

  10. arXiv:1911.08953   

    cs.CV

    MetH: A family of high-resolution and variable-shape image challenges

    Authors: Ferran Parés, Dario Garcia-Gasulla, Harald Servat, Jesús Labarta, Eduard Ayguadé

    Abstract: High-resolution and variable-shape images have not yet been properly addressed by the AI community. The approach of down-sampling data often used with convolutional neural networks is sub-optimal for many tasks, and has too many drawbacks to be considered a sustainable alternative. In sight of the increasing importance of problems that can benefit from exploiting high-resolution (HR) and variable-… ▽ More

    Submitted 29 September, 2020; v1 submitted 20 November, 2019; originally announced November 2019.

    Comments: An improved and extended version of this paper has been published in arXiv:2007.13693 This version is now obsolete

  11. Feature discriminativity estimation in CNNs for transfer learning

    Authors: Victor Gimenez-Abalos, Armand Vilalta, Dario Garcia-Gasulla, Jesus Labarta, Eduard Ayguadé

    Abstract: The purpose of feature extraction on convolutional neural networks is to reuse deep representations learnt for a pre-trained model to solve a new, potentially unrelated problem. However, raw feature extraction from all layers is unfeasible given the massive size of these networks. Recently, a supervised method using complexity reduction was proposed, resulting in significant improvements in perfor… ▽ More

    Submitted 8 November, 2019; originally announced November 2019.

    Comments: Presented in the 22nd International Conference of the Catalan Association for Artificial Intelligence (CCIA 19)

    Journal ref: Volume 319: Artificial Intelligence Research and Development 2019

  12. arXiv:1905.05881  [pdf, other

    cs.LG stat.ML

    Resource-aware Elastic Swap Random Forest for Evolving Data Streams

    Authors: Diego Marrón, Eduard Ayguadé, José Ramon Herrero, Albert Bifet

    Abstract: Continual learning based on data stream mining deals with ubiquitous sources of Big Data arriving at high-velocity and in real-time. Adaptive Random Forest ({\em ARF}) is a popular ensemble method used for continual learning due to its simplicity in combining adaptive leveraging bagging with fast random Hoeffding trees. While the default ARF size provides competitive accuracy, it is usually over-p… ▽ More

    Submitted 14 May, 2019; originally announced May 2019.

  13. arXiv:1804.09558  [pdf, other

    cs.CL cs.AI cs.LG cs.NE stat.ML

    A Visual Distance for WordNet

    Authors: Raquel Pérez-Arnal, Armand Vilalta, Dario Garcia-Gasulla, Ulises Cortés, Eduard Ayguadé, Jesus Labarta

    Abstract: Measuring the distance between concepts is an important field of study of Natural Language Processing, as it can be used to improve tasks related to the interpretation of those same concepts. WordNet, which includes a wide variety of concepts associated with words (i.e., synsets), is often used as a source for computing those distances. In this paper, we explore a distance for WordNet synsets base… ▽ More

    Submitted 27 April, 2018; v1 submitted 24 April, 2018; originally announced April 2018.

  14. arXiv:1804.05267  [pdf, other

    cs.LG cs.NE stat.ML

    Low-Precision Floating-Point Schemes for Neural Network Training

    Authors: Marc Ortiz, Adrián Cristal, Eduard Ayguadé, Marc Casas

    Abstract: The use of low-precision fixed-point arithmetic along with stochastic rounding has been proposed as a promising alternative to the commonly used 32-bit floating point arithmetic to enhance training neural networks training in terms of performance and energy efficiency. In the first part of this paper, the behaviour of the 12-bit fixed-point arithmetic when training a convolutional neural network w… ▽ More

    Submitted 14 April, 2018; originally announced April 2018.

    Comments: 16 pages, 9 figures and 4 tables

    ACM Class: I.2.6; I.5

  15. arXiv:1707.09872  [pdf, other

    cs.CV cs.CL cs.NE

    Full-Network Embedding in a Multimodal Embedding Pipeline

    Authors: Armand Vilalta, Dario Garcia-Gasulla, Ferran Parés, Eduard Ayguadé, Jesus Labarta, Ulises Cortés, Toyotaro Suzumura

    Abstract: The current state-of-the-art for image annotation and image retrieval tasks is obtained through deep neural networks, which combine an image representation and a text representation into a shared embedding space. In this paper we evaluate the impact of using the Full-Network embedding in this setting, replacing the original image representation in a competitive multimodal embedding generation sche… ▽ More

    Submitted 9 August, 2017; v1 submitted 24 July, 2017; originally announced July 2017.

    Comments: In 2nd Workshop on Semantic Deep Learning (SemDeep-2) at the 12th International Conference on Computational Semantics (IWCS) 2017

  16. arXiv:1707.09323  [pdf, other

    cs.DC

    Identifying the potential of Near Data Computing for Apache Spark

    Authors: Ahsan Javed Awan, Mats Brorsson, Vladimir Vlassov, Eduard Ayguade

    Abstract: While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics for being a unified framework for both, batch and stream data processing. There is also a renewed interest is Near Data Computing (NDC) due to technological advancement in the last decade. However, it is not known if NDC… ▽ More

    Submitted 8 May, 2017; originally announced July 2017.

    Comments: position paper

  17. arXiv:1707.07465  [pdf, other

    cs.NE

    Building Graph Representations of Deep Vector Embeddings

    Authors: Dario Garcia-Gasulla, Armand Vilalta, Ferran Parés, Jonatan Moreno, Eduard Ayguadé, Jesus Labarta, Ulises Cortés, Toyotaro Suzumura

    Abstract: Patterns stored within pre-trained deep neural networks compose large and powerful descriptive languages that can be used for many different purposes. Typically, deep network representations are implemented within vector embedding spaces, which enables the use of traditional machine learning algorithms on top of them. In this short paper we propose the construction of a graph embedding space inste… ▽ More

    Submitted 9 August, 2017; v1 submitted 24 July, 2017; originally announced July 2017.

    Comments: Accepted at the 2nd Workshop on Semantic Deep Learning (SemDeep-2)

  18. arXiv:1705.07706  [pdf, other

    cs.LG cs.NE

    An Out-of-the-box Full-network Embedding for Convolutional Neural Networks

    Authors: Dario Garcia-Gasulla, Armand Vilalta, Ferran Parés, Jonatan Moreno, Eduard Ayguadé, Jesus Labarta, Ulises Cortés, Toyotaro Suzumura

    Abstract: Transfer learning for feature extraction can be used to exploit deep representations in contexts where there is very few training data, where there are limited computational resources, or when tuning the hyper-parameters needed for training is not an option. While previous contributions to feature extraction propose embeddings based on a single layer of the network, in this paper we propose a full… ▽ More

    Submitted 22 May, 2017; originally announced May 2017.

  19. arXiv:1703.09307  [pdf, other

    cs.DS cs.SI physics.soc-ph

    Fluid Communities: A Competitive, Scalable and Diverse Community Detection Algorithm

    Authors: Ferran Parés, Dario Garcia-Gasulla, Armand Vilalta, Jonatan Moreno, Eduard Ayguadé, Jesús Labarta, Ulises Cortés, Toyotaro Suzumura

    Abstract: We introduce a community detection algorithm (Fluid Communities) based on the idea of fluids interacting in an environment, expanding and contracting as a result of that interaction. Fluid Communities is based on the propagation methodology, which represents the state-of-the-art in terms of computational cost and scalability. While being highly efficient, Fluid Communities is able to find communit… ▽ More

    Submitted 9 October, 2017; v1 submitted 27 March, 2017; originally announced March 2017.

    Comments: Accepted at the 6th International Conference on Complex Networks and Their Applications

  20. arXiv:1703.01127  [pdf, other

    cs.NE cs.AI cs.LG stat.ML

    On the Behavior of Convolutional Nets for Feature Extraction

    Authors: Dario Garcia-Gasulla, Ferran Parés, Armand Vilalta, Jonatan Moreno, Eduard Ayguadé, Jesús Labarta, Ulises Cortés, Toyotaro Suzumura

    Abstract: Deep neural networks are representation learning techniques. During training, a deep net is capable of generating a descriptive language of unprecedented size and detail in machine learning. Extracting the descriptive language coded within a trained CNN model (in the case of image data), and reusing it for other purposes is a field of interest, as it provides access to the visual descriptors previ… ▽ More

    Submitted 29 January, 2018; v1 submitted 3 March, 2017; originally announced March 2017.

    Comments: Published in the Journal of Artificial Intelligence Research (JAIR), Special Track on Deep Learning, Knowledge Representation, and Reasoning

  21. arXiv:1611.09084  [pdf, other

    cs.DS cs.IR cs.SI

    Hierarchical Hyperlink Prediction for the WWW

    Authors: Dario Garcia-Gasulla, Eduard Ayguadé, Jesús Labarta, Ulises Cortés, Toyotaro Suzumura

    Abstract: The hyperlink prediction task, that of proposing new links between webpages, can be used to improve search engines, expand the visibility of web pages, and increase the connectivity and navigability of the web. Hyperlink prediction is typically performed on webgraphs composed by thousands or millions of vertices, where on average each webpage contains less than fifty links. Algorithms processing g… ▽ More

    Submitted 28 November, 2016; originally announced November 2016.

    Comments: Submitted to Transactions on Internet Technology journal

  22. arXiv:1611.00547  [pdf, other

    cs.SI cs.AI cs.DB

    Limitations and Alternatives for the Evaluation of Large-scale Link Prediction

    Authors: Dario Garcia-Gasulla, Eduard Ayguadé, Jesús Labarta, Ulises Cortés

    Abstract: Link prediction, the problem of identifying missing links among a set of inter-related data entities, is a popular field of research due to its application to graph-like domains. Producing consistent evaluations of the performance of the many link prediction algorithms being proposed can be challenging due to variable graph properties, such as size and density. In this paper we first discuss tradi… ▽ More

    Submitted 25 November, 2016; v1 submitted 2 November, 2016; originally announced November 2016.

    Comments: Submitted to New Generation Computing, 15 pages, 4 tables, 4 figures

  23. arXiv:1604.08484  [pdf, other

    cs.DC cs.AR cs.PF

    Architectural Impact on Performance of In-memory Data Analytics: Apache Spark Case Study

    Authors: Ahsan Javed Awan, Mats Brorsson, Vladimir Vlassov, Eduard Ayguade

    Abstract: While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics for being a unified framework for both, batch and stream data processing. However, recent studies on micro-architectural characterization of in-memory data analytics are limited to only batch processing workloads. We com… ▽ More

    Submitted 28 April, 2016; originally announced April 2016.

  24. arXiv:1507.08818  [pdf, other

    cs.CV cs.LG cs.NE

    A Visual Embedding for the Unsupervised Extraction of Abstract Semantics

    Authors: D. Garcia-Gasulla, J. Béjar, U. Cortés, E. Ayguadé, J. Labarta, T. Suzumura, R. Chen

    Abstract: Vector-space word representations obtained from neural network models have been shown to enable semantic operations based on vector arithmetic. In this paper, we explore the existence of similar information on vector representations of images. For that purpose we define a methodology to obtain large, sparse vector representations of image classes, and generate vectors through the state-of-the-art… ▽ More

    Submitted 16 December, 2016; v1 submitted 31 July, 2015; originally announced July 2015.

    Comments: 14 pages, 5 figures, accepted at Cognitive Systems Research

  25. arXiv:1507.08340  [pdf, other

    cs.DC cs.AR cs.PF

    How Data Volume Affects Spark Based Data Analytics on a Scale-up Server

    Authors: Ahsan Javed Awan, Mats Brorsson, Vladimir Vlassov, Eduard Ayguade

    Abstract: Sheer increase in volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark is gaining popularity for exhibiting superior scale-out performance on the commodity machines, the impact of data volume on the performance of Spark based data analytics in scale-up configuration is not we… ▽ More

    Submitted 29 July, 2015; originally announced July 2015.

    Comments: accepted to 6th International Workshop on Big Data Benchmarks, Performance Optimization and Emerging Hardware (BpoE-6) held in conjunction with VLDB 2015. arXiv admin note: text overlap with arXiv:1506.07742

  26. Performance Characterization of In-Memory Data Analytics on a Modern Cloud Server

    Authors: Ahsan Javed Awan, Mats Brorsson, Vladimir Vlassov, Eduard Ayguade

    Abstract: In last decade, data analytics have rapidly progressed from traditional disk-based processing to modern in-memory processing. However, little effort has been devoted at enhancing performance at micro-architecture level. This paper characterizes the performance of in-memory data analytics using Apache Spark framework. We use a single node NUMA machine and identify the bottlenecks hampering the scal… ▽ More

    Submitted 25 June, 2015; originally announced June 2015.

    Comments: Accepted to The 5th IEEE International Conference on Big Data and Cloud Computing (BDCloud 2015)