Skip to main content

Showing 1–16 of 16 results for author: Igual, F D

.
  1. Energy efficiency optimization of task-parallel codes on asymmetric architectures

    Authors: Luis Costero, Francisco D. Igual, Katzalin Olcoz, Francisco Tirado

    Abstract: We present a family of policies that, integrated within a runtime task scheduler (Nanox), pursue the goal of improving the energy efficiency of task-parallel executions with no intervention from the programmer. The proposed policies tackle the problem by modifying the core operating frequency via DVFS mechanisms, or by enabling/disabling the map** of tasks to specific cores at selected execution… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  2. Leveraging knowledge-as-a-service (KaaS) for QoS-aware resource management in multi-user video transcoding

    Authors: Luis Costero, Francisco D. Igual, Katzalin Olcoz, Francisco Tirado

    Abstract: The coexistence of parallel applications in shared computing nodes, each one featuring different Quality of Service (QoS) requirements, carries out new challenges to improve resource occupation while kee** acceptable rates in terms of QoS. As more application-specific and system-wide metrics are included as QoS dimensions, or under situations in which resource-usage limits are strict, building a… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Journal ref: Journal of Supercomputing 76, pp. 9388 to 9403 (2020)

  3. Acceleration and energy consumption optimization in cascading classifiers for face detection on low-cost ARM big.LITTLE asymmetric architectures

    Authors: Alberto Corpas, Luis Costero, Guillermo Botella, Francisco D. Igual, Carlos García, Manuel Rodríguez

    Abstract: This paper proposes a mechanism to accelerate and optimize the energy consumption of a face detection software based on Haar-like cascading classifiers, taking advantage of the features of low-cost Asymmetric Multicore Processors (AMPs) with limited power budget. A modelling and task scheduling/allocation is proposed in order to efficiently make use of the existing features on big.LITTLE ARM proce… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Journal ref: International Journal of Circuit Theory and Applications. 2018. 46, pp 1756 1776

  4. arXiv:2310.20347  [pdf, other

    cs.CL

    Automatic Generators for a Family of Matrix Multiplication Routines with Apache TVM

    Authors: Guillermo Alaejos, Adrián Castelló, Pedro Alonso-Jordá, Francisco D. Igual, Héctor Martínez, Enrique S. Quintana-Ortí

    Abstract: We explore the utilization of the Apache TVM open source framework to automatically generate a family of algorithms that follow the approach taken by popular linear algebra libraries, such as GotoBLAS2, BLIS and OpenBLAS, in order to obtain high-performance blocked formulations of the general matrix multiplication (GEMM). % In addition, we fully automatize the generation process, by also leveragin… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: 35 pages, 22 figures. Submitted to ACM TOMS

  5. arXiv:2304.14480  [pdf, other

    cs.DC

    Co-Design of the Dense Linear AlgebravSoftware Stack for Multicore Processors

    Authors: Héctor Martínez, Sandra Catalán, Francisco D. Igual, José R. Herrero, Rafael Rodríguez-Sánchez, Enrique S. Quintana-Ortí

    Abstract: This paper advocates for an intertwined design of the dense linear algebra software stack that breaks down the strict barriers between the high-level, blocked algorithms in LAPACK (Linear Algebra PACKage) and the low-level, architecture-dependent kernels in BLAS (Basic Linear Algebra Subprograms). Specifically, we propose customizing the GEMM (general matrix multiplication) kernel, which is invoke… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

  6. arXiv:2104.05782  [pdf, other

    cs.MS

    Efficient algorithms for computing a rank-revealing UTV factorization on parallel computing architectures

    Authors: N. Heavner, F. D. Igual, G. Quintana-Ortí, P. G. Martinsson

    Abstract: The randomized singular value decomposition (RSVD) is by now a well established technique for efficiently computing an approximate singular value decomposition of a matrix. Building on the ideas that underpin the RSVD, the recently proposed algorithm "randUTV" computes a FULL factorization of a given matrix that provides low-rank approximations with near-optimal error. Because the bulk of randUTV… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

    Comments: 31 pages and 20 figures

    ACM Class: G.1.3; G.4; C.4; D.1.3; F.2.1

  7. arXiv:1911.08963  [pdf, ps, other

    cs.DC cs.IT

    Parallel Implementations for Computing the Minimum Distance of a Random Linear Code on Multicomputers

    Authors: Gregorio Quintana-Ortí, Fernando Hernando, Francisco D. Igual

    Abstract: The minimum distance of a linear code is a key concept in information theory. Therefore, the time required by its computation is very important to many problems in this area. In this paper, we introduce a family of implementations of the Brouwer-Zimmermann algorithm for distributed-memory architectures for computing the minimum distance of a random linear code over F2. Both current commercial and… ▽ More

    Submitted 20 November, 2019; originally announced November 2019.

  8. arXiv:1904.11268  [pdf, other

    cs.CR

    Detecting time-fragmented cache attacks against AES using Performance Monitoring Counters

    Authors: Iván Prada, Francisco D. Igual, Katzalin Olcoz

    Abstract: Cache timing attacks use shared caches in multi-core processors as side channels to extract information from victim processes. These attacks are particularly dangerous in cloud infrastructures, in which the deployed countermeasures cause collateral effects in terms of performance loss and increase in energy consumption. We propose to monitor the victim process using an independent monitoring (dete… ▽ More

    Submitted 25 April, 2019; originally announced April 2019.

  9. arXiv:1804.07017  [pdf, other

    cs.DC cs.MS

    Programming Parallel Dense Matrix Factorizations with Look-Ahead and OpenMP

    Authors: Sandra Catalán, Adrián Castelló, Francisco D. Igual, Rafael Rodríguez-Sánchez, Enrique S. Quintana-Ortí

    Abstract: We investigate a parallelization strategy for dense matrix factorization (DMF) algorithms, using OpenMP, that departs from the legacy (or conventional) solution, which simply extracts concurrency from a multithreaded version of BLAS. This approach is also different from the more sophisticated runtime-assisted implementations, which decompose the operation into tasks and identify dependencies via d… ▽ More

    Submitted 19 April, 2018; originally announced April 2018.

    Comments: 28 pages

  10. Fast Algorithms for the Computation of the Minimum Distance of a Random Linear Code

    Authors: Fernando Hernando, Francisco D. Igual, Gregorio Quintana-Ortí

    Abstract: The minimum distance of a code is an important concept in information theory. Hence, computing the minimum distance of a code with a minimum computational cost is a crucial process to many problems in this area. In this paper, we present and evaluate a family of algorithms and implementations to compute the minimum distance of a random linear code over $\mathbb{F}_{2}$ that are faster than differe… ▽ More

    Submitted 30 January, 2017; v1 submitted 22 March, 2016; originally announced March 2016.

    MSC Class: 68Q30; 68Q25; 65F30; 11Y16

    Journal ref: ACM Transactions on Mathematical Software (TOMS). Volume 45 Issue 2, June 2019

  11. arXiv:1602.05510  [pdf, other

    cs.DC

    HeSP: a simulation framework for solving the task scheduling-partitioning problem on heterogeneous architectures

    Authors: Anton Rey, Francisco D. Igual, Manuel Prieto-Matías

    Abstract: In this paper we describe HeSP, a complete simulation framework to study a general task scheduling-partitioning problem on heterogeneous architectures, which treats recursive task partitioning and scheduling decisions on equal footing. Considering recursive partitioning as an additional degree of freedom, tasks can be dynamically partitioned or merged at runtime for each available processor type,… ▽ More

    Submitted 17 February, 2016; originally announced February 2016.

  12. arXiv:1511.02171  [pdf, other

    cs.MS cs.DC

    Multi-Threaded Dense Linear Algebra Libraries for Low-Power Asymmetric Multicore Processors

    Authors: Sandra Catalán, José R. Herrero, Francisco D. Igual, Rafael Rodríguez-Sánchez, Enrique S. Quintana-Ortí

    Abstract: Dense linear algebra libraries, such as BLAS and LAPACK, provide a relevant collection of numerical tools for many scientific and engineering applications. While there exist high performance implementations of the BLAS (and LAPACK) functionality for many current multi-threaded architectures,the adaption of these libraries for asymmetric multicore processors (AMPs)is still pending. In this paper we… ▽ More

    Submitted 6 November, 2015; originally announced November 2015.

  13. arXiv:1509.02058  [pdf, other

    cs.DC

    Revisiting Conventional Task Schedulers to Exploit Asymmetry in ARM big.LITTLE Architectures for Dense Linear Algebra

    Authors: Luis Costero, Francisco D. Igual, Katzalin Olcoz, Enrique S. Quintana-Ortí

    Abstract: Dealing with asymmetry in the architecture opens a plethora of questions from the perspective of scheduling task-parallel applications, and there exist early attempts to address this problem via ad-hoc strategies embedded into a runtime framework. In this paper we take a different path, which consists in addressing the complexity of the problem at the library level, via a few asymmetry-aware funda… ▽ More

    Submitted 7 September, 2015; originally announced September 2015.

  14. arXiv:1507.05129  [pdf, ps, other

    cs.DC

    Performance and Energy Optimization of Matrix Multiplication on Asymmetric big.LITTLE Processors

    Authors: Sandra Catalán, Francisco D. Igual, Rafael Mayo, Luis Piñuel, Enrique S. Quintana-Ortí, Rafael Rodríguez-Sánchez

    Abstract: Asymmetric processors have emerged as an appealing technology for severely energy-constrained environments, especially in the mobile market where heterogeneity in applications is mainstream. In addition, given the growing interest on ultra low-power architectures for high performance computing, this type of platforms are also being investigated in the road towards the implementation of energy- eff… ▽ More

    Submitted 17 July, 2015; originally announced July 2015.

    Comments: Presented at HiPEAC 2015, Amsterdam. Foundation of the Asymmetric BLIS implementation

  15. arXiv:1506.08988  [pdf, other

    cs.PF cs.DC cs.MS math.NA

    Architecture-Aware Configuration and Scheduling of Matrix Multiplication on Asymmetric Multicore Processors

    Authors: Sandra Catalán, Francisco D. Igual, Rafael Mayo, Rafael Rodríguez-Sánchez, Enrique S. Quintana-Ortí

    Abstract: Asymmetric multicore processors (AMPs) have recently emerged as an appealing technology for severely energy-constrained environments, especially in mobile appliances where heterogeneity in applications is mainstream. In addition, given the growing interest for low-power high performance computing, this type of architectures is also being investigated as a means to improve the throughput-per-Watt o… ▽ More

    Submitted 30 June, 2015; originally announced June 2015.

  16. arXiv:1111.6374  [pdf, ps, other

    cs.PF cond-mat.mtrl-sci cs.DC cs.MS

    Solving Dense Generalized Eigenproblems on Multi-threaded Architectures

    Authors: José I. Aliaga, Paolo Bientinesi, Davor Davidović, Edoardo Di Napoli, Francisco D. Igual, Enrique S. Quintana-Ortí

    Abstract: We compare two approaches to compute a portion of the spectrum of dense symmetric definite generalized eigenproblems: one is based on the reduction to tridiagonal form, and the other on the Krylov-subspace iteration. Two large-scale applications, arising in molecular dynamics and material science, are employed to investigate the contributions of the application, architecture, and parallelism of th… ▽ More

    Submitted 17 June, 2012; v1 submitted 28 November, 2011; originally announced November 2011.

    Comments: 5 tables and 4 figures. In press by Applied Mathematics and Computation. Accepted version