Search | arXiv e-print repository

doi 10.1109/HPCA57654.2024.00046

A Two Level Neural Approach Combining Off-Chip Prediction with Adaptive Prefetch Filtering

Authors: Alexandre Valentin Jamet, Georgios Vavouliotis, Daniel A. Jiménez, Lluc Alvarez, Marc Casas

Abstract: To alleviate the performance and energy overheads of contemporary applications with large data footprints, we propose the Two Level Perceptron (TLP) predictor, a neural mechanism that effectively combines predicting whether an access will be off-chip with adaptive prefetch filtering at the first-level data cache (L1D). TLP is composed of two connected microarchitectural perceptron predictors, name… ▽ More To alleviate the performance and energy overheads of contemporary applications with large data footprints, we propose the Two Level Perceptron (TLP) predictor, a neural mechanism that effectively combines predicting whether an access will be off-chip with adaptive prefetch filtering at the first-level data cache (L1D). TLP is composed of two connected microarchitectural perceptron predictors, named First Level Predictor (FLP) and Second Level Predictor (SLP). FLP performs accurate off-chip prediction by using several program features based on virtual addresses and a novel selective delay component. The novelty of SLP relies on leveraging off-chip prediction to drive L1D prefetch filtering by using physical addresses and the FLP prediction as features. TLP constitutes the first hardware proposal targeting both off-chip prediction and prefetch filtering using a multi-level perceptron hardware approach. TLP only requires 7KB of storage. To demonstrate the benefits of TLP we compare its performance with state-of-the-art approaches using off-chip prediction and prefetch filtering on a wide range of single-core and multi-core workloads. Our experiments show that TLP reduces the average DRAM transactions by 30.7% and 17.7%, as compared to a baseline using state-of-the-art cache prefetchers but no off-chip prediction mechanism, across the single-core and multi-core workloads, respectively, while recent work significantly increases DRAM transactions. As a result, TLP achieves geometric mean performance speedups of 6.2% and 11.8% across single-core and multi-core workloads, respectively. In addition, our evaluation demonstrates that TLP is effective independently of the L1D prefetching logic. △ Less

Submitted 22 March, 2024; originally announced March 2024.

Comments: To appear in 30th International Symposium on High-Performance Computer Architecture (HPCA), 2024

arXiv:2401.08124 [pdf, other]

A Large-Scale Epidemic Simulation Framework for Realistic Social Contact Networks

Authors: Joy Kitson, Ian Costello, Jiangzhuo Chen, Diego Jiménez, Stefan Hoops, Henning Mortveit, Esteban Meneses, Jae-Seung Yeom, Madhav V. Marathe, Abhinav Bhatele

Abstract: Global pandemics can wreak havoc and lead to significant social, economic, and personal losses. Preventing the spread of infectious diseases requires implementing interventions at different levels of government, and evaluating the potential impact and efficacy of those preemptive measures. Agent-based modeling can be used for detailed studies of epidemic diffusion and possible interventions. We pr… ▽ More Global pandemics can wreak havoc and lead to significant social, economic, and personal losses. Preventing the spread of infectious diseases requires implementing interventions at different levels of government, and evaluating the potential impact and efficacy of those preemptive measures. Agent-based modeling can be used for detailed studies of epidemic diffusion and possible interventions. We present Loimos, a highly parallel simulation of epidemic diffusion written on top of Charm++, an asynchronous task-based parallel runtime. Loimos uses a hybrid of time-step** and discrete-event simulation to model disease spread. We demonstrate that our implementation of Loimos is able to scale to large core counts on an HPC system. In particular, Loimos is able to simulate a US-scale synthetic interaction network in an average of 1.497 seconds per simulation day when executed on 16 nodes on Rivanna at the University of Virginia, processing around 428 billion interactions (person-person edges) in under five minutes for an average of 1.4 billion traversed edges per second (TEPS). △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: 13 pages (including references), 9 figures

arXiv:2307.02912 [pdf, other]

LEA: Improving Sentence Similarity Robustness to Typos Using Lexical Attention Bias

Authors: Mario Almagro, Emilio Almazán, Diego Ortego, David Jiménez

Abstract: Textual noise, such as typos or abbreviations, is a well-known issue that penalizes vanilla Transformers for most downstream tasks. We show that this is also the case for sentence similarity, a fundamental task in multiple domains, e.g. matching, retrieval or paraphrasing. Sentence similarity can be approached using cross-encoders, where the two sentences are concatenated in the input allowing the… ▽ More Textual noise, such as typos or abbreviations, is a well-known issue that penalizes vanilla Transformers for most downstream tasks. We show that this is also the case for sentence similarity, a fundamental task in multiple domains, e.g. matching, retrieval or paraphrasing. Sentence similarity can be approached using cross-encoders, where the two sentences are concatenated in the input allowing the model to exploit the inter-relations between them. Previous works addressing the noise issue mainly rely on data augmentation strategies, showing improved robustness when dealing with corrupted samples that are similar to the ones used for training. However, all these methods still suffer from the token distribution shift induced by typos. In this work, we propose to tackle textual noise by equip** cross-encoders with a novel LExical-aware Attention module (LEA) that incorporates lexical similarities between words in both sentences. By using raw text similarities, our approach avoids the tokenization shift problem obtaining improved robustness. We demonstrate that the attention bias introduced by LEA helps cross-encoders to tackle complex scenarios with textual noise, specially in domains with short-text descriptions and limited context. Experiments using three popular Transformer encoders in five e-commerce datasets for product matching show that LEA consistently boosts performance under the presence of noise, while remaining competitive on the original (clean) splits. We also evaluate our approach in two datasets for textual entailment and paraphrasing showing that LEA is robust to typos in domains with longer sentences and more natural context. Additionally, we thoroughly analyze several design choices in our approach, providing insights about the impact of the decisions made and fostering future research in cross-encoders dealing with typos. △ Less

Submitted 6 July, 2023; originally announced July 2023.

Comments: KDD'23 conference (main research track). (*) These authors contributed equally

arXiv:2303.00795 [pdf, other]

Improved Segmentation of Deep Sulci in Cortical Gray Matter Using a Deep Learning Framework Incorporating Laplace's Equation

Authors: Sadhana Ravikumar, Ranjit Ittyerah, Sydney Lim, Long Xie, Sandhitsu Das, Pulkit Khandelwal, Laura E. M. Wisse, Madigan L. Bedard, John L. Robinson, Terry Schuck, Murray Grossman, John Q. Trojanowski, Edward B. Lee, M. Dylan Tisdall, Karthik Prabhakaran, John A. Detre, David J. Irwin, Winifred Trotman, Gabor Mizsei, Emilio Artacho-Pérula, Maria Mercedes Iñiguez de Onzono Martin, Maria del Mar Arroyo Jiménez, Monica Muñoz, Francisco Javier Molina Romero, Maria del Pilar Marcos Rabal , et al. (7 additional authors not shown)

Abstract: When develo** tools for automated cortical segmentation, the ability to produce topologically correct segmentations is important in order to compute geometrically valid morphometry measures. In practice, accurate cortical segmentation is challenged by image artifacts and the highly convoluted anatomy of the cortex itself. To address this, we propose a novel deep learning-based cortical segmentat… ▽ More When develo** tools for automated cortical segmentation, the ability to produce topologically correct segmentations is important in order to compute geometrically valid morphometry measures. In practice, accurate cortical segmentation is challenged by image artifacts and the highly convoluted anatomy of the cortex itself. To address this, we propose a novel deep learning-based cortical segmentation method in which prior knowledge about the geometry of the cortex is incorporated into the network during the training process. We design a loss function which uses the theory of Laplace's equation applied to the cortex to locally penalize unresolved boundaries between tightly folded sulci. Using an ex vivo MRI dataset of human medial temporal lobe specimens, we demonstrate that our approach outperforms baseline segmentation networks, both quantitatively and qualitatively. △ Less

Submitted 3 March, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

Comments: Accepted at the 28th biennial international conference on Information Processing in Medical Imaging (IPMI 2023)

arXiv:2210.14324 [pdf, other]

The Championship Simulator: Architectural Simulation for Education and Competition

Authors: Nathan Gober, Gino Chacon, Lei Wang, Paul V. Gratz, Daniel A. Jimenez, Elvira Teran, Seth Pugsley, **chun Kim

Abstract: Recent years have seen a dramatic increase in the microarchitectural complexity of processors. This increase in complexity presents a twofold challenge for the field of computer architecture. First, no individual architect can fully comprehend the complexity of the entire microarchitecture of the core. This leads to increasingly specialized architects, who treat parts of the core outside their par… ▽ More Recent years have seen a dramatic increase in the microarchitectural complexity of processors. This increase in complexity presents a twofold challenge for the field of computer architecture. First, no individual architect can fully comprehend the complexity of the entire microarchitecture of the core. This leads to increasingly specialized architects, who treat parts of the core outside their particular expertise as black boxes. Second, with increasing complexity, the field becomes decreasingly accessible to new students of the field. When learning core microarchitecture, new students must first learn the big picture of how the system works in order to understand how the pieces all fit together. The tools used to study microarchitecture experience a similar struggle. As with the microarchitectures they simulate, an increase in complexity reduces accessibility to new users. In this work, we present ChampSim. ChampSim uses a modular design and configurable structure to achieve a low barrier to entry into the field of microarchitecural simulation. ChampSim has shown itself to be useful in multiple areas of research, competition, and education. In this way, we seek to promote access and inclusion despite the increasing complexity of the field of computer architecture. △ Less

Submitted 25 October, 2022; originally announced October 2022.

arXiv:2210.08330 [pdf, other]

Aplicación de redes neuronales convolucionales profundas al diagnóstico asistido de la enfermedad de Alzheimer

Authors: Ángel de la Vega Jiménez

Abstract: Currently, the diagnosis of Alzheimer's disease is a complex and error-prone process. Improving this diagnosis could allow earlier detection of the disease and improve the quality of life of patients and their families. For this work, we will use 249 brain images from two modalities: PET and MRI, taken from the ADNI database, and labelled into three classes according to the degree of development o… ▽ More Currently, the diagnosis of Alzheimer's disease is a complex and error-prone process. Improving this diagnosis could allow earlier detection of the disease and improve the quality of life of patients and their families. For this work, we will use 249 brain images from two modalities: PET and MRI, taken from the ADNI database, and labelled into three classes according to the degree of development of Alzheimer's disease. We propose the development of a convolutional neural network to perform the classification of these images, during which, we will study the appropriate depth of the networks for this problem, the importance of pre-processing medical images, the use of transfer learning and data augmentation techniques as tools to reduce the effects of the problem of having too little data, and the simultaneous use of multiple medical imaging modalities. We also propose the application of an evaluation method that guarantees a good degree of repeatability of the results even when using a small dataset. Following this evaluation method, our best final model, which makes use of transfer learning with COVID-19 data, achieves an accuracy d 68\%. In addition, in an independent test set, this same model achieves 70\% accuracy, a promising result given the small size of our dataset. We further conclude that augmenting the depth of the networks helps with this problem, that image pre-processing is a fundamental process to address this type of medical problem, and that the use of data augmentation and the use of pre-trained networks with images of other diseases can provide significant improvements. △ Less

Submitted 15 October, 2022; originally announced October 2022.

Comments: in Spanish language

arXiv:2207.02008 [pdf, other]

Block-SCL: Blocking Matters for Supervised Contrastive Learning in Product Matching

Authors: Mario Almagro, David Jiménez, Diego Ortego, Emilio Almazán, Eva Martínez

Abstract: Product matching is a fundamental step for the global understanding of consumer behavior in e-commerce. In practice, product matching refers to the task of deciding if two product offers from different data sources (e.g. retailers) represent the same product. Standard pipelines use a previous stage called blocking, where for a given product offer a set of potential matching candidates are retrieve… ▽ More Product matching is a fundamental step for the global understanding of consumer behavior in e-commerce. In practice, product matching refers to the task of deciding if two product offers from different data sources (e.g. retailers) represent the same product. Standard pipelines use a previous stage called blocking, where for a given product offer a set of potential matching candidates are retrieved based on similar characteristics (e.g. same brand, category, flavor, etc.). From these similar product candidates, those that are not a match can be considered hard negatives. We present Block-SCL, a strategy that uses the blocking output to make the most of Supervised Contrastive Learning (SCL). Concretely, Block-SCL builds enriched batches using the hard-negatives samples obtained in the blocking stage. These batches provide a strong training signal leading the model to learn more meaningful sentence embeddings for product matching. Experimental results in several public datasets demonstrate that Block-SCL achieves state-of-the-art results despite only using short product titles as input, no data augmentation, and a lighter transformer backbone than competing methods. △ Less

Submitted 5 July, 2022; originally announced July 2022.

Comments: 7 pages, 2 figures, e-commerce, conference

arXiv:2007.07802 [pdf, ps, other]

doi 10.5802/alco.249

Permutree sorting

Authors: Vincent Pilaud, Viviane Pons, Daniel Tamayo Jiménez

Abstract: Generalizing stack sorting and $c$-sorting for permutations, we define the permutree sorting algorithm. Given two disjoint subsets $U$ and $D$ of $\{2, \dots, n-1\}$, the $(U,D)$-permutree sorting tries to sort the permutation $π\in \mathfrak{S}_n$ and fails if and only if there are $1 \le i < j < k \le n$ such that $π$ contains the subword $jki$ if $j \in U$ and $kij$ if $j \in D$. This algorithm… ▽ More Generalizing stack sorting and $c$-sorting for permutations, we define the permutree sorting algorithm. Given two disjoint subsets $U$ and $D$ of $\{2, \dots, n-1\}$, the $(U,D)$-permutree sorting tries to sort the permutation $π\in \mathfrak{S}_n$ and fails if and only if there are $1 \le i < j < k \le n$ such that $π$ contains the subword $jki$ if $j \in U$ and $kij$ if $j \in D$. This algorithm is seen as a way to explore an automaton which either rejects all reduced expressions of $π$, or accepts those reduced expressions for $π$ whose prefixes are all $(U,D)$-permutree sortable. △ Less

Submitted 6 March, 2023; v1 submitted 15 July, 2020; originally announced July 2020.

Comments: 18 pages, 5 figures

MSC Class: 68P10; 68Q45; 68R05; 05E99

Journal ref: Alg. Comb., 6(1):53-74, 2023

arXiv:2004.05222 [pdf]

Give more data, awareness and control to individual citizens, and they will help COVID-19 containment

Authors: Mirco Nanni, Gennady Andrienko, Albert-László Barabási, Chiara Boldrini, Francesco Bonchi, Ciro Cattuto, Francesca Chiaromonte, Giovanni Comandé, Marco Conti, Mark Coté, Frank Dignum, Virginia Dignum, Josep Domingo-Ferrer, Paolo Ferragina, Fosca Giannotti, Riccardo Guidotti, Dirk Helbing, Kimmo Kaski, Janos Kertesz, Sune Lehmann, Bruno Lepri, Paul Lukowicz, Stan Matwin, David Megías Jiménez, Anna Monreale , et al. (14 additional authors not shown)

Abstract: The rapid dynamics of COVID-19 calls for quick and effective tracking of virus transmission chains and early detection of outbreaks, especially in the phase 2 of the pandemic, when lockdown and other restriction measures are progressively withdrawn, in order to avoid or minimize contagion resurgence. For this purpose, contact-tracing apps are being proposed for large scale adoption by many countri… ▽ More The rapid dynamics of COVID-19 calls for quick and effective tracking of virus transmission chains and early detection of outbreaks, especially in the phase 2 of the pandemic, when lockdown and other restriction measures are progressively withdrawn, in order to avoid or minimize contagion resurgence. For this purpose, contact-tracing apps are being proposed for large scale adoption by many countries. A centralized approach, where data sensed by the app are all sent to a nation-wide server, raises concerns about citizens' privacy and needlessly strong digital surveillance, thus alerting us to the need to minimize personal data collection and avoiding location tracking. We advocate the conceptual advantage of a decentralized approach, where both contact and location data are collected exclusively in individual citizens' "personal data stores", to be shared separately and selectively, voluntarily, only when the citizen has tested positive for COVID-19, and with a privacy preserving level of granularity. This approach better protects the personal sphere of citizens and affords multiple benefits: it allows for detailed information gathering for infected people in a privacy-preserving fashion; and, in turn this enables both contact tracing, and, the early detection of outbreak hotspots on more finely-granulated geographic scale. Our recommendation is two-fold. First to extend existing decentralized architectures with a light touch, in order to manage the collection of location data locally on the device, and allow the user to share spatio-temporal aggregates - if and when they want, for specific aims - with health authorities, for instance. Second, we favour a longer-term pursuit of realizing a Personal Data Store vision, giving users the opportunity to contribute to collective good in the measure they want, enhancing self-awareness, and cultivating collective efforts for rebuilding society. △ Less

Submitted 16 April, 2020; v1 submitted 10 April, 2020; originally announced April 2020.

Comments: Revised text. Additional authors

Journal ref: Transactions on Data Privacy 13(1): 61-66 (2020), http://www.tdp.cat/issues16/abs.a389a20.php

arXiv:2002.07725 [pdf, other]

Gradient-Based Adversarial Training on Transformer Networks for Detecting Check-Worthy Factual Claims

Authors: Kevin Meng, Damian Jimenez, Fatma Arslan, Jacob Daniel Devasier, Daniel Obembe, Chengkai Li

Abstract: We present a study on the efficacy of adversarial training on transformer neural network models, with respect to the task of detecting check-worthy claims. In this work, we introduce the first adversarially-regularized, transformer-based claim spotter model that achieves state-of-the-art results on multiple challenging benchmarks. We obtain a 4.70 point F1-score improvement over current state-of-t… ▽ More We present a study on the efficacy of adversarial training on transformer neural network models, with respect to the task of detecting check-worthy claims. In this work, we introduce the first adversarially-regularized, transformer-based claim spotter model that achieves state-of-the-art results on multiple challenging benchmarks. We obtain a 4.70 point F1-score improvement over current state-of-the-art models on the ClaimBuster Dataset and CLEF2019 Dataset, respectively. In the process, we propose a method to apply adversarial training to transformer models, which has the potential to be generalized to many similar text classification tasks. Along with our results, we are releasing our codebase and manually labeled datasets. We also showcase our models' real world usage via a live public API. △ Less

Submitted 21 May, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

Comments: 11 pages, 4 figures, 6 tables

arXiv:1912.01563 [pdf, other]

LEGaTO: Low-Energy, Secure, and Resilient Toolset for Heterogeneous Computing

Authors: B. Salami, K. Parasyris, A. Cristal, O. Unsal, X. Martorell, P. Carpenter, R. De La Cruz, L. Bautista, D. Jimenez, C. Alvarez, S. Nabavi, S. Madonar, M. Pericas, P. Trancoso, M. Abduljabbar, J. Chen, P. N. Soomro, M Manivannan, M. Berge, S. Krupop, F. Klawonn, Al Mekhlafi, S. May, T. Becker, G. Gaydadjiev , et al. (20 additional authors not shown)

Abstract: The LEGaTO project leverages task-based programming models to provide a software ecosystem for Made in-Europe heterogeneous hardware composed of CPUs, GPUs, FPGAs and dataflow engines. The aim is to attain one order of magnitude energy savings from the edge to the converged cloud/HPC, balanced with the security and resilience challenges. LEGaTO is an ongoing three-year EU H2020 project started in… ▽ More The LEGaTO project leverages task-based programming models to provide a software ecosystem for Made in-Europe heterogeneous hardware composed of CPUs, GPUs, FPGAs and dataflow engines. The aim is to attain one order of magnitude energy savings from the edge to the converged cloud/HPC, balanced with the security and resilience challenges. LEGaTO is an ongoing three-year EU H2020 project started in December 2017. △ Less

Submitted 1 December, 2019; originally announced December 2019.

Comments: 6 pages, 9 figures

arXiv:1909.01891 [pdf, other]

Let's agree to disagree: learning highly debatable multirater labelling

Authors: Carole H. Sudre, Beatriz Gomez Anson, Silvia Ingala, Chris D. Lane, Daniel Jimenez, Lukas Haider, Thomas Varsavsky, Ryutaro Tanno, Lorna Smith, Sébastien Ourselin, Rolf H. Jäger, M. Jorge Cardoso

Abstract: Classification and differentiation of small pathological objects may greatly vary among human raters due to differences in training, expertise and their consistency over time. In a radiological setting, objects commonly have high within-class appearance variability whilst sharing certain characteristics across different classes, making their distinction even more difficult. As an example, markers… ▽ More Classification and differentiation of small pathological objects may greatly vary among human raters due to differences in training, expertise and their consistency over time. In a radiological setting, objects commonly have high within-class appearance variability whilst sharing certain characteristics across different classes, making their distinction even more difficult. As an example, markers of cerebral small vessel disease, such as enlarged perivascular spaces (EPVS) and lacunes, can be very varied in their appearance while exhibiting high inter-class similarity, making this task highly challenging for human raters. In this work, we investigate joint models of individual rater behaviour and multirater consensus in a deep learning setting, and apply it to a brain lesion object-detection task. Results show that jointly modelling both individual and consensus estimates leads to significant improvements in performance when compared to directly predicting consensus labels, while also allowing the characterization of human-rater consistency. △ Less

Submitted 4 September, 2019; originally announced September 2019.

Comments: Accepted at MICCAI 2019

arXiv:1812.09046 [pdf, other]

3D multirater RCNN for multimodal multiclass detection and characterisation of extremely small objects

Authors: Carole H. Sudre, Beatriz Gomez Anson, Silvia Ingala, Chris D. Lane, Daniel Jimenez, Lukas Haider, Thomas Varsavsky, Lorna Smith, H. Rolf Jäger, M. Jorge Cardoso

Abstract: Extremely small objects (ESO) have become observable on clinical routine magnetic resonance imaging acquisitions, thanks to a reduction in acquisition time at higher resolution. Despite their small size (usually $<$10 voxels per object for an image of more than $10^6$ voxels), these markers reflect tissue damage and need to be accounted for to investigate the complete phenotype of complex patholog… ▽ More Extremely small objects (ESO) have become observable on clinical routine magnetic resonance imaging acquisitions, thanks to a reduction in acquisition time at higher resolution. Despite their small size (usually $<$10 voxels per object for an image of more than $10^6$ voxels), these markers reflect tissue damage and need to be accounted for to investigate the complete phenotype of complex pathological pathways. In addition to their very small size, variability in shape and appearance leads to high labelling variability across human raters, resulting in a very noisy gold standard. Such objects are notably present in the context of cerebral small vessel disease where enlarged perivascular spaces and lacunes, commonly observed in the ageing population, are thought to be associated with acceleration of cognitive decline and risk of dementia onset. In this work, we redesign the RCNN model to scale to 3D data, and to jointly detect and characterise these important markers of age-related neurovascular changes. We also propose training strategies enforcing the detection of extremely small objects, ensuring a tractable and stable training process. △ Less

Submitted 21 December, 2018; originally announced December 2018.

arXiv:1008.3147

doi 10.4204/EPTCS.33

Proceedings First Workshop on Applications of Membrane computing, Concurrency and Agent-based modelling in POPulation biology

Authors: Paolo Milazzo, Mario de J. Pérez Jiménez

Abstract: This volume contains the papers presented at the first International Workshop on Applications of Membrane Computing, Concurrency and Agent-based Modelling in Population Biology (AMCA-POP 2010) held in Jena, Germany on August 25th, 2010 as a satellite event of the 11th Conference on Membrane Computing (CMC11). The aim of the workshop is to investigate whether formal modelling and analysis techniq… ▽ More This volume contains the papers presented at the first International Workshop on Applications of Membrane Computing, Concurrency and Agent-based Modelling in Population Biology (AMCA-POP 2010) held in Jena, Germany on August 25th, 2010 as a satellite event of the 11th Conference on Membrane Computing (CMC11). The aim of the workshop is to investigate whether formal modelling and analysis techniques could be applied with profit to systems of interest for population biology and ecology. The considered modelling notations include membrane systems, Petri nets, agent-based notations, process calculi, automata-based notations, rewriting systems and cellular automata. Such notations enable the application of analysis techniques such as simulation, model checking, abstract interpretation and type systems to study systems of interest in disciplines such as population biology, ecosystem science, epidemiology, genetics, sustainability science, evolution and other disciplines in which population dynamics and interactions with the environment are studied. Papers contain results and experiences in the modelling and analysis of systems of interest in these fields. △ Less

Submitted 18 August, 2010; originally announced August 2010.

Journal ref: EPTCS 33, 2010

Showing 1–14 of 14 results for author: Jiménez, D