Search | arXiv e-print repository

SYNPA: SMT Performance Analysis and Allocation of Threads to Cores in ARM Processors

Authors: Marta Navarro, Josué Feliu, Salvador Petit, María E. Gómez, Julio Sahuquillo

Abstract: Simultaneous multithreading processors improve throughput over single-threaded processors thanks to sharing internal core resources among instructions from distinct threads. However, resource sharing introduces inter-thread interference within the core, which has a negative impact on individual application performance and can significantly increase the turnaround time of multi-program workloads. T… ▽ More Simultaneous multithreading processors improve throughput over single-threaded processors thanks to sharing internal core resources among instructions from distinct threads. However, resource sharing introduces inter-thread interference within the core, which has a negative impact on individual application performance and can significantly increase the turnaround time of multi-program workloads. The severity of the interference effects depends on the competing co-runners sharing the core. Thus, it can be mitigated by applying a thread-to-core allocation policy that smartly selects applications to be run in the same core to minimize their interference. This paper presents SYNPA, a simple approach that dynamically allocates threads to cores in an SMT processor based on their run-time dynamic behavior. The approach uses a regression model to select synergistic pairs to mitigate intra-core interference. The main novelty of SYNPA is that it uses just three variables collected from the performance counters available in current ARM processors at the dispatch stage. Experimental results show that SYNPA outperforms the default Linux scheduler by around 36%, on average, in terms of turnaround time in 8-application workloads combining frontend bound and backend bound benchmarks. △ Less

Submitted 19 October, 2023; originally announced October 2023.

Comments: 11 pages, 9 figures

arXiv:2211.09566 [pdf, other]

Enabling Collagen Quantification on HE-stained Slides Through Stain Deconvolution and Restained HE-HES

Authors: Guillaume Balezo, Christof A. Bertram, Cyprien Tilmant, Stéphanie Petit, Saima Ben Hadj, Rutger H. J. Fick

Abstract: In histology, the presence of collagen in the extra-cellular matrix has both diagnostic and prognostic value for cancer malignancy, and can be highlighted by adding Saffron (S) to a routine Hematoxylin and Eosin (HE) staining. However, Saffron is not usually added because of the additional cost and because pathologists are accustomed to HE, with the exception of France-based laboratories. In this… ▽ More In histology, the presence of collagen in the extra-cellular matrix has both diagnostic and prognostic value for cancer malignancy, and can be highlighted by adding Saffron (S) to a routine Hematoxylin and Eosin (HE) staining. However, Saffron is not usually added because of the additional cost and because pathologists are accustomed to HE, with the exception of France-based laboratories. In this paper, we show that it is possible to quantify the collagen content from the HE image alone and to digitally create an HES image. To do so, we trained a UNet to predict the Saffron densities from HE images. We created a dataset of registered, restained HE-HES slides and we extracted the Saffron concentrations as ground truth using stain deconvolution on the HES images. Our model reached a Mean Absolute Error of 0.0668 $\pm$ 0.0002 (Saffron values between 0 and 1) on a 3-fold testing set. We hope our approach can aid in improving the clinical workflow while reducing reagent costs for laboratories. △ Less

Submitted 17 November, 2022; originally announced November 2022.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2211.09559 [pdf, other]

Interpretable HER2 scoring by evaluating clinical Guidelines through a weakly supervised, constrained Deep Learning Approach

Authors: Manh Dan Pham, Cyprien Tilmant, Stéphanie Petit, Isabelle Salmon, Saima Ben Hadj, Rutger H. J. Fick

Abstract: The evaluation of the Human Epidermal growth factor Receptor-2 (HER2) expression is an important prognostic biomarker for breast cancer treatment selection. However, HER2 scoring has notoriously high interobserver variability due to stain variations between centers and the need to estimate visually the staining intensity in specific percentages of tumor area. In this paper, focusing on the interpr… ▽ More The evaluation of the Human Epidermal growth factor Receptor-2 (HER2) expression is an important prognostic biomarker for breast cancer treatment selection. However, HER2 scoring has notoriously high interobserver variability due to stain variations between centers and the need to estimate visually the staining intensity in specific percentages of tumor area. In this paper, focusing on the interpretability of HER2 scoring by a pathologist, we propose a semi-automatic, two-stage deep learning approach that directly evaluates the clinical HER2 guidelines defined by the American Society of Clinical Oncology/ College of American Pathologists (ASCO/CAP). In the first stage, we segment the invasive tumor over the user-indicated Region of Interest (ROI). Then, in the second stage, we classify the tumor tissue into four HER2 classes. For the classification stage, we use weakly supervised, constrained optimization to find a model that classifies cancerous patches such that the tumor surface percentage meets the guidelines specification of each HER2 class. We end the second stage by freezing the model and refining its output logits in a supervised way to all slide labels in the training set. To ensure the quality of our dataset's labels, we conducted a multi-pathologist HER2 scoring consensus. For the assessment of doubtful cases where no consensus was found, our model can help by interpreting its HER2 class percentages output. We achieve a performance of 0.78 in F1-score on the test set while kee** our model interpretable for the pathologist, hopefully contributing to interpretable AI models in digital pathology. △ Less

Submitted 17 November, 2022; originally announced November 2022.

Comments: Submitted to Elsevier

arXiv:2203.16288 [pdf, ps, other]

Region of Interest focused MRI to Synthetic CT Translation using Regression and Classification Multi-task Network

Authors: Sandeep Kaushik, Mikael Bylund, Cristina Cozzini, Dattesh Shanbhag, Steven F Petit, Jonathan J Wyatt, Marion I Menzel, Carolin Pirkl, Bhairav Mehta, Vikas Chauhan, Kesavadas Chandrasekharan, Joakim Jonsson, Tufve Nyholm, Florian Wiesinger, Bjoern Menze

Abstract: In this work, we present a method for synthetic CT (sCT) generation from zero-echo-time (ZTE) MRI aimed at structural and quantitative accuracies of the image, with a particular focus on the accurate bone density value prediction. We propose a loss function that favors a spatially sparse region in the image. We harness the ability of a multi-task network to produce correlated outputs as a framewor… ▽ More In this work, we present a method for synthetic CT (sCT) generation from zero-echo-time (ZTE) MRI aimed at structural and quantitative accuracies of the image, with a particular focus on the accurate bone density value prediction. We propose a loss function that favors a spatially sparse region in the image. We harness the ability of a multi-task network to produce correlated outputs as a framework to enable localisation of region of interest (RoI) via classification, emphasize regression of values within RoI and still retain the overall accuracy via global regression. The network is optimized by a composite loss function that combines a dedicated loss from each task. We demonstrate how the multi-task network with RoI focused loss offers an advantage over other configurations of the network to achieve higher accuracy of performance. This is relevant to sCT where failure to accurately estimate high Hounsfield Unit values of bone could lead to impaired accuracy in clinical applications. We compare the dose calculation maps from the proposed sCT and the real CT in a radiation therapy treatment planning setup. △ Less

Submitted 25 October, 2022; v1 submitted 30 March, 2022; originally announced March 2022.

Comments: Submitted to Physics in Medicine & Biology

arXiv:2109.01878 [pdf, other]

Robust Mitosis Detection Using a Cascade Mask-RCNN Approach With Domain-Specific Residual Cycle-GAN Data Augmentation

Authors: Gauthier Roy, Jules Dedieu, Capucine Bertrand, Alireza Moshayedi, Ali Mammadov, Stéphanie Petit, Saima Ben Hadj, Rutger H. J. Fick

Abstract: For the MIDOG mitosis detection challenge, we created a cascade algorithm consisting of a Mask-RCNN detector, followed by a classification ensemble consisting of ResNet50 and DenseNet201 to refine detected mitotic candidates. The MIDOG training data consists of 200 frames originating from four scanners, three of which are annotated for mitotic instances with centroid annotations. Our main algorith… ▽ More For the MIDOG mitosis detection challenge, we created a cascade algorithm consisting of a Mask-RCNN detector, followed by a classification ensemble consisting of ResNet50 and DenseNet201 to refine detected mitotic candidates. The MIDOG training data consists of 200 frames originating from four scanners, three of which are annotated for mitotic instances with centroid annotations. Our main algorithmic choices are as follows: first, to enhance the generalizability of our detector and classification networks, we use a state-of-the-art residual Cycle-GAN to transform each scanner domain to every other scanner domain. During training, we then randomly load, for each image, one of the four domains. In this way, our networks can learn from the fourth non-annotated scanner domain even if we don't have annotations for it. Second, for training the detector network, rather than using centroid-based fixed-size bounding boxes, we create mitosis-specific bounding boxes. We do this by manually annotating a small selection of mitoses, training a Mask-RCNN on this small dataset, and applying it to the rest of the data to obtain full annotations. We trained the follow-up classification ensemble using only the challenge-provided positive and hard-negative examples. On the preliminary test set, the algorithm scores an F1 score of 0.7578, putting us as the second-place team on the leaderboard. △ Less

Submitted 28 September, 2021; v1 submitted 4 September, 2021; originally announced September 2021.

Comments: Gauthier Roy and Jules Dedieu contributed equally to this work

arXiv:2101.09747 [pdf, ps, other]

doi 10.1007/978-3-030-95470-3_9

Numerical issues in maximum likelihood parameter estimation for Gaussian process interpolation

Authors: Subhasish Basak, Sébastien Petit, Julien Bect, Emmanuel Vazquez

Abstract: This article investigates the origin of numerical issues in maximum likelihood parameter estimation for Gaussian process (GP) interpolation and investigates simple but effective strategies for improving commonly used open-source software implementations. This work targets a basic problem but a host of studies, particularly in the literature of Bayesian optimization, rely on off-the-shelf GP implem… ▽ More This article investigates the origin of numerical issues in maximum likelihood parameter estimation for Gaussian process (GP) interpolation and investigates simple but effective strategies for improving commonly used open-source software implementations. This work targets a basic problem but a host of studies, particularly in the literature of Bayesian optimization, rely on off-the-shelf GP implementations. For the conclusions of these studies to be reliable and reproducible, robust GP implementations are critical. △ Less

Submitted 27 July, 2021; v1 submitted 24 January, 2021; originally announced January 2021.

arXiv:2010.05031 [pdf, other]

Understanding Cloud Workloads Performance in a Production like Environment

Authors: Lucia Pons, Josué Feliu, José Puche, Chaoyi Huang, Salvador Petit, Julio Pons, María E. Gómez, Julio Sahuquillo

Abstract: Understanding inter-VM interference is of paramount importance to provide a sound knowledge and understand where performance degradation comes from in the current public cloud. With this aim, this paper devises a workload taxonomy that classifies applications according to how the major system resources affect their performance (e.g., tail latency) as a function of the level of load (e.g., QPS). Af… ▽ More Understanding inter-VM interference is of paramount importance to provide a sound knowledge and understand where performance degradation comes from in the current public cloud. With this aim, this paper devises a workload taxonomy that classifies applications according to how the major system resources affect their performance (e.g., tail latency) as a function of the level of load (e.g., QPS). After that, we present three main studies addressing three major concerns to improve the cloud performance: impact of the level of load on performance, impact of hyper-threading on performance, and impact of limiting the major system resources (e.g., last level cache) on performance. In all these studies we identified important findings that we hope help cloud providers improve their system utilization. △ Less

Submitted 10 October, 2020; originally announced October 2020.

Comments: 16 pages, 17 figures. Submitted to Journal of Parallel and Distributed Computing

Showing 1–7 of 7 results for author: Petit, S