SYNPA: SMT Performance Analysis and Allocation of Threads to Cores in ARM Processors
Authors:
Marta Navarro,
Josué Feliu,
Salvador Petit,
María E. Gómez,
Julio Sahuquillo
Abstract:
Simultaneous multithreading processors improve throughput over single-threaded processors thanks to sharing internal core resources among instructions from distinct threads. However, resource sharing introduces inter-thread interference within the core, which has a negative impact on individual application performance and can significantly increase the turnaround time of multi-program workloads. T…
▽ More
Simultaneous multithreading processors improve throughput over single-threaded processors thanks to sharing internal core resources among instructions from distinct threads. However, resource sharing introduces inter-thread interference within the core, which has a negative impact on individual application performance and can significantly increase the turnaround time of multi-program workloads. The severity of the interference effects depends on the competing co-runners sharing the core. Thus, it can be mitigated by applying a thread-to-core allocation policy that smartly selects applications to be run in the same core to minimize their interference.
This paper presents SYNPA, a simple approach that dynamically allocates threads to cores in an SMT processor based on their run-time dynamic behavior. The approach uses a regression model to select synergistic pairs to mitigate intra-core interference. The main novelty of SYNPA is that it uses just three variables collected from the performance counters available in current ARM processors at the dispatch stage. Experimental results show that SYNPA outperforms the default Linux scheduler by around 36%, on average, in terms of turnaround time in 8-application workloads combining frontend bound and backend bound benchmarks.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
Understanding Cloud Workloads Performance in a Production like Environment
Authors:
Lucia Pons,
Josué Feliu,
José Puche,
Chaoyi Huang,
Salvador Petit,
Julio Pons,
María E. Gómez,
Julio Sahuquillo
Abstract:
Understanding inter-VM interference is of paramount importance to provide a sound knowledge and understand where performance degradation comes from in the current public cloud. With this aim, this paper devises a workload taxonomy that classifies applications according to how the major system resources affect their performance (e.g., tail latency) as a function of the level of load (e.g., QPS). Af…
▽ More
Understanding inter-VM interference is of paramount importance to provide a sound knowledge and understand where performance degradation comes from in the current public cloud. With this aim, this paper devises a workload taxonomy that classifies applications according to how the major system resources affect their performance (e.g., tail latency) as a function of the level of load (e.g., QPS). After that, we present three main studies addressing three major concerns to improve the cloud performance: impact of the level of load on performance, impact of hyper-threading on performance, and impact of limiting the major system resources (e.g., last level cache) on performance. In all these studies we identified important findings that we hope help cloud providers improve their system utilization.
△ Less
Submitted 10 October, 2020;
originally announced October 2020.