-
Recommending Podcasts for Cold-Start Users Based on Music Listening and Taste
Authors:
Zahra Nazari,
Christophe Charbuillet,
Johan Pages,
Martin Laurent,
Denis Charrier,
Briana Vecchione,
Ben Carterette
Abstract:
Recommender systems are increasingly used to predict and serve content that aligns with user taste, yet the task of matching new users with relevant content remains a challenge. We consider podcasting to be an emerging medium with rapid growth in adoption, and discuss challenges that arise when applying traditional recommendation approaches to address the cold-start problem. Using music consumptio…
▽ More
Recommender systems are increasingly used to predict and serve content that aligns with user taste, yet the task of matching new users with relevant content remains a challenge. We consider podcasting to be an emerging medium with rapid growth in adoption, and discuss challenges that arise when applying traditional recommendation approaches to address the cold-start problem. Using music consumption behavior, we examine two main techniques in inferring Spotify users preferences over more than 200k podcasts. Our results show significant improvements in consumption of up to 50\% for both offline and online experiments. We provide extensive analysis on model performance and examine the degree to which music data as an input source introduces bias in recommendations.
△ Less
Submitted 26 July, 2020;
originally announced July 2020.
-
Lightweight Task Offloading Exploiting MPI Wait Times for Parallel Adaptive Mesh Refinement
Authors:
Philipp Samfass,
Tobias Weinzierl,
Dominic E. Charrier,
Michael Bader
Abstract:
Balancing the workload of sophisticated simulations is inherently difficult, since we have to balance both computational workload and memory footprint over meshes that can change any time or yield unpredictable cost per mesh entity, while modern supercomputers and their interconnects start to exhibit fluctuating performance. We propose a novel lightweight balancing technique for MPI+X to accompany…
▽ More
Balancing the workload of sophisticated simulations is inherently difficult, since we have to balance both computational workload and memory footprint over meshes that can change any time or yield unpredictable cost per mesh entity, while modern supercomputers and their interconnects start to exhibit fluctuating performance. We propose a novel lightweight balancing technique for MPI+X to accompany traditional, prediction-based load balancing. It is a reactive diffusion approach that uses online measurements of MPI idle time to migrate tasks temporarily from overloaded to underemployed ranks. Tasks are deployed to ranks which otherwise would wait, processed with high priority, and made available to the overloaded ranks again. This migration is non-persistent. Our approach hijacks idle time to do meaningful work and is totally non-blocking, asynchronous and distributed without a global data view. Tests with a seismic simulation code developed in the ExaHyPE engine uncover the method's potential. We found speed-ups of up to 2-3 for ill-balanced scenarios without logical modifications of the code base and show that the strategy is capable to react quickly to temporarily changing workload or node performance.
△ Less
Submitted 14 April, 2020; v1 submitted 13 September, 2019;
originally announced September 2019.
-
ExaHyPE: An Engine for Parallel Dynamically Adaptive Simulations of Wave Problems
Authors:
Anne Reinarz,
Dominic E. Charrier,
Michael Bader,
Luke Bovard,
Michael Dumbser,
Kenneth Duru,
Francesco Fambri,
Alice-Agnes Gabriel,
Jean-Matthieu Gallard,
Sven Köppel,
Lukas Krenz,
Leonhard Rannabauer,
Luciano Rezzolla,
Philipp Samfass,
Maurizio Tavelli,
Tobias Weinzierl
Abstract:
ExaHyPE ("An Exascale Hyperbolic PDE Engine") is a software engine for solving systems of first-order hyperbolic partial differential equations (PDEs). Hyperbolic PDEs are typically derived from the conservation laws of physics and are useful in a wide range of application areas. Applications powered by ExaHyPE can be run on a student's laptop, but are also able to exploit thousands of processor c…
▽ More
ExaHyPE ("An Exascale Hyperbolic PDE Engine") is a software engine for solving systems of first-order hyperbolic partial differential equations (PDEs). Hyperbolic PDEs are typically derived from the conservation laws of physics and are useful in a wide range of application areas. Applications powered by ExaHyPE can be run on a student's laptop, but are also able to exploit thousands of processor cores on state-of-the-art supercomputers. The engine is able to dynamically increase the accuracy of the simulation using adaptive mesh refinement where required. Due to the robustness and shock capturing abilities of ExaHyPE's numerical methods, users of the engine can simulate linear and non-linear hyperbolic PDEs with very high accuracy. Users can tailor the engine to their particular PDE by specifying evolved quantities, fluxes, and source terms. A complete simulation code for a new hyperbolic PDE can often be realised within a few hours - a task that, traditionally, can take weeks, months, often years for researchers starting from scratch. In this paper, we showcase ExaHyPE's workflow and capabilities through real-world scenarios from our two main application areas: seismology and astrophysics.
△ Less
Submitted 18 May, 2020; v1 submitted 20 May, 2019;
originally announced May 2019.
-
Studies on the energy and deep memory behaviour of a cache-oblivious, task-based hyperbolic PDE solver
Authors:
Dominic E. Charrier,
Benjamin Hazelwood,
Ekaterina Tutlyaeva,
Michael Bader,
Michael Dumbser,
Andrey Kudryavtsev,
Alexander Moskovsky,
Tobias Weinzierl
Abstract:
We study the performance behaviour of a seismic simulation using the ExaHyPE engine with a specific focus on memory characteristics and energy needs. ExaHyPE combines dynamically adaptive mesh refinement (AMR) with ADER-DG. It is parallelized using tasks, and it is cache efficient. AMR plus ADER-DG yields a task graph which is highly dynamic in nature and comprises both arithmetically expensive ta…
▽ More
We study the performance behaviour of a seismic simulation using the ExaHyPE engine with a specific focus on memory characteristics and energy needs. ExaHyPE combines dynamically adaptive mesh refinement (AMR) with ADER-DG. It is parallelized using tasks, and it is cache efficient. AMR plus ADER-DG yields a task graph which is highly dynamic in nature and comprises both arithmetically expensive tasks and tasks which challenge the memory's latency. The expensive tasks and thus the whole code benefit from AVX vectorization, though we suffer from memory access bursts. A frequency reduction of the chip improves the code's energy-to-solution. Yet, it does not mitigate burst effects. The bursts' latency penalty becomes worse once we add Intel Optane technology, increase the core count significantly, or make individual, computationally heavy tasks fall out of close caches. Thread overbooking to hide away these latency penalties contra-productive with non-inclusive caches as it destroys the cache and vectorization character. In cases where memory-intense and computationally expensive tasks overlap, ExaHyPE's cache-oblivious implementation can exploit deep, non-inclusive, heterogeneous memory effectively, as main memory misses arise infrequently and slow down only few cores. We thus propose that upcoming supercomputing simulation codes with dynamic, inhomogeneous task graphs are actively supported by thread runtimes in intermixing tasks of different compute character, and we propose that future hardware actively allows codes to downclock the cores running particular task types.
△ Less
Submitted 25 March, 2019; v1 submitted 9 October, 2018;
originally announced October 2018.
-
Enclave Tasking for Discontinuous Galerkin Methods on Dynamically Adaptive Meshes
Authors:
Dominic E. Charrier,
Benjamin Hazelwood,
Tobias Weinzierl
Abstract:
High-order Discontinuous Galerkin (DG) methods promise to be an excellent discretisation paradigm for partial differential equation solvers by combining high arithmetic intensity with localised data access. They also facilitate dynamic adaptivity without the need for conformal meshes. A parallel evaluation of DG's weak formulation within a mesh traversal is non-trivial, as dependency graphs over d…
▽ More
High-order Discontinuous Galerkin (DG) methods promise to be an excellent discretisation paradigm for partial differential equation solvers by combining high arithmetic intensity with localised data access. They also facilitate dynamic adaptivity without the need for conformal meshes. A parallel evaluation of DG's weak formulation within a mesh traversal is non-trivial, as dependency graphs over dynamically adaptive meshes change, as causal constraints along resolution transitions have to be preserved, and as data sends along MPI domain boundaries have to be triggered in the correct order. We propose to process mesh elements subject to constraints with high priority or, where needed, serially throughout a traversal. The remaining cells form enclaves and are spawned into a task system. This introduces concurrency, mixes memory-intensive DG integrations with compute-bound Riemann solves, and overlaps computation and communication. We discuss implications on MPI and show that MPI parallelisation improves by a factor of three through enclave tasking, while we obtain an additional factor of two from shared memory if grids are dynamically adaptive.
△ Less
Submitted 24 February, 2020; v1 submitted 19 June, 2018;
originally announced June 2018.
-
Stop talking to me -- a communication-avoiding ADER-DG realisation
Authors:
Dominic E. Charrier,
Tobias Weinzierl
Abstract:
We present a communication- and data-sensitive formulation of ADER-DG for hyperbolic differential equation systems. Sensitive here has multiple flavours: First, the formulation reduces the persistent memory footprint. This reduces pressure on the memory subsystem. Second, the formulation realises the underlying predictor-corrector scheme with single-touch semantics, i.e., each degree of freedom is…
▽ More
We present a communication- and data-sensitive formulation of ADER-DG for hyperbolic differential equation systems. Sensitive here has multiple flavours: First, the formulation reduces the persistent memory footprint. This reduces pressure on the memory subsystem. Second, the formulation realises the underlying predictor-corrector scheme with single-touch semantics, i.e., each degree of freedom is read on average only once per time step from the main memory. This reduces communication through the memory controllers. Third, the formulation breaks up the tight coupling of the explicit time step**'s algorithmic steps to mesh traversals. This averages out data access peaks. Different operations and algorithmic steps are ran on different grid entities. Finally, the formulation hides distributed memory data transfer behind the computation aligned with the mesh traversal. This reduces pressure on the machine interconnects. All techniques applied by our formulation are elaborated by means of a rigorous task formalism. They break up ADER-DG's tight causal coupling of compute steps and can be generalised to other predictor-corrector schemes.
△ Less
Submitted 26 January, 2018;
originally announced January 2018.