-
Satellite observations reveal shorter periodic inner core oscillation
Authors:
Yachong An,
Hao Ding,
Fred D. Richards,
Wei** Jiang,
Jiancheng Li,
Wenbin Shen
Abstract:
Detecting the Earth's inner core motions relative to the mantle presents a considerable challenge due to their indirect accessibility. Seismological observations initially provided evidence for differential/super-rotation of the inner core, but recently demonstrated a possibly about 70-year periodic oscillation. The contrasting results underscore the ongoing enigma surrounding inner core motion, l…
▽ More
Detecting the Earth's inner core motions relative to the mantle presents a considerable challenge due to their indirect accessibility. Seismological observations initially provided evidence for differential/super-rotation of the inner core, but recently demonstrated a possibly about 70-year periodic oscillation. The contrasting results underscore the ongoing enigma surrounding inner core motion, leaving debates unresolved, including the precise oscillate period. In parallel to seismic observations, satellite geodesy has accumulated decades of global high-precision records, providing a novel avenue to probe inner core motions. Here, we detect an about 6-year oscillation from the gravitational field degree-2 order-2 Stokes coefficients derived from satellite observations, and find it has a unique phase correlation with the about 6-year signal in the Earth's length-of-day variations. This correlation is attributed to an inner core oscillation which is controlled by the gravitational coupling between the inner core and lower mantle (mainly due to the density heterogeneity of the two large low-velocity provinces; LLVPs). That is, we independently corroborate the inner core periodic oscillation, albeit with a significantly shorter period than previously suggested. Our findings demonstrate the dense layer of the LLVPs (mean density anomalies of about +0.9 percent at the bottom), consistent with inversions from tidal tomography and Stoneley modes. Furthermore, our research reveals equatorial topographic undulations of about 187 m at the inner core boundary.
△ Less
Submitted 14 April, 2024;
originally announced April 2024.
-
Characterization of Low Surface Brightness structures in annotated deep images
Authors:
Elisabeth Sola,
Pierre-Alain Duc,
Felix Richards,
Adeline Paiement,
Mathias Urbano,
Julie Klehammer,
Michal BĂlek,
Jean-Charles Cuillandre,
Stephen Gwyn,
Alan McConnachie
Abstract:
The characterization of Low Surface Brightness (LSB) stellar structures around galaxies such as tidal debris of on-going or past collisions is essential to constrain models of galactic evolution. Our goal is to obtain quantitative measurements of LSB structures identified in deep images of samples consisting of hundreds of galaxies. We developed an online annotation tool that enables contributors…
▽ More
The characterization of Low Surface Brightness (LSB) stellar structures around galaxies such as tidal debris of on-going or past collisions is essential to constrain models of galactic evolution. Our goal is to obtain quantitative measurements of LSB structures identified in deep images of samples consisting of hundreds of galaxies. We developed an online annotation tool that enables contributors to delineate the shapes of diffuse extended stellar structures, as well as artefacts or foreground structures. All parameters are automatically stored in a database which may be queried to retrieve quantitative measurements. We annotated LSB structures around 352 nearby massive galaxies with deep images obtained with the CFHT as part of two large programs: MATLAS and UNIONS/CFIS. Each LSB structure was delineated and labeled according to its likely nature: stellar shells, streams associated to a disrupted satellite, tails formed in major mergers, ghost reflections or cirrus. From our database containing 8441 annotations, the area, size, median surface brightness and distance to the host of 228 structures were computed. The results confirm the fact that tidal structures defined as streams are thinner than tails, as expected by numerical simulations. In addition, tidal tails appear to exhibit a higher surface brightness than streams (by about 1 mag), which may be related to different survival times for the two types of collisional debris. We did not detect any tidal feature fainter than 27.5 mag.arcsec$^{-2}$, while the nominal surface brightness limits of our surveys range between 28.3 and 29 mag.arcsec$^{-2}$, a difference that needs to be taken into account when estimating the sensitivity of future surveys to identify LSB structures. Our annotation database of observed LSB structures may be used for quantitative analysis and as a training set for machine learning algorithms (abbreviated).
△ Less
Submitted 8 March, 2022;
originally announced March 2022.
-
Improving Scalability with GPU-Aware Asynchronous Tasks
Authors:
Jaemin Choi,
David F. Richards,
Laxmikant V. Kale
Abstract:
Asynchronous tasks, when created with over-decomposition, enable automatic computation-communication overlap which can substantially improve performance and scalability. This is not only applicable to traditional CPU-based systems, but also to modern GPU-accelerated platforms. While the ability to hide communication behind computation can be highly effective in weak scaling scenarios, performance…
▽ More
Asynchronous tasks, when created with over-decomposition, enable automatic computation-communication overlap which can substantially improve performance and scalability. This is not only applicable to traditional CPU-based systems, but also to modern GPU-accelerated platforms. While the ability to hide communication behind computation can be highly effective in weak scaling scenarios, performance begins to suffer with smaller problem sizes or in strong scaling due to fine-grained overheads and reduced room for overlap. In this work, we integrate GPU-aware communication into asynchronous tasks in addition to computation-communication overlap, with the goal of reducing time spent in communication and further increasing GPU utilization. We demonstrate the performance impact of our approach using a proxy application that performs the Jacobi iterative method, Jacobi3D. In addition to optimizations to minimize synchronizations between the host and GPU devices and increase the concurrency of GPU operations, we explore techniques such as kernel fusion and CUDA Graphs to mitigate fine-grained overheads at scale.
△ Less
Submitted 21 March, 2022; v1 submitted 23 February, 2022;
originally announced February 2022.
-
Accelerating Communication for Parallel Programming Models on GPU Systems
Authors:
Jaemin Choi,
Zane Fink,
Sam White,
Nitin Bhat,
David F. Richards,
Laxmikant V. Kale
Abstract:
As an increasing number of leadership-class systems embrace GPU accelerators in the race towards exascale, efficient communication of GPU data is becoming one of the most critical components of high-performance computing. For developers of parallel programming models, implementing support for GPU-aware communication using native APIs for GPUs such as CUDA can be a daunting task as it requires cons…
▽ More
As an increasing number of leadership-class systems embrace GPU accelerators in the race towards exascale, efficient communication of GPU data is becoming one of the most critical components of high-performance computing. For developers of parallel programming models, implementing support for GPU-aware communication using native APIs for GPUs such as CUDA can be a daunting task as it requires considerable effort with little guarantee of performance. In this work, we demonstrate the capability of the Unified Communication X (UCX) framework to compose a GPU-aware communication layer that serves multiple parallel programming models of the Charm++ ecosystem: Charm++, Adaptive MPI (AMPI), and Charm4py. We demonstrate the performance impact of our designs with microbenchmarks adapted from the OSU benchmark suite, obtaining improvements in latency of up to 10.1x in Charm++, 11.7x in AMPI, and 17.4x in Charm4py. We also observe increases in bandwidth of up to 10.1x in Charm++, 10x in AMPI, and 10.5x in Charm4py. We show the potential impact of our designs on real-world applications by evaluating a proxy application for the Jacobi iterative method, improving the communication performance by up to 12.4x in Charm++, 12.8x in AMPI, and 19.7x in Charm4py.
△ Less
Submitted 21 March, 2022; v1 submitted 24 February, 2021;
originally announced February 2021.
-
Learnable Gabor modulated complex-valued networks for orientation robustness
Authors:
Felix Richards,
Adeline Paiement,
Xianghua Xie,
Elisabeth Sola,
Pierre-Alain Duc
Abstract:
Robustness to transformation is desirable in many computer vision tasks, given that input data often exhibits pose variance. While translation invariance and equivariance is a documented phenomenon of CNNs, sensitivity to other transformations is typically encouraged through data augmentation. We investigate the modulation of complex valued convolutional weights with learned Gabor filters to enabl…
▽ More
Robustness to transformation is desirable in many computer vision tasks, given that input data often exhibits pose variance. While translation invariance and equivariance is a documented phenomenon of CNNs, sensitivity to other transformations is typically encouraged through data augmentation. We investigate the modulation of complex valued convolutional weights with learned Gabor filters to enable orientation robustness. The resulting network can generate orientation dependent features free of interpolation with a single set of learnable rotation-governing parameters. By choosing to either retain or pool orientation channels, the choice of equivariance versus invariance can be directly controlled. Moreover, we introduce rotational weight-tying through a proposed cyclic Gabor convolution, further enabling generalisation over rotations. We combine these innovations into Learnable Gabor Convolutional Networks (LGCNs), that are parameter-efficient and offer increased model complexity. We demonstrate their rotation invariance and equivariance on MNIST, BSD and a dataset of simulated and real astronomical images of Galactic cirri.
△ Less
Submitted 5 October, 2021; v1 submitted 23 November, 2020;
originally announced November 2020.
-
ClangJIT: Enhancing C++ with Just-in-Time Compilation
Authors:
Hal Finkel,
David Poliakoff,
David F. Richards
Abstract:
The C++ programming language is not only a keystone of the high-performance-computing ecosystem but has proven to be a successful base for portable parallel-programming frameworks. As is well known, C++ programmers use templates to specialize algorithms, thus allowing the compiler to generate highly-efficient code for specific parameters, data structures, and so on. This capability has been limite…
▽ More
The C++ programming language is not only a keystone of the high-performance-computing ecosystem but has proven to be a successful base for portable parallel-programming frameworks. As is well known, C++ programmers use templates to specialize algorithms, thus allowing the compiler to generate highly-efficient code for specific parameters, data structures, and so on. This capability has been limited to those specializations that can be identified when the application is compiled, and in many critical cases, compiling all potentially-relevant specializations is not practical. ClangJIT provides a well-integrated C++ language extension allowing template-based specialization to occur during program execution. This capability has been implemented for use in large-scale applications, and we demonstrate that just-in-time-compilation-based dynamic specialization can be integrated into applications, often requiring minimal changes (or no changes) to the applications themselves, providing significant performance improvements, programmer-productivity improvements, and decreased compilation time.
△ Less
Submitted 27 April, 2019; v1 submitted 17 April, 2019;
originally announced April 2019.
-
Atoms in the Surf: Molecular Dynamics Simulation of the Kelvin-Helmholtz Instability using 9 Billion Atoms
Authors:
D. F. Richards,
L. D. Krauss,
W. H. Cabot,
K. J. Caspersen,
A. W. Cook,
J. N. Glosli,
R. E. Rudd,
F. H. Streitz
Abstract:
We present a fluid dynamics video showing the results of a 9-billion atom molecular dynamics simulation of complex fluid flow in molten copper and aluminum. Starting with an atomically flat interface, a shear is imposed along the copper-aluminum interface and random atomic fluctuations seed the formation of vortices. These vortices grow due to the Kelvin-Helmholtz instability. The resulting vort…
▽ More
We present a fluid dynamics video showing the results of a 9-billion atom molecular dynamics simulation of complex fluid flow in molten copper and aluminum. Starting with an atomically flat interface, a shear is imposed along the copper-aluminum interface and random atomic fluctuations seed the formation of vortices. These vortices grow due to the Kelvin-Helmholtz instability. The resulting vortical structures are beautifully intricate, decorated with secondary instabilities and complex mixing phenomena. This work performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
△ Less
Submitted 16 October, 2008;
originally announced October 2008.