Towards latent space evolution of spatiotemporal dynamics of six-dimensional phase space of charged particle beams thanks: Work supported by the LANL LDRD Program Directed Research (DR) project 20220074DR

M. Rautela 1 [email protected]    A. Williams1,2    A. Scheinker1
1Los Alamos National Laboratory
   NM    US
2University of California
   San Diago    CA    US
Abstract

Addressing the charged particle beam diagnostics in accelerators poses a formidable challenge, demanding high-fidelity simulations in limited computational time. Machine learning (ML) based surrogate models have emerged as a promising tool for non-invasive charged particle beam diagnostics. Trained ML models can make predictions much faster than computationally expensive physics simulations. In this work, we have proposed a temporally structured variational autoencoder model to autoregressively forecast the spatiotemporal dynamics of the 15 unique 2D projections of 6D phase space of charged particle beam as it travels through the LANSCE linear accelerator. In the model, VAE embeds the phase space projections into a lower dimensional latent space. A long-short-term memory network then learns the temporal correlations in the latent space. The trained network can evolve the phase space projections across further modules provided the first few modules as inputs. The model predicts all the projections across different modules with low mean squared error and high structural similarity index.

1 INTRODUCTION

With the advancement in parallel processing, machine learning (ML) and deep learning (DL) have shown promising capabilities in solving problems in physics. Most of the problems in physics are governed by spatiotemporal dynamics, where complex spatial behavior evolves with time [rautela2023bayesian]. In a particle accelerator, the dynamics of charged particles evolve temporally in a six-dimensional phase space made up of three position and three momentum components for each particle i.e., (x,y,z,px,py,pz)𝑥𝑦𝑧subscript𝑝𝑥subscript𝑝𝑦subscript𝑝𝑧(x,y,z,p_{x},p_{y},p_{z})( italic_x , italic_y , italic_z , italic_p start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ) with z𝑧zitalic_z typically chosen as the direction along the accelerator axis [scheinker2020adaptive_JAP].

The majority of the ML research is inclined towards learning spatial or temporal dynamics with limited emphasis on spatiotemporal dynamics. Some of the DL techniques for solving spatiotemporal dynamical problems are three-dimensional convolutional neural networks (3DCNN) [wandel2021teaching], convolutional long short-term memory (ConvLSTM) [shi2015convolutional], Deep Convolutional Generative Adversarial Networks (DCGAN) [cheng2020data], Graph Neural Networks (GNNs) [kipf2016semi].

Recently, latent evolution models have gained traction for solving spatiotemporal dynamics problems. In these computationally efficient models, a dimensionality reduction framework learns spatial correlations by map** higher dimensional images to lower dimensional latent space. For example, in Ref. [scheinker2021adaptive_JOI], an adaptive virtual 6D phase space diagnostic was developed for particle accelerator beams, where an autoencoder maps high dimensional images representing the states of complex time-varying particle accelerator beams to a low-dimensional latent space from which it then generates all 15 unique 2D projections of the beam’s 6D phase space. Adaptive feedback is used within the low-dimensional latent representation to track an unknown time-varying beam’s properties with time as the accelerator parameters change. A robustness study of this adaptive latent space tuning method has demonstrated an ability to extrapolate beyond the span of the training data with a more physically consistent generated 6D phase space [scheinker2023adaptive, scheinker2021adaptive_SciRep]. A combination of autoencoders for learning spatial dynamics and LSTMs for temporal dynamics has also been employed for addressing fluid flow problems [wiewel2019latent, nakamura2021convolutional, maulik2021reduced, vlachas2022multiscale].

In this paper, we introduce a two-step deep learning modeling framework, wherein we study the full spatiotemporal dynamical nature of the evolution of a charged particle beam through various sections of a linear accelerator. In this model, a conditional variational autoencoder (CVAE) is used to learn a low-dimensional latent space distribution of the high-dimensional phase space of charged particles. An LSTM-based recurrent neural network is employed to learn the temporal dynamics within the latent space. It gives the model two promising abilities, allowing both the generation of realistic projections across different modules and the forecasting of phase space in further modules [rautela2024conditional].

2 METHODS

2.1 Multi-Particle Tracking Simulations

The behavior of the charged particles in accelerators is governed by numerous accelerator parameters like radio frequency cavity field, and magnetic field strengths. During accelerator operation, these parameters are manually adjusted to minimize beam loss. However, this manual adjustment process is time-consuming and often results in suboptimal performance. Moreover, the time-varying nature of these parameters introduces further uncertainties. Non-destructive beam measurements are scarce in most accelerators, primarily due to resolution limits with short-duration beam pulses and short run times, posing challenges to meaningful data collection. To achieve optimal functionality, understanding beam dynamics is crucial.

Various simulation tools have been devised for this purpose [tenenbaum2005lucretia, young2003particle, pang2014gpu]. High-Performance Simulator (HPSim) is an advanced multiple-particle beam dynamics simulator taking into account the effects of external accelerating and focusing forces as well as space charge forces [pang2014gpu]. In this work, we generate synthetic data by randomly sampling RF set points (amplitude and phase) of the first four modules from a uniform distribution kee** other beam and accelerator parameters fixed. This investigation centers on the LANSCE linear accelerator at Los Alamos National Laboratory [wangler2008rf]. More details about the optimization and tuning challenges of the LANSCE accelerator are given in Ref. [scheinker2021extremum_LANSCE].

From the HPSim output, we generate 2D histograms which are the 15 unique projections of the beam’s 6D (x,y,z,px,py,pz)𝑥𝑦𝑧subscript𝑝𝑥subscript𝑝𝑦subscript𝑝𝑧(x,y,z,p_{x},p_{y},p_{z})( italic_x , italic_y , italic_z , italic_p start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ) phase space at each of the 48 modules. For our application (z,pz)𝑧subscript𝑝𝑧(z,p_{z})( italic_z , italic_p start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ) is converted to (ϕ,E)italic-ϕ𝐸(\phi,E)( italic_ϕ , italic_E ) where ϕitalic-ϕ\phiitalic_ϕ is the phase of a particle in a bunch relative to the design phase. In Fig. 1, E,ϕ𝐸italic-ϕE,\phiitalic_E , italic_ϕ projection is plotted at various accelerating modules. The plots are shown on a logarithmic scale for better visualization.

Refer to caption
Figure 1: Eϕ𝐸italic-ϕE-\phiitalic_E - italic_ϕ projection at various accelerating modules. The images are unnormalized and shown on a logarithmic scale.

2.2 Latent Space Evolution via CVAE-LSTM

Autoencoders (AE) are able to learn low-dimensional latent representations of complex data and then generate new high-dimensional data from the latent embedding [rautela2022delamination]. Variational autoencoders (VAE) are AEs that map to a probabilistic latent space. Due to this, VAEs enable the generation of new realistic samples by traversing the latent space [rautela2022towards, rautela2023deep]. In this work, we propose a CVAE-LSTM-based latent evolution model that utilizes a conditional VAE to transform 6D phase space into a lower dimensional latent space. An LSTM-based recurrent neural network learns the temporal dynamics within the latent space. The uniqueness of the model lies in its ability to independently learn spatial and temporal dynamics through a two-step process.

The architecture of the CVAE-LSTM is represented in Fig. 2. The initial 15 unique phase space projections are depicted as 15 channels, each represented by a 256 ×\times× 256-pixel image, resulting in a 106similar-toabsentsuperscript106\sim 10^{6}∼ 10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT dimensional input that is encoded into an 8-dimensional latent space. The projections along with the module number (1-48) are input to the CVAE. The encoder performs convolution operations, and the extracted features are concatenated with the module number and then input to the CVAE’s latent space. The learned latent space is then processed by an LSTM network, which is designed to forecast the next latent space point (corresponding to projections in the next module) based on previous points (corresponding to projections in the previous modules). The continuous latent space of the CVAE allows for conditional sampling, followed by a decoder which generates realistic projections across different modules of the accelerator.

Refer to caption
Figure 2: CVAE-LSTM as a latent evolution model. CVAE maps phase space projections to latent space and a LSTM learns to forecast future downstream states autoregressively given previous upstream states.

3 RESULTS

To generate data, the RF field set points (amplitude and phase) of the DTL sections (first 4 modules) were randomly sampled (1400 simulations) while kee** the other 88 RF parameters fixed. HPSim would then simulate the dynamics of a beam through the entire accelerator, from which the 15 unique phase space projections (each with a 256 ×\times× 256 image) at each of the 48 RF modules were generated. 1400 simulation data sets were used for training and 100 for testing. A single input to the VAE is a set of 15 256 ×\times× 256-pixel images. The conditional input c𝑐citalic_c to the encoder is the module number, a scalar between 1-48, normalized to the range [0,1]01[0,1][ 0 , 1 ].

3.1 Latent Space Visualization

Visualization of the latent space is important for the interpretability of the network. However, the visualization is restricted by the high dimensionality (8D) of the latent space. We have transformed the 8D latent space is transformed into various different 2D spaces, as shown in Fig. 3. The first method is a linear dimensionality reduction technique called principal component analysis (PCA). The other two methods are manifold learning techniques called t-distributed Stochastic Neighbor Embedding (t-SNE) [van2008visualizing] and Uniform Manifold Approximation and Projection (UMAP) [mcinnes2018umap]. Both of them are non-linear dimensionality reduction approaches, contrary to PCA. While t-SNE is a more popular method for various problems, UMAP performs better in preserving both local and global structures, computational efficiency, and parameter robustness [mcinnes2018umap].

Refer to caption
Figure 3: 2D PCA, t-SNE and UMAP of 8D latent space. The color maps corresponding to different module number.

3.2 Forecasting Ability

The test set is projected onto the latent space and LSTM is used to forecast downstream latent points given upstream latent points. The forecasted latent points are passed through the decoder of the CVAE to reconstruct the phase space projections. In Fig. 4, we showcase the Eϕ𝐸italic-ϕE-\phiitalic_E - italic_ϕ projection using the initial four-phase space projections from the test set as input. The figure depicts the original projection, the predicted projection, and their absolute difference at various modules. The corresponding MSE and SSIM for the projection reveal high similarity, with MSE of the order of 107superscript10710^{-7}10 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT and SSIM exceeding 0.99. We have noted that the discrepancies in forecasted projections grow for later modules, particularly MSE. This growth in MSE is expected as the LSTM’s inputs are the true values of M1M4subscript𝑀1subscript𝑀4M_{1}-M_{4}italic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT, based on which it predicts an estimate M5^^subscript𝑀5\hat{M_{5}}over^ start_ARG italic_M start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT end_ARG of M5subscript𝑀5M_{5}italic_M start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT and then uses its own prediction to generate M6^^subscript𝑀6\hat{M_{6}}over^ start_ARG italic_M start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT end_ARG and so on in an iterative manner in which the errors introduced by the CVAE and LSTM are propagated and accumulated leading to a continuous increase in error.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 4: Forecasting results: Forecasted projections (shown Eϕ𝐸italic-ϕE-\phiitalic_E - italic_ϕ only) across different modules given first four projections as inputs. The original projection is presented against the forecasted along with the absolute difference between both.

The forecasting of all the projections in all the modules takes less than one second whereas HPSim takes around 10 minutes with similar computing infrastructure, resulting in a speed up by a factor of 600similar-toabsent600\sim 600∼ 600. The exceptional computational speed of the method makes it extremely well-suited for various real-time accelerator applications. The method can be used as a virtual diagnostic in which CVAE-LSTM predicts a detailed evolution of the beam’s phase space through the entire LANSCE accelerator based on the current RF module settings and using only 4 initial steps from the much slower HPSim physics-based model as its initial points. In general, the application of such an approach to any large accelerator will provide a substantial benefit for simulating beam dynamics and for accelerator optimization.

Uncertainty analysis is a byproduct of probabilistic models (like VAE) and it plays an important role in understanding uncertainties associated with the accelerator operation. In our proposed methods, just by sampling the latent space for the first few modules, the LSTM and decoder can be used to generate phase space projections in all the modules. A detailed investigation of the uncertainty analysis aspect is a part of future research work.

4 CONCLUSION

A novel latent evolution model i.e., CVAE-LSTM is proposed for learning spatiotemporal dynamics. The application of the model is shown for forecasting complex dynamics of charged particle beams through linear accelerators without any supervision from RF field set points. The forecasting results are promising when tested with different evaluation metrics. We have performed a visualization of the latent space PCA, t-SNE, and UMAP. The proposed methodology brings computational speed, robustness, and enhanced interpretability for solving spatiotemporal dynamics problems. This general method provides a computational speed-up of approximately 600x for complex beam dynamics and is applicable to a wide range of accelerator tuning, optimization, and virtual diagnostics applications.

5 ACKNOWLEDGEMENTS

This work was supported by the Los Alamos National Laboratory LDRD Program Directed Research (DR) project 20220074DR.

References