ML-based Calibration and Control of the GlueX Central Drift Chamber

T. Britton    M. Goodrich    N. Jarvis    T. Jeske    N. Kalra    D. Lawrence,11footnotetext: Corresponding author.    D. McSpadden    K. Rajput
Abstract

The GlueX Central Drift Chamber (CDC) in Hall D at Jefferson Lab, used for detecting and tracking charged particles, is calibrated and controlled during data taking using a Gaussian process. The system dynamically adjusts the high voltage applied to the anode wires inside the chamber in response to changing environmental and experimental conditions such that the gain is stabilized. Control policies have been established to manage the CDC’s behavior. These policies are activated when the model’s uncertainty exceeds a configurable threshold or during human-initiated tests during normal production running. We demonstrate the system reduces the time detector experts dedicate to calibration of the data offline, leading to a marked decrease in computing resource usage without compromising detector performance.

1 Introduction

The GlueX Central Drift Chamber (CDC) consists of 3522 wires, each contained in a straw tube with a conductive coating[1, 2]. A high voltage (HV) of around 2125V is maintained between the wire in the center of the tube and the tube itself. This creates an intense electric field that accelerates electrons that have been liberated by the passing of charged particles towards the wire[6]. These electrons create a signal on the wire whose amplitude is related to the energy lost to the gas in the chamber by the initial charged particle moving through it. This amplitude can be used to help identify the exact type of that charged particle (see figure 1). The amplification of the signal or “gain” is determined by multiple factors which include the HV, atmospheric pressure, rate at which charged particles are passing through (due to beam flux ×\times× target thickness), and temperature of the gas. Traditionally, the data is analyzed after the experiment to determine the gain and calibration constants derived which are then used to correct the data during analysis. The goal of this project was to use a Machine Learning (ML) model to predict the calibration before the data was taken using 3 of the parameters as inputs and then adjust the HV to counterbalance the effect on the gain. Figure 2 illustrates this. The result would be to operate the detector in a more stable mode and significantly reduce the time needed for calibration after the data was taken.

Refer to caption
Figure 1: Energy loss rate as a function of momentum in the CDC. For lower momentum particles this can be used for identifying the particle type. Accurate gain calibration helps sharpen the resolution between the various bands.
Refer to caption
Figure 2: Diagram illustrating how environmental parameters are used to predict gain calibrations. The predicted calibrations are then used to determine adjustments to the detector high voltage to counterbalance these resulting in actual gain calibrations that are nearly constant over time.

2 Initial System Development and Testing

Development on the system began in 2021. A model was trained using historic GCF calibrations (derived from recorded data[3]) and environmental parameters as read from just before the data was acquired[4]. A Python script was used to gather current environmental conditions, execute the model, and convert the results into a suggestion for a new HV setting. This was done manually and it was up to shift takers to actually set the HV using the standard controls GUI. Forcing a human in the pipeline was done as a precaution and a way to assure collaborators the system would not risk data quality. Figure 3 shows the HV and estimated gain correction factors (GCF) as a function of run number. The atmospheric pressure is also shown to help illustrate its correlation to the gain. For this test HV values were only changed by increments of 5V. This initial test was successful at improving the stability of the GCF.

The next step involved automating the system so that it would not require a human. This was done during a period when beam was not available and instead used signals from cosmic rays. The policy was changed to allow 1V changes in the HV as opposed to the 5V policy used for the initial test. The automated system was allowed to modify the voltage on half of the wires automatically every 5 minutes over a 2 week period. The other half of the wires were maintained at constant voltage during this same time period. Guardrails were implemented to prevent the ML model from setting the HV outside of a limited range that was deemed safe under all circumstances. Figure 4 shows the GCF based on a later analysis of the data for both halves of the detector. The half controlled by the ML model had significantly more stable GCF values.

Refer to caption
Figure 3: Plots from an initial semi-manual test with beam. This occurred during the first run of the PrimEx experiment in Fall of 2021. The x-axis of both plots is the run number which roughly correlates with time. The top plot shows the gain (blue dots plotted against left y-axis) and high voltage setting (magenta line plotted against right y-axis. The bottom plot shows the atmospheric pressure during this same time period.
Refer to caption
Figure 4: (Reproduced from [4]) Plot showing the first automated test of the AI/ML controlled system during a cosmic data run. The x-axis is event number which roughly corresponds to time over a 2 week period. The orange dots are from the half of the detector that was not controlled by the AI/ML model while the blue dots show the much more stable behavior of the half of the detector that was controller by the AI/ML model.

3 Gaussian Process Regression

The ML model used for the project was based on Gaussian Process Regression (GPR)[7]. This was chosen as it naturally provides a value quantifying the uncertainty of its prediction. A GPR is a non-parametric approach for regression that assumes the outputs and input variables follow a multivariate Gaussian distribution. For a single target GPR, the mean of this distribution is the predicted output, and the covariance captures the uncertainty associated with the prediction. An underlying function RdRsuperscript𝑅𝑑𝑅R^{d}\rightarrow Ritalic_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT → italic_R maps training input, X𝑋Xitalic_X, and corresponding targets y𝑦yitalic_y. The map** uses a chosen covariance function, k(,)𝑘k(\cdot,\cdot)italic_k ( ⋅ , ⋅ ), which determines the smoothness of the predicted function between any two data points (x𝑥xitalic_x and xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT). This function is used to construct the Gram matrix, K𝐾Kitalic_K, which represents pairwise similarities between all data points in the training data:

Kn,n=k(xn,xn),subscript𝐾𝑛superscript𝑛𝑘superscript𝑥𝑛superscript𝑥superscript𝑛\displaystyle K_{n,n^{\prime}}=k(x^{n},x^{n^{\prime}}),italic_K start_POSTSUBSCRIPT italic_n , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = italic_k ( italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , italic_x start_POSTSUPERSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ) ,

where k, when applied to X𝑋Xitalic_X produces an N𝑁Nitalic_N x N𝑁Nitalic_N matrix, where the dataset is of size N𝑁Nitalic_N.

Our work employs a kernel function, k(,)𝑘k(\cdot,\cdot)italic_k ( ⋅ , ⋅ ) constructed as the sum of Scikit-learn’s [5] Squared Exponential (RBF) and White Noise kernels. The RBF kernel, Eq. 3.1, captures the covariance and inherent smoothness between data points, while the White Noise kernel, Eq. 3.2, accounts for the overall noise level present in the data.

kRBF(x,x)=exp((|xx|)22l2),subscript𝑘𝑅𝐵𝐹𝑥superscript𝑥superscript𝑥superscript𝑥22superscript𝑙2{k_{RBF}}(x,x^{\prime})=\exp\left(-\frac{(|x-x^{\prime}|)^{2}}{2l^{2}}\right),italic_k start_POSTSUBSCRIPT italic_R italic_B italic_F end_POSTSUBSCRIPT ( italic_x , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = roman_exp ( - divide start_ARG ( | italic_x - italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_l start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) , (3.1)

where l𝑙litalic_l is the learned length-scale parameter used to scale the difference in distance between training observations.

k(x,x)=σ2In,𝑘𝑥superscript𝑥superscript𝜎2subscript𝐼𝑛k(x,x^{\prime})=\sigma^{2}I_{n},italic_k ( italic_x , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , (3.2)

where σ2superscript𝜎2\sigma^{2}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is the variance of the noise and Insubscript𝐼𝑛I_{n}italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is the identity matrix.

4 Integration with Controls System

Before deploying the system in a running experiment, a mechanism was integrated that would easily allow shift workers to turn ML control of the CDC on off. This included a new button on the standard control GUI for the CDC detector as shown in figure 5. Shift workers can range from graduate students to senior professors and from seasoned shift takers to novices. Data taking occurs 24/7 during a running experiment so system experts are not able to monitor this continuously. Placing an on/off switch in an easily accessible location and updating the shift worker documentation was considered necessary for such a new system. When the system was turned off, the ML model was still active and the recommended HV still recorded in a database. The HV itself was just not modified by the ML system.

The ML model was trained on historic calibrations which included several regions of the input feature space. This did not cover all possible regions so the uncertainty quantification of the ML model output was needed to inform a policy that could make decisions on what to setting to use. Figure 6 shows a 3D rendering of the 3%percent33\%3 % surface of model uncertainty. The final policy deployed will use the model recommendation for points within this 3%percent33\%3 % surface but for points outside of it will revert to observation mode which automatically sets the HV to its default value of 2125V. Data gathered while in observation mode will contribute to future model training causing the surface to increase as more areas of the feature space are encountered.

Refer to caption
Figure 5: The CDC detector controls GUI. For this project, the button indicated by the green circle was added to allow the AI/ML control of the detector to be easily turned on/off at any time by the shift workers.
Refer to caption
Figure 6: Visualization of the region of certainty for the AI/ML model. Points within the volume have an uncertainty of 3%absentpercent3\leq 3\%≤ 3 %. The AI/ML model is not allowed to control the detector high voltage for points outside of this surface. (See text for details.)

5 Results from Production Running

The fully automated system is now deployed as part of standard production for the GlueX detector. Figure 7 shows the GCF for the second run of the PrimEx experiment. This experiment included running conditions at the edge of the feature space on which the ML model was trained. Thus, it includes several regions where the ML system dropped to observation mode and used a constant HV setting. The green region indicates the ±5%plus-or-minuspercent5\pm 5\%± 5 % band that was an initial goal of system.

A final concern with this system was that it was not clear if stabilizing the GCF calibration for the CDC would lead to less stability in the other calibration constants for the detector that are used to determine the time-to-distance(TtoD) conversion. Figure 8 shows the TtoD residual width as a function of run. The plots include periods of both constant HV (red points mostly to the left of the plot) and ML controlled HV (blue points mostly to the right of the plot). The top plot indicates that the ML controlled period was no less stable than the period using the legacy mode of running with constant HV. The bottom plot shows the residual widths after applying a correction based on the gas density. This correction was derived as a byproduct of this project which exposed a correlation in the TtoD that had not been noticed before. This further reduced the time needed to fully calibration the detector.

Refer to caption
Figure 7: Gain correction factor (GCF) as a function of time (Run Number) for the PrimEx experiment’s second run during the Fall of 2022. Orange points were taken at a constant 2125 V while blue points were taken with an AI/ML tuned HV setting. The dashed line indicates the ideal GCF while the green box corresponds to ±plus-or-minus\pm± 5% of that.
Refer to caption
Refer to caption
Figure 8: Data taken during GlueX experiment phase II in Spring of 2023
TOP: Widths of residual time-to-distance(TtoD) distributions obtained before calibration for runs taken at 2125 V (red) and a HV setting determined by the AI/ML system (blue). This shows that the adjustments made by the AI/ML model to stabilize the gain did not introduce instability to the TtoD calibration.
BOTTOM: Widths of residual time-to-distance(TtoD) distributions obtained using a linear function dependent only on gas density. This new, fast technique for calibrating the TtoD was developed after noticing an interesting correlation while working on the gain calibrations.

6 Summary

A system utilizing an ML model to automatically control the High Voltage of the GlueX Central Drift Chamber detector has been deployed in production experiments. The system predicts calibrations based on environmental factors available prior to taking data. The predictions are then used to adjust the HV in order to stabilize the gain of the detector. The system was developed in stages to ensure safe, robust operation and to instill confidence and trust in the scientists whose data depended on it.

Acknowledgments

This work is supported by a grant from the U.S. Department of Energy, Office of Science, Office of Nuclear Physics under the LAB-20-2261 FOA.

The Carnegie Mellon Group is supported by the U.S. Department of Energy, Office of Science, Office of Nuclear Physics, DOE Grant No. DE-FG02-87ER40315

This research used resources of the Thomas Jefferson National Accelerator Facility, which is a DOE Office of Science User Facility supported by the U.S. Department of Energy, Office of Science, Office of Nuclear Physics under contract DE-AC05-06OR23177.

References

  • Adhikari et al. [2021] S. Adhikari, C.S. Akondi, H. Al Ghoul, A. Ali, M. Amaryan, E.G. Anassontzis, A. Austregesilo, F. Barbosa, J. Barlow, A. Barnes, and et al. The GlueX beamline and detector. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 987:164807, 2021. ISSN 0168-9002. doi: https://doi.org/10.1016/j.nima.2020.164807. URL https://www.sciencedirect.com/science/article/pii/S0168900220312043.
  • Jarvis et al. [2020] N.S. Jarvis, C.A. Meyer, B. Zihlmann, M. Staib, A. Austregesilo, F. Barbosa, C. Dickover, V. Razmyslovich, S. Taylor, Y. Van Haarlem, G. Visser, and T. Whitlatch. The Central Drift Chamber for GlueX. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 962:163727, 2020. ISSN 0168-9002. doi: https://doi.org/10.1016/j.nima.2020.163727. URL https://www.sciencedirect.com/science/article/pii/S0168900220302771.
  • Jones et al. [2006] R.T. Jones, M. Kornicer, A.R. Dzierba, J.L. Gunter, R. Lindenbusch, E. Scott, P. Smith, C. Steffen, S. Teige, P. Rubin, and E.S. Smith. A bootstrap method for gain calibration and resolution determination of a lead-glass calorimeter. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 566(2):366–374, 2006. ISSN 0168-9002. doi: https://doi.org/10.1016/j.nima.2006.07.061. URL https://www.sciencedirect.com/science/article/pii/S0168900206013556.
  • McSpadden et al. [2022] D. McSpadden, T. Jeske, N. Jarvis, Britton, D. Lawrence, and N. T.Kalra. Control and Calibration of GlueX Central Drift Chamber Using Gaussian Process Regression, 2022. URL https://ml4physicalsciences.github.io/2022/files/NeurIPS_ML4PS_2022_35.pdf.
  • Pedregosa et al. [2011] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
  • Sauli [1977] Fabio Sauli. Principles of Operation of Multiwire Proportional and Drift Chambers. In Principles of Operation of Multiwire Proportional and Drift Chambers, page 92 p, Geneva, 1977. CERN, CERN. doi: 10.5170/CERN-1977-009. URL https://cds.cern.ch/record/117989. CERN, Geneva, 1975 - 1976.
  • Williams and Rasmussen [1995] Christopher Williams and Carl Rasmussen. Gaussian Processes for Regression. In D. Touretzky, M.C. Mozer, and M. Hasselmo, editors, Advances in Neural Information Processing Systems, volume 8. MIT Press, 1995. URL https://proceedings.neurips.cc/paper_files/paper/1995/file/7cce53cf90577442771720a370c3c723-Paper.pdf.