\affiliation

[UCR]Department of Physics and Astronomy, University of California, Riverside, CA 92521, USA \affiliation[LLNLNACS]Nuclear and Chemical Science Division, Lawrence Livermore National Laboratory, Livermore, CA 94550 \affiliation[LLNLCED]Computational Engineering Division, Lawrence Livermore National Laboratory, Livermore CA 94550 \affiliation[LBNL]Physics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA \affiliation[BIDS]Berkeley Institute for Data Science, University of California, Berkeley, CA 94720, USA

Design of a SiPM-on-Tile ZDC for the future EIC,
and its Performance with Graph Neural Networks

Ryan Milton Sebouh J. Paul Barak Schmookler Miguel Arratia Piyush Karande Aaron Angerami Fernando Torales Acosta Benjamin Nachman
Abstract

We present a design for a high-granularity zero-degree calorimeter (ZDC) for the upcoming Electron-Ion Collider (EIC). The design uses SiPM-on-tile technology and features a novel staggered-layer arrangement that improves spatial resolution. To fully leverage the design’s high granularity and non-trivial geometry, we employ graph neural networks (GNNs) for energy and angle regression as well as signal classification. The GNN-boosted performance metrics meet, and in some cases, significantly surpass the requirements set in the EIC Yellow Report, laying the groundwork for enhanced measurements that will facilitate a wide physics program. Our studies show that GNNs can significantly enhance the performance of high-granularity CALICE-style calorimeters by automating and optimizing the software compensation algorithms required for these systems. This improvement holds true even in the case of complicated geometries that pose challenges for image-based AI/ML methods.

journal: NIMA

1 Introduction

The Zero-Degree Calorimeter (ZDC) at the future Electron-Ion Collider (EIC) [1] will measure high-energy neutrons, photons, and neutral pions to support a comprehensive physics program with electron-proton (ep𝑒𝑝epitalic_e italic_p) and electron-nucleus (eA𝑒𝐴eAitalic_e italic_A) collisions [2]. For example, the ZDC will be used to measure meson structure with deeply exclusive meson production, i.e., epeπ+n𝑒𝑝𝑒superscript𝜋𝑛ep\to e\pi^{+}nitalic_e italic_p → italic_e italic_π start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_n and epeK+Λ(nπ0)𝑒𝑝𝑒superscript𝐾Λ𝑛superscript𝜋0ep\to eK^{+}\Lambda(n\pi^{0})italic_e italic_p → italic_e italic_K start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT roman_Λ ( italic_n italic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ), which generate a neutron or lambda near beam rapidity. In these reactions, the ZDC’s position resolution plays a crucial role in reconstructing the momentum-transfer variable t𝑡titalic_t [2, 3]. In another application, the ZDC will measure spectator nucleons in scattering off light ions[4, 5], or off heavy ions [6, 7], where energy linearity plays an important role. Furthermore, the ZDC granularity must possess the capability to distinguish between photons and neutral pions produced in u𝑢uitalic_u-channel backward reactions, such as deeply virtual π0superscript𝜋0\pi^{0}italic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT production (epeπ0p𝑒𝑝𝑒superscript𝜋0𝑝ep\to e\pi^{0}pitalic_e italic_p → italic_e italic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT italic_p) and deeply virtual Compton scattering (epeγp𝑒𝑝𝑒𝛾𝑝ep\to e\gamma pitalic_e italic_p → italic_e italic_γ italic_p[8].

The ZDC performance requirements stated in section 14.5.2 of the EIC Yellow report mandate a single-hadron energy resolution better than 50%/Eabsent𝐸/\sqrt{E}/ square-root start_ARG italic_E end_ARG and an angular resolution better than 3 mrad/Eabsent𝐸/\sqrt{E}/ square-root start_ARG italic_E end_ARG [2]. The required energy range extends up to the beam energy, which will be a maximum of 275 GeV for protons. In eA𝑒𝐴eAitalic_e italic_A collisions at maximum energy, the average energy per nucleon for lead-208 nuclei is approximately 110 GeV.

As per the ECCE baseline design for the EIC project detector [9, 10], the ZDC incorporates two sampling calorimeter sections: a 12-layer lead/silicon-pad section (21 cm) and a 30-layer lead/scintillator section [10], resulting in a depth of 7λ𝜆\lambdaitalic_λ. The ZDC’s location is at about z=35𝑧35z=35italic_z = 35 m and nominally covers θ<5.5𝜃5.5\theta<5.5italic_θ < 5.5 mrad (η>6𝜂6\eta>6italic_η > 6). This design uses lead absorbers with an approximately 4 to 1 lead-to-scintillator thickness ratio to achieve a compensated response (i.e., a similar response for electromagnetic and hadronic showers). This approach illustrates “hardware compensation”, which improves energy resolution by minimizing the impact of shower-to-shower fluctuations on the electromagnetic fraction of hadronic showers.

An alternative iron-scintillator design offers the advantage of a self-supporting iron structure, easing assembly and maintenance in the ZDC’s limited space. Additionally, employing an iron absorber could reduce neutron production by a factor of about 4 compared to a lead absorber [11], thereby mitigating radiation damage in the SiPMs 111The neutron fluence expected in the ZDC region is projected to be less than 1012superscript101210^{12}10 start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT 1-MeV neutron equivalent per cm2 per year at peak luminosity (1034superscript103410^{34}10 start_POSTSUPERSCRIPT 34 end_POSTSUPERSCRIPT cm-2s-1[12]. This may necessitate mitigation measures in the design, such as employing annealing of SiPMs in between runs [13]. . A limitation, however, is that this design cannot be compensated at the hardware level with any practical absorber-to-scintillator thickness ratio [14].

The non-compensated nature of iron-scintillator calorimeters can be corrected with “software compensation” algorithms, which exploit the shower-cell topology to re-weight electromagnetic and hadronic sub-showers, e.g., Refs. [15, 16, 17]. The CALICE collaboration has tested these algorithms with their iron-scintillator SiPM-on-tile prototypes (e.g., Ref. [18]).

The advent of modern machine-learning techniques, such as deep learning, offers the prospect of fully exploiting the power of high-granularity calorimetry and software-compensation techniques in an automated, efficient, and optimal way. Several works have shown these, mostly from studies at the LHC [19, 20, 21, 22, 23, 24, 25, 26, 27]. In particular, graph neural networks (GNNs), see e.g., Ref. [28], offer a promising approach to handle energy regression for calorimeter showers, even with complex geometries, providing improvements over traditional (non-AI/ML) software compensation methods.

Examples of GNN work for software compensation include work by the ATLAS collaboration [25], by us in an EIC context in Ref. [29], and more recently by the CALICE collaboration [30]. Concurrently, the GNNs are capable of classifying events by particle species, which is needed for several EIC applications such as to distinguish γ𝛾\gammaitalic_γ from π0superscript𝜋0\pi^{0}italic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT or neutrons.

In this paper, we present the design of a high-granularity ZDC for the EIC based on SiPM-on-tile technology. We highlight this design for its high energy and angular resolution, leveraging its potential through GNNs. Section 2 showcases the detector design. The simulations and regression approach are described in Section 3, while the performance is outlined in Section 4. Finally, the summary and conclusions are presented in Section 5.

2 Design

Figure 1 displays our design for the ZDC, which is based on the SiPM-on-tile approach [18], following a similar approach to that of Ref. [31]. The sampling structure consists of iron absorber layers and scintillator layers, which are read out with SiPMs.

Refer to caption
Figure 1: Foreground: exploded view of a ZDC layer. Background: the entire ZDC with dimensions.

The overall dimensions are 60×60×162606016260\times 60\times 16260 × 60 × 162 cm3, driven by limits imposed by the EIC beamlines. The self-supporting iron absorber structure is based on steel blocks with dimensions of 96mm×98mm×20mm96mm98mm20mm96\mathrm{mm}\times 98\mathrm{mm}\times 20\mathrm{mm}96 roman_m roman_m × 98 roman_m roman_m × 20 roman_m roman_m, which are identical to the ones currently used in the STAR hadronic calorimeter [32, 33], and that will be accessible for reuse in EIC experiments.

The scintillator cells will be 25 cm2 hexagonal tiles with thickness of 3 mm, featuring a dimple at the center for coupling to a SiPM. The scintillator cells will be positioned within 3D-printed plastic frames [34], which will then be interleaved between reflective foils (ESR by 3M).

Following Ref. [35], the layers will cycle through four different scintillator layouts in order to stagger them, as shown in Fig. 2. This staggering approach enhances the position resolution of the ZDC. In this four-layer cyclical layout, referred to as “H4” in Ref. [35], the overlap between the cells in four consecutive layers defines rhombus-shaped subcells that have area equal to 1/12 of that of the hexagonal cells.

Refer to caption
Refer to caption
Figure 2: Top: Four scintillator layers with different offsets for the tile positions, with an exaggerated spacing between them. Bottom: 3D rendering of the same four layers, as viewed head on.

3 Simulation

The Geant4 [36] (v11.0.p2) simulations of the ZDC geometry were implemented in the DD4HEP framework [37] using the FTFP_BERT physics list. No noise was included in the simulation. Auxiliary simulations of muons and electrons were also generated to define the hit-energy MIP unit and electromagnetic scale of the calorimeter, respectively. We performed studies both with the staggered layout and an unstaggered layout for reference.

Hits were reconstructed assuming a 15-bit ADC with no noise and a dynamic range of 800 MeV. Only reconstructed hit energies above E>0.5𝐸0.5E>0.5italic_E > 0.5 MIP, with 1 MIP ===0.5 MeV estimated from muon simulations, and a hit time of t<275𝑡275t<275italic_t < 275 ns were considered for further analysis.

We defined the reconstructed shower energy at the electromagnetic scale, which was determined using electron simulations. These simulations were used to estimate a sampling fraction of 2.1%.

A standard reference is defined with the simplest energy regression algorithm, referred to as the “strawman”, which defines the reconstructed energy as the sum of the energies in cells that pass both the time and energy cut, divided by the sampling fraction obtained with electron simulations. For hadronic showers, this algorithm produces an energy scale that was not unity and exhibited non-linearity at lower energies, as expected due to the non-compensated nature of the calorimeter.

As a reference for the position (angle) reconstruction, we used two algorithms: first, a log-weighted center-of-gravity reconstruction (“baseline”), and second, a modified version of this algorithm known as “HEXPLIT” [35] which takes advantage of the overlap** cells. In HEXPLIT, subcells are defined by the overlap of cells with those in neighboring (and next-to-neighboring) layers, and the relative energy contributions in each subcell are estimated using the energy in the overlap** cells in these neighboring and next-to-neighboring layers. The log-weighted center-of-gravity reconstruction of the shower position is then performed on the subcells instead of on hits in individual cells. This algorithm is described in further detail in Ref. [35].

3.1 ML-based reconstruction

In addition to the strawman reconstruction algorithm, an ML-based reconstruction using GNNs [29, 25] was also used. This was done using the Graph Nets library [38] in TensorFlow [39]. For training and test data, we used single-particle simulations of neutrons, π0superscript𝜋0\pi^{0}italic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPTs, and photons with the “H4” staggering approach for the ZDC’s readout.

The calorimeter data for each event was represented as a graph, with each node containing the energies and positions of cells that pass the energy and timing cuts. The nodes were interconnected using a set of edges, which represent the set of the ten nearest neighbors of a cell. A global node containing the summed node energies divided by the sampling fraction was also included in the graph. All energies in the graph were normalized via a z-score normalization using the mean and standard deviation from a subset of events.

A dense neural network composed of four dense layers with 64 nodes each was trained to predict the generated energy and polar angle, Etruthsubscript𝐸truthE_{\rm truth}italic_E start_POSTSUBSCRIPT roman_truth end_POSTSUBSCRIPT and θtruthsubscript𝜃truth\theta_{\rm truth}italic_θ start_POSTSUBSCRIPT roman_truth end_POSTSUBSCRIPT, respectively. Each dense layer used the Rectified Linear Unit (ReLU) activation function [40] and He-normal initialization [41]. The model was trained with a batch size of 256 calorimeter showers for 70 epochs, using the Adam optimizer [42]. The learning rate was initialized to 1e31superscript𝑒31e^{-3}1 italic_e start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT and was halved every 5 epochs to a minimum of 1e61superscript𝑒61e^{-6}1 italic_e start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT.

3.1.1 Neutrons

To quantify the single-neutron performance of the ZDC, we generated single-neutron events with energies in the range of 10 GeV–300 GeV, polar angles in the range 0<θ<50𝜃50<\theta<50 < italic_θ < 5 mrad, and azimuthal angles in the range 0<ϕ<3600italic-ϕsuperscript3600<\phi<360^{\circ}0 < italic_ϕ < 360 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT. We trained a model using 750k events as training data and 250k for validation data, with each event energy sampled from a continuous log-uniform distribution. Mean absolute error loss was used, with equal weights given to energy and θ𝜃\thetaitalic_θ. For testing, we used a separate set of simulated events at discrete energies between 10 GeV and 300 GeV, with a total of 500k events. While the model was trained with data having up to θ=5𝜃5\theta=5italic_θ = 5 mrad, the test data was limited to 0<θ<40𝜃40<\theta<40 < italic_θ < 4 mrad, which is the fiducial acceptance of ZDC.

We also explored multiple-neutron events, which are expected in eA𝑒𝐴eAitalic_e italic_A collisions, to test the energy linearity of the ZDC. A model was trained to predict the energy of events with varying numbers of neutrons, Nnsubscript𝑁𝑛N_{n}italic_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. We trained the model on a mixture of the two types of events the ZDC will measure.

The first data set contained 1 million events with a random number of neutrons between 1 and 10, each with an energy 10 GeV–200 GeV, sampled from a continuous log-uniform distribution. The second set had 0.5 million events with a random number of neutrons between 1 and 10, each with an energy sampled from a Gaussian distribution of μ=100𝜇100\mu=100italic_μ = 100 GeV and σ=5𝜎5\sigma=5italic_σ = 5 GeV. The value of 5 GeV represents the expected smearing caused by nuclear effects in eA𝑒𝐴eAitalic_e italic_A collisions, as estimated using the BeAgle event generator [7]. Each neutron in these data sets had random θ𝜃\thetaitalic_θ and ϕitalic-ϕ\phiitalic_ϕ, sampled uniformly in the ranges 0<θ<50𝜃50<\theta<50 < italic_θ < 5 mrad and 0<ϕ<3600italic-ϕsuperscript3600<\phi<360^{\circ}0 < italic_ϕ < 360 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT, respectively.

The loss function was again the mean absolute error, with Etruthsubscript𝐸truthE_{\rm truth}italic_E start_POSTSUBSCRIPT roman_truth end_POSTSUBSCRIPT being the sum of the generated neutron energies. For the test data, we used 0.5 million events with 1–10 neutrons, each with an energy of exactly 100 GeV, 0<θ<40𝜃40<\theta<40 < italic_θ < 4 mrad, and 0<ϕ<3600italic-ϕsuperscript3600<\phi<360^{\circ}0 < italic_ϕ < 360 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT.

The approximate maximum number of neutrons that will hit the ZDC in the “most-central” collisions is about 50 [7]. However, for this work, we focused on the range 1–10, which captures the bulk of the total eA𝑒𝐴eAitalic_e italic_A cross-section and is most relevant for studies such as Ref. [43].

3.1.2 π0/γsuperscript𝜋0𝛾\pi^{0}/\gammaitalic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT / italic_γ

To investigate π0/γsuperscript𝜋0𝛾\pi^{0}/\gammaitalic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT / italic_γ separation, the model was extended to predict Etruthsubscript𝐸truthE_{\rm truth}italic_E start_POSTSUBSCRIPT roman_truth end_POSTSUBSCRIPT, θtruthsubscript𝜃truth\theta_{\rm truth}italic_θ start_POSTSUBSCRIPT roman_truth end_POSTSUBSCRIPT, and the particle type. The modified loss function follows from Ref. [25]:

L=(1α)Lclassification+αLregression,L1𝛼subscriptLclassification𝛼subscriptLregression\pazocal{L}=(1-\alpha)\pazocal{L}_{\rm classification}+\alpha\pazocal{L}_{\rm regression},roman_L = ( 1 - italic_α ) roman_L start_POSTSUBSCRIPT roman_classification end_POSTSUBSCRIPT + italic_α roman_L start_POSTSUBSCRIPT roman_regression end_POSTSUBSCRIPT , (1)

where LclassificationsubscriptLclassification\pazocal{L}_{\rm classification}roman_L start_POSTSUBSCRIPT roman_classification end_POSTSUBSCRIPT is the binary cross-entropy loss, LregressionsubscriptLregression\pazocal{L}_{\rm regression}roman_L start_POSTSUBSCRIPT roman_regression end_POSTSUBSCRIPT is the mean absolute error loss, and α𝛼\alphaitalic_α is a hyperparameter specifying the importance of classification versus regression. α𝛼\alphaitalic_α was set to 0.75 for these studies. Energy and θ𝜃\thetaitalic_θ were given equal weights in LregressionsubscriptLregression\pazocal{L}_{\rm regression}roman_L start_POSTSUBSCRIPT roman_regression end_POSTSUBSCRIPT.

The model’s output for particle type classification was converted to the probability of an event’s incident particle being a π0superscript𝜋0\pi^{0}italic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT using the sigmoid function. Since the sigmoid function returns a continuous value between 0 and 1, we invoked a classification cut of 0.30.30.30.3 – below which, we called the model’s classification a γ𝛾\gammaitalic_γ and above which, the output was a π0superscript𝜋0\pi^{0}italic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT.

We used 880k single-π0superscript𝜋0\pi^{0}italic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT and 880k single-γ𝛾\gammaitalic_γ events for training the model and 300k for each particle type during validation. The model was trained with a random order of π0superscript𝜋0\pi^{0}italic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT and γ𝛾\gammaitalic_γ events. The data were again generated with a log-uniform distribution of energies and a polar angle range of 0<θ<40𝜃40<\theta<40 < italic_θ < 4 mrad. The model was tested using a separate set of simulated events at discrete energies, with 500k events for both π0superscript𝜋0\pi^{0}italic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT and γ𝛾\gammaitalic_γ events.

4 Performance

4.1 Single-neutron energy resolution

We show the single-neutron energy resolution and scale for reconstructed neutrons from the strawman and GNN reconstruction algorithms in Fig. 3. We find that while the strawman reconstruction result has an offset in the scale up to absent\approx-≈ -30%, the scale in the GNN reconstruction is nearly zero. The resolutions for the strawman method are close to the 50/E%5%direct-sum50percent𝐸percent550/\sqrt{E}\%\oplus 5\%50 / square-root start_ARG italic_E end_ARG % ⊕ 5 % requirement of the YR at low energy, and are below the YR requirement at higher energy. The resolutions from the GNN method are considerably better, especially at lower energies.

We compare the results of our simulation to those of a CALICE beamtest [17], which included software compensation in the reconstruction. We find that the resolutions from our strawman reconstruction are comparable to those of CALICE without software compensation. The resolution in our simulations with the GNN method outperforms that of CALICE with software compensation by about 30%.

We likewise measured the energy resolution for single-photon showers and found it to be 20%/Epercent20𝐸20\%/\sqrt{E}20 % / square-root start_ARG italic_E end_ARG with an energy scale at unity for both the strawman reconstruction and GNN method, as expected.

Refer to caption
Figure 3: Energy resolution (top row) and scale (bottom row) obtained with the strawman (open symbols) and GNN (filled symbols) for simulated single neutrons. The resolutions are compared to those of the CALICE beamtest [17] (orange squares) with (filled) and without (open) software compensation.

To determine if there were any edge effects within the fiducial region of the detector, we show the resolution and bias for the strawman and GNN reconstruction at various ranges in the polar angle θ𝜃\thetaitalic_θ in Fig. 4. We find that the resolution does not have a strong dependence on the polar angle within this acceptance region.

Refer to caption
Figure 4: Energy resolution (top row) and scale (bottom row) obtained with the strawman (left) and GNN (right), at various ranges in θ𝜃\thetaitalic_θ. The positions on the face of the detector corresponding to each color are visualized in the inset.

4.2 Position resolution

Figure 5 shows the position resolution as a function of the generated energy of simulated neutrons, while Figure 6 presents it in slices of polar angle. This resolution is defined as the sigma of a Gaussian fit to the distributions of the radial position residuals, defined by

Δr=rrecortruth,Δ𝑟subscript𝑟recosubscript𝑟truth\Delta r=r_{\rm reco}-r_{\rm truth},roman_Δ italic_r = italic_r start_POSTSUBSCRIPT roman_reco end_POSTSUBSCRIPT - italic_r start_POSTSUBSCRIPT roman_truth end_POSTSUBSCRIPT , (2)

where rrecosubscript𝑟recor_{\rm reco}italic_r start_POSTSUBSCRIPT roman_reco end_POSTSUBSCRIPT is the radial coordinate of the reconstructed position of where the particle struck the face of the detector, rtruthsubscript𝑟truthr_{\rm truth}italic_r start_POSTSUBSCRIPT roman_truth end_POSTSUBSCRIPT is the radial coordinate of the position of the truth particle track at the detector face. The angular resolution is then the radial position resolution divided by the distance from the nominal interaction point to the front face of the detector.

We compared the results obtained with the staggered layout proposed in this work to those obtained with an unstaggered layout. Further, we compare the resolutions obtained with the “baseline” and HEXPLIT reconstruction algorithms, as described in Ref. [35], and the GNN reconstruction. We find that the best resolution is obtained using the GNN. At high energies, the HEXPLIT and GNN algorithms produce nearly the same resolutions as one another, while at lower energies, the HEXPLIT performs worse. The staggered-layer design and GNN reconstruction easily meet the requirements outlined in the EIC Yellow Report [2] and are close to the more stringent requirement set forth in Ref. [3].

At 100 GeV, the angular resolution with the GNN reconstruction is 63 μ𝜇\muitalic_μrad, which, when added in quadrature with the beam divergence of 56 μ𝜇\muitalic_μrad in the high-acceptance configuration [3], corresponds to a pTsubscript𝑝𝑇p_{T}italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT resolution of 8.4 MeV.

Refer to caption
Figure 5: Position resolution for neutrons as a function of the generated energy. Results are shown with an unstaggered layout (blue) and staggered layout with the positions reconstructed with the baseline (orange), HEXPLIT (green), and GNN (red) reconstruction algorithms.
Refer to caption
Figure 6: Top: same as Fig. 5, except in slices of θ𝜃\thetaitalic_θ and only for the GNN method. Bottom: bias in position reconstruction for neutrons as a function of the generated energy. The colored regions in the inset provide a visualization of where these slices of θ𝜃\thetaitalic_θ intersect the front face of the detector.

We repeated this exercise for single photons and obtained a resolution of σθ=0.19/E0.014subscript𝜎𝜃direct-sum0.19𝐸0.014\sigma_{\theta}=0.19/\sqrt{E}\oplus 0.014italic_σ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT = 0.19 / square-root start_ARG italic_E end_ARG ⊕ 0.014 mrad for the baseline algorithm. For reference, the smallest separation between photons produced in π0superscript𝜋0\pi^{0}italic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT decay allowed by kinematics is 1.0 mrad, about an order of magnitude larger than the single-photon position resolution. We discuss π0superscript𝜋0\pi^{0}italic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT and γ𝛾\gammaitalic_γ classification in Section 4.4.

4.3 Multiple-neutron events

We evaluate the ZDC performance for multiple-neutron events (illustrated in Fig. 7) that are expected in eA𝑒𝐴eAitalic_e italic_A collisions, where the spectator neutrons mostly have the same energy modulo “nuclear effects” such as Fermi motion, short-range correlations or fission222 Upon boosting to the laboratory frame, these nuclear effects led to a Gaussian smearing with a width of about 5 GeV around the beam momentum of 100 GeV per nucleon, as per the BeAgle event generator [7]. .

Figure 8 compares the multi-neutron performance of the strawman approach and the GNN at an integer neutron energy of exactly 100 GeV and a random angle within 4 mrad; we excluded the anticipated smearing from nuclear effects in assessing this performance to isolate purely instrumental effects.

The reconstructed energy peaks for each number of neutron (Nnsubscript𝑁𝑛N_{n}italic_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT) are clearly separated in both approaches. Similar to the single-neutron case, the GNN enhances the energy scale and reduces the width of the peaks compared to the strawman reconstruction. This improvement occurs despite the overlap** showers and event complexity illustrated in Fig. 7.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 7: Examples of 4 reconstructed 3D shower shapes in the ZDC for events with 1 neutron (Nn=1subscript𝑁𝑛1N_{n}=1italic_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = 1), 2 neutrons (Nn=2subscript𝑁𝑛2N_{n}=2italic_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = 2), 4 neutrons (Nn=4subscript𝑁𝑛4N_{n}=4italic_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = 4), and 9 neutrons (Nn=9subscript𝑁𝑛9N_{n}=9italic_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = 9). The color code represents hit energy in terms of EMIPsubscript𝐸MIPE_{\mathrm{MIP}}italic_E start_POSTSUBSCRIPT roman_MIP end_POSTSUBSCRIPT. The marker size is displayed proportionally to hit energy for display purposes.

The energy resolutions for these multiple-neutron events are worse than the performance extrapolated from single-neutron events (shown in Fig. 3), by about 10–30% in the range studied; for instance, the width for five 100 GeV neutrons is about 8 GeV, whereas extrapolating from single-neutron performance one would estimate it to be about 7 GeV.

The reconstructed peak widths in neutron measurements in eA𝑒𝐴eAitalic_e italic_A collisions, similar to Ref. [43], are expected to receive comparable contributions from instrumental effects and nuclear effects when using the strawman reconstruction, and to be dominated by nuclear effects when using the GNN method.

Refer to caption
Figure 8: Energy measured for each integer number of neutrons with energy equal to 100 GeV, reconstructed with GNN (orange) and strawman (blue). The events are weighted by a factor of eNnsuperscript𝑒subscript𝑁𝑛e^{-N_{n}}italic_e start_POSTSUPERSCRIPT - italic_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUPERSCRIPT. The black, vertical lines are drawn at the total true energies of events with different numbers of neutrons. Here we show events containing up to nine neutrons. This performance illustrates just instrumental effects, excluding nuclear effects like smearing caused by Fermi motion and fission. In the laboratory frame, these nuclear effects would contribute in quadrature about 5 GeV per 100 GeV neutron, according to BeAgle [7], which happens to be similar to the widths of the strawman reconstruction shown here.

4.4 π0/γsuperscript𝜋0𝛾\pi^{0}/\gammaitalic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT / italic_γ separation

We quantify the model’s ability to separate π0superscript𝜋0\pi^{0}italic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT and γ𝛾\gammaitalic_γ using the efficiency, as shown in Fig. 9. This shows the fraction of events that the model classified as photons, i.e., the fraction of events with outputs below the classification cut of 0.3. The GNN consistently classifies photon events with 99%percent9999\%99 % efficiency between 50 and 250 GeV, and correctly classifies 98%percent9898\%98 % of π0superscript𝜋0\pi^{0}italic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT events above 150 GeV.

At lower energies, the GNN misclassifies up to 20%percent2020\%20 % of π0superscript𝜋0\pi^{0}italic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT events. Here, the separation between photons is large enough such that only one photon deposits energy in the ZDC, making it difficult for the GNN to differentiate these events from single-photon events. As such, the GNN offers similar separation to that from a simple σ𝜎\sigmaitalic_σ cut on the shower width, indicating no substantial improvement can be achieved in π0/γsuperscript𝜋0𝛾\pi^{0}/\gammaitalic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT / italic_γ separation at low energies. When both photons consistently reach the ZDC at higher energies, the σ𝜎\sigmaitalic_σ cut and GNN achieve a π0superscript𝜋0\pi^{0}italic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT rejection efficiency greater than 97%percent9797\%97 % and 98%percent9898\%98 %, respectively.

Refer to caption
Figure 9: Efficiency of classifying γ𝛾\gammaitalic_γ (blue) and π0superscript𝜋0\pi^{0}italic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT (red) as γ𝛾\gammaitalic_γ. The π0superscript𝜋0\pi^{0}italic_π start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT rejection efficiency has a bottleneck (green) at lower energies due to events with only one photon hitting the ZDC. The green curve quantifies the fraction of events where both decay photons hit the ZDC, and it hence shows the theoretical best classification performance.

5 Summary

We have presented a design and simulated the performance of a high-granularity Zero-Degree Calorimeter for the future Electron-Ion Collider. This design is similar to the CALICE AHCAL prototypes and uses iron blocks as absorbers, with scintillator readout using SiPMs. Unlike CALICE designs, it includes a novel staggered layer design based on hexagonal tessellation patterns, aimed at enhancing its position resolution.

We demonstrate the efficacy of this design using a machine-learning regression based on a graph representation, which provides motivation for choosing this design over traditional low-granularity ones. We show that the GNN can effectively handle and exploit the complex pattern of staggered hexagonal cells for software compensation and particle identification. The GNN improves energy reconstruction, enhancing resolution and scalability compared to other approaches for both single-neutron events and multiple-neutron events. It also provides performance for single photon vs. neutral pion classification that is close to optimal given its acceptance.

The current design meets the expectations outlined in the EIC Yellow Report, especially in terms of improving angular resolution. Additionally, its high granularity enhances background-rejection capabilities, such as beam-gas interactions and SiPM noise. This design could also offer fine time resolution, which could be used to improve energy regression or background rejection. We will explore this in future work after prototype beam testing reveals realistic time performance.

This ZDC design will be capable of delivering performance for the majority of the physics program at EIC, which necessitates measurements of high-energy neutrons, photons, and neutral pions. Additionally, this design could be complemented with a homogeneous crystal calorimeter designed for measuring low-energy O(10100)𝑂10100O(10-100)italic_O ( 10 - 100 ) MeV photons, which would serve the purpose of tagging backgrounds for coherent eA𝑒𝐴eAitalic_e italic_A scattering [44], and other applications.

The design and GNN-based reconstruction presented in this work can serve as a blueprint for guiding the design and optimizing performance for other high-granularity calorimeter systems at the EIC. These include the forward hadronic calorimeter [10], and especially its high-granularity insert [31], the barrel electromagnetic and hadronic calorimeters, the few-degree calorimeter [45], among others.

Code Availability

The code for the data processing, training models, and plotting results can be found here: https://github.com/eiccodesign/regressiononly/tree/zdc_classification. The data used in these studies is found in Ref. [46, 47, 48, 49, 50, 51]

Addendum

While this manuscript was in preparation, this ZDC design was incorporated into the baseline design of ePIC, which is the EIC project detector.

Acknowledgments

We thank members of the California EIC consortium for their feedback on our design, especially Oleg Tsai. Additionally, we extend our gratitude to the ePIC collaboration, specifically Alexander Jentsch and Elke-Caroline Aschenauer, for their numerous discussions about ZDC physics and detector design.

We acknowledge support from DOE grant award number DE-SC0022355. We also acknowledge support by the MRPI program of the University of California Office of the President, award number 00010100. S.P also acknowledges support from the Jefferson Lab EIC Center Fellowship. This research used resources from the LLNL institutional Computing Grand Challenge program and the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231 using NERSC award HEP- ERCAP0021099. M.A acknowledges support through DOE Contract No. DE-AC05-06OR23177 under which Jefferson Science Associates, LLC operates the Thomas Jefferson National Accelerator Facility. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract No. DE-AC52-07NA27344.

Bibliography