\addbibresource

library.bib

Stochastic Optimisation Framework using the Core Imaging Library and Synergistic Image Reconstruction Framework for PET Reconstruction111https://agenda.infn.it/event/36860/contributions/230108/

Evangelos Papoutsellis1, Casper da Costa-Luis2, Daniel Deidda3, Claire Delplancke4, Margaret Duff2, Gemma Fardell2, Ashley Gillman5, Jakob S. Jørgensen6, Zeljko Kereta7, Evgueni Ovtchinnikov2, Edoardo Pasca2, Georg Schramm8 and Kris Thielemans9
10th Conference on PET, SPECT, and MR Multimodal Technologies, Total Body and Fast Timing in Medical Imaging, 20-23 May 2024, Isola d’Elba, Italy
1Finden Ltd, Rutherford Appleton Laboratory, Harwell Campus, UK, 2Scientific Computing Department, Science & Technology Facilities Council, Harwell Campus, UK, , 3 National Physical Laboratory, UK, 4Électricité de France, Research and Development, 5 Australian e-Health Res. Ctr., CSIRO, Brisbane, Queensland, Australia, 6Department of Applied Mathematics and Computer Science, Technical University of Denmark, 7Department of Computer Science, University College London, UK, 8Department of Imaging and Pathology, Division of Nuclear Medicine, KU Leuven, Leuven, Belgium, 9Institute of Nuclear Medicine, University College London, UK. With thanks for discussions and contributions from: Matthias Ehrhardt, Tang Junqi, Laura Murgatroyd, Sam Porter, Imraj Singh and Robert Twyman. E. Pap acknowledges funding through the Innovate UK Analysis for Innovators (A4i) program “Denoising of chemical imaging and tomography data (Project No. 10060435)”. The development of CIL is supported by CCPi (EPSRC grant EP/T026677/1) and the Ada Lovelace Centre at STFC. The development of SIRF is funded by CCP SyneRBI (EPSRC grant EP/T026693/1). J.S.J. is supported by The Villum Foundation (Grant No. 25893) Z.K. is supported by the UK EPSRC grant EP/X010740/1. C.D. is supported by the “PET++: Improving Localization, Diagnosis and Quantification in Clinical and Medical PET Imaging with Randomized Optimization’ EP/S026045/1.”
Abstract

We introduce a stochastic framework into the open–source Core Imaging Library (CIL) which enables easy development of stochastic algorithms. Five such algorithms from the literature are developed, Stochastic Gradient Descent, Stochastic Average Gradient (-Amélioré), (Loopless) Stochastic Variance Reduced Gradient. We showcase the functionality of the framework with a comparative study against a deterministic algorithm on a simulated 2D PET dataset, with the use of the open-source Synergistic Image Reconstruction Framework. We observe that stochastic optimisation methods can converge in fewer passes of the data than a standard deterministic algorithm222https://agenda.infn.it/event/36860/contributions/230108/.

Index Terms:
stochastic algorithms, positron emission tomography, image reconstruction, software and quantification

I Introduction

Iterative reconstruction methods have been applied with great success for solving challenging optimisation problems, such as total variation (TV) regularisation. Since iterative methods are computationally demanding due to the increasingly large data sizes, a range of stochastic optimisation algorithms have been proposed in the literature to reduce the computational effort.

In this work, we extend the optimisation functionality of the Core Imaging Library (CIL) [jorgensen2021core, papoutsellis2021core] with a stochastic framework that enables develo** a range of stochastic algorithms found in the literature: Stochastic Gradient Descent (SGD), Stochastic Average Gradient (SAG) [schmidt2017minimizing], SAG- Amélioré (SAGA) [defazio2014saga], Stochastic Variance Reduced Gradient (SVRG) [johnson2013accelerating] and Loopless SVRG (LSVRG) [kovalev2020don]. These add to the Stochastic Primal-Dual Hybrid-Gradient (SPDHG) currently available in CIL, see [papoutsellis2021core] and references therein. We demonstrate the use of this framework on a Positron Emission Tomography (PET) application, thanks to the combined use of the Synergistic Image Reconstruction Framework (SIRF) [ovtchinnikov2020sirf].

The developed framework allows for an easy comparison between stochastic gradient estimators. In this summary, we observe that stochastic optimisation methods can converge in fewer passes of the data than a deterministic benchmark.

Refer to caption
Figure 1: Simulated 2D 18F-FDG PET thorax dataset, reference solution xsuperscript𝑥x^{*}italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and error plots for FISTA, Prox-SGD, Prox-LSVRG and SPDHG at 100 data passes.

II Stochastic Framework

We consider optimisation problems of the form

x=argminx𝕏{F(x):=f(x)+g(x)}i=1nfi(x)+g(x),superscript𝑥subscriptargmin𝑥𝕏assign𝐹𝑥𝑓𝑥𝑔𝑥superscriptsubscript𝑖1𝑛subscript𝑓𝑖𝑥𝑔𝑥x^{*}\!=\!\operatorname*{arg\,min}_{x\in\mathbb{X}}\{F(x):=f(x)+g(x)\}\equiv% \sum_{i=1}^{n}f_{i}(x)+g(x),italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = start_OPERATOR roman_arg roman_min end_OPERATOR start_POSTSUBSCRIPT italic_x ∈ blackboard_X end_POSTSUBSCRIPT { italic_F ( italic_x ) := italic_f ( italic_x ) + italic_g ( italic_x ) } ≡ ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_x ) + italic_g ( italic_x ) , (1)

where 𝕏𝕏\mathbb{X}blackboard_X is a finite dimensional space. Functions fi,f,g:𝕏:subscript𝑓𝑖𝑓𝑔𝕏f_{i},f,g:\mathbb{X}\rightarrow\mathbb{R}italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_f , italic_g : blackboard_X → blackboard_R for i{1,,n}𝑖1𝑛i\in\{1,...,n\}italic_i ∈ { 1 , … , italic_n }, are proper and convex, where fisubscript𝑓𝑖f_{i}italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are L𝐿Litalic_L-smooth and represent the fitness to the data. Regulariser g𝑔gitalic_g has a proximal operator which either has a closed-form representation or can be efficiently solved up to some precision.

Proximal gradient descent [beck2009fast] (also known as ISTA, or forward-backward splitting) is a classical deterministic algorithm to solve (1) by the iterations

xk+1=proxγkg(xkγkf(xk)),k=0,1,2,formulae-sequencesubscript𝑥𝑘1subscriptproxsubscript𝛾𝑘𝑔subscript𝑥𝑘subscript𝛾𝑘𝑓subscript𝑥𝑘𝑘012x_{k+1}=\text{prox}_{\gamma_{k}g}\left(x_{k}-\gamma_{k}\nabla f(x_{k})\right),% \quad k=0,1,2,\dotsitalic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = prox start_POSTSUBSCRIPT italic_γ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_γ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∇ italic_f ( italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) , italic_k = 0 , 1 , 2 , … (2)

for step size γksubscript𝛾𝑘\gamma_{k}italic_γ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT and initial guess x0subscript𝑥0x_{0}italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. When g0𝑔0g\equiv 0italic_g ≡ 0 this reduces to gradient descent. Instead of computing the full gradient f(xk)𝑓subscript𝑥𝑘\nabla f(x_{k})∇ italic_f ( italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) in each iteration, stochastic optimisation algorithms employ an estimator ~f(xk)~𝑓subscript𝑥𝑘\tilde{\nabla}f(x_{k})over~ start_ARG ∇ end_ARG italic_f ( italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ), typically using the information of only one randomly selected function fisubscript𝑓𝑖f_{i}italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT.

The stochastic framework in CIL consists of four components that can be combined in a plug-and-play fashion: i) functions providing stochastic estimators for the gradient of f𝑓fitalic_f, ii) sampling methods which take in a set of probabilities pisubscript𝑝𝑖p_{i}italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT for choosing each of the functions fisubscript𝑓𝑖f_{i}italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, iii) a partitioner to split up the data, defining the fisubscript𝑓𝑖f_{i}italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT’s, and, iv) algorithms to solve (1).

Thus far we have implemented 5 functions that provide stochastic estimators for the gradient of f𝑓fitalic_f. When these are used in combination with GD or ISTA algorithms of CIL, they correspond to SGD (Prox-SGD), SAG (Prox-SAG) [schmidt2017minimizing], SAGA (Prox-SAGA) [defazio2014saga], SVRG (Prox-SVRG) [johnson2013accelerating] and LSVRG (Prox-LSVRG) [kovalev2020don]. Due to our flexible design, the above stochastic estimators can also be combined with Nesterov-type accelerated algorithms, e.g., FISTA [beck2009fast], see [Driggs2020] for more details.

III Methodology and Results

For the numerical study, we use a simulated 2D 18F-FDG PET dataset from SIRF 333https://github.com/SyneRBI/SIRF_data/tree/master/examples/PET/thorax_single_slice . Simulated Poisson noise is applied to the acquisition data. The data is partitioned into 32 subsets with equidistant projection views. Kullback-Leibler data fitting term is used for fisubscript𝑓𝑖f_{i}italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPTs and f𝑓fitalic_f. TV with a non-negativity constraint is used for the regulariser, g=αTV𝑔𝛼TVg=\alpha\text{TV}italic_g = italic_α TV, where α=0.1𝛼0.1\alpha=0.1italic_α = 0.1. Results are shown in figure 1.

The optimal reconstruction xsuperscript𝑥x^{*}italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT was obtained using 500 data passes of SPDHG. All the algorithms are warm-started with one data pass of Prox-SGD. Functions fisubscript𝑓𝑖f_{i}italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are selected randomly with replacement. Algorithmic parameters such as step size, update frequency for Prox-SVRG, and probability for Prox-LSVRG were optimised using a parameter search. In figures 2 and 3, we compare the different stochastic algorithms in CIL, and their performance with respect to “data passes”, i.e. how many times the algorithm has processed all the acquisition data in expectation. All the proposed stochastic algorithms have a faster convergence rate to the optimal solution than the deterministic FISTA.

Refer to caption
Figure 2: Distance from the optimal solution xsuperscript𝑥x^{*}italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT with respect to “data passes”.
Refer to caption
Figure 3: Distance from the optimal objective value F(x)𝐹superscript𝑥F(x^{*})italic_F ( italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) with respect to “data passes”.

IV Discussion and Future Work

This contribution describes an open-source framework that enables investigating a large variety of optimisation algorithms in many different contexts, including CT, MR, PET and SPECT image reconstruction. Presented results for PET show that stochastic algorithms in CIL can converge in fewer data passes than a deterministic counterpart. However, comparing the stochastic algorithms fairly will require a thorough investigation of algorithmic parameters, such as step size regimes. We leave this comparison for future work.

In addition, work is in progress to further empirically validate the stochastic framework by applying it to real PET data and to expand the versatility of the stochastic framework by extending its applicability to a wider array of stochastic algorithms, diverse imaging modalities, and integrating additional methodologies such as acceleration and pre-conditioning.

\printbibliography