-
An Application of Model Reference Adaptive Control for Multi-Agent Synchronization in Drone Networks
Authors:
Miguel F. Arevalo-Castiblanco,
Ye** Wi,
Marzia Cescon and,
Cesar A. Uribe
Abstract:
This paper presents the application of a Distributed Model Reference Adaptive Control (DMRAC) strategy for robust multi-agent synchronization of a network of drones. The proposed approach enables the development of controllers capable of accommodating differences in real-life model parameters between agents, thereby enhancing overall network performance. We compare the performance of the adaptive…
▽ More
This paper presents the application of a Distributed Model Reference Adaptive Control (DMRAC) strategy for robust multi-agent synchronization of a network of drones. The proposed approach enables the development of controllers capable of accommodating differences in real-life model parameters between agents, thereby enhancing overall network performance. We compare the performance of the adaptive control laws with classical PID controllers for the reference tracking task. Each follower drone has a model reference adaptive controller that continuously updates its parameters based on real-time feedback and reference model information. This adaptability ensures an adequate performance that, compared to conventional non-adaptive techniques, can reduce the amount of energy required and consequently increase the operating duration of the drones. The experimental results, particularly in vertical velocity control, underscore the effectiveness of the proposed approach in achieving synchronized behavior.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
LLM Critics Help Catch LLM Bugs
Authors:
Nat McAleese,
Rai Michael Pokorny,
Juan Felipe Ceron Uribe,
Evgenia Nitishinskaya,
Maja Trebacz,
Jan Leike
Abstract:
Reinforcement learning from human feedback (RLHF) is fundamentally limited by the capacity of humans to correctly evaluate model output. To improve human evaluation ability and overcome that limitation this work trains "critic" models that help humans to more accurately evaluate model-written code. These critics are themselves LLMs trained with RLHF to write natural language feedback highlighting…
▽ More
Reinforcement learning from human feedback (RLHF) is fundamentally limited by the capacity of humans to correctly evaluate model output. To improve human evaluation ability and overcome that limitation this work trains "critic" models that help humans to more accurately evaluate model-written code. These critics are themselves LLMs trained with RLHF to write natural language feedback highlighting problems in code from real-world assistant tasks. On code containing naturally occurring LLM errors model-written critiques are preferred over human critiques in 63% of cases, and human evaluation finds that models catch more bugs than human contractors paid for code review. We further confirm that our fine-tuned LLM critics can successfully identify hundreds of errors in ChatGPT training data rated as "flawless", even though the majority of those tasks are non-code tasks and thus out-of-distribution for the critic model. Critics can have limitations of their own, including hallucinated bugs that could mislead humans into making mistakes they might have otherwise avoided, but human-machine teams of critics and contractors catch similar numbers of bugs to LLM critics while hallucinating less than LLMs alone.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
Efficient Path Planning with Soft Homology Constraints
Authors:
Carlos A. Taveras,
Santiago Segarra,
César A. Uribe
Abstract:
We study the problem of path planning with soft homology constraints on a surface topologically equivalent to a disk with punctures. Specifically, we propose an algorithm, named $\Hstar$, for the efficient computation of a path homologous to a user-provided reference path. We show that the algorithm can generate a suite of paths in distinct homology classes, from the overall shortest path to the s…
▽ More
We study the problem of path planning with soft homology constraints on a surface topologically equivalent to a disk with punctures. Specifically, we propose an algorithm, named $\Hstar$, for the efficient computation of a path homologous to a user-provided reference path. We show that the algorithm can generate a suite of paths in distinct homology classes, from the overall shortest path to the shortest path homologous to the reference path, ordered both by path length and similarity to the reference path. Rollout is shown to improve the results produced by the algorithm. Experiments demonstrate that $\Hstar$ can be an efficient alternative to optimal methods, especially for configuration spaces with many obstacles.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
An Optimal Transport Approach for Network Regression
Authors:
Alex G. Zalles,
Kai M. Hung,
Ann E. Finneran,
Lydia Beaudrot,
César A. Uribe
Abstract:
We study the problem of network regression, where one is interested in how the topology of a network changes as a function of Euclidean covariates. We build upon recent developments in generalized regression models on metric spaces based on Fréchet means and propose a network regression method using the Wasserstein metric. We show that when representing graphs as multivariate Gaussian distribution…
▽ More
We study the problem of network regression, where one is interested in how the topology of a network changes as a function of Euclidean covariates. We build upon recent developments in generalized regression models on metric spaces based on Fréchet means and propose a network regression method using the Wasserstein metric. We show that when representing graphs as multivariate Gaussian distributions, the network regression problem requires the computation of a Riemannian center of mass (i.e., Fréchet means). Fréchet means with non-negative weights translates into a barycenter problem and can be efficiently computed using fixed point iterations. Although the convergence guarantees of fixed-point iterations for the computation of Wasserstein affine averages remain an open problem, we provide evidence of convergence in a large number of synthetic and real-data scenarios. Extensive numerical results show that the proposed approach improves existing procedures by accurately accounting for graph size, topology, and sparsity in synthetic experiments. Additionally, real-world experiments using the proposed approach result in higher Coefficient of Determination ($R^{2}$) values and lower mean squared prediction error (MSPE), cementing improved prediction capabilities in practice.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
RT-utils: A Minimal Python Library for RT-struct Manipulation
Authors:
Asim Shrestha,
Adam Watkins,
Fereshteh Yousefirizi,
Arman Rahmim,
Carlos F. Uribe
Abstract:
Towards the need for automated and precise AI-based analysis of medical images, we present RT-utils, a specialized Python library tuned for the manipulation of radiotherapy (RT) structures stored in DICOM format. RT-utils excels in converting the polygon contours into binary masks, ensuring accuracy and efficiency. By converting DICOM RT structures into standardized formats such as NumPy arrays an…
▽ More
Towards the need for automated and precise AI-based analysis of medical images, we present RT-utils, a specialized Python library tuned for the manipulation of radiotherapy (RT) structures stored in DICOM format. RT-utils excels in converting the polygon contours into binary masks, ensuring accuracy and efficiency. By converting DICOM RT structures into standardized formats such as NumPy arrays and SimpleITK Images, RT-utils optimizes inputs for computational solutions such as AI-based automated segmentation techniques or radiomics analysis. Since its inception in 2020, RT-utils has been used extensively with a focus on simplifying complex data processing tasks. RT-utils offers researchers a powerful solution to enhance workflows and drive significant advancements in medical imaging.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Novel Method to Estimate Kinetic Microparameters from Dynamic Whole-Body Imaging in Regular-Axial Field-of-View PET Scanners
Authors:
Kyung-Nam Lee,
Arman Rahmim,
Carlos Uribe
Abstract:
For whole-body (WB) kinetic modeling based on a typical PET scanner, a multi-pass multi-bed scanning protocol is necessary given the limited axial field-of-view. Such a protocol introduces loss of early-dynamics in time-activity curves (TACs) and sparsity in TAC measurements, inducing uncertainty in parameter estimation when using least-squares estimation (LSE) (i.e., common standard) especially f…
▽ More
For whole-body (WB) kinetic modeling based on a typical PET scanner, a multi-pass multi-bed scanning protocol is necessary given the limited axial field-of-view. Such a protocol introduces loss of early-dynamics in time-activity curves (TACs) and sparsity in TAC measurements, inducing uncertainty in parameter estimation when using least-squares estimation (LSE) (i.e., common standard) especially for kinetic microparameters. We present a method to reliably estimate microparameters, enabling accurate parametric imaging, on regular-axial field-of-view PET scanners. Our method, denoted parameter combination-driven estimation (PCDE), relies on generation of reference truth TAC database, and subsequently selecting, the best parameter combination as the one arriving at TAC with highest total similarity score (TSS), focusing on general image quality, overall visibility, and tumor detectability metrics. Our technique has two distinctive characteristics: 1) improved probability of having one-on-one map** between early and late dynamics in TACs (the former missing from typical protocols), and 2) use of multiple aspects of TACs in selection of best fits. To evaluate our method against conventional LSE, we plotted tradeoff curves for noise and bias. In addition, the overall SNR and spatial noise were calculated and compared. Furthermore, CNR and TBR were also calculated. We also tested our proposed method on patient data (18F-DCFPyL PET scans) to further verify clinical applicability. Significantly improved general image quality performance was verified in microparametric images (e.g. noise-bias performance). The overall visibility and tumor detectability were also improved. Finally, for our patient studies, improved overall visibility and tumor detectability were demonstrated in micoparametric images, compared to use of conventional parameter estimation.
△ Less
Submitted 13 May, 2024; v1 submitted 15 April, 2024;
originally announced April 2024.
-
A Moreau Envelope Approach for LQR Meta-Policy Estimation
Authors:
Ashwin Aravind,
Mohammad Taha Toghani,
César A. Uribe
Abstract:
We study the problem of policy estimation for the Linear Quadratic Regulator (LQR) in discrete-time linear time-invariant uncertain dynamical systems. We propose a Moreau Envelope-based surrogate LQR cost, built from a finite set of realizations of the uncertain system, to define a meta-policy efficiently adjustable to new realizations. Moreover, we design an algorithm to find an approximate first…
▽ More
We study the problem of policy estimation for the Linear Quadratic Regulator (LQR) in discrete-time linear time-invariant uncertain dynamical systems. We propose a Moreau Envelope-based surrogate LQR cost, built from a finite set of realizations of the uncertain system, to define a meta-policy efficiently adjustable to new realizations. Moreover, we design an algorithm to find an approximate first-order stationary point of the meta-LQR cost function. Numerical results show that the proposed approach outperforms naive averaging of controllers on new realizations of the linear system. We also provide empirical evidence that our method has better sample complexity than Model-Agnostic Meta-Learning (MAML) approaches.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
A slice classification neural network for automated classification of axial PET/CT slices from a multi-centric lymphoma dataset
Authors:
Shadab Ahamed,
Yixi Xu,
Ingrid Bloise,
Joo H. O,
Carlos F. Uribe,
Rahul Dodhia,
Juan L. Ferres,
Arman Rahmim
Abstract:
Automated slice classification is clinically relevant since it can be incorporated into medical image segmentation workflows as a preprocessing step that would flag slices with a higher probability of containing tumors, thereby directing physicians attention to the important slices. In this work, we train a ResNet-18 network to classify axial slices of lymphoma PET/CT images (collected from two in…
▽ More
Automated slice classification is clinically relevant since it can be incorporated into medical image segmentation workflows as a preprocessing step that would flag slices with a higher probability of containing tumors, thereby directing physicians attention to the important slices. In this work, we train a ResNet-18 network to classify axial slices of lymphoma PET/CT images (collected from two institutions) depending on whether the slice intercepted a tumor (positive slice) in the 3D image or if the slice did not (negative slice). Various instances of the network were trained on 2D axial datasets created in different ways: (i) slice-level split and (ii) patient-level split; inputs of different types were used: (i) only PET slices and (ii) concatenated PET and CT slices; and different training strategies were employed: (i) center-aware (CAW) and (ii) center-agnostic (CAG). Model performances were compared using the area under the receiver operating characteristic curve (AUROC) and the area under the precision-recall curve (AUPRC), and various binary classification metrics. We observe and describe a performance overestimation in the case of slice-level split as compared to the patient-level split training. The model trained using patient-level split data with the network input containing only PET slices in the CAG training regime was the best performing/generalizing model on a majority of metrics. Our models were additionally more closely compared using the sensitivity metric on the positive slices from their respective test sets.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
A cascaded deep network for automated tumor detection and segmentation in clinical PET imaging of diffuse large B-cell lymphoma
Authors:
Shadab Ahamed,
Natalia Dubljevic,
Ingrid Bloise,
Claire Gowdy,
Patrick Martineau,
Don Wilson,
Carlos F. Uribe,
Arman Rahmim,
Fereshteh Yousefirizi
Abstract:
Accurate detection and segmentation of diffuse large B-cell lymphoma (DLBCL) from PET images has important implications for estimation of total metabolic tumor volume, radiomics analysis, surgical intervention and radiotherapy. Manual segmentation of tumors in whole-body PET images is time-consuming, labor-intensive and operator-dependent. In this work, we develop and validate a fast and efficient…
▽ More
Accurate detection and segmentation of diffuse large B-cell lymphoma (DLBCL) from PET images has important implications for estimation of total metabolic tumor volume, radiomics analysis, surgical intervention and radiotherapy. Manual segmentation of tumors in whole-body PET images is time-consuming, labor-intensive and operator-dependent. In this work, we develop and validate a fast and efficient three-step cascaded deep learning model for automated detection and segmentation of DLBCL tumors from PET images. As compared to a single end-to-end network for segmentation of tumors in whole-body PET images, our three-step model is more effective (improves 3D Dice score from 58.9% to 78.1%) since each of its specialized modules, namely the slice classifier, the tumor detector and the tumor segmentor, can be trained independently to a high degree of skill to carry out a specific task, rather than a single network with suboptimal performance on overall segmentation.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
Decentralized and Equitable Optimal Transport
Authors:
Ivan Lau,
Shiqian Ma,
César A. Uribe
Abstract:
This paper considers the decentralized (discrete) optimal transport (D-OT) problem. In this setting, a network of agents seeks to design a transportation plan jointly, where the cost function is the sum of privately held costs for each agent. We reformulate the D-OT problem as a constraint-coupled optimization problem and propose a single-loop decentralized algorithm with an iteration complexity o…
▽ More
This paper considers the decentralized (discrete) optimal transport (D-OT) problem. In this setting, a network of agents seeks to design a transportation plan jointly, where the cost function is the sum of privately held costs for each agent. We reformulate the D-OT problem as a constraint-coupled optimization problem and propose a single-loop decentralized algorithm with an iteration complexity of O(1/ε) that matches existing centralized first-order approaches. Moreover, we propose the decentralized equitable optimal transport (DE-OT) problem. In DE-OT, in addition to cooperatively designing a transportation plan that minimizes transportation costs, agents seek to ensure equity in their individual costs. The iteration complexity of the proposed method to solve DE-OT is also O(1/ε). This rate improves existing centralized algorithms, where the best iteration complexity obtained is O(1/ε^2).
△ Less
Submitted 12 March, 2024; v1 submitted 7 March, 2024;
originally announced March 2024.
-
PIDformer: Transformer Meets Control Theory
Authors:
Tam Nguyen,
César A. Uribe,
Tan M. Nguyen,
Richard G. Baraniuk
Abstract:
In this work, we address two main shortcomings of transformer architectures: input corruption and rank collapse in their output representation. We unveil self-attention as an autonomous state-space model that inherently promotes smoothness in its solutions, leading to lower-rank outputs and diminished representation capacity. Moreover, the steady-state solution of the model is sensitive to input p…
▽ More
In this work, we address two main shortcomings of transformer architectures: input corruption and rank collapse in their output representation. We unveil self-attention as an autonomous state-space model that inherently promotes smoothness in its solutions, leading to lower-rank outputs and diminished representation capacity. Moreover, the steady-state solution of the model is sensitive to input perturbations. We incorporate a Proportional-Integral-Derivative (PID) closed-loop feedback control system with a reference point into the model to improve robustness and representation capacity. This integration aims to preserve high-frequency details while bolstering model stability, rendering it more noise-resilient. The resulting controlled state-space model is theoretically proven robust and adept at addressing the rank collapse. Motivated by this control framework, we derive a novel class of transformers, PID-controlled Transformer (PIDformer), aimed at improving robustness and mitigating the rank-collapse issue inherent in softmax transformers. We empirically evaluate the model for advantages and robustness against baseline transformers across various practical tasks, including object classification, image segmentation, and language modeling.
△ Less
Submitted 25 February, 2024;
originally announced February 2024.
-
Sparse factorization of the square all-ones matrix of arbitrary order
Authors:
Xin Jiang,
Edward Duc Hien Nguyen,
César A. Uribe,
Bicheng Ying
Abstract:
In this paper, we study sparse factorization of the (scaled) square all-ones matrix $J$ of arbitrary order. We introduce the concept of hierarchically banded matrices and propose two types of hierarchically banded factorization of $J$: the reduced hierarchically banded (RHB) factorization and the doubly stochastic hierarchically banded (DSHB) factorization. Based on the DSHB factorization, we prop…
▽ More
In this paper, we study sparse factorization of the (scaled) square all-ones matrix $J$ of arbitrary order. We introduce the concept of hierarchically banded matrices and propose two types of hierarchically banded factorization of $J$: the reduced hierarchically banded (RHB) factorization and the doubly stochastic hierarchically banded (DSHB) factorization. Based on the DSHB factorization, we propose the sequential doubly stochastic (SDS) factorization, in which~$J$ is decomposed as a product of sparse, doubly stochastic matrices. Finally, we discuss the application of the proposed sparse factorizations to the decentralized average consensus problem and decentralized optimization.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
Investigating heterogeneous PSMA ligand uptake inside parotid glands
Authors:
Caleb Sample,
Carlos Uribe,
Arman Rahmim,
François Bénard,
Jonn Wu,
Haley Clark
Abstract:
The purpose was to investigate the spatial heterogeneity of prostate-specific membrane antigen (PSMA) positron emission tomography (PET) uptake within parotid glands. We aim to quantify patterns in well-defined regions to facilitate further investigations. Furthermore, we investigate whether uptake is correlated with computed tomography (CT) texture features. Parotid glands from [18F]DCFPyL PSMA P…
▽ More
The purpose was to investigate the spatial heterogeneity of prostate-specific membrane antigen (PSMA) positron emission tomography (PET) uptake within parotid glands. We aim to quantify patterns in well-defined regions to facilitate further investigations. Furthermore, we investigate whether uptake is correlated with computed tomography (CT) texture features. Parotid glands from [18F]DCFPyL PSMA PET/CT images of 30 prostate cancer patients were analyzed. Thresholding was used to define high-uptake regions, and uptake statistics were computed within various divisions. Spearman's rank correlation coefficient was calculated between PSMA PET uptake and the Grey Level Run Length Matrix (GLRLM) using a long and short run length emphasis (GLRLML and GLRLMS) in subregions of parotid glands. PSMA PET uptake was significantly higher (p < 0.001) in lateral/posterior regions of the glands than anterior/medial regions. Maximum uptake was found in the lateral half of parotid glands in 50 out of 60 glands. The difference in SUV between parotid halves is greatest when parotids are divided by a plane separating the anterior/medial and posterior/lateral halves symmetrically. PSMA PET uptake was significantly correlated with CT GLRLML (p < 0.001), and anti-correlated with CT GLRLMS (p < 0.001). Uptake of PSMA PET is heterogeneous within parotid glands, with uptake biased towards lateral and posterior regions. Uptake patterns within parotid glands were found to be strongly correlated with CT texture features, suggesting the possible future use of CT texture features as a proxy for inferring PSMA PET uptake in salivary glands.
△ Less
Submitted 4 January, 2024;
originally announced January 2024.
-
Improving Denoising Diffusion Probabilistic Models via Exploiting Shared Representations
Authors:
Delaram Pirhayatifard,
Mohammad Taha Toghani,
Guha Balakrishnan,
César A. Uribe
Abstract:
In this work, we address the challenge of multi-task image generation with limited data for denoising diffusion probabilistic models (DDPM), a class of generative models that produce high-quality images by reversing a noisy diffusion process. We propose a novel method, SR-DDPM, that leverages representation-based techniques from few-shot learning to effectively learn from fewer samples across diff…
▽ More
In this work, we address the challenge of multi-task image generation with limited data for denoising diffusion probabilistic models (DDPM), a class of generative models that produce high-quality images by reversing a noisy diffusion process. We propose a novel method, SR-DDPM, that leverages representation-based techniques from few-shot learning to effectively learn from fewer samples across different tasks. Our method consists of a core meta architecture with shared parameters, i.e., task-specific layers with exclusive parameters. By exploiting the similarity between diverse data distributions, our method can scale to multiple tasks without compromising the image quality. We evaluate our method on standard image datasets and show that it outperforms both unconditional and conditional DDPM in terms of FID and SSIM metrics.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Observer study-based evaluation of TGAN architecture used to generate oncological PET images
Authors:
Roberto Fedrigo,
Fereshteh Yousefirizi,
Zi** Liu,
Abhinav K. Jha,
Robert V. Bergen,
Jean-Francois Rajotte,
Raymond T. Ng,
Ingrid Bloise,
Sara Harsini,
Dan J. Kadrmas,
Carlos Uribe,
Arman Rahmim
Abstract:
The application of computer-vision algorithms in medical imaging has increased rapidly in recent years. However, algorithm training is challenging due to limited sample sizes, lack of labeled samples, as well as privacy concerns regarding data sharing. To address these issues, we previously developed (Bergen et al. 2022) a synthetic PET dataset for Head and Neck (H and N) cancer using the temporal…
▽ More
The application of computer-vision algorithms in medical imaging has increased rapidly in recent years. However, algorithm training is challenging due to limited sample sizes, lack of labeled samples, as well as privacy concerns regarding data sharing. To address these issues, we previously developed (Bergen et al. 2022) a synthetic PET dataset for Head and Neck (H and N) cancer using the temporal generative adversarial network (TGAN) architecture and evaluated its performance segmenting lesions and identifying radiomics features in synthesized images. In this work, a two-alternative forced-choice (2AFC) observer study was performed to quantitatively evaluate the ability of human observers to distinguish between real and synthesized oncological PET images. In the study eight trained readers, including two board-certified nuclear medicine physicians, read 170 real/synthetic image pairs presented as 2D-transaxial using a dedicated web app. For each image pair, the observer was asked to identify the real image and input their confidence level with a 5-point Likert scale. P-values were computed using the binomial test and Wilcoxon signed-rank test. A heat map was used to compare the response accuracy distribution for the signed-rank test. Response accuracy for all observers ranged from 36.2% [27.9-44.4] to 63.1% [54.8-71.3]. Six out of eight observers did not identify the real image with statistical significance, indicating that the synthetic dataset was reasonably representative of oncological PET images. Overall, this study adds validity to the realism of our simulated H&N cancer dataset, which may be implemented in the future to train AI algorithms while favoring patient confidentiality and privacy protection.
△ Less
Submitted 27 November, 2023; v1 submitted 27 November, 2023;
originally announced November 2023.
-
Comprehensive Evaluation and Insights into the Use of Deep Neural Networks to Detect and Quantify Lymphoma Lesions in PET/CT Images
Authors:
Shadab Ahamed,
Yixi Xu,
Claire Gowdy,
Joo H. O,
Ingrid Bloise,
Don Wilson,
Patrick Martineau,
François Bénard,
Fereshteh Yousefirizi,
Rahul Dodhia,
Juan M. Lavista,
William B. Weeks,
Carlos F. Uribe,
Arman Rahmim
Abstract:
This study performs comprehensive evaluation of four neural network architectures (UNet, SegResNet, DynUNet, and SwinUNETR) for lymphoma lesion segmentation from PET/CT images. These networks were trained, validated, and tested on a diverse, multi-institutional dataset of 611 cases. Internal testing (88 cases; total metabolic tumor volume (TMTV) range [0.52, 2300] ml) showed SegResNet as the top p…
▽ More
This study performs comprehensive evaluation of four neural network architectures (UNet, SegResNet, DynUNet, and SwinUNETR) for lymphoma lesion segmentation from PET/CT images. These networks were trained, validated, and tested on a diverse, multi-institutional dataset of 611 cases. Internal testing (88 cases; total metabolic tumor volume (TMTV) range [0.52, 2300] ml) showed SegResNet as the top performer with a median Dice similarity coefficient (DSC) of 0.76 and median false positive volume (FPV) of 4.55 ml; all networks had a median false negative volume (FNV) of 0 ml. On the unseen external test set (145 cases with TMTV range: [0.10, 2480] ml), SegResNet achieved the best median DSC of 0.68 and FPV of 21.46 ml, while UNet had the best FNV of 0.41 ml. We assessed reproducibility of six lesion measures, calculated their prediction errors, and examined DSC performance in relation to these lesion measures, offering insights into segmentation accuracy and clinical relevance. Additionally, we introduced three lesion detection criteria, addressing the clinical need for identifying lesions, counting them, and segmenting based on metabolic characteristics. We also performed expert intra-observer variability analysis revealing the challenges in segmenting ``easy'' vs. ``hard'' cases, to assist in the development of more resilient segmentation algorithms. Finally, we performed inter-observer agreement assessment underscoring the importance of a standardized ground truth segmentation protocol involving multiple expert annotators. Code is available at: https://github.com/microsoft/lymphoma-segmentation-dnn
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
Frequentist Guarantees of Distributed (Non)-Bayesian Inference
Authors:
Bohan Wu,
César A. Uribe
Abstract:
Motivated by the need to analyze large, decentralized datasets, distributed Bayesian inference has become a critical research area across multiple fields, including statistics, electrical engineering, and economics. This paper establishes Frequentist properties, such as posterior consistency, asymptotic normality, and posterior contraction rates, for the distributed (non-)Bayes Inference problem a…
▽ More
Motivated by the need to analyze large, decentralized datasets, distributed Bayesian inference has become a critical research area across multiple fields, including statistics, electrical engineering, and economics. This paper establishes Frequentist properties, such as posterior consistency, asymptotic normality, and posterior contraction rates, for the distributed (non-)Bayes Inference problem among agents connected via a communication network. Our results show that, under appropriate assumptions on the communication graph, distributed Bayesian inference retains parametric efficiency while enhancing robustness in uncertainty quantification. We also explore the trade-off between statistical efficiency and communication efficiency by examining how the design and size of the communication graph impact the posterior contraction rate. Furthermore, We extend our analysis to time-varying graphs and apply our results to exponential family models, distributed logistic regression, and decentralized detection models.
△ Less
Submitted 15 April, 2024; v1 submitted 14 November, 2023;
originally announced November 2023.
-
On Graphs with Finite-Time Consensus and Their Use in Gradient Tracking
Authors:
Edward Duc Hien Nguyen,
Xin Jiang,
Bicheng Ying,
César A. Uribe
Abstract:
This paper studies sequences of graphs satisfying the finite-time consensus property (i.e., iterating through such a finite sequence is equivalent to performing global or exact averaging) and their use in Gradient Tracking. We provide an explicit weight matrix representation of the studied sequences and prove their finite-time consensus property. Moreover, we incorporate the studied finite-time co…
▽ More
This paper studies sequences of graphs satisfying the finite-time consensus property (i.e., iterating through such a finite sequence is equivalent to performing global or exact averaging) and their use in Gradient Tracking. We provide an explicit weight matrix representation of the studied sequences and prove their finite-time consensus property. Moreover, we incorporate the studied finite-time consensus topologies into Gradient Tracking and present a new algorithmic scheme called Gradient Tracking for Finite-Time Consensus Topologies (GT-FT). We analyze the new scheme for nonconvex problems with stochastic gradient estimates. Our analysis shows that the convergence rate of GT-FT does not depend on the heterogeneity of the agents' functions or the connectivity of any individual graph in the topology sequence. Furthermore, owing to the sparsity of the graphs, GT-FT requires lower communication costs than Gradient Tracking using the static counterpart of the topology sequence.
△ Less
Submitted 14 November, 2023; v1 submitted 2 November, 2023;
originally announced November 2023.
-
A Discrete-time Networked Competitive Bivirus SIS Model
Authors:
Sebin Gracy,
Ji Liu,
Tamer Basar,
Cesar A. Uribe
Abstract:
The paper deals with the analysis of a discrete-time networked competitive bivirus susceptible-infected-susceptible (SIS) model. More specifically, we suppose that virus 1 and virus 2 are circulating in the population and are in competition with each other. We show that the model is strongly monotone, and that, under certain assumptions, it does not admit any periodic orbit. We identify a sufficie…
▽ More
The paper deals with the analysis of a discrete-time networked competitive bivirus susceptible-infected-susceptible (SIS) model. More specifically, we suppose that virus 1 and virus 2 are circulating in the population and are in competition with each other. We show that the model is strongly monotone, and that, under certain assumptions, it does not admit any periodic orbit. We identify a sufficient condition for exponential convergence to the disease-free equilibrium (DFE). Assuming only virus 1 (resp. virus 2) is alive, we establish a condition for global asymptotic convergence to the single-virus endemic equilibrium of virus 1 (resp. virus 2) -- our proof does not rely on the construction of a Lyapunov function. Assuming both virus 1 and virus 2 are alive, we establish a condition which ensures local exponential convergence to the single-virus equilibrium of virus 1 (resp. virus 2). Finally, we provide a sufficient (resp. necessary) condition for the existence of a coexistence equilibrium.
△ Less
Submitted 20 October, 2023;
originally announced October 2023.
-
Competitive Networked Bivirus SIS spread over Hypergraphs
Authors:
Sebin Gracy,
Brian D. O. Anderson,
Mengbin Ye,
Cesar A. Uribe
Abstract:
The paper deals with the spread of two competing viruses over a network of population nodes, accounting for pairwise interactions and higher-order interactions (HOI) within and between the population nodes. We study the competitive networked bivirus susceptible-infected-susceptible (SIS) model on a hypergraph introduced in Cui et al. [1]. We show that the system has, in a generic sense, a finite n…
▽ More
The paper deals with the spread of two competing viruses over a network of population nodes, accounting for pairwise interactions and higher-order interactions (HOI) within and between the population nodes. We study the competitive networked bivirus susceptible-infected-susceptible (SIS) model on a hypergraph introduced in Cui et al. [1]. We show that the system has, in a generic sense, a finite number of equilibria, and the Jacobian associated with each equilibrium point is nonsingular; the key tool is the Parametric Transversality Theorem of differential topology. Since the system is also monotone, it turns out that the typical behavior of the system is convergence to some equilibrium point. Thereafter, we exhibit a tri-stable domain with three locally exponentially stable equilibria. For different parameter regimes, we establish conditions for the existence of a coexistence equilibrium (both viruses infect separate fractions of each population node).
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
PyTomography: A Python Library for Quantitative Medical Image Reconstruction
Authors:
Lucas Polson,
Roberto Fedrigo,
Chenguang Li,
Maziar Sabouri,
Obed Dzikunu,
Shadab Ahamed,
Nikolaos Karakatsanis,
Arman Rahmim,
Carlos Uribe
Abstract:
There is a need for open-source libraries in emission tomography that (i) use modern and popular backend code to encourage community contributions and (ii) offer support for the multitude of reconstruction algorithms available in recent literature, such as those that employ artificial intelligence. The purpose of this research was to create and evaluate a GPU-accelerated, open-source, and user-fri…
▽ More
There is a need for open-source libraries in emission tomography that (i) use modern and popular backend code to encourage community contributions and (ii) offer support for the multitude of reconstruction algorithms available in recent literature, such as those that employ artificial intelligence. The purpose of this research was to create and evaluate a GPU-accelerated, open-source, and user-friendly image reconstruction library, designed to serve as a central platform for the development, validation, and deployment of various tomographic reconstruction algorithms. PyTomography was developed using Python and inherits the GPU-accelerated functionality of PyTorch and parallelproj for fast computations. Its flexible and modular design decouples system matrices, likelihoods, and reconstruction algorithms, simplifying the process of integrating new imaging modalities using various python tools. Example use cases demonstrate the software capabilities in parallel hole SPECT and listmode PET imaging. Overall, we have developed and publicly share PyTomography, a highly optimized and user-friendly software for medical image reconstruction, with a class hierarchy that fosters the development of novel imaging applications.
△ Less
Submitted 7 July, 2024; v1 submitted 5 September, 2023;
originally announced September 2023.
-
Neural blind deconvolution for deblurring and supersampling PSMA PET
Authors:
Caleb Sample,
Arman Rahmim,
Carlos Uribe,
François Bénard,
Jonn Wu,
Roberto Fedrigo,
Haley Clark
Abstract:
Objective: To simultaneously deblur and supersample prostate specific membrane antigen (PSMA) positron emission tomography (PET) images using neural blind deconvolution. Approach: Blind deconvolution is a method of estimating the hypothetical "deblurred" image along with the blur kernel (related to the point spread function) simultaneously. Traditional \textit{maximum a posteriori} blind deconvolu…
▽ More
Objective: To simultaneously deblur and supersample prostate specific membrane antigen (PSMA) positron emission tomography (PET) images using neural blind deconvolution. Approach: Blind deconvolution is a method of estimating the hypothetical "deblurred" image along with the blur kernel (related to the point spread function) simultaneously. Traditional \textit{maximum a posteriori} blind deconvolution methods require stringent assumptions and suffer from convergence to a trivial solution. A method of modelling the deblurred image and kernel with independent neural networks, called "neural blind deconvolution" had demonstrated success for deblurring 2D natural images in 2020. In this work, we adapt neural blind deconvolution for PVE correction of PSMA PET images with simultaneous supersampling. We compare this methodology with several interpolation methods, using blind image quality metrics, and test the model's ability to predict kernels by re-running the model after applying artificial "pseudokernels" to deblurred images. The methodology was tested on a retrospective set of 30 prostate patients as well as phantom images containing spherical lesions of various volumes. Results: Neural blind deconvolution led to improvements in image quality over other interpolation methods in terms of blind image quality metrics, recovery coefficients, and visual assessment. Predicted kernels were similar between patients, and the model accurately predicted several artificially-applied pseudokernels. Localization of activity in phantom spheres was improved after deblurring, allowing small lesions to be more accurately defined. Significance: The intrinsically low spatial resolution of PSMA PET leads to PVEs which negatively impact uptake quantification in small regions. The proposed method can be used to mitigate this issue, and can be straightforwardly adapted for other imaging modalities.
△ Less
Submitted 2 March, 2024; v1 submitted 1 September, 2023;
originally announced September 2023.
-
Adaptive Federated Learning with Auto-Tuned Clients
Authors:
Junhyung Lyle Kim,
Mohammad Taha Toghani,
César A. Uribe,
Anastasios Kyrillidis
Abstract:
Federated learning (FL) is a distributed machine learning framework where the global model of a central server is trained via multiple collaborative steps by participating clients without sharing their data. While being a flexible framework, where the distribution of local data, participation rate, and computing power of each client can greatly vary, such flexibility gives rise to many new challen…
▽ More
Federated learning (FL) is a distributed machine learning framework where the global model of a central server is trained via multiple collaborative steps by participating clients without sharing their data. While being a flexible framework, where the distribution of local data, participation rate, and computing power of each client can greatly vary, such flexibility gives rise to many new challenges, especially in the hyperparameter tuning on the client side. We propose $Δ$-SGD, a simple step size rule for SGD that enables each client to use its own step size by adapting to the local smoothness of the function each client is optimizing. We provide theoretical and empirical results where the benefit of the client adaptivity is shown in various FL scenarios.
△ Less
Submitted 2 May, 2024; v1 submitted 19 June, 2023;
originally announced June 2023.
-
Examination of Supernets to Facilitate International Trade for Indian Exports to Brazil
Authors:
Evan Winter,
Anupam Shah,
Ujjwal Gupta,
Anshul Kumar,
Deepayan Mohanty,
Juan Carlos Uribe,
Aishwary Gupta,
Mini P. Thomas
Abstract:
The objective of this paper is to investigate a more efficient cross-border payment and document handling process for the export of Indian goods to Brazil. The paper is structured into two sections: first, to explain the problems unique to the India-Brazil international trade corridor by highlighting the obstacles of compliance, speed, and payments; and second, to propose a digital solution for In…
▽ More
The objective of this paper is to investigate a more efficient cross-border payment and document handling process for the export of Indian goods to Brazil. The paper is structured into two sections: first, to explain the problems unique to the India-Brazil international trade corridor by highlighting the obstacles of compliance, speed, and payments; and second, to propose a digital solution for India-brazil trade utilizing Supernets, focusing on the use case of Indian exports. The solution assumes that stakeholders will be onboarded as permissioned actors (i.e. nodes) on a Polygon Supernet. By engaging trade and banking stakeholders, we ensure that the digital solution results in export benefits for Indian exporters, and a lawful channel to receive hard currency payments. The involvement of Brazilian and Indian banks ensures that Letter of Credit (LC) processing time and document handling occur at the speed of blockchain technology. The ultimate goal is to achieve faster settlement and negotiation period while maintaining a regulatory-compliant outcome, so that the end result is faster and easier, yet otherwise identical to the real-world process in terms of export benefits and compliance.
△ Less
Submitted 1 June, 2023;
originally announced June 2023.
-
On First-Order Meta-Reinforcement Learning with Moreau Envelopes
Authors:
Mohammad Taha Toghani,
Sebastian Perez-Salazar,
César A. Uribe
Abstract:
Meta-Reinforcement Learning (MRL) is a promising framework for training agents that can quickly adapt to new environments and tasks. In this work, we study the MRL problem under the policy gradient formulation, where we propose a novel algorithm that uses Moreau envelope surrogate regularizers to jointly learn a meta-policy that is adjustable to the environment of each individual task. Our algorit…
▽ More
Meta-Reinforcement Learning (MRL) is a promising framework for training agents that can quickly adapt to new environments and tasks. In this work, we study the MRL problem under the policy gradient formulation, where we propose a novel algorithm that uses Moreau envelope surrogate regularizers to jointly learn a meta-policy that is adjustable to the environment of each individual task. Our algorithm, called Moreau Envelope Meta-Reinforcement Learning (MEMRL), learns a meta-policy that can adapt to a distribution of tasks by efficiently updating the policy parameters using a combination of gradient-based optimization and Moreau Envelope regularization. Moreau Envelopes provide a smooth approximation of the policy optimization problem, which enables us to apply standard optimization techniques and converge to an appropriate stationary point. We provide a detailed analysis of the MEMRL algorithm, where we show a sublinear convergence rate to a first-order stationary point for non-convex policy gradient optimization. We finally show the effectiveness of MEMRL on a multi-task 2D-navigation problem.
△ Less
Submitted 20 May, 2023;
originally announced May 2023.
-
Towards Understanding the Endemic Behavior of a Competitive Tri-Virus SIS Networked Model
Authors:
Sebin Gracy,
Mengbin Ye,
Brian D. O. Anderson,
Cesar A. Uribe
Abstract:
This paper studies the endemic behavior of a multi-competitive networked susceptible-infected-susceptible (SIS) model. Specifically, the paper deals with three competing virus systems (i.e., tri-virus systems). First, we show that a tri-virus system, unlike a bi-virus system, is not a monotone dynamical system. Using the Parametric Transversality Theorem, we show that, generically, a tri-virus sys…
▽ More
This paper studies the endemic behavior of a multi-competitive networked susceptible-infected-susceptible (SIS) model. Specifically, the paper deals with three competing virus systems (i.e., tri-virus systems). First, we show that a tri-virus system, unlike a bi-virus system, is not a monotone dynamical system. Using the Parametric Transversality Theorem, we show that, generically, a tri-virus system has a finite number of equilibria and that the Jacobian matrices associated with each equilibrium are nonsingular. The endemic equilibria of this system can be classified as follows: a) single-virus endemic equilibria (also referred to as the boundary equilibria), where precisely one of the three viruses is alive; b) 2-coexistence equilibria, where exactly two of the three viruses are alive; and c) 3-coexistence equilibria, where all three viruses survive in the network. We provide a necessary and sufficient condition that guarantees local exponential convergence to a boundary equilibrium. Further, we secure conditions for the nonexistence of 3-coexistence equilibria (resp. for various forms of 2-coexistence equilibria). We also identify sufficient conditions for the existence of a 2-coexistence (resp. 3-coexistence) equilibrium. We identify conditions on the model parameters that give rise to a continuum of coexistence equilibria. More specifically, we establish i) a scenario that admits the existence and local exponential attractivity of a line of coexistence equilibria; and ii) scenarios that admit the existence of, and, in the case of one such scenario, global convergence to, a plane of 3-coexistence equilibria.
△ Less
Submitted 29 March, 2023;
originally announced March 2023.
-
Multi-Competitive Virus Spread over a Time-Varying Networked SIS Model with an Infrastructure Network
Authors:
Sebin Gracy,
Yuan Wang,
Philip E. Pare,
Cesar A Uribe
Abstract:
We study the spread of multi-competitive viruses over a (possibly) time-varying network of individuals accounting for the presence of shared infrastructure networks that further enables transmission of the virus. We establish a sufficient condition for exponentially fast eradication of a virus for: 1) time-invariant graphs, 2) time-varying graphs with symmetric interactions between individuals and…
▽ More
We study the spread of multi-competitive viruses over a (possibly) time-varying network of individuals accounting for the presence of shared infrastructure networks that further enables transmission of the virus. We establish a sufficient condition for exponentially fast eradication of a virus for: 1) time-invariant graphs, 2) time-varying graphs with symmetric interactions between individuals and homogeneous virus spread across the network (same healing and infection rate for all individuals), and 3) directed and slowly varying graphs with heterogeneous virus spread (not necessarily same healing and infection rates for all individuals) across the network. Numerical examples illustrate our theoretical results and indicate that, for the time-varying case, violation of the aforementioned sufficient conditions could lead to the persistence of a virus.
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
GPT-4 Technical Report
Authors:
OpenAI,
Josh Achiam,
Steven Adler,
Sandhini Agarwal,
Lama Ahmad,
Ilge Akkaya,
Florencia Leoni Aleman,
Diogo Almeida,
Janko Altenschmidt,
Sam Altman,
Shyamal Anadkat,
Red Avila,
Igor Babuschkin,
Suchir Balaji,
Valerie Balcom,
Paul Baltescu,
Haiming Bao,
Mohammad Bavarian,
Jeff Belgum,
Irwan Bello,
Jake Berdine,
Gabriel Bernadett-Shapiro,
Christopher Berner,
Lenny Bogdonoff,
Oleg Boiko
, et al. (256 additional authors not shown)
Abstract:
We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo…
▽ More
We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based model pre-trained to predict the next token in a document. The post-training alignment process results in improved performance on measures of factuality and adherence to desired behavior. A core component of this project was develo** infrastructure and optimization methods that behave predictably across a wide range of scales. This allowed us to accurately predict some aspects of GPT-4's performance based on models trained with no more than 1/1,000th the compute of GPT-4.
△ Less
Submitted 4 March, 2024; v1 submitted 15 March, 2023;
originally announced March 2023.
-
$^{177}$Lu SPECT Imaging in the Presence of $^{90}$Y: Does $^{90}$Y Degrade Image Quantification? A Simulation Study
Authors:
Cassandra Miller,
Carlos Uribe,
Xinchi Hou,
Arman Rahmim,
Anna Celler
Abstract:
This work aims to investigate the accuracy of quantitative SPECT imaging of $^{177}$Lu in the presence of $^{90}$Y, which occurs in dual-isotope radiopharmaceutical therapy (RPT) involving both isotopes. We used the GATE Monte Carlo simulation toolkit to conduct a phantom study, simulating spheres filled with $^{177}$Lu and $^{90}$Y placed in a cylindrical water phantom that was also filled with a…
▽ More
This work aims to investigate the accuracy of quantitative SPECT imaging of $^{177}$Lu in the presence of $^{90}$Y, which occurs in dual-isotope radiopharmaceutical therapy (RPT) involving both isotopes. We used the GATE Monte Carlo simulation toolkit to conduct a phantom study, simulating spheres filled with $^{177}$Lu and $^{90}$Y placed in a cylindrical water phantom that was also filled with activity of both radionuclides. We simulated multiple phantom configurations and activity combinations by varying the location of the spheres, the concentrations of $^{177}$Lu and $^{90}$Y in the spheres, and the amount of background activity. We investigated two different scatter window widths to be used with triple energy window (TEW) scatter correction. We also created multiple realizations of each configuration to improve our assessment, leading to a total of 540 simulations. Each configuration was imaged using a simulated Siemens SPECT camera. The projections were reconstructed using the standard 3D OSEM algorithm, and errors associated with $^{177}$Lu activity quantification and contrast-to-noise ratios (CNRs) were determined. In all configurations, the quantification error was within $\pm$6% of the no-$^{90}$Y case, and we found that quantitative accuracy may slightly improve when $^{90}$Y is present because of reduction of errors associated with TEW scatter correction. The CNRs were not significantly impacted by the presence of $^{90}$Y, but they were increased when a wider scatter window width was used for TEW scatter correction. The width of the scatter windows made a small but statistically significant difference of 1-2% on the recovered $^{177}$Lu activity. Based on these results, we can conclude that activity quantification of $^{177}$Lu and lesion detectability is not degraded by the presence of $^{90}$Y.
△ Less
Submitted 25 January, 2023;
originally announced January 2023.
-
Semi-supervised learning towards automated segmentation of PET images with limited annotations: Application to lymphoma patients
Authors:
Fereshteh Yousefirizi,
Isaac Shiri,
Joo Hyun O,
Ingrid Bloise,
Patrick Martineau,
Don Wilson,
François Bénard,
Laurie H. Sehn,
Kerry J. Savage,
Habib Zaidi,
Carlos F. Uribe,
Arman Rahmim
Abstract:
The time-consuming task of manual segmentation challenges routine systematic quantification of disease burden. Convolutional neural networks (CNNs) hold significant promise to reliably identify locations and boundaries of tumors from PET scans. We aimed to leverage the need for annotated data via semi-supervised approaches, with application to PET images of diffuse large B-cell lymphoma (DLBCL) an…
▽ More
The time-consuming task of manual segmentation challenges routine systematic quantification of disease burden. Convolutional neural networks (CNNs) hold significant promise to reliably identify locations and boundaries of tumors from PET scans. We aimed to leverage the need for annotated data via semi-supervised approaches, with application to PET images of diffuse large B-cell lymphoma (DLBCL) and primary mediastinal large B-cell lymphoma (PMBCL). We analyzed 18F-FDG PET images of 292 patients with PMBCL (n=104) and DLBCL (n=188) (n=232 for training and validation, and n=60 for external testing). We employed FCM and MS losses for training a 3D U-Net with different levels of supervision: i) fully supervised methods with labeled FCM (LFCM) as well as Unified focal and Dice loss functions, ii) unsupervised methods with Robust FCM (RFCM) and Mumford-Shah (MS) loss functions, and iii) Semi-supervised methods based on FCM (RFCM+LFCM), as well as MS loss in combination with supervised Dice loss (MS+Dice). Unified loss function yielded higher Dice score (mean +/- standard deviation (SD)) (0.73 +/- 0.03; 95% CI, 0.67-0.8) compared to Dice loss (p-value<0.01). Semi-supervised (RFCM+alpha*LFCM) with alpha=0.3 showed the best performance, with a Dice score of 0.69 +/- 0.03 (95% CI, 0.45-0.77) outperforming (MS+alpha*Dice) for any supervision level (any alpha) (p<0.01). The best performer among (MS+alpha*Dice) semi-supervised approaches with alpha=0.2 showed a Dice score of 0.60 +/- 0.08 (95% CI, 0.44-0.76) compared to another supervision level in this semi-supervised approach (p<0.01). Semi-supervised learning via FCM loss (RFCM+alpha*LFCM) showed improved performance compared to supervised approaches. Considering the time-consuming nature of expert manual delineations and intra-observer variabilities, semi-supervised approaches have significant potential for automated segmentation workflows.
△ Less
Submitted 25 March, 2024; v1 submitted 19 December, 2022;
originally announced December 2022.
-
An energy management system model with power quality constraints for unbalanced multi-microgrids interacting in a local energy market
Authors:
Johanna Castellanos,
Carlos Adrian Correa-Florez,
Alejandro Garcés,
Gabriel Ordóñez-Plata,
César A. Uribe,
Diego Patino
Abstract:
As multi-microgrids become readily available, some limited models have been proposed that study operational and power quality constraints with local energy markets independently. This paper proposes a convex optimization model of an energy management system with operational and power quality constraints and interactions in a Local Energy Market (LEM) for unbalanced microgrids (MGs). The LEM consis…
▽ More
As multi-microgrids become readily available, some limited models have been proposed that study operational and power quality constraints with local energy markets independently. This paper proposes a convex optimization model of an energy management system with operational and power quality constraints and interactions in a Local Energy Market (LEM) for unbalanced microgrids (MGs). The LEM consists of a pre-dispatch step and an energy transactions step (ETS). The ETS combines the MGs' objectives while considering two strategies: minimize the cost of buyers or maximize the revenue of sellers. Our proposed model considers harmonic distortion and voltage limit power quality constraints in both steps. Moreover, we model operational constraints such as power flow, power balance, and distributed energy resources behaviors and capacities. We numerically evaluate the proposed model using three unbalanced MGs with residential, industrial, and commercial load profiles, where each microgrid manages its resources locally. Furthermore, we create two groups of cases to analyze the interactions in the local energy market. In the first group, the price of the DSO energy and the surplus from MGs to DSO are the same. The numerical results show that using the increasing revenue strategy promotes MGs to interact more while encouraging them to have high energy prices. When the reducing cost strategy is used, fewer energy interactions occur, and the price of MGs energy is encouraged to be lower.
△ Less
Submitted 4 December, 2022;
originally announced December 2022.
-
On the Performance of Gradient Tracking with Local Updates
Authors:
Edward Duc Hien Nguyen,
Sulaiman A. Alghunaim,
Kun Yuan,
César A. Uribe
Abstract:
We study the decentralized optimization problem where a network of $n$ agents seeks to minimize the average of a set of heterogeneous non-convex cost functions distributedly. State-of-the-art decentralized algorithms like Exact Diffusion~(ED) and Gradient Tracking~(GT) involve communicating every iteration. However, communication is expensive, resource intensive, and slow. In this work, we analyze…
▽ More
We study the decentralized optimization problem where a network of $n$ agents seeks to minimize the average of a set of heterogeneous non-convex cost functions distributedly. State-of-the-art decentralized algorithms like Exact Diffusion~(ED) and Gradient Tracking~(GT) involve communicating every iteration. However, communication is expensive, resource intensive, and slow. In this work, we analyze a locally updated GT method (LU-GT), where agents perform local recursions before interacting with their neighbors. While local updates have been shown to reduce communication overhead in practice, their theoretical influence has not been fully characterized. We show LU-GT has the same communication complexity as the Federated Learning setting but allows arbitrary network topologies. In addition, we prove that the number of local updates does not degrade the quality of the solution achieved by LU-GT. Numerical examples reveal that local updates can lower communication costs in certain regimes (e.g., well-connected graphs).
△ Less
Submitted 12 October, 2022; v1 submitted 10 October, 2022;
originally announced October 2022.
-
A State Feedback Controller for Mitigation of Continuous-Time Networked SIS Epidemics
Authors:
Yuan Wang,
Sebin Gracy,
César A. Uribe,
Hideaki Ishii,
Karl Henrik Johansson
Abstract:
The paper considers continuous-time networked susceptible-infected-susceptible (SIS) diseases spreading over a population. Each agent represents a sub-population and has its own healing rate and infection rate; the state of the agent at a time instant denotes what fraction of the said sub-population is infected with the disease at the said time instant. By taking account of the changes in behavior…
▽ More
The paper considers continuous-time networked susceptible-infected-susceptible (SIS) diseases spreading over a population. Each agent represents a sub-population and has its own healing rate and infection rate; the state of the agent at a time instant denotes what fraction of the said sub-population is infected with the disease at the said time instant. By taking account of the changes in behaviors of the agents in response to the infection rates in real-time, our goal is to devise a feedback strategy such that the infection level for each agent strictly stays below a pre-specified value. Furthermore, we are also interested in ensuring that the closed-loop system converges either to the disease-free equilibrium or, when it exists, to the endemic equilibrium. The upshot of devising such a strategy is that it allows health administration officials to ensure that there is sufficient capacity in the healthcare system to treat the most severe cases. We demonstrate the effectiveness of our controller via numerical examples.
△ Less
Submitted 9 October, 2022;
originally announced October 2022.
-
PersA-FL: Personalized Asynchronous Federated Learning
Authors:
Mohammad Taha Toghani,
Soomin Lee,
César A. Uribe
Abstract:
We study the personalized federated learning problem under asynchronous updates. In this problem, each client seeks to obtain a personalized model that simultaneously outperforms local and global models. We consider two optimization-based frameworks for personalization: (i) Model-Agnostic Meta-Learning (MAML) and (ii) Moreau Envelope (ME). MAML involves learning a joint model adapted for each clie…
▽ More
We study the personalized federated learning problem under asynchronous updates. In this problem, each client seeks to obtain a personalized model that simultaneously outperforms local and global models. We consider two optimization-based frameworks for personalization: (i) Model-Agnostic Meta-Learning (MAML) and (ii) Moreau Envelope (ME). MAML involves learning a joint model adapted for each client through fine-tuning, whereas ME requires a bi-level optimization problem with implicit gradients to enforce personalization via regularized losses. We focus on improving the scalability of personalized federated learning by removing the synchronous communication assumption. Moreover, we extend the studied function class by removing boundedness assumptions on the gradient norm. Our main technical contribution is a unified proof for asynchronous federated learning with bounded staleness that we apply to MAML and ME personalization frameworks. For the smooth and non-convex functions class, we show the convergence of our method to a first-order stationary point. We illustrate the performance of our method and its tolerance to staleness through experiments for classification tasks over heterogeneous datasets.
△ Less
Submitted 4 October, 2023; v1 submitted 3 October, 2022;
originally announced October 2022.
-
Unbounded Gradients in Federated Learning with Buffered Asynchronous Aggregation
Authors:
Mohammad Taha Toghani,
César A. Uribe
Abstract:
Synchronous updates may compromise the efficiency of cross-device federated learning once the number of active clients increases. The \textit{FedBuff} algorithm (Nguyen et al., 2022) alleviates this problem by allowing asynchronous updates (staleness), which enhances the scalability of training while preserving privacy via secure aggregation. We revisit the \textit{FedBuff} algorithm for asynchron…
▽ More
Synchronous updates may compromise the efficiency of cross-device federated learning once the number of active clients increases. The \textit{FedBuff} algorithm (Nguyen et al., 2022) alleviates this problem by allowing asynchronous updates (staleness), which enhances the scalability of training while preserving privacy via secure aggregation. We revisit the \textit{FedBuff} algorithm for asynchronous federated learning and extend the existing analysis by removing the boundedness assumptions from the gradient norm. This paper presents a theoretical analysis of the convergence rate of this algorithm when heterogeneity in data, batch size, and delay are considered.
△ Less
Submitted 3 October, 2022;
originally announced October 2022.
-
On the Endemic Behavior of a Competitive Tri-Virus SIS Networked Model
Authors:
Sebin Gracy,
Mengbin Ye,
Brian DO Anderson,
Cesar A. Uribe
Abstract:
This paper studies the endemic behavior of a multi-competitive networked susceptible-infected-susceptible (SIS) model. In particular, we focus on the case where there are three competing viruses (i.e., the tri-virus system). First, we show that the tri-virus system is not a monotone system. Thereafter, we provide a condition that guarantees local exponential convergence to a boundary equilibrium (…
▽ More
This paper studies the endemic behavior of a multi-competitive networked susceptible-infected-susceptible (SIS) model. In particular, we focus on the case where there are three competing viruses (i.e., the tri-virus system). First, we show that the tri-virus system is not a monotone system. Thereafter, we provide a condition that guarantees local exponential convergence to a boundary equilibrium (exactly one virus is endemic, the other two are dead), and identify a special case that admits the existence and local exponential attractivity of a line of coexistence equilibria (at least two viruses are active). Finally, we identify a particular case (subsumed by the aforementioned special case) such that, for all nonzero initial infection levels, the dynamics of the tri-virus system converge to a plane of coexistence equilibria.
△ Less
Submitted 23 September, 2022;
originally announced September 2022.
-
Convolutional neural network with a hybrid loss function for fully automated segmentation of lymphoma lesions in FDG PET images
Authors:
Fereshteh Yousefirizi,
Natalia Dubljevic,
Shadab Ahamed,
Ingrid Bloise,
Claire Gowdy,
Joo Hyun O,
Youssef Farag,
Rodrigue de Schaetzen,
Patrick Martineau,
Don Wilson,
Carlos F. Uribe,
Arman Rahmim
Abstract:
Segmentation of lymphoma lesions is challenging due to their varied sizes and locations in whole-body PET scans. This work presents a fully-automated segmentation technique using a multi-center dataset of diffuse large B-cell lymphoma (DLBCL) with heterogeneous characteristics. We utilized a dataset of [18F]FDG-PET scans (n=194) from two different imaging centers, including cases with primary medi…
▽ More
Segmentation of lymphoma lesions is challenging due to their varied sizes and locations in whole-body PET scans. This work presents a fully-automated segmentation technique using a multi-center dataset of diffuse large B-cell lymphoma (DLBCL) with heterogeneous characteristics. We utilized a dataset of [18F]FDG-PET scans (n=194) from two different imaging centers, including cases with primary mediastinal large B-cell lymphoma (PMBCL) (n=104). Automated brain and bladder removal approaches were utilized as preprocessing steps to tackle false positives caused by normal hypermetabolic uptake in these organs. Our segmentation model is a convolutional neural network (CNN) based on a 3D U-Net architecture that includes squeeze and excitation (SE) modules. Hybrid distribution, region, and boundary-based losses (Unified Focal and Mumford-Shah (MS)) were utilized that showed the best performance compared to other combinations (p<0.05). Cross-validation between different centers, DLBCL and PMBCL cases, and three random splits were applied on train/validation data. The ensemble of these six models achieved a Dice similarity coefficient (DSC) of 0.77 +- 0.08 and Hausdorff distance (HD) of 16.5 +-12.5. Our 3D U-net model with SE modules for segmentation with hybrid loss performed significantly better (p<0.05) as compared to the 3D U-Net (without SE modules) using the same loss function (Unified Focal and MS loss) (DSC= 0.64 +-0.21 and HD= 26.3 +- 18.7). Our model can facilitate a fully automated quantification pipeline in a multi-center context that opens the possibility for routine reporting of total metabolic tumor volume (TMTV) and other metrics shown useful for the management of lymphoma.
△ Less
Submitted 10 August, 2022; v1 submitted 30 July, 2022;
originally announced August 2022.
-
Consensus ADMM-Based Distributed Simultaneous Imaging & Communication
Authors:
Nishant Mehrotra,
Ashutosh Sabharwal,
César A. Uribe
Abstract:
This paper takes the first steps toward enabling wireless networks to perform both imaging and communication in a distributed manner. We propose Distributed Simultaneous Imaging and Symbol Detection (DSISD), a provably convergent distributed simultaneous imaging and communication scheme based on the alternating direction method of multipliers. We show that DSISD achieves similar imaging and commun…
▽ More
This paper takes the first steps toward enabling wireless networks to perform both imaging and communication in a distributed manner. We propose Distributed Simultaneous Imaging and Symbol Detection (DSISD), a provably convergent distributed simultaneous imaging and communication scheme based on the alternating direction method of multipliers. We show that DSISD achieves similar imaging and communication performance as centralized schemes, with order-wise reduction in computational complexity. We evaluate the performance of DSISD via 2.4 GHz Wi-Fi simulations.
△ Less
Submitted 20 June, 2022;
originally announced June 2022.
-
Distributed Generalized Wirtinger Flow for Interferometric Imaging on Networks
Authors:
Sean M. Farrell,
Ashok Veeraraghavan,
Ashutosh Sabharwal,
César A. Uribe
Abstract:
We study the problem of decentralized interferometric imaging over networks, where agents have access to a subset of local radar measurements and can compute pair-wise correlations with their neighbors. We propose a primal-dual distributed algorithm named Distributed Generalized Wirtinger Flow (DGWF). We use the theory of low rank matrix recovery to show when the interferometric imaging problem sa…
▽ More
We study the problem of decentralized interferometric imaging over networks, where agents have access to a subset of local radar measurements and can compute pair-wise correlations with their neighbors. We propose a primal-dual distributed algorithm named Distributed Generalized Wirtinger Flow (DGWF). We use the theory of low rank matrix recovery to show when the interferometric imaging problem satisfies the Regularity Condition, which implies the Polyak-Lojasiewicz inequality. Moreover, we show that DGWF converges geometrically for smooth functions. Numerical simulations for single-scattering radar interferometric imaging demonstrate that DGWF can achieve the same mean-squared error image reconstruction quality as its centralized counterpart for various network connectivity and size.
△ Less
Submitted 8 June, 2022;
originally announced June 2022.
-
FlowNet-PET: Unsupervised Learning to Perform Respiratory Motion Correction in PET Imaging
Authors:
Teaghan O'Briain,
Carlos Uribe,
Kwang Moo Yi,
Jonas Teuwen,
Ioannis Sechopoulos,
Magdalena Bazalova-Carter
Abstract:
To correct for respiratory motion in PET imaging, an interpretable and unsupervised deep learning technique, FlowNet-PET, was constructed. The network was trained to predict the optical flow between two PET frames from different breathing amplitude ranges. The trained model aligns different retrospectively-gated PET images, providing a final image with similar counting statistics as a non-gated im…
▽ More
To correct for respiratory motion in PET imaging, an interpretable and unsupervised deep learning technique, FlowNet-PET, was constructed. The network was trained to predict the optical flow between two PET frames from different breathing amplitude ranges. The trained model aligns different retrospectively-gated PET images, providing a final image with similar counting statistics as a non-gated image, but without the blurring effects. FlowNet-PET was applied to anthropomorphic digital phantom data, which provided the possibility to design robust metrics to quantify the corrections. When comparing the predicted optical flows to the ground truths, the median absolute error was found to be smaller than the pixel and slice widths. The improvements were illustrated by comparing against images without motion and computing the intersection over union (IoU) of the tumors as well as the enclosed activity and coefficient of variation (CoV) within the no-motion tumor volume before and after the corrections were applied. The average relative improvements provided by the network were 64%, 89%, and 75% for the IoU, total activity, and CoV, respectively. FlowNet-PET achieved similar results as the conventional retrospective phase binning approach, but only required one sixth of the scan duration. The code and data have been made publicly available (https://github.com/teaghan/FlowNet_PET).
△ Less
Submitted 2 August, 2022; v1 submitted 27 May, 2022;
originally announced May 2022.
-
On Arbitrary Compression for Decentralized Consensus and Stochastic Optimization over Directed Networks
Authors:
Mohammad Taha Toghani,
César A. Uribe
Abstract:
We study the decentralized consensus and stochastic optimization problems with compressed communications over static directed graphs. We propose an iterative gradient-based algorithm that compresses messages according to a desired compression ratio. The proposed method provably reduces the communication overhead on the network at every communication round. Contrary to existing literature, we allow…
▽ More
We study the decentralized consensus and stochastic optimization problems with compressed communications over static directed graphs. We propose an iterative gradient-based algorithm that compresses messages according to a desired compression ratio. The proposed method provably reduces the communication overhead on the network at every communication round. Contrary to existing literature, we allow for arbitrary compression ratios in the communicated messages. We show a linear convergence rate for the proposed method on the consensus problem. Moreover, we provide explicit convergence rates for decentralized stochastic optimization problems on smooth functions that are either (i) strongly convex, (ii) convex, or (iii) non-convex. Finally, we provide numerical experiments to illustrate convergence under arbitrary compression ratios and the communication efficiency of our algorithm.
△ Less
Submitted 18 April, 2022;
originally announced April 2022.
-
On Acceleration of Gradient-Based Empirical Risk Minimization using Local Polynomial Regression
Authors:
Ekaterina Trimbach,
Edward Duc Hien Nguyen,
César A. Uribe
Abstract:
We study the acceleration of the Local Polynomial Interpolation-based Gradient Descent method (LPI-GD) recently proposed for the approximate solution of empirical risk minimization problems (ERM). We focus on loss functions that are strongly convex and smooth with condition number $σ$. We additionally assume the loss function is $η$-Hölder continuous with respect to the data. The oracle complexity…
▽ More
We study the acceleration of the Local Polynomial Interpolation-based Gradient Descent method (LPI-GD) recently proposed for the approximate solution of empirical risk minimization problems (ERM). We focus on loss functions that are strongly convex and smooth with condition number $σ$. We additionally assume the loss function is $η$-Hölder continuous with respect to the data. The oracle complexity of LPI-GD is $\tilde{O}\left(σm^d \log(1/\varepsilon)\right)$ for a desired accuracy $\varepsilon$, where $d$ is the dimension of the parameter space, and $m$ is the cardinality of an approximation grid. The factor $m^d$ can be shown to scale as $O((1/\varepsilon)^{d/2η})$. LPI-GD has been shown to have better oracle complexity than gradient descent (GD) and stochastic gradient descent (SGD) for certain parameter regimes. We propose two accelerated methods for the ERM problem based on LPI-GD and show an oracle complexity of $\tilde{O}\left(\sqrtσ m^d \log(1/\varepsilon)\right)$. Moreover, we provide the first empirical study on local polynomial interpolation-based gradient methods and corroborate that LPI-GD has better performance than GD and SGD in some scenarios, and the proposed methods achieve acceleration.
△ Less
Submitted 15 April, 2022;
originally announced April 2022.
-
On Distributed Exact Sparse Linear Regression over Networks
Authors:
Tu Anh-Nguyen,
César A. Uribe
Abstract:
In this work, we propose an algorithm for solving exact sparse linear regression problems over a network in a distributed manner. Particularly, we consider the problem where data is stored among different computers or agents that seek to collaboratively find a common regressor with a specified sparsity k, i.e., the L0-norm is less than or equal to k. Contrary to existing literature that uses L1 re…
▽ More
In this work, we propose an algorithm for solving exact sparse linear regression problems over a network in a distributed manner. Particularly, we consider the problem where data is stored among different computers or agents that seek to collaboratively find a common regressor with a specified sparsity k, i.e., the L0-norm is less than or equal to k. Contrary to existing literature that uses L1 regularization to approximate sparseness, we solve the problem with exact sparsity k. The main novelty in our proposal lies in showing a problem formulation with zero duality gap for which we adopt a dual approach to solve the problem in a decentralized way. This sets a foundational approach for the study of distributed optimization with explicit sparsity constraints. We show theoretically and empirically that, under appropriate assumptions, where each agent solves smaller and local integer programming problems, all agents will eventually reach a consensus on the same sparse optimal regressor.
△ Less
Submitted 1 April, 2022;
originally announced April 2022.
-
Local Stochastic Factored Gradient Descent for Distributed Quantum State Tomography
Authors:
Junhyung Lyle Kim,
Mohammad Taha Toghani,
César A. Uribe,
Anastasios Kyrillidis
Abstract:
We propose a distributed Quantum State Tomography (QST) protocol, named Local Stochastic Factored Gradient Descent (Local SFGD), to learn the low-rank factor of a density matrix over a set of local machines. QST is the canonical procedure to characterize the state of a quantum system, which we formulate as a stochastic nonconvex smooth optimization problem. Physically, the estimation of a low-rank…
▽ More
We propose a distributed Quantum State Tomography (QST) protocol, named Local Stochastic Factored Gradient Descent (Local SFGD), to learn the low-rank factor of a density matrix over a set of local machines. QST is the canonical procedure to characterize the state of a quantum system, which we formulate as a stochastic nonconvex smooth optimization problem. Physically, the estimation of a low-rank density matrix helps characterizing the amount of noise introduced by quantum computation. Theoretically, we prove the local convergence of Local SFGD for a general class of restricted strongly convex/smooth loss functions, i.e., Local SFGD converges locally to a small neighborhood of the global optimum at a linear rate with a constant step size, while it locally converges exactly at a sub-linear rate with diminishing step sizes. With a proper initialization, local convergence results imply global convergence. We validate our theoretical findings with numerical simulations of QST on the Greenberger-Horne-Zeilinger (GHZ) state.
△ Less
Submitted 1 June, 2022; v1 submitted 22 March, 2022;
originally announced March 2022.
-
The Role of Local Steps in Local SGD
Authors:
Tiancheng Qin,
S. Rasoul Etesami,
César A. Uribe
Abstract:
We consider the distributed stochastic optimization problem where $n$ agents want to minimize a global function given by the sum of agents' local functions, and focus on the heterogeneous setting when agents' local functions are defined over non-i.i.d. data sets. We study the Local SGD method, where agents perform a number of local stochastic gradient steps and occasionally communicate with a cent…
▽ More
We consider the distributed stochastic optimization problem where $n$ agents want to minimize a global function given by the sum of agents' local functions, and focus on the heterogeneous setting when agents' local functions are defined over non-i.i.d. data sets. We study the Local SGD method, where agents perform a number of local stochastic gradient steps and occasionally communicate with a central node to improve their local optimization tasks. We analyze the effect of local steps on the convergence rate and the communication complexity of Local SGD. In particular, instead of assuming a fixed number of local steps across all communication rounds, we allow the number of local steps during the $i$-th communication round, $H_i$, to be different and arbitrary numbers. Our main contribution is to characterize the convergence rate of Local SGD as a function of $\{H_i\}_{i=1}^R$ under various settings of strongly convex, convex, and nonconvex local functions, where $R$ is the total number of communication rounds. Based on this characterization, we provide sufficient conditions on the sequence $\{H_i\}_{i=1}^R$ such that Local SGD can achieve linear speed-up with respect to the number of workers. Furthermore, we propose a new communication strategy with increasing local steps superior to existing communication strategies for strongly convex local functions. On the other hand, for convex and nonconvex local functions, we argue that fixed local steps are the best communication strategy for Local SGD and recover state-of-the-art convergence rate results. Finally, we justify our theoretical results through extensive numerical experiments.
△ Less
Submitted 29 September, 2022; v1 submitted 13 March, 2022;
originally announced March 2022.
-
Tensor Radiomics: Paradigm for Systematic Incorporation of Multi-Flavoured Radiomics Features
Authors:
Arman Rahmim,
Amirhosein Toosi,
Mohammad R. Salmanpour,
Natalia Dubljevic,
Ian Janzen,
Isaac Shiri,
Ren Yuan,
Cheryl Ho,
Habib Zaidi,
Calum MacAulay,
Carlos Uribe,
Fereshteh Yousefirizi
Abstract:
Radiomics features extract quantitative information from medical images, towards the derivation of biomarkers for clinical tasks, such as diagnosis, prognosis, or treatment response assessment. Different image discretization parameters (e.g. bin number or size), convolutional filters, segmentation perturbation, or multi-modality fusion levels can be used to generate radiomics features and ultimate…
▽ More
Radiomics features extract quantitative information from medical images, towards the derivation of biomarkers for clinical tasks, such as diagnosis, prognosis, or treatment response assessment. Different image discretization parameters (e.g. bin number or size), convolutional filters, segmentation perturbation, or multi-modality fusion levels can be used to generate radiomics features and ultimately signatures. Commonly, only one set of parameters is used; resulting in only one value or flavour for a given RF. We propose tensor radiomics (TR) where tensors of features calculated with multiple combinations of parameters (i.e. flavours) are utilized to optimize the construction of radiomics signatures. We present examples of TR as applied to PET/CT, MRI, and CT imaging invoking machine learning or deep learning solutions, and reproducibility analyses: (1) TR via varying bin sizes on CT images of lung cancer and PET-CT images of head & neck cancer (HNC) for overall survival prediction. A hybrid deep neural network, referred to as TR-Net, along with two ML-based flavour fusion methods showed improved accuracy compared to regular rediomics features. (2) TR built from different segmentation perturbations and different bin sizes for classification of late-stage lung cancer response to first-line immunotherapy using CT images. TR improved predicted patient responses. (3) TR via multi-flavour generated radiomics features in MR imaging showed improved reproducibility when compared to many single-flavour features. (4) TR via multiple PET/CT fusions in HNC. Flavours were built from different fusions using methods, such as Laplacian pyramids and wavelet transforms. TR improved overall survival prediction. Our results suggest that the proposed TR paradigm has the potential to improve performance capabilities in different medical imaging tasks.
△ Less
Submitted 24 October, 2022; v1 submitted 11 March, 2022;
originally announced March 2022.
-
Faster Convergence of Local SGD for Over-Parameterized Models
Authors:
Tiancheng Qin,
S. Rasoul Etesami,
César A. Uribe
Abstract:
Modern machine learning architectures are often highly expressive. They are usually over-parameterized and can interpolate the data by driving the empirical loss close to zero. We analyze the convergence of Local SGD (or FedAvg) for such over-parameterized models in the heterogeneous data setting and improve upon the existing literature by establishing the following convergence rates. For general…
▽ More
Modern machine learning architectures are often highly expressive. They are usually over-parameterized and can interpolate the data by driving the empirical loss close to zero. We analyze the convergence of Local SGD (or FedAvg) for such over-parameterized models in the heterogeneous data setting and improve upon the existing literature by establishing the following convergence rates. For general convex loss functions, we establish an error bound of $Ø(1/T)$ under a mild data similarity assumption and an error bound of $Ø(K/T)$ otherwise, where $K$ is the number of local steps and $T$ is the total number of iterations. For non-convex loss functions we prove an error bound of $Ø(K/T)$. These bounds improve upon the best previous bound of $Ø(1/\sqrt{nT})$ in both cases, where $n$ is the number of nodes, when no assumption on the model being over-parameterized is made. We complete our results by providing problem instances in which our established convergence rates are tight to a constant factor with a reasonably small stepsize. Finally, we validate our theoretical results by performing large-scale numerical experiments that reveal the convergence behavior of Local SGD for practical over-parameterized deep learning models, in which the $Ø(1/T)$ convergence rate of Local SGD is clearly shown.
△ Less
Submitted 10 June, 2024; v1 submitted 29 January, 2022;
originally announced January 2022.
-
Approximate Wasserstein Attraction Flows for Dynamic Mass Transport over Networks
Authors:
Ferran Arqué,
César A. Uribe,
Carlos Ocampo-Martinez
Abstract:
This paper presents a Wasserstein attraction approach for solving dynamic mass transport problems over networks. In the transport problem over networks, we start with a distribution over the set of nodes that needs to be "transported" to a target distribution accounting for the network topology. We exploit the specific structure of the problem, characterized by the computation of implicit gradient…
▽ More
This paper presents a Wasserstein attraction approach for solving dynamic mass transport problems over networks. In the transport problem over networks, we start with a distribution over the set of nodes that needs to be "transported" to a target distribution accounting for the network topology. We exploit the specific structure of the problem, characterized by the computation of implicit gradient steps, and formulate an approach based on discretized flows. As a result, our proposed algorithm relies on the iterative computation of constrained Wasserstein barycenters. We show how the proposed method finds approximate solutions to the network transport problem, taking into account the topology of the network, the capacity of the communication channels, and the capacity of the individual nodes. Finally, we show the performance of this approach applied to large-scale water transportation networks.
△ Less
Submitted 26 April, 2022; v1 submitted 19 September, 2021;
originally announced September 2021.
-
Scalable Average Consensus with Compressed Communications
Authors:
Mohammad Taha Toghani,
César A. Uribe
Abstract:
We propose a new decentralized average consensus algorithm with compressed communication that scales linearly with the network size n. We prove that the proposed method converges to the average of the initial values held locally by the agents of a network when agents are allowed to communicate with compressed messages. The proposed algorithm works for a broad class of compression operators (possib…
▽ More
We propose a new decentralized average consensus algorithm with compressed communication that scales linearly with the network size n. We prove that the proposed method converges to the average of the initial values held locally by the agents of a network when agents are allowed to communicate with compressed messages. The proposed algorithm works for a broad class of compression operators (possibly biased), where agents interact over arbitrary static, undirected, and connected networks. We further present numerical experiments that confirm our theoretical results and illustrate the scalability and communication efficiency of our algorithm.
△ Less
Submitted 14 September, 2021;
originally announced September 2021.
-
Role of AI in Theranostics: Towards Routine Personalized Radiopharmaceutical Therapies
Authors:
Julia Brosch-Lenz,
Fereshteh Yousefirizi,
Katherine Zukotynski,
Jean-Mathieu Beauregard,
Vincent Gaudet,
Babak Saboury,
Arman Rahmim,
Carlos Uribe
Abstract:
We highlight emerging uses of artificial intelligence (AI) in the field of theranostics, focusing on its significant potential to enable routine and reliable personalization of radiopharmaceutical therapies (RPTs). Personalized RPTs require patient-individual dosimetry calculations accompanying therapy. Image-based dosimetry needs: 1) quantitative imaging; 2) co-registration and organ/tumor identi…
▽ More
We highlight emerging uses of artificial intelligence (AI) in the field of theranostics, focusing on its significant potential to enable routine and reliable personalization of radiopharmaceutical therapies (RPTs). Personalized RPTs require patient-individual dosimetry calculations accompanying therapy. Image-based dosimetry needs: 1) quantitative imaging; 2) co-registration and organ/tumor identification on serial and multimodality images; 3) determination of the time-integrated activity; and 4) absorbed dose determination. AI models that facilitate these steps are reviewed. Additionally we discuss the potential to exploit biological information from diagnostic and therapeutic molecular images to derive biomarkers for absorbed dose and outcome prediction, towards personalization of therapies. We try to motivate the nuclear medicine community to expand and align efforts into making routine and reliable personalization of RPTs a reality.
△ Less
Submitted 31 July, 2021; v1 submitted 29 July, 2021;
originally announced July 2021.