-
Semantic-Aware Resource Allocation in Constrained Networks with Limited User Participation
Authors:
Ouiame Marnissi,
Hajar EL Hammouti,
El Houcine Bergou
Abstract:
Semantic communication has gained attention as a key enabler for intelligent and context-aware communication. However, one of the key challenges of semantic communications is the need to tailor the resource allocation to meet the specific requirements of semantic transmission. In this paper, we focus on networks with limited resources where devices are constrained to transmit with limited bandwidt…
▽ More
Semantic communication has gained attention as a key enabler for intelligent and context-aware communication. However, one of the key challenges of semantic communications is the need to tailor the resource allocation to meet the specific requirements of semantic transmission. In this paper, we focus on networks with limited resources where devices are constrained to transmit with limited bandwidth and power over large distance. Specifically, we devise an efficient strategy to select the most pertinent semantic features and participating users, taking into account the channel quality, the transmission time, and the recovery accuracy. To this end, we formulate an optimization problem with the goal of selecting the most relevant and accurate semantic features over devices while satisfying constraints on transmission time and quality of the channel. This involves optimizing communication resources, identifying participating users, and choosing specific semantic information for transmission. The underlying problem is inherently complex due to its non-convex nature and combinatorial constraints. To overcome this challenge, we efficiently approximate the optimal solution by solving a series of integer linear programming problems. Our numerical findings illustrate the effectiveness and efficiency of our approach in managing semantic communications in networks with limited resources.
△ Less
Submitted 19 January, 2024;
originally announced January 2024.
-
Joint Probability Selection and Power Allocation for Federated Learning
Authors:
Ouiame Marnissi,
Hajar EL Hammouti,
El Houcine Bergou
Abstract:
In this paper, we study the performance of federated learning over wireless networks, where devices with a limited energy budget train a machine learning model. The federated learning performance depends on the selection of the clients participating in the learning at each round. Most existing studies suggest deterministic approaches for the client selection, resulting in challenging optimization…
▽ More
In this paper, we study the performance of federated learning over wireless networks, where devices with a limited energy budget train a machine learning model. The federated learning performance depends on the selection of the clients participating in the learning at each round. Most existing studies suggest deterministic approaches for the client selection, resulting in challenging optimization problems that are usually solved using heuristics, and therefore without guarantees on the quality of the final solution. We formulate a new probabilistic approach to jointly select clients and allocate power optimally so that the expected number of participating clients is maximized. To solve the problem, a new alternating algorithm is proposed, where at each step, the closed-form solutions for user selection probabilities and power allocations are obtained. Our numerical results show that the proposed approach achieves a significant performance in terms of energy consumption, completion time and accuracy as compared to the studied benchmarks.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
Age-of-Information in UAV-assisted Networks: a Decentralized Multi-Agent Optimization
Authors:
Mouhamed Naby Ndiaye,
El Houcine Bergou,
Hajar El Hammouti
Abstract:
Unmanned aerial vehicles (UAVs) are a highly promising technology with diverse applications in wireless networks. One of their primary uses is the collection of time-sensitive data from Internet of Things (IoT) devices. In UAV-assisted networks, the Age-of-Information (AoI) serves as a fundamental metric for quantifying data timeliness and freshness. In this work, we are interested in a generalize…
▽ More
Unmanned aerial vehicles (UAVs) are a highly promising technology with diverse applications in wireless networks. One of their primary uses is the collection of time-sensitive data from Internet of Things (IoT) devices. In UAV-assisted networks, the Age-of-Information (AoI) serves as a fundamental metric for quantifying data timeliness and freshness. In this work, we are interested in a generalized AoI formulation, where each packet's age is weighted based on its generation time. Our objective is to find the optimal UAVs' trajectories and the subsets of selected devices such that the weighted AoI is minimized. To address this challenge, we formulate the problem as a Mixed-Integer Nonlinear Programming (MINLP), incorporating time and quality of service constraints. To efficiently tackle this complex problem and minimize communication overhead among UAVs, we propose a distributed approach. This approach enables drones to make independent decisions based on locally acquired data. Specifically, we reformulate our problem such that our objective function is easily decomposed into individual rewards. The reformulated problem is solved using a distributed implementation of Multi-Agent Reinforcement Learning (MARL). Our empirical results show that the proposed decentralized approach achieves results that are nearly equivalent to a centralized implementation with a notable reduction in communication overhead.
△ Less
Submitted 25 December, 2023;
originally announced December 2023.
-
Demystifying the Myths and Legends of Nonconvex Convergence of SGD
Authors:
Aritra Dutta,
El Houcine Bergou,
Soumia Boucherouite,
Nicklas Werge,
Melih Kandemir,
Xin Li
Abstract:
Stochastic gradient descent (SGD) and its variants are the main workhorses for solving large-scale optimization problems with nonconvex objective functions. Although the convergence of SGDs in the (strongly) convex case is well-understood, their convergence for nonconvex functions stands on weak mathematical foundations. Most existing studies on the nonconvex convergence of SGD show the complexity…
▽ More
Stochastic gradient descent (SGD) and its variants are the main workhorses for solving large-scale optimization problems with nonconvex objective functions. Although the convergence of SGDs in the (strongly) convex case is well-understood, their convergence for nonconvex functions stands on weak mathematical foundations. Most existing studies on the nonconvex convergence of SGD show the complexity results based on either the minimum of the expected gradient norm or the functional sub-optimality gap (for functions with extra structural property) by searching the entire range of iterates. Hence the last iterations of SGDs do not necessarily maintain the same complexity guarantee. This paper shows that an $ε$-stationary point exists in the final iterates of SGDs, given a large enough total iteration budget, $T$, not just anywhere in the entire range of iterates -- a much stronger result than the existing one. Additionally, our analyses allow us to measure the density of the $ε$-stationary points in the final iterates of SGD, and we recover the classical $O(\frac{1}{\sqrt{T}})$ asymptotic rate under various existing assumptions on the objective function and the bounds on the stochastic gradient. As a result of our analyses, we addressed certain myths and legends related to the nonconvex convergence of SGD and posed some thought-provoking questions that could set new directions for research.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
Ensemble DNN for Age-of-Information Minimization in UAV-assisted Networks
Authors:
Mouhamed Naby Ndiaye,
El Houcine Bergou,
Hajar El Hammouti
Abstract:
This paper addresses the problem of Age-of-Information (AoI) in UAV-assisted networks. Our objective is to minimize the expected AoI across devices by optimizing UAVs' stop** locations and device selection probabilities. To tackle this problem, we first derive a closed-form expression of the expected AoI that involves the probabilities of selection of devices. Then, we formulate the problem as a…
▽ More
This paper addresses the problem of Age-of-Information (AoI) in UAV-assisted networks. Our objective is to minimize the expected AoI across devices by optimizing UAVs' stop** locations and device selection probabilities. To tackle this problem, we first derive a closed-form expression of the expected AoI that involves the probabilities of selection of devices. Then, we formulate the problem as a non-convex minimization subject to quality of service constraints. Since the problem is challenging to solve, we propose an Ensemble Deep Neural Network (EDNN) based approach which takes advantage of the dual formulation of the studied problem. Specifically, the Deep Neural Networks (DNNs) in the ensemble are trained in an unsupervised manner using the Lagrangian function of the studied problem. Our experiments show that the proposed EDNN method outperforms traditional DNNs in reducing the expected AoI, achieving a remarkable reduction of $29.5\%$.
△ Less
Submitted 6 September, 2023;
originally announced September 2023.
-
A Note on Randomized Kaczmarz Algorithm for Solving Doubly-Noisy Linear Systems
Authors:
El Houcine Bergou,
Soumia Boucherouite,
Aritra Dutta,
Xin Li,
Anna Ma
Abstract:
Large-scale linear systems, $Ax=b$, frequently arise in practice and demand effective iterative solvers. Often, these systems are noisy due to operational errors or faulty data-collection processes. In the past decade, the randomized Kaczmarz (RK) algorithm has been studied extensively as an efficient iterative solver for such systems. However, the convergence study of RK in the noisy regime is li…
▽ More
Large-scale linear systems, $Ax=b$, frequently arise in practice and demand effective iterative solvers. Often, these systems are noisy due to operational errors or faulty data-collection processes. In the past decade, the randomized Kaczmarz (RK) algorithm has been studied extensively as an efficient iterative solver for such systems. However, the convergence study of RK in the noisy regime is limited and considers measurement noise in the right-hand side vector, $b$. Unfortunately, in practice, that is not always the case; the coefficient matrix $A$ can also be noisy. In this paper, we analyze the convergence of RK for noisy linear systems when the coefficient matrix, $A$, is corrupted with both additive and multiplicative noise, along with the noisy vector, $b$. In our analyses, the quantity $\tilde R=\| \tilde A^{\dagger} \|_2^2 \|\tilde A \|_F^2$ influences the convergence of RK, where $\tilde A$ represents a noisy version of $A$. We claim that our analysis is robust and realistically applicable, as we do not require information about the noiseless coefficient matrix, $A$, and considering different conditions on noise, we can control the convergence of RK. We substantiate our theoretical findings by performing comprehensive numerical experiments.
△ Less
Submitted 31 August, 2023;
originally announced August 2023.
-
Muti-Agent Proximal Policy Optimization For Data Freshness in UAV-assisted Networks
Authors:
Mouhamed Naby Ndiaye,
El Houcine Bergou,
Hajar El Hammouti
Abstract:
Unmanned aerial vehicles (UAVs) are seen as a promising technology to perform a wide range of tasks in wireless communication networks. In this work, we consider the deployment of a group of UAVs to collect the data generated by IoT devices. Specifically, we focus on the case where the collected data is time-sensitive, and it is critical to maintain its timeliness. Our objective is to optimally de…
▽ More
Unmanned aerial vehicles (UAVs) are seen as a promising technology to perform a wide range of tasks in wireless communication networks. In this work, we consider the deployment of a group of UAVs to collect the data generated by IoT devices. Specifically, we focus on the case where the collected data is time-sensitive, and it is critical to maintain its timeliness. Our objective is to optimally design the UAVs' trajectories and the subsets of visited IoT devices such as the global Age-of-Updates (AoU) is minimized. To this end, we formulate the studied problem as a mixed-integer nonlinear programming (MINLP) under time and quality of service constraints. To efficiently solve the resulting optimization problem, we investigate the cooperative Multi-Agent Reinforcement Learning (MARL) framework and propose an RL approach based on the popular on-policy Reinforcement Learning (RL) algorithm: Policy Proximal Optimization (PPO). Our approach leverages the centralized training decentralized execution (CTDE) framework where the UAVs learn their optimal policies while training a centralized value function. Our simulation results show that the proposed MAPPO approach reduces the global AoU by at least a factor of 1/2 compared to conventional off-policy reinforcement learning approaches.
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
Linear Scalarization for Byzantine-robust learning on non-IID data
Authors:
Latifa Errami,
El Houcine Bergou
Abstract:
In this work we study the problem of Byzantine-robust learning when data among clients is heterogeneous. We focus on poisoning attacks targeting the convergence of SGD. Although this problem has received great attention; the main Byzantine defenses rely on the IID assumption causing them to fail when data distribution is non-IID even with no attack. We propose the use of Linear Scalarization (LS)…
▽ More
In this work we study the problem of Byzantine-robust learning when data among clients is heterogeneous. We focus on poisoning attacks targeting the convergence of SGD. Although this problem has received great attention; the main Byzantine defenses rely on the IID assumption causing them to fail when data distribution is non-IID even with no attack. We propose the use of Linear Scalarization (LS) as an enhancing method to enable current defenses to circumvent Byzantine attacks in the non-IID setting. The LS method is based on the incorporation of a trade-off vector that penalizes the suspected malicious clients. Empirical analysis corroborates that the proposed LS variants are viable in the IID setting. For mild to strong non-IID data splits, LS is either comparable or outperforming current approaches under state-of-the-art Byzantine attack scenarios.
△ Less
Submitted 15 October, 2022;
originally announced October 2022.
-
Minibatch Stochastic Three Points Method for Unconstrained Smooth Minimization
Authors:
Soumia Boucherouite,
Grigory Malinovsky,
Peter Richtárik,
EL Houcine Bergou
Abstract:
In this paper, we propose a new zero order optimization method called minibatch stochastic three points (MiSTP) method to solve an unconstrained minimization problem in a setting where only an approximation of the objective function evaluation is possible. It is based on the recently proposed stochastic three points (STP) method (Bergou et al., 2020). At each iteration, MiSTP generates a random se…
▽ More
In this paper, we propose a new zero order optimization method called minibatch stochastic three points (MiSTP) method to solve an unconstrained minimization problem in a setting where only an approximation of the objective function evaluation is possible. It is based on the recently proposed stochastic three points (STP) method (Bergou et al., 2020). At each iteration, MiSTP generates a random search direction in a similar manner to STP, but chooses the next iterate based solely on the approximation of the objective function rather than its exact evaluations. We also analyze our method's complexity in the nonconvex and convex cases and evaluate its performance on multiple machine learning tasks.
△ Less
Submitted 16 September, 2022;
originally announced September 2022.
-
Age-of-Updates Optimization for UAV-assisted Networks
Authors:
Mouhamed Naby Ndiaye,
El Houcine Bergou,
Mounir Ghogho,
Hajar El Hammouti
Abstract:
Unmanned aerial vehicles (UAVs) have been proposed as a promising technology to collect data from IoT devices and relay it to the network. In this work, we are interested in scenarios where the data is updated periodically, and the collected updates are time-sensitive. In particular, the data updates may lose their value if they are not collected and analyzed timely. To maximize the data freshness…
▽ More
Unmanned aerial vehicles (UAVs) have been proposed as a promising technology to collect data from IoT devices and relay it to the network. In this work, we are interested in scenarios where the data is updated periodically, and the collected updates are time-sensitive. In particular, the data updates may lose their value if they are not collected and analyzed timely. To maximize the data freshness, we optimize a new performance metric, namely the Age-of-Updates (AoU). Our objective is to carefully schedule the UAVs hovering positions and the users' association so that the AoU is minimized. Unlike existing works where the association parameters are considered as binary variables, we assume that devices send their updates according to a probability distribution. As a consequence, instead of optimizing a deterministic objective function, the objective function is replaced by an expectation over the probability distribution. The expected AoU is therefore optimized under quality of service and energy constraints. The original problem being non-convex, we propose an equivalent convex optimization that we solve using an interior-point method. Our simulation results show the performance of the proposed approach against a binary association.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
Personalized Federated Learning with Communication Compression
Authors:
El Houcine Bergou,
Konstantin Burlachenko,
Aritra Dutta,
Peter Richtárik
Abstract:
In contrast to training traditional machine learning (ML) models in data centers, federated learning (FL) trains ML models over local datasets contained on resource-constrained heterogeneous edge devices. Existing FL algorithms aim to learn a single global model for all participating devices, which may not be helpful to all devices participating in the training due to the heterogeneity of the data…
▽ More
In contrast to training traditional machine learning (ML) models in data centers, federated learning (FL) trains ML models over local datasets contained on resource-constrained heterogeneous edge devices. Existing FL algorithms aim to learn a single global model for all participating devices, which may not be helpful to all devices participating in the training due to the heterogeneity of the data across the devices. Recently, Hanzely and Richtárik (2020) proposed a new formulation for training personalized FL models aimed at balancing the trade-off between the traditional global model and the local models that could be trained by individual devices using their private data only. They derived a new algorithm, called Loopless Gradient Descent (L2GD), to solve it and showed that this algorithms leads to improved communication complexity guarantees in regimes when more personalization is required. In this paper, we equip their L2GD algorithm with a bidirectional compression mechanism to further reduce the communication bottleneck between the local devices and the server. Unlike other compression-based algorithms used in the FL-setting, our compressed L2GD algorithm operates on a probabilistic communication protocol, where communication does not happen on a fixed schedule. Moreover, our compressed L2GD algorithm maintains a similar convergence rate as vanilla SGD without compression. To empirically validate the efficiency of our algorithm, we perform diverse numerical experiments on both convex and non-convex problems and using various compression techniques.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
Client Selection in Federated Learning based on Gradients Importance
Authors:
Ouiame Marnissi,
Hajar El Hammouti,
El Houcine Bergou
Abstract:
Federated learning (FL) enables multiple devices to collaboratively learn a global model without sharing their personal data. In real-world applications, the different parties are likely to have heterogeneous data distribution and limited communication bandwidth. In this paper, we are interested in improving the communication efficiency of FL systems. We investigate and design a device selection s…
▽ More
Federated learning (FL) enables multiple devices to collaboratively learn a global model without sharing their personal data. In real-world applications, the different parties are likely to have heterogeneous data distribution and limited communication bandwidth. In this paper, we are interested in improving the communication efficiency of FL systems. We investigate and design a device selection strategy based on the importance of the gradient norms. In particular, our approach consists of selecting devices with the highest norms of gradient values at each communication round. We study the convergence and the performance of such a selection technique and compare it to existing ones. We perform several experiments with non-iid set-up. The results show the convergence of our method with a considerable increase of test accuracy comparing to the random selection.
△ Less
Submitted 19 November, 2021;
originally announced November 2021.
-
On the Discrepancy between the Theoretical Analysis and Practical Implementations of Compressed Communication for Distributed Deep Learning
Authors:
Aritra Dutta,
El Houcine Bergou,
Ahmed M. Abdelmoniem,
Chen-Yu Ho,
Atal Narayan Sahu,
Marco Canini,
Panos Kalnis
Abstract:
Compressed communication, in the form of sparsification or quantization of stochastic gradients, is employed to reduce communication costs in distributed data-parallel training of deep neural networks. However, there exists a discrepancy between theory and practice: while theoretical analysis of most existing compression methods assumes compression is applied to the gradients of the entire model,…
▽ More
Compressed communication, in the form of sparsification or quantization of stochastic gradients, is employed to reduce communication costs in distributed data-parallel training of deep neural networks. However, there exists a discrepancy between theory and practice: while theoretical analysis of most existing compression methods assumes compression is applied to the gradients of the entire model, many practical implementations operate individually on the gradients of each layer of the model. In this paper, we prove that layer-wise compression is, in theory, better, because the convergence rate is upper bounded by that of entire-model compression for a wide range of biased and unbiased compression methods. However, despite the theoretical bound, our experimental study of six well-known methods shows that convergence, in practice, may or may not be better, depending on the actual trained model and compression ratio. Our findings suggest that it would be advantageous for deep learning frameworks to include support for both layer-wise and entire-model compression.
△ Less
Submitted 19 November, 2019;
originally announced November 2019.
-
A Stochastic Derivative Free Optimization Method with Momentum
Authors:
Eduard Gorbunov,
Adel Bibi,
Ozan Sener,
El Houcine Bergou,
Peter Richtárik
Abstract:
We consider the problem of unconstrained minimization of a smooth objective function in $\mathbb{R}^d$ in setting where only function evaluations are possible. We propose and analyze stochastic zeroth-order method with heavy ball momentum. In particular, we propose, SMTP, a momentum version of the stochastic three-point method (STP) \cite{Bergou_2018}. We show new complexity results for non-convex…
▽ More
We consider the problem of unconstrained minimization of a smooth objective function in $\mathbb{R}^d$ in setting where only function evaluations are possible. We propose and analyze stochastic zeroth-order method with heavy ball momentum. In particular, we propose, SMTP, a momentum version of the stochastic three-point method (STP) \cite{Bergou_2018}. We show new complexity results for non-convex, convex and strongly convex functions. We test our method on a collection of learning to continuous control tasks on several MuJoCo \cite{Todorov_2012} environments with varying difficulty and compare against STP, other state-of-the-art derivative-free optimization algorithms and against policy gradient methods. SMTP significantly outperforms STP and all other methods that we considered in our numerical experiments. Our second contribution is SMTP with importance sampling which we call SMTP_IS. We provide convergence analysis of this method for non-convex, convex and strongly convex objectives.
△ Less
Submitted 16 February, 2020; v1 submitted 30 May, 2019;
originally announced May 2019.
-
Direct Nonlinear Acceleration
Authors:
Aritra Dutta,
El Houcine Bergou,
Yunming Xiao,
Marco Canini,
Peter Richtárik
Abstract:
Optimization acceleration techniques such as momentum play a key role in state-of-the-art machine learning algorithms. Recently, generic vector sequence extrapolation techniques, such as regularized nonlinear acceleration (RNA) of Scieur et al., were proposed and shown to accelerate fixed point iterations. In contrast to RNA which computes extrapolation coefficients by (approximately) setting the…
▽ More
Optimization acceleration techniques such as momentum play a key role in state-of-the-art machine learning algorithms. Recently, generic vector sequence extrapolation techniques, such as regularized nonlinear acceleration (RNA) of Scieur et al., were proposed and shown to accelerate fixed point iterations. In contrast to RNA which computes extrapolation coefficients by (approximately) setting the gradient of the objective function to zero at the extrapolated point, we propose a more direct approach, which we call direct nonlinear acceleration (DNA). In DNA, we aim to minimize (an approximation of) the function value at the extrapolated point instead. We adopt a regularized approach with regularizers designed to prevent the model from entering a region in which the functional approximation is less precise. While the computational cost of DNA is comparable to that of RNA, our direct approach significantly outperforms RNA on both synthetic and real-world datasets. While the focus of this paper is on convex problems, we obtain very encouraging results in accelerating the training of neural networks.
△ Less
Submitted 28 May, 2019;
originally announced May 2019.
-
Stochastic Three Points Method for Unconstrained Smooth Minimization
Authors:
El Houcine Bergou,
Eduard Gorbunov,
Peter Richtárik
Abstract:
In this paper we consider the unconstrained minimization problem of a smooth function in ${\mathbb{R}}^n$ in a setting where only function evaluations are possible. We design a novel randomized derivative-free algorithm --- the stochastic three points (STP) method --- and analyze its iteration complexity. At each iteration, STP generates a random search direction according to a certain fixed proba…
▽ More
In this paper we consider the unconstrained minimization problem of a smooth function in ${\mathbb{R}}^n$ in a setting where only function evaluations are possible. We design a novel randomized derivative-free algorithm --- the stochastic three points (STP) method --- and analyze its iteration complexity. At each iteration, STP generates a random search direction according to a certain fixed probability law. Our assumptions on this law are very mild: roughly speaking, all laws which do not concentrate all measure on any halfspace passing through the origin will work. For instance, we allow for the uniform distribution on the sphere and also distributions that concentrate all measure on a positive spanning set.
Given a current iterate $x$, STP compares the objective function at three points: $x$, $x+αs$ and $x-αs$, where $α>0$ is a stepsize parameter and $s$ is the random search direction. The best of these three points is the next iterate. We analyze the method STP under several stepsize selection schemes (fixed, decreasing, estimated through finite differences, etc). We study non-convex, convex and strongly convex cases.
△ Less
Submitted 7 May, 2019; v1 submitted 10 February, 2019;
originally announced February 2019.
-
A Stochastic Derivative-Free Optimization Method with Importance Sampling: Theory and Learning to Control
Authors:
Adel Bibi,
El Houcine Bergou,
Ozan Sener,
Bernard Ghanem,
Peter Richtárik
Abstract:
We consider the problem of unconstrained minimization of a smooth objective function in $\R^n$ in a setting where only function evaluations are possible. While importance sampling is one of the most popular techniques used by machine learning practitioners to accelerate the convergence of their models when applicable, there is not much existing theory for this acceleration in the derivative-free s…
▽ More
We consider the problem of unconstrained minimization of a smooth objective function in $\R^n$ in a setting where only function evaluations are possible. While importance sampling is one of the most popular techniques used by machine learning practitioners to accelerate the convergence of their models when applicable, there is not much existing theory for this acceleration in the derivative-free setting. In this paper, we propose the first derivative free optimization method with importance sampling and derive new improved complexity results on non-convex, convex and strongly convex functions. We conduct extensive experiments on various synthetic and real LIBSVM datasets confirming our theoretical results. We further test our method on a collection of continuous control tasks on MuJoCo environments with varying difficulty. Experiments suggest that our algorithm is practical for high dimensional continuous control problems where importance sampling results in a significant sample complexity improvement.
△ Less
Submitted 2 April, 2020; v1 submitted 4 February, 2019;
originally announced February 2019.
-
A Line-Search Algorithm Inspired by the Adaptive Cubic Regularization Framework and Complexity Analysis
Authors:
El houcine Bergou,
Youssef Diouane,
Serge Gratton
Abstract:
Adaptive regularized framework using cubics has emerged as an alternative to line-search and trust-region algorithms for smooth nonconvex optimization, with an optimal complexity amongst second-order methods. In this paper, we propose and analyze the use of an iteration dependent scaled norm in the adaptive regularized framework using cubics. Within such scaled norm, the obtained method behaves as…
▽ More
Adaptive regularized framework using cubics has emerged as an alternative to line-search and trust-region algorithms for smooth nonconvex optimization, with an optimal complexity amongst second-order methods. In this paper, we propose and analyze the use of an iteration dependent scaled norm in the adaptive regularized framework using cubics. Within such scaled norm, the obtained method behaves as a line-search algorithm along the quasi-Newton direction with a special backtracking strategy. Under appropriate assumptions, the new algorithm enjoys the same convergence and complexity properties as adaptive regularized algorithm using cubics. The complexity for finding an approximate first-order stationary point can be improved to be optimal whenever a second order version of the proposed algorithm is regarded. In a similar way, using the same scaled norm to define the trust-region neighborhood, we show that the trust-region algorithm behaves as a line-search algorithm. The good potential of the obtained algorithms is shown on a set of large scale optimization problems.
△ Less
Submitted 29 May, 2018;
originally announced May 2018.
-
On the Convergence of a Non-linear Ensemble Kalman Smoother
Authors:
El houcine Bergou,
Serge Gratton,
Jan Mandel
Abstract:
Ensemble methods, such as the ensemble Kalman filter (EnKF), the local ensemble transform Kalman filter (LETKF), and the ensemble Kalman smoother (EnKS) are widely used in sequential data assimilation, where state vectors are of huge dimension. Little is known, however, about the asymptotic behavior of ensemble methods. In this paper, we prove convergence in L^p of ensemble Kalman smoother to the…
▽ More
Ensemble methods, such as the ensemble Kalman filter (EnKF), the local ensemble transform Kalman filter (LETKF), and the ensemble Kalman smoother (EnKS) are widely used in sequential data assimilation, where state vectors are of huge dimension. Little is known, however, about the asymptotic behavior of ensemble methods. In this paper, we prove convergence in L^p of ensemble Kalman smoother to the Kalman smoother in the large-ensemble limit, as well as the convergence of EnKS-4DVAR, which is a Levenberg-Marquardt-like algorithm with EnKS as the linear solver, to the classical Levenberg-Marquardt algorithm in which the linearized problem is solved exactly.
△ Less
Submitted 17 November, 2014;
originally announced November 2014.