-
Probabilistic Approach to Black-Box Binary Optimization with Budget Constraints: Application to Sensor Placement
Authors:
Ahmed Attia
Abstract:
We present a fully probabilistic approach for solving binary optimization problems with black-box objective functions and with budget constraints. In the probabilistic approach, the optimization variable is viewed as a random variable and is associated with a parametric probability distribution. The original optimization problem is replaced with an optimization over the expected value of the origi…
▽ More
We present a fully probabilistic approach for solving binary optimization problems with black-box objective functions and with budget constraints. In the probabilistic approach, the optimization variable is viewed as a random variable and is associated with a parametric probability distribution. The original optimization problem is replaced with an optimization over the expected value of the original objective, which is then optimized over the probability distribution parameters. The resulting optimal parameter (optimal policy) is used to sample the binary space to produce estimates of the optimal solution(s) of the original binary optimization problem. The probability distribution is chosen from the family of Bernoulli models because the optimization variable is binary. The optimization constraints generally restrict the feasibility region. This can be achieved by modeling the random variable with a conditional distribution given satisfiability of the constraints. Thus, in this work we develop conditional Bernoulli distributions to model the random variable conditioned by the total number of nonzero entries, that is, the budget constraint. This approach (a) is generally applicable to binary optimization problems with nonstochastic black-box objective functions and budget constraints; (b) accounts for budget constraints by employing conditional probabilities that sample only the feasible region and thus considerably reduces the computational cost compared with employing soft constraints; and (c) does not employ soft constraints and thus does not require tuning of a regularization parameter, for example to promote sparsity, which is challenging in sensor placement optimization problems. The proposed approach is verified numerically by using an idealized bilinear binary optimization problem and is validated by using a sensor placement experiment in a parameter identification setup.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Continued Pretraining for Domain Adaptation of Wav2vec2.0 in Automatic Speech Recognition for Elementary Math Classroom Settings
Authors:
Ahmed Adel Attia,
Dorottya Demszky,
Tolulope Ogunremi,
**g Liu,
Carol Espy-Wilson
Abstract:
Creating Automatic Speech Recognition (ASR) systems that are robust and resilient to classroom conditions is paramount to the development of AI tools to aid teachers and students. In this work, we study the efficacy of continued pretraining (CPT) in adapting Wav2vec2.0 to the classroom domain. We show that CPT is a powerful tool in that regard and reduces the Word Error Rate (WER) of Wav2vec2.0-ba…
▽ More
Creating Automatic Speech Recognition (ASR) systems that are robust and resilient to classroom conditions is paramount to the development of AI tools to aid teachers and students. In this work, we study the efficacy of continued pretraining (CPT) in adapting Wav2vec2.0 to the classroom domain. We show that CPT is a powerful tool in that regard and reduces the Word Error Rate (WER) of Wav2vec2.0-based models by upwards of 10%. More specifically, CPT improves the model's robustness to different noises, microphones, classroom conditions as well as classroom demographics. Our CPT models show improved ability to generalize to different demographics unseen in the labeled finetuning data.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
New upper bound of muon neutrino mass in a short-baseline experiment
Authors:
A. M. Attia,
I. G. Márián,
B. Ujvári
Abstract:
In the paper Int.J.Mod.Phys.E 23 (2014) 1450004, the potential of short-baseline experiments was proposed to measure the mass (and parameters of Lorentz-violating effects) of the muon neutrino, where a roughly estimated upper bound of 420 eV was given as a possibility with large unknown uncertainties. In the present work, we improve upon this study by focusing on a feasible and improved experiment…
▽ More
In the paper Int.J.Mod.Phys.E 23 (2014) 1450004, the potential of short-baseline experiments was proposed to measure the mass (and parameters of Lorentz-violating effects) of the muon neutrino, where a roughly estimated upper bound of 420 eV was given as a possibility with large unknown uncertainties. In the present work, we improve upon this study by focusing on a feasible and improved experimental setup with today's technology, eliminating most large uncertainties, with the use of the Geant4 simulation toolkit. High-energy protons collide with a tungsten target, producing a variety of particles, most importantly pions that decay into muon neutrinos. The detector records the time of flight for both muon and anti-muon neutrinos, utilizing light as a reference signal. Additionally, it captures the energy deposited by neutrinos. By applying the dispersion relation, we determine the muon and/or anti-muon neutrino mass. Our improved results reveal a less optimistic but more accurate and realistic estimated upper bound of the muon neutrino mass, providing a new limit of about 150 keV. Notably, this finding is a factor of three lower than the best upper bound previously established in the literature originating from pion decay in flight.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Benchmarking formalisms for dynamic structure system Modeling and Simulation
Authors:
Aya Attia,
Clément Foucher,
Luiz Fernando Lavado Villa
Abstract:
Modeling and simulation of complex systems is key to explore systems dynamics. Many scientific approaches were developed to represent dynamic structure systems but most of these approaches are efficient for some kinds of systems and inefficient for others. Which approach can be adopted for different dynamic structure systems categories is a topic of interest for many researchers and until now has…
▽ More
Modeling and simulation of complex systems is key to explore systems dynamics. Many scientific approaches were developed to represent dynamic structure systems but most of these approaches are efficient for some kinds of systems and inefficient for others. Which approach can be adopted for different dynamic structure systems categories is a topic of interest for many researchers and until now has not been fully resolved. Therefore it is essential to explore the existing approaches, understand them, and identify gaps. To fulfil this goal, we identified criteria at stake for a smooth flow from model creation to its simulation for dynamic structure systems. Using these criteria, we benchmark the existing modeling formalisms focusing more on DEVS extensions, and use the results to identify approaches gaps and discuss them.
△ Less
Submitted 25 January, 2024;
originally announced April 2024.
-
A Note on High-Probability Analysis of Algorithms with Exponential, Sub-Gaussian, and General Light Tails
Authors:
Amit Attia,
Tomer Koren
Abstract:
This short note describes a simple technique for analyzing probabilistic algorithms that rely on a light-tailed (but not necessarily bounded) source of randomization. We show that the analysis of such an algorithm can be reduced, in a black-box manner and with only a small loss in logarithmic factors, to an analysis of a simpler variant of the same algorithm that uses bounded random variables and…
▽ More
This short note describes a simple technique for analyzing probabilistic algorithms that rely on a light-tailed (but not necessarily bounded) source of randomization. We show that the analysis of such an algorithm can be reduced, in a black-box manner and with only a small loss in logarithmic factors, to an analysis of a simpler variant of the same algorithm that uses bounded random variables and often easier to analyze. This approach simultaneously applies to any light-tailed randomization, including exponential, sub-Gaussian, and more general fast-decaying distributions, without needing to appeal to specialized concentration inequalities. Analyses of a generalized Azuma inequality and stochastic optimization with general light-tailed noise are provided to illustrate the technique.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
How Free is Parameter-Free Stochastic Optimization?
Authors:
Amit Attia,
Tomer Koren
Abstract:
We study the problem of parameter-free stochastic optimization, inquiring whether, and under what conditions, do fully parameter-free methods exist: these are methods that achieve convergence rates competitive with optimally tuned methods, without requiring significant knowledge of the true problem parameters. Existing parameter-free methods can only be considered ``partially'' parameter-free, as…
▽ More
We study the problem of parameter-free stochastic optimization, inquiring whether, and under what conditions, do fully parameter-free methods exist: these are methods that achieve convergence rates competitive with optimally tuned methods, without requiring significant knowledge of the true problem parameters. Existing parameter-free methods can only be considered ``partially'' parameter-free, as they require some non-trivial knowledge of the true problem parameters, such as a bound on the stochastic gradient norms, a bound on the distance to a minimizer, etc. In the non-convex setting, we demonstrate that a simple hyperparameter search technique results in a fully parameter-free method that outperforms more sophisticated state-of-the-art algorithms. We also provide a similar result in the convex setting with access to noisy function values under mild noise assumptions. Finally, assuming only access to stochastic gradients, we establish a lower bound that renders fully parameter-free stochastic convex optimization infeasible, and provide a method which is (partially) parameter-free up to the limit indicated by our lower bound.
△ Less
Submitted 18 March, 2024; v1 submitted 5 February, 2024;
originally announced February 2024.
-
Centralized calibration of power system dynamic models using variational data assimilation
Authors:
Ahmed Attia,
D. Adrian Maldonado,
Emil Constantinescu,
Mihai Anitescu
Abstract:
This paper presents a novel centralized, variational data assimilation approach for calibrating transient dynamic models in electrical power systems, focusing on load model parameters. With the increasing importance of inverter-based resources, assessing power systems' dynamic performance under disturbances has become challenging, necessitating robust model calibration methods. The proposed approa…
▽ More
This paper presents a novel centralized, variational data assimilation approach for calibrating transient dynamic models in electrical power systems, focusing on load model parameters. With the increasing importance of inverter-based resources, assessing power systems' dynamic performance under disturbances has become challenging, necessitating robust model calibration methods. The proposed approach expands on previous Bayesian frameworks by establishing a posterior distribution of parameters using an approximation around the maximum a posteriori value. We illustrate the efficacy of our method by generating events of varying intensity, highlighting its ability to capture the systems' evolution accurately and with associated uncertainty estimates. This research improves the precision of dynamic performance assessments in modern power systems, with potential applications in managing uncertainties and optimizing system operations.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Heuristic Algorithms for Placing Geomagnetically Induced Current Blocking Devices
Authors:
Minseok Ryu,
Ahmed Attia,
Arthur Barnes,
Russell Bent,
Sven Leyffer,
Adam Mate
Abstract:
We propose a new heuristic approach for solving the challenge of determining optimal placements for geomagnetically induced current blocking devices on electrical grids. Traditionally, these determinations are approached by formulating the problem as mixed-integer nonlinear programming models and solving them using optimization solvers based on the spatial branch-and-bound algorithm.However, compu…
▽ More
We propose a new heuristic approach for solving the challenge of determining optimal placements for geomagnetically induced current blocking devices on electrical grids. Traditionally, these determinations are approached by formulating the problem as mixed-integer nonlinear programming models and solving them using optimization solvers based on the spatial branch-and-bound algorithm.However, computing an optimal solution using the solvers often demands substantial computational time due to their inability to leverage the inherent problem structure. Therefore, in this work we propose a new heuristic approach based on a three-block alternating direction method of multipliers algorithm, and we compare it with an existing stochastic learning algorithm. Both heuristics exploit the structure of the problem of interest. We test these heuristic approaches through extensive numerical experiments conducted on the EPRI-21 and UIUC-150 test systems. The outcomes showcase the superior performance of our methodologies in terms of both solution quality and computational speed when compared with conventional solvers.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
Improving Speech Inversion Through Self-Supervised Embeddings and Enhanced Tract Variables
Authors:
Ahmed Adel Attia,
Yashish M. Siriwardena,
Carol Espy-Wilson
Abstract:
The performance of deep learning models depends significantly on their capacity to encode input features efficiently and decode them into meaningful outputs. Better input and output representation has the potential to boost models' performance and generalization. In the context of acoustic-to-articulatory speech inversion (SI) systems, we study the impact of utilizing speech representations acquir…
▽ More
The performance of deep learning models depends significantly on their capacity to encode input features efficiently and decode them into meaningful outputs. Better input and output representation has the potential to boost models' performance and generalization. In the context of acoustic-to-articulatory speech inversion (SI) systems, we study the impact of utilizing speech representations acquired via self-supervised learning (SSL) models, such as HuBERT compared to conventional acoustic features. Additionally, we investigate the incorporation of novel tract variables (TVs) through an improved geometric transformation model. By combining these two approaches, we improve the Pearson product-moment correlation (PPMC) scores which evaluate the accuracy of TV estimation of the SI system from 0.7452 to 0.8141, a 6.9% increase. Our findings underscore the profound influence of rich feature representations from SSL models and improved geometric transformations with target TVs on the enhanced functionality of SI systems.
△ Less
Submitted 17 September, 2023;
originally announced September 2023.
-
Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults
Authors:
Ahmed Adel Attia,
**g Liu,
Wei Ai,
Dorottya Demszky,
Carol Espy-Wilson
Abstract:
Recent advancements in Automatic Speech Recognition (ASR) systems, exemplified by Whisper, have demonstrated the potential of these systems to approach human-level performance given sufficient data. However, this progress doesn't readily extend to ASR for children due to the limited availability of suitable child-specific databases and the distinct characteristics of children's speech. A recent st…
▽ More
Recent advancements in Automatic Speech Recognition (ASR) systems, exemplified by Whisper, have demonstrated the potential of these systems to approach human-level performance given sufficient data. However, this progress doesn't readily extend to ASR for children due to the limited availability of suitable child-specific databases and the distinct characteristics of children's speech. A recent study investigated leveraging the My Science Tutor (MyST) children's speech corpus to enhance Whisper's performance in recognizing children's speech. They were able to demonstrate some improvement on a limited testset. This paper builds on these findings by enhancing the utility of the MyST dataset through more efficient data preprocessing. We reduce the Word Error Rate (WER) on the MyST testset 13.93% to 9.11% with Whisper-Small and from 13.23% to 8.61% with Whisper-Medium and show that this improvement can be generalized to unseen datasets. We also highlight important challenges towards improving children's ASR performance. The results showcase the viable and efficient integration of Whisper for effective children's speech recognition.
△ Less
Submitted 15 May, 2024; v1 submitted 12 September, 2023;
originally announced September 2023.
-
Enhancing Speech Articulation Analysis using a Geometric Transformation of the X-ray Microbeam Dataset
Authors:
Ahmed Adel Attia,
Mark Tiede,
Carol Y. Espy-Wilson
Abstract:
Accurate analysis of speech articulation is crucial for speech analysis. However, X-Y coordinates of articulators strongly depend on the anatomy of the speakers and the variability of pellet placements, and existing methods for map** anatomical landmarks in the X-ray Microbeam Dataset (XRMB) fail to capture the entire anatomy of the vocal tract. In this paper, we propose a new geometric transfor…
▽ More
Accurate analysis of speech articulation is crucial for speech analysis. However, X-Y coordinates of articulators strongly depend on the anatomy of the speakers and the variability of pellet placements, and existing methods for map** anatomical landmarks in the X-ray Microbeam Dataset (XRMB) fail to capture the entire anatomy of the vocal tract. In this paper, we propose a new geometric transformation that improves the accuracy of these measurements. Our transformation maps anatomical landmarks' X-Y coordinates along the midsagittal plane onto six relative measures: Lip Aperture (LA), Lip Protusion (LP), Tongue Body Constriction Location (TTCL), Degree (TBCD), Tongue Tip Constriction Location (TTCL) and Degree (TTCD). Our novel contribution is the extension of the palate trace towards the inferred anterior pharyngeal line, which improves measurements of tongue body constriction.
△ Less
Submitted 28 September, 2023; v1 submitted 18 May, 2023;
originally announced May 2023.
-
Robust A-Optimal Experimental Design for Bayesian Inverse Problems
Authors:
Ahmed Attia,
Sven Leyffer,
Todd Munson
Abstract:
Optimal design of experiments for Bayesian inverse problems has recently gained wide popularity and attracted much attention, especially in the computational science and Bayesian inversion communities. An optimal design maximizes a predefined utility function that is formulated in terms of the elements of an inverse problem, an example being optimal sensor placement for parameter identification. T…
▽ More
Optimal design of experiments for Bayesian inverse problems has recently gained wide popularity and attracted much attention, especially in the computational science and Bayesian inversion communities. An optimal design maximizes a predefined utility function that is formulated in terms of the elements of an inverse problem, an example being optimal sensor placement for parameter identification. The state-of-the-art algorithmic approaches following this simple formulation generally overlook misspecification of the elements of the inverse problem, such as the prior or the measurement uncertainties. This work presents an efficient algorithmic approach for designing optimal experimental design schemes for Bayesian inverse problems such that the optimal design is robust to misspecification of elements of the inverse problem. Specifically, we consider a worst-case scenario approach for the uncertain or misspecified parameters, formulate robust objectives, and propose an algorithmic approach for optimizing such objectives. Both relaxation and stochastic solution approaches are discussed with detailed analysis and insight into the interpretation of the problem and the proposed algorithmic approach. Extensive numerical experiments to validate and analyze the proposed approach are carried out for sensor placement in a parameter identification problem.
△ Less
Submitted 5 May, 2023;
originally announced May 2023.
-
SGD with AdaGrad Stepsizes: Full Adaptivity with High Probability to Unknown Parameters, Unbounded Gradients and Affine Variance
Authors:
Amit Attia,
Tomer Koren
Abstract:
We study Stochastic Gradient Descent with AdaGrad stepsizes: a popular adaptive (self-tuning) method for first-order stochastic optimization. Despite being well studied, existing analyses of this method suffer from various shortcomings: they either assume some knowledge of the problem parameters, impose strong global Lipschitz conditions, or fail to give bounds that hold with high probability. We…
▽ More
We study Stochastic Gradient Descent with AdaGrad stepsizes: a popular adaptive (self-tuning) method for first-order stochastic optimization. Despite being well studied, existing analyses of this method suffer from various shortcomings: they either assume some knowledge of the problem parameters, impose strong global Lipschitz conditions, or fail to give bounds that hold with high probability. We provide a comprehensive analysis of this basic method without any of these limitations, in both the convex and non-convex (smooth) cases, that additionally supports a general ``affine variance'' noise model and provides sharp rates of convergence in both the low-noise and high-noise~regimes.
△ Less
Submitted 11 June, 2023; v1 submitted 17 February, 2023;
originally announced February 2023.
-
PyOED: An Extensible Suite for Data Assimilation and Model-Constrained Optimal Design of Experiments
Authors:
Abhijit Chowdhary,
Shady E. Ahmed,
Ahmed Attia
Abstract:
This paper describes PyOED, a highly extensible scientific package that enables develo** and testing model-constrained optimal experimental design (OED) for inverse problems. Specifically, PyOED aims to be a comprehensive Python toolkit for model-constrained OED. The package targets scientists and researchers interested in understanding the details of OED formulations and approaches. It is also…
▽ More
This paper describes PyOED, a highly extensible scientific package that enables develo** and testing model-constrained optimal experimental design (OED) for inverse problems. Specifically, PyOED aims to be a comprehensive Python toolkit for model-constrained OED. The package targets scientists and researchers interested in understanding the details of OED formulations and approaches. It is also meant to enable researchers to experiment with standard and innovative OED technologies with a wide range of test problems (e.g., simulation models). OED, inverse problems (e.g., Bayesian inversion), and data assimilation (DA) are closely related research fields, and their formulations overlap significantly. Thus, PyOED is continuously being expanded with a plethora of Bayesian inversion, DA, and OED methods as well as new scientific simulation models, observation error models, and observation operators. These pieces are added such that they can be permuted to enable testing OED methods in various settings of varying complexities. The PyOED core is completely written in Python and utilizes the inherent object-oriented capabilities; however, the current version of PyOED is meant to be extensible rather than scalable. Specifically, PyOED is developed to enable rapid development and benchmarking of OED methods with minimal coding effort and to maximize code reutilization. This paper provides a brief description of the PyOED layout and philosophy and provides a set of exemplary test cases and tutorials to demonstrate the potential of the package.
△ Less
Submitted 19 December, 2023; v1 submitted 19 January, 2023;
originally announced January 2023.
-
Masked Autoencoders Are Articulatory Learners
Authors:
Ahmed Adel Attia,
Carol Espy-Wilson
Abstract:
Articulatory recordings track the positions and motion of different articulators along the vocal tract and are widely used to study speech production and to develop speech technologies such as articulatory based speech synthesizers and speech inversion systems. The University of Wisconsin X-Ray microbeam (XRMB) dataset is one of various datasets that provide articulatory recordings synced with aud…
▽ More
Articulatory recordings track the positions and motion of different articulators along the vocal tract and are widely used to study speech production and to develop speech technologies such as articulatory based speech synthesizers and speech inversion systems. The University of Wisconsin X-Ray microbeam (XRMB) dataset is one of various datasets that provide articulatory recordings synced with audio recordings. The XRMB articulatory recordings employ pellets placed on a number of articulators which can be tracked by the microbeam. However, a significant portion of the articulatory recordings are mistracked, and have been so far unsuable. In this work, we present a deep learning based approach using Masked Autoencoders to accurately reconstruct the mistracked articulatory recordings for 41 out of 47 speakers of the XRMB dataset. Our model is able to reconstruct articulatory trajectories that closely match ground truth, even when three out of eight articulators are mistracked, and retrieve 3.28 out of 3.4 hours of previously unusable recordings.
△ Less
Submitted 18 May, 2023; v1 submitted 27 October, 2022;
originally announced October 2022.
-
Uniform Stability for First-Order Empirical Risk Minimization
Authors:
Amit Attia,
Tomer Koren
Abstract:
We consider the problem of designing uniformly stable first-order optimization algorithms for empirical risk minimization. Uniform stability is often used to obtain generalization error bounds for optimization algorithms, and we are interested in a general approach to achieve it. For Euclidean geometry, we suggest a black-box conversion which given a smooth optimization algorithm, produces a unifo…
▽ More
We consider the problem of designing uniformly stable first-order optimization algorithms for empirical risk minimization. Uniform stability is often used to obtain generalization error bounds for optimization algorithms, and we are interested in a general approach to achieve it. For Euclidean geometry, we suggest a black-box conversion which given a smooth optimization algorithm, produces a uniformly stable version of the algorithm while maintaining its convergence rate up to logarithmic factors. Using this reduction we obtain a (nearly) optimal algorithm for smooth optimization with convergence rate $\widetilde{O}(1/T^2)$ and uniform stability $O(T^2/n)$, resolving an open problem of Chen et al. (2018); Attia and Koren (2021). For more general geometries, we develop a variant of Mirror Descent for smooth optimization with convergence rate $\widetilde{O}(1/T)$ and uniform stability $O(T/n)$, leaving open the question of devising a general conversion method as in the Euclidean case.
△ Less
Submitted 17 July, 2022;
originally announced July 2022.
-
Audio Data Augmentation for Acoustic-to-articulatory Speech Inversion using Bidirectional Gated RNNs
Authors:
Yashish M. Siriwardena,
Ahmed Adel Attia,
Ganesh Sivaraman,
Carol Espy-Wilson
Abstract:
Data augmentation has proven to be a promising prospect in improving the performance of deep learning models by adding variability to training data. In previous work with develo** a noise robust acoustic-to-articulatory speech inversion system, we have shown the importance of noise augmentation to improve the performance of speech inversion in noisy speech. In this work, we compare and contrast…
▽ More
Data augmentation has proven to be a promising prospect in improving the performance of deep learning models by adding variability to training data. In previous work with develo** a noise robust acoustic-to-articulatory speech inversion system, we have shown the importance of noise augmentation to improve the performance of speech inversion in noisy speech. In this work, we compare and contrast different ways of doing data augmentation and show how this technique improves the performance of articulatory speech inversion not only on noisy speech, but also on clean speech data. We also propose a Bidirectional Gated Recurrent Neural Network as the speech inversion system instead of the previously used feed forward neural network. The inversion system uses mel-frequency cepstral coefficients (MFCCs) as the input acoustic features and six vocal tract-variables (TVs) as the output articulatory features. The Performance of the system was measured by computing the correlation between estimated and actual TVs on the U. Wisc. X-ray Microbeam database. The proposed speech inversion system shows a 5% relative improvement in correlation over the baseline noise robust system for clean speech data. The pre-trained model, when adapted to each unseen speaker in the test set, improves the average correlation by another 6%.
△ Less
Submitted 31 May, 2023; v1 submitted 25 May, 2022;
originally announced May 2022.
-
A Hybrid Genetic-Fuzzy Controller for a 14-inches Astronomical Telescope Tracking
Authors:
Doaa Eid,
Abdel-Fattah Attia,
Said Elmasry,
Islam Helmy
Abstract:
The performance of on telescope depend strongly on its operating conditions. During pointing the telescope can move at a relatively high velocity, and the system can tolerate trajectory position errors higher than during tracking. On the contrary, during tracking Alt-Az telescopes generally move slower but still in a large dynamic range. In this case, the position errors must be as close to zero a…
▽ More
The performance of on telescope depend strongly on its operating conditions. During pointing the telescope can move at a relatively high velocity, and the system can tolerate trajectory position errors higher than during tracking. On the contrary, during tracking Alt-Az telescopes generally move slower but still in a large dynamic range. In this case, the position errors must be as close to zero as possible. Tracking is one of the essential factors that affect the quality of astronomical observations. In this paper, a hybrid Genetic-Fuzzy approach to control the movement of a two-link direct-drive Celestron telescope is introduced. The proposed controller uses the Genetic algorithm (GA) for optimizing a fuzzy logic controller (FLC) to improve the tracking of the 14-inches Celestron telescope of the Kottamia Astronomical Observatory (KAO). The fuzzy logic input is a vector of the position error and its rate of change, and the output is torque. The GA objective function used here is the Integral Time Absolute Error (ITAE). The proposed method is compared with a conventional Proportional-Differential (PD) controller, an optimized PD controller with a GA, and a Fuzzy controller. The results show the effectiveness of the proposed controller to improve the dynamic response of the overall system.
△ Less
Submitted 22 June, 2021;
originally announced June 2021.
-
Algorithmic Instabilities of Accelerated Gradient Descent
Authors:
Amit Attia,
Tomer Koren
Abstract:
We study the algorithmic stability of Nesterov's accelerated gradient method. For convex quadratic objectives, Chen et al. (2018) proved that the uniform stability of the method grows quadratically with the number of optimization steps, and conjectured that the same is true for the general convex and smooth case. We disprove this conjecture and show, for two notions of algorithmic stability (inclu…
▽ More
We study the algorithmic stability of Nesterov's accelerated gradient method. For convex quadratic objectives, Chen et al. (2018) proved that the uniform stability of the method grows quadratically with the number of optimization steps, and conjectured that the same is true for the general convex and smooth case. We disprove this conjecture and show, for two notions of algorithmic stability (including uniform stability), that the stability of Nesterov's accelerated method in fact deteriorates exponentially fast with the number of gradient steps. This stands in sharp contrast to the bounds in the quadratic case, but also to known results for non-accelerated gradient methods where stability typically grows linearly with the number of steps.
△ Less
Submitted 19 June, 2021; v1 submitted 3 February, 2021;
originally announced February 2021.
-
Stochastic Learning Approach to Binary Optimization for Optimal Design of Experiments
Authors:
Ahmed Attia,
Sven Leyffer,
Todd Munson
Abstract:
We present a novel stochastic approach to binary optimization for optimal experimental design (OED) for Bayesian inverse problems governed by mathematical models such as partial differential equations. The OED utility function, namely, the regularized optimality criterion, is cast into a stochastic objective function in the form of an expectation over a multivariate Bernoulli distribution. The pro…
▽ More
We present a novel stochastic approach to binary optimization for optimal experimental design (OED) for Bayesian inverse problems governed by mathematical models such as partial differential equations. The OED utility function, namely, the regularized optimality criterion, is cast into a stochastic objective function in the form of an expectation over a multivariate Bernoulli distribution. The probabilistic objective is then solved by using a stochastic optimization routine to find an optimal observational policy. The proposed approach is analyzed from an optimization perspective and also from a machine learning perspective with correspondence to policy gradient reinforcement learning. The approach is demonstrated numerically by using an idealized two-dimensional Bayesian linear inverse problem, and validated by extensive numerical experiments carried out for sensor placement in a parameter identification setup.
△ Less
Submitted 14 January, 2021;
originally announced January 2021.
-
Optimal Experimental Design for Inverse Problems in the Presence of Observation Correlations
Authors:
Ahmed Attia,
Emil Constantinescu
Abstract:
Optimal experimental design (OED) is the general formalism of sensor placement and decisions about the data collection strategy for engineered or natural experiments. This approach is prevalent in many critical fields such as battery design, numerical weather prediction, geosciences, and environmental and urban studies. State-of-the-art computational methods for experimental design, however, do no…
▽ More
Optimal experimental design (OED) is the general formalism of sensor placement and decisions about the data collection strategy for engineered or natural experiments. This approach is prevalent in many critical fields such as battery design, numerical weather prediction, geosciences, and environmental and urban studies. State-of-the-art computational methods for experimental design, however, do not accommodate correlation structure in observational errors produced by many expensive-to-operate devices such as X-ray machines or radar and satellite retrievals. Discarding evident data correlations leads to biased results, poor data collection decisions, and waste of valuable resources. We present a general formulation of the OED formalism for model-constrained large-scale Bayesian linear inverse problems, where measurement errors are generally correlated. The proposed approach utilizes the Hadamard product of matrices to formulate the weighted likelihood and is valid for both finite- and infinite- dimensional Bayesian inverse problems. We also discuss widely used approaches for relaxation of the binary OED problem, in light of the proposed pointwise weighting approach, and present a clear interpretation of the relaxed design and its effect on the observational error covariance. Extensive numerical experiments are carried out for empirical verification of the proposed approach by using an advection-diffusion model, where the objective is to optimally place a small set of sensors, under a limited budget, to predict the concentration of a contaminant in a bounded domain.
△ Less
Submitted 25 June, 2022; v1 submitted 28 July, 2020;
originally announced July 2020.
-
Asymmetric Leaky Private Information Retrieval
Authors:
Islam Samy,
Mohamed A. Attia,
Ravi Tandon,
Loukas Lazos
Abstract:
Information-theoretic formulations of the private information retrieval (PIR) problem have been investigated under a variety of scenarios. Symmetric private information retrieval (SPIR) is a variant where a user is able to privately retrieve one out of $K$ messages from $N$ non-colluding replicated databases without learning anything about the remaining $K-1$ messages. However, the goal of perfect…
▽ More
Information-theoretic formulations of the private information retrieval (PIR) problem have been investigated under a variety of scenarios. Symmetric private information retrieval (SPIR) is a variant where a user is able to privately retrieve one out of $K$ messages from $N$ non-colluding replicated databases without learning anything about the remaining $K-1$ messages. However, the goal of perfect privacy can be too taxing for certain applications. In this paper, we investigate if the information-theoretic capacity of SPIR (equivalently, the inverse of the minimum download cost) can be increased by relaxing both user and DB privacy definitions. Such relaxation is relevant in applications where privacy can be traded for communication efficiency. We introduce and investigate the Asymmetric Leaky PIR (AL-PIR) model with different privacy leakage budgets in each direction. For user privacy leakage, we bound the probability ratios between all possible realizations of DB queries by a function of a non-negative constant $ε$. For DB privacy, we bound the mutual information between the undesired messages, the queries, and the answers, by a function of a non-negative constant $δ$. We propose a general AL-PIR scheme that achieves an upper bound on the optimal download cost for arbitrary $ε$ and $δ$. We show that the optimal download cost of AL-PIR is upper-bounded as $D^{*}(ε,δ)\leq 1+\frac{1}{N-1}-\frac{δe^ε}{N^{K-1}-1}$. Second, we obtain an information-theoretic lower bound on the download cost as $D^{*}(ε,δ)\geq 1+\frac{1}{Ne^ε-1}-\fracδ{(Ne^ε)^{K-1}-1}$. The gap analysis between the two bounds shows that our AL-PIR scheme is optimal when $ε=0$, i.e., under perfect user privacy and it is optimal within a maximum multiplicative gap of $\frac{N-e^{-ε}}{N-1}$ for any $(ε,δ)$.
△ Less
Submitted 4 June, 2020;
originally announced June 2020.
-
Latent-variable Private Information Retrieval
Authors:
Islam Samy,
Mohamed A. Attia,
Ravi Tandon,
Loukas Lazos
Abstract:
In many applications, content accessed by users (movies, videos, news articles, etc.) can leak sensitive latent attributes, such as religious and political views, sexual orientation, ethnicity, gender, and others. To prevent such information leakage, the goal of classical PIR is to hide the identity of the content/message being accessed, which subsequently also hides the latent attributes. This so…
▽ More
In many applications, content accessed by users (movies, videos, news articles, etc.) can leak sensitive latent attributes, such as religious and political views, sexual orientation, ethnicity, gender, and others. To prevent such information leakage, the goal of classical PIR is to hide the identity of the content/message being accessed, which subsequently also hides the latent attributes. This solution, while private, can be too costly, particularly, when perfect (information-theoretic) privacy constraints are imposed. For instance, for a single database holding $K$ messages, privately retrieving one message is possible if and only if the user downloads the entire database of $K$ messages. Retrieving content privately, however, may not be necessary to perfectly hide the latent attributes.
Motivated by the above, we formulate and study the problem of latent-variable private information retrieval (LV-PIR), which aims at allowing the user efficiently retrieve one out of $K$ messages (indexed by $θ$) without revealing any information about the latent variable (modeled by $S$). We focus on the practically relevant setting of a single database and show that one can significantly reduce the download cost of LV-PIR (compared to the classical PIR) based on the correlation between $θ$ and $S$. We present a general scheme for LV-PIR as a function of the statistical relationship between $θ$ and $S$, and also provide new results on the capacity/download cost of LV-PIR. Several open problems and new directions are also discussed.
△ Less
Submitted 14 May, 2020; v1 submitted 16 January, 2020;
originally announced January 2020.
-
An Optimal Experimental Design Framework for Adaptive Inflation and Covariance Localization for Ensemble Filters
Authors:
Ahmed Attia,
Emil Constantinescu
Abstract:
We develop an optimal experimental design framework for adapting the covariance inflation and localization in data assimilation problems. Covariance inflation and localization are ubiquitously employed to alleviate the effect of using ensembles of finite sizes in all practical data assimilation systems. The choice of both the inflation factor and the localization radius can have a significant impa…
▽ More
We develop an optimal experimental design framework for adapting the covariance inflation and localization in data assimilation problems. Covariance inflation and localization are ubiquitously employed to alleviate the effect of using ensembles of finite sizes in all practical data assimilation systems. The choice of both the inflation factor and the localization radius can have a significant impact on the performance of the assimilation scheme. These parameters are generally tuned by trial and error, rendering them expensive to optimize in practice. Spatially and temporally varying inflation parameter and localization radii have been recently proposed and have been empirically proven to enhance the performance of the employed assimilation filter. In this study, we present a variational framework for adaptive tuning of the inflation and localization parameters. Each of these parameters is optimized independently, with an objective to minimize the uncertainty in the posterior state. The proposed framework does not assume uncorrelated observations or prior errors and can in principle be applied without expert knowledge about the model and the observations. Thus, it is adequate for handling dense as well as sparse observational networks. We present the mathematical formulation, algorithmic description of the approach, and numerical experiments using the two-layer Lorenz-96 model.
△ Less
Submitted 24 March, 2019; v1 submitted 27 June, 2018;
originally announced June 2018.
-
The Capacity of Private Information Retrieval from Uncoded Storage Constrained Databases
Authors:
Mohamed Adel Attia,
Deepak Kumar,
Ravi Tandon
Abstract:
Private information retrieval (PIR) allows a user to retrieve a desired message from a set of databases without revealing the identity of the desired message. The replicated databases scenario was considered by Sun and Jafar, 2016, where $N$ databases can store the same $K$ messages completely. A PIR scheme was developed to achieve the optimal download cost given by…
▽ More
Private information retrieval (PIR) allows a user to retrieve a desired message from a set of databases without revealing the identity of the desired message. The replicated databases scenario was considered by Sun and Jafar, 2016, where $N$ databases can store the same $K$ messages completely. A PIR scheme was developed to achieve the optimal download cost given by $\left(1+ \frac{1}{N}+ \frac{1}{N^{2}}+ \cdots + \frac{1}{N^{K-1}}\right)$. In this work, we consider the problem of PIR from storage constrained databases. Each database has a storage capacity of $μKL$ bits, where $L$ is the size of each message in bits, and $μ\in [1/N, 1]$ is the normalized storage. On one extreme, $μ=1$ is the replicated databases case. On the other hand, when $μ= 1/N$, then in order to retrieve a message privately, the user has to download all the messages from the databases achieving a download cost of $1/K$. We aim to characterize the optimal download cost versus storage trade-off for any storage capacity in the range $μ\in [1/N, 1]$. For any $(N,K)$, we show that the optimal trade-off between storage, $μ$, and the download cost, $D(μ)$, is given by the lower convex hull of the $N$ pairs $\left(μ= \frac{t}{N},D(μ) = \left(1+ \frac{1}{t}+ \frac{1}{t^{2}}+ \cdots + \frac{1}{t^{K-1}}\right)\right)$ for $t=1,2,\ldots, N$. To prove this result, we first present the storage constrained PIR scheme for any $(N,K)$. We next obtain a general lower bound on the download cost for PIR, which is valid for the following storage scenarios: replicated or storage constrained, coded or uncoded, and fixed or optimized. We then specialize this bound using the uncoded storage assumption to obtain lower bounds matching the achievable download cost of the storage constrained PIR scheme for any value of the available storage.
△ Less
Submitted 23 October, 2018; v1 submitted 10 May, 2018;
originally announced May 2018.
-
Goal-Oriented Optimal Design of Experiments for Large-Scale Bayesian Linear Inverse Problems
Authors:
Ahmed Attia,
Alen Alexanderian,
Arvind K. Saibaba
Abstract:
We develop a framework for goal-oriented optimal design of experiments (GOODE) for large-scale Bayesian linear inverse problems governed by PDEs. This framework differs from classical Bayesian optimal design of experiments (ODE) in the following sense: we seek experimental designs that minimize the posterior uncertainty in the experiment end-goal, e.g., a quantity of interest (QoI), rather than th…
▽ More
We develop a framework for goal-oriented optimal design of experiments (GOODE) for large-scale Bayesian linear inverse problems governed by PDEs. This framework differs from classical Bayesian optimal design of experiments (ODE) in the following sense: we seek experimental designs that minimize the posterior uncertainty in the experiment end-goal, e.g., a quantity of interest (QoI), rather than the estimated parameter itself. This is suitable for scenarios in which the solution of an inverse problem is an intermediate step and the estimated parameter is then used to compute a QoI. In such problems, a GOODE approach has two benefits: the designs can avoid wastage of experimental resources by a targeted collection of data, and the resulting design criteria are computationally easier to evaluate due to the often low-dimensionality of the QoIs. We present two modified design criteria, A-GOODE and D-GOODE, which are natural analogues of classical Bayesian A- and D-optimal criteria. We analyze the connections to other ODE criteria, and provide interpretations for the GOODE criteria by using tools from information theory. Then, we develop an efficient gradient-based optimization framework for solving the GOODE optimization problems. Additionally, we present comprehensive numerical experiments testing the various aspects of the presented approach. The driving application is the optimal placement of sensors to identify the source of contaminants in a diffusion and transport problem. We enforce sparsity of the sensor placements using an $\ell_1$-norm penalty approach, and propose a practical strategy for specifying the associated penalty parameter.
△ Less
Submitted 11 June, 2018; v1 submitted 18 February, 2018;
originally announced February 2018.
-
Detecting and counting tiny faces
Authors:
Alexandre Attia,
Sharone Dayan
Abstract:
Finding Tiny Faces (by Hu and Ramanan) proposes a novel approach to find small objects in an image. Our contribution consists in deeply understanding the choices of the paper together with applying and extending a similar method to a real world subject which is the counting of people in a public demonstration.
Finding Tiny Faces (by Hu and Ramanan) proposes a novel approach to find small objects in an image. Our contribution consists in deeply understanding the choices of the paper together with applying and extending a similar method to a real world subject which is the counting of people in a public demonstration.
△ Less
Submitted 24 January, 2018; v1 submitted 19 January, 2018;
originally announced January 2018.
-
Global overview of Imitation Learning
Authors:
Alexandre Attia,
Sharone Dayan
Abstract:
Imitation Learning is a sequential task where the learner tries to mimic an expert's action in order to achieve the best performance. Several algorithms have been proposed recently for this task. In this project, we aim at proposing a wide review of these algorithms, presenting their main features and comparing them on their performance and their regret bounds.
Imitation Learning is a sequential task where the learner tries to mimic an expert's action in order to achieve the best performance. Several algorithms have been proposed recently for this task. In this project, we aim at proposing a wide review of these algorithms, presenting their main features and comparing them on their performance and their regret bounds.
△ Less
Submitted 19 January, 2018;
originally announced January 2018.
-
Detection and segmentation of the Left Ventricle in Cardiac MRI using Deep Learning
Authors:
Alexandre Attia,
Sharone Dayan
Abstract:
Manual segmentation of the Left Ventricle (LV) is a tedious and meticulous task that can vary depending on the patient, the Magnetic Resonance Images (MRI) cuts and the experts. Still today, we consider manual delineation done by experts as being the ground truth for cardiac diagnosticians. Thus, we are reviewing the paper - written by Avendi and al. - who presents a combined approach with Convolu…
▽ More
Manual segmentation of the Left Ventricle (LV) is a tedious and meticulous task that can vary depending on the patient, the Magnetic Resonance Images (MRI) cuts and the experts. Still today, we consider manual delineation done by experts as being the ground truth for cardiac diagnosticians. Thus, we are reviewing the paper - written by Avendi and al. - who presents a combined approach with Convolutional Neural Networks, Stacked Auto-Encoders and Deformable Models, to try and automate the segmentation while performing more accurately. Furthermore, we have implemented parts of the paper (around three quarts) and experimented both the original method and slightly modified versions when changing the architecture and the parameters.
△ Less
Submitted 7 January, 2018;
originally announced January 2018.
-
Near Optimal Coded Data Shuffling for Distributed Learning
Authors:
Mohamed A. Attia,
Ravi Tandon
Abstract:
Data shuffling between distributed cluster of nodes is one of the critical steps in implementing large-scale learning algorithms. Randomly shuffling the data-set among a cluster of workers allows different nodes to obtain fresh data assignments at each learning epoch. This process has been shown to provide improvements in the learning process. However, the statistical benefits of distributed data…
▽ More
Data shuffling between distributed cluster of nodes is one of the critical steps in implementing large-scale learning algorithms. Randomly shuffling the data-set among a cluster of workers allows different nodes to obtain fresh data assignments at each learning epoch. This process has been shown to provide improvements in the learning process. However, the statistical benefits of distributed data shuffling come at the cost of extra communication overhead from the master node to worker nodes, and can act as one of the major bottlenecks in the overall time for computation. There has been significant recent interest in devising approaches to minimize this communication overhead. One approach is to provision for extra storage at the computing nodes. The other emerging approach is to leverage coded communication to minimize the overall communication overhead.
The focus of this work is to understand the fundamental trade-off between the amount of storage and the communication overhead for distributed data shuffling. In this work, we first present an information theoretic formulation for the data shuffling problem, accounting for the underlying problem parameters (number of workers, $K$, number of data points, $N$, and the available storage, $S$ per node). We then present an information theoretic lower bound on the communication overhead for data shuffling as a function of these parameters. We next present a novel coded communication scheme and show that the resulting communication overhead of the proposed scheme is within a multiplicative factor of at most $\frac{K}{K-1}$ from the information-theoretic lower bound. Furthermore, we present the aligned coded shuffling scheme for some storage values, which achieves the optimal storage vs communication trade-off for $K<5$, and further reduces the maximum multiplicative gap down to $\frac{K-\frac{1}{3}}{K-1}$, for $K\geq 5$.
△ Less
Submitted 5 January, 2018;
originally announced January 2018.
-
A Machine Learning Approach to Adaptive Covariance Localization
Authors:
Azam Moosavi,
Ahmed Attia,
Adrian Sandu
Abstract:
Data assimilation plays a key role in large-scale atmospheric weather forecasting, where the state of the physical system is estimated from model outputs and observations, and is then used as initial condition to produce accurate future forecasts. The Ensemble Kalman Filter (EnKF) provides a practical implementation of the statistical solution of the data assimilation problem and has gained wide p…
▽ More
Data assimilation plays a key role in large-scale atmospheric weather forecasting, where the state of the physical system is estimated from model outputs and observations, and is then used as initial condition to produce accurate future forecasts. The Ensemble Kalman Filter (EnKF) provides a practical implementation of the statistical solution of the data assimilation problem and has gained wide popularity as. This success can be attributed to its simple formulation and ease of implementation. EnKF is a Monte-Carlo algorithm that solves the data assimilation problem by sampling the probability distributions involved in Bayes theorem. Because of this, all flavors of EnKF are fundamentally prone to sampling errors when the ensemble size is small. In typical weather forecasting applications, the model state space has dimension $10^{9}-10^{12}$, while the ensemble size typically ranges between $30-100$ members. Sampling errors manifest themselves as long-range spurious correlations and have been shown to cause filter divergence. To alleviate this effect covariance localization dampens spurious correlations between state variables located at a large distance in the physical space, via an empirical distance-dependent function. The quality of the resulting analysis and forecast is greatly influenced by the choice of the localization function parameters, e.g., the radius of influence. The localization radius is generally tuned empirically to yield desirable results.This work, proposes two adaptive algorithms for covariance localization in the EnKF framework, both based on a machine learning approach. The first algorithm adapts the localization radius in time, while the second algorithm tunes the localization radius in both time and space. Numerical experiments carried out with the Lorenz-96 model, and a quasi-geostrophic model, reveal the potential of the proposed machine learning approaches.
△ Less
Submitted 10 February, 2018; v1 submitted 1 January, 2018;
originally announced January 2018.
-
Combating Computational Heterogeneity in Large-Scale Distributed Computing via Work Exchange
Authors:
Mohamed A. Attia,
Ravi Tandon
Abstract:
Owing to data-intensive large-scale applications, distributed computation systems have gained significant recent interest, due to their ability of running such tasks over a large number of commodity nodes in a time efficient manner. One of the major bottlenecks that adversely impacts the time efficiency is the computational heterogeneity of distributed nodes, often limiting the task completion tim…
▽ More
Owing to data-intensive large-scale applications, distributed computation systems have gained significant recent interest, due to their ability of running such tasks over a large number of commodity nodes in a time efficient manner. One of the major bottlenecks that adversely impacts the time efficiency is the computational heterogeneity of distributed nodes, often limiting the task completion time due to the slowest worker.
In this paper, we first present a lower bound on the expected computation time based on the work-conservation principle. We then present our approach of work exchange to combat the latency problem, in which faster workers can be reassigned additional leftover computations that were originally assigned to slower workers. We present two variations of the work exchange approach: a) when the computational heterogeneity knowledge is known a priori; and b) when heterogeneity is unknown and is estimated in an online manner to assign tasks to distributed workers. As a baseline, we also present and analyze the use of an optimized Maximum Distance Separable (MDS) coded distributed computation scheme over heterogeneous nodes. Simulation results also compare the proposed approach of work exchange, the baseline MDS coded scheme and the lower bound obtained via work-conservation principle. We show that the work exchange scheme achieves time for computation which is very close to the lower bound with limited coordination and communication overhead even when the knowledge about heterogeneity levels is not available.
△ Less
Submitted 22 November, 2017;
originally announced November 2017.
-
DATeS: A Highly-Extensible Data Assimilation Testing Suite v1.0
Authors:
Ahmed Attia,
Adrian Sandu
Abstract:
A flexible and highly-extensible data assimilation testing suite, named DATeS, is described in this paper. DATeS aims to offer a unified testing environment that allows researchers to compare different data assimilation methodologies and understand their performance in various settings. The core of DATeS is implemented in Python and takes advantage of its object-oriented capabilities. The main com…
▽ More
A flexible and highly-extensible data assimilation testing suite, named DATeS, is described in this paper. DATeS aims to offer a unified testing environment that allows researchers to compare different data assimilation methodologies and understand their performance in various settings. The core of DATeS is implemented in Python and takes advantage of its object-oriented capabilities. The main components of the package (the numerical models, the data assimilation algorithms, the linear algebra solvers, and the time discretization routines) are independent of each other, which offers great flexibility to configure data assimilation applications. DATeS can interface easily with large third-party numerical models written in Fortran or in C, and with a plethora of external solvers.
△ Less
Submitted 1 July, 2018; v1 submitted 18 April, 2017;
originally announced April 2017.
-
Cluster Sampling Filters for Non-Gaussian Data Assimilation
Authors:
Ahmed Attia,
Azam Moosavi,
Adrian Sandu
Abstract:
This paper presents a fully non-Gaussian version of the Hamiltonian Monte Carlo (HMC) sampling filter. The Gaussian prior assumption in the original HMC filter is relaxed. Specifically, a clustering step is introduced after the forecast phase of the filter, and the prior density function is estimated by fitting a Gaussian Mixture Model (GMM) to the prior ensemble. Using the data likelihood functio…
▽ More
This paper presents a fully non-Gaussian version of the Hamiltonian Monte Carlo (HMC) sampling filter. The Gaussian prior assumption in the original HMC filter is relaxed. Specifically, a clustering step is introduced after the forecast phase of the filter, and the prior density function is estimated by fitting a Gaussian Mixture Model (GMM) to the prior ensemble. Using the data likelihood function, the posterior density is then formulated as a mixture density, and is sampled using a HMC approach (or any other scheme capable of sampling multimodal densities in high-dimensional subspaces). The main filter developed herein is named "cluster HMC sampling filter" (ClHMC). A multi-chain version of the ClHMC filter, namely MC-ClHMC is also proposed to guarantee that samples are taken from the vicinities of all probability modes of the formulated posterior. The new methodologies are tested using a quasi-geostrophic (QG) model with double-gyre wind forcing and bi-harmonic friction. Numerical results demonstrate the usefulness of using GMMs to relax the Gaussian prior assumption in the HMC filtering paradigm.
△ Less
Submitted 18 August, 2016; v1 submitted 13 July, 2016;
originally announced July 2016.
-
The Reduced-Order Hybrid Monte Carlo Sampling Smoother
Authors:
Ahmed Attia,
Razvan Stefanescu,
Adrian Sandu
Abstract:
Hybrid Monte-Carlo (HMC) sampling smoother is a fully non-Gaussian four-dimensional data assimilation algorithm that works by directly sampling the posterior distribution formulated in the Bayesian framework. The smoother in its original formulation is computationally expensive due to the intrinsic requirement of running the forward and adjoint models repeatedly. Here we present computationally ef…
▽ More
Hybrid Monte-Carlo (HMC) sampling smoother is a fully non-Gaussian four-dimensional data assimilation algorithm that works by directly sampling the posterior distribution formulated in the Bayesian framework. The smoother in its original formulation is computationally expensive due to the intrinsic requirement of running the forward and adjoint models repeatedly. Here we present computationally efficient versions of the HMC sampling smoother based on reduced-order approximations of the underlying model dynamics. The schemes developed herein are tested numerically using the shallow-water equations model on Cartesian coordinates. The results reveal that the reduced-order versions of the smoother are capable of accurately capturing the posterior probability density, while being significantly faster than the original full order formulation.
△ Less
Submitted 1 January, 2016;
originally announced January 2016.
-
A Hybrid Monte-Carlo Sampling Smoother for Four Dimensional Data Assimilation
Authors:
Ahmed Attia,
Vishwas Rao,
Adrian Sandu
Abstract:
This paper constructs an ensemble-based sampling smoother for four-dimensional data assimilation using a Hybrid/Hamiltonian Monte-Carlo approach. The smoother samples efficiently from the posterior probability density of the solution at the initial time. Unlike the well-known ensemble Kalman smoother, which is optimal only in the linear Gaussian case, the proposed methodology naturally accommodate…
▽ More
This paper constructs an ensemble-based sampling smoother for four-dimensional data assimilation using a Hybrid/Hamiltonian Monte-Carlo approach. The smoother samples efficiently from the posterior probability density of the solution at the initial time. Unlike the well-known ensemble Kalman smoother, which is optimal only in the linear Gaussian case, the proposed methodology naturally accommodates non-Gaussian errors and non-linear model dynamics and observation operators. Unlike the four-dimensional variational met\-hod, which only finds a mode of the posterior distribution, the smoother provides an estimate of the posterior uncertainty. One can use the ensemble mean as the minimum variance estimate of the state, or can use the ensemble in conjunction with the variational approach to estimate the background errors for subsequent assimilation windows. Numerical results demonstrate the advantages of the proposed method compared to the traditional variational and ensemble-based smoothing methods.
△ Less
Submitted 18 May, 2015;
originally announced May 2015.
-
A Sampling Filter for Non-Gaussian Data Assimilation
Authors:
Ahmed Attia,
Adrian Sandu
Abstract:
Data assimilation combines information from models, measurements, and priors to estimate the state of a dynamical system such as the atmosphere. The Ensemble Kalman filter (EnKF) is a family of ensemble-based data assimilation approaches that has gained wide popularity due its simple formulation, ease of implementation, and good practical results. Most EnKF algorithms assume that the underlying pr…
▽ More
Data assimilation combines information from models, measurements, and priors to estimate the state of a dynamical system such as the atmosphere. The Ensemble Kalman filter (EnKF) is a family of ensemble-based data assimilation approaches that has gained wide popularity due its simple formulation, ease of implementation, and good practical results. Most EnKF algorithms assume that the underlying probability distributions are Gaussian. Although this assumption is well accepted, it is too restrictive when applied to large nonlinear models, nonlinear observation operators, and large levels of uncertainty. Several approaches have been proposed in order to avoid the Gaussianity assumption. One of the most successful strategies is the maximum likelihood ensemble filter (MLEF) which computes a maximum a posteriori estimate of the state assuming the posterior distribution is Gaussian. MLEF is designed to work with nonlinear and even non-differentiable observation operators, and shows good practical performance. However, there are limits to the degree of nonlinearity that MLEF can handle. This paper proposes a new ensemble-based data assimilation method, named the "sampling filter", which obtains the analysis by sampling directly from the posterior distribution. The sampling strategy is based on a Hybrid Monte Carlo (HMC) approach that can handle non-Gaussian probability distributions. Numerical experiments are carried out using the Lorenz-96 model and observation operators with different levels of non-linearity and differentiability. The proposed filter is also tested with shallow water model on a sphere with linear observation operator. The results show that the sampling filter can perform well even in highly nonlinear situations were EnKF and MLEF filters diverge.
△ Less
Submitted 5 December, 2014; v1 submitted 27 March, 2014;
originally announced March 2014.
-
Study of 3He inelastic scattering on 13C and 14C at 37.9 MeV
Authors:
Marwa N. El-Hammamy,
Ali Attia,
Fayza. A. El-Akkad,
Ali M. Abdel-Moneim
Abstract:
The differential cross sections of elastic and inelastic scattering of 3He ions on 13C and 14C have been studied at energy of 37.9 MeV with a double folding model based on M3Y-Reid effective nucleon-nucleon interaction. The resulted parameters have been used for the standard Distorted Wave Born Approximation calculations of angular distributions corresponding to different excitations levels of 13C…
▽ More
The differential cross sections of elastic and inelastic scattering of 3He ions on 13C and 14C have been studied at energy of 37.9 MeV with a double folding model based on M3Y-Reid effective nucleon-nucleon interaction. The resulted parameters have been used for the standard Distorted Wave Born Approximation calculations of angular distributions corresponding to different excitations levels of 13C and 14C and deformation parameters have been deduced.
△ Less
Submitted 14 November, 2013;
originally announced November 2013.