Search | arXiv e-print repository

Probabilistic Approach to Black-Box Binary Optimization with Budget Constraints: Application to Sensor Placement

Abstract: We present a fully probabilistic approach for solving binary optimization problems with black-box objective functions and with budget constraints. In the probabilistic approach, the optimization variable is viewed as a random variable and is associated with a parametric probability distribution. The original optimization problem is replaced with an optimization over the expected value of the origi… ▽ More We present a fully probabilistic approach for solving binary optimization problems with black-box objective functions and with budget constraints. In the probabilistic approach, the optimization variable is viewed as a random variable and is associated with a parametric probability distribution. The original optimization problem is replaced with an optimization over the expected value of the original objective, which is then optimized over the probability distribution parameters. The resulting optimal parameter (optimal policy) is used to sample the binary space to produce estimates of the optimal solution(s) of the original binary optimization problem. The probability distribution is chosen from the family of Bernoulli models because the optimization variable is binary. The optimization constraints generally restrict the feasibility region. This can be achieved by modeling the random variable with a conditional distribution given satisfiability of the constraints. Thus, in this work we develop conditional Bernoulli distributions to model the random variable conditioned by the total number of nonzero entries, that is, the budget constraint. This approach (a) is generally applicable to binary optimization problems with nonstochastic black-box objective functions and budget constraints; (b) accounts for budget constraints by employing conditional probabilities that sample only the feasible region and thus considerably reduces the computational cost compared with employing soft constraints; and (c) does not employ soft constraints and thus does not require tuning of a regularization parameter, for example to promote sparsity, which is challenging in sensor placement optimization problems. The proposed approach is verified numerically by using an idealized bilinear binary optimization problem and is validated by using a sensor placement experiment in a parameter identification setup. △ Less

Submitted 9 June, 2024; originally announced June 2024.

Comments: 54 pages, 20 figures, 6 sections, 2 appendices

MSC Class: 90C27; 60C05; 62K05; 35R30; 35Q93; 65C60; 93E35

arXiv:2405.13018 [pdf, other]

Continued Pretraining for Domain Adaptation of Wav2vec2.0 in Automatic Speech Recognition for Elementary Math Classroom Settings

Authors: Ahmed Adel Attia, Dorottya Demszky, Tolulope Ogunremi, **g Liu, Carol Espy-Wilson

Abstract: Creating Automatic Speech Recognition (ASR) systems that are robust and resilient to classroom conditions is paramount to the development of AI tools to aid teachers and students. In this work, we study the efficacy of continued pretraining (CPT) in adapting Wav2vec2.0 to the classroom domain. We show that CPT is a powerful tool in that regard and reduces the Word Error Rate (WER) of Wav2vec2.0-ba… ▽ More Creating Automatic Speech Recognition (ASR) systems that are robust and resilient to classroom conditions is paramount to the development of AI tools to aid teachers and students. In this work, we study the efficacy of continued pretraining (CPT) in adapting Wav2vec2.0 to the classroom domain. We show that CPT is a powerful tool in that regard and reduces the Word Error Rate (WER) of Wav2vec2.0-based models by upwards of 10%. More specifically, CPT improves the model's robustness to different noises, microphones, classroom conditions as well as classroom demographics. Our CPT models show improved ability to generalize to different demographics unseen in the labeled finetuning data. △ Less

Submitted 15 May, 2024; originally announced May 2024.

arXiv:2404.03661 [pdf, other]

Benchmarking formalisms for dynamic structure system Modeling and Simulation

Authors: Aya Attia, Clément Foucher, Luiz Fernando Lavado Villa

Abstract: Modeling and simulation of complex systems is key to explore systems dynamics. Many scientific approaches were developed to represent dynamic structure systems but most of these approaches are efficient for some kinds of systems and inefficient for others. Which approach can be adopted for different dynamic structure systems categories is a topic of interest for many researchers and until now has… ▽ More Modeling and simulation of complex systems is key to explore systems dynamics. Many scientific approaches were developed to represent dynamic structure systems but most of these approaches are efficient for some kinds of systems and inefficient for others. Which approach can be adopted for different dynamic structure systems categories is a topic of interest for many researchers and until now has not been fully resolved. Therefore it is essential to explore the existing approaches, understand them, and identify gaps. To fulfil this goal, we identified criteria at stake for a smooth flow from model creation to its simulation for dynamic structure systems. Using these criteria, we benchmark the existing modeling formalisms focusing more on DEVS extensions, and use the results to identify approaches gaps and discuss them. △ Less

Submitted 25 January, 2024; originally announced April 2024.

arXiv:2403.02873 [pdf, other]

A Note on High-Probability Analysis of Algorithms with Exponential, Sub-Gaussian, and General Light Tails

Authors: Amit Attia, Tomer Koren

Abstract: This short note describes a simple technique for analyzing probabilistic algorithms that rely on a light-tailed (but not necessarily bounded) source of randomization. We show that the analysis of such an algorithm can be reduced, in a black-box manner and with only a small loss in logarithmic factors, to an analysis of a simpler variant of the same algorithm that uses bounded random variables and… ▽ More This short note describes a simple technique for analyzing probabilistic algorithms that rely on a light-tailed (but not necessarily bounded) source of randomization. We show that the analysis of such an algorithm can be reduced, in a black-box manner and with only a small loss in logarithmic factors, to an analysis of a simpler variant of the same algorithm that uses bounded random variables and often easier to analyze. This approach simultaneously applies to any light-tailed randomization, including exponential, sub-Gaussian, and more general fast-decaying distributions, without needing to appeal to specialized concentration inequalities. Analyses of a generalized Azuma inequality and stochastic optimization with general light-tailed noise are provided to illustrate the technique. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: 9 pages

arXiv:2402.03126 [pdf, other]

How Free is Parameter-Free Stochastic Optimization?

Authors: Amit Attia, Tomer Koren

Abstract: We study the problem of parameter-free stochastic optimization, inquiring whether, and under what conditions, do fully parameter-free methods exist: these are methods that achieve convergence rates competitive with optimally tuned methods, without requiring significant knowledge of the true problem parameters. Existing parameter-free methods can only be considered ``partially'' parameter-free, as… ▽ More We study the problem of parameter-free stochastic optimization, inquiring whether, and under what conditions, do fully parameter-free methods exist: these are methods that achieve convergence rates competitive with optimally tuned methods, without requiring significant knowledge of the true problem parameters. Existing parameter-free methods can only be considered ``partially'' parameter-free, as they require some non-trivial knowledge of the true problem parameters, such as a bound on the stochastic gradient norms, a bound on the distance to a minimizer, etc. In the non-convex setting, we demonstrate that a simple hyperparameter search technique results in a fully parameter-free method that outperforms more sophisticated state-of-the-art algorithms. We also provide a similar result in the convex setting with access to noisy function values under mild noise assumptions. Finally, assuming only access to stochastic gradients, we establish a lower bound that renders fully parameter-free stochastic convex optimization infeasible, and provide a method which is (partially) parameter-free up to the limit indicated by our lower bound. △ Less

Submitted 18 March, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

Comments: 28 pages

arXiv:2311.07676 [pdf, other]

Centralized calibration of power system dynamic models using variational data assimilation

Authors: Ahmed Attia, D. Adrian Maldonado, Emil Constantinescu, Mihai Anitescu

Abstract: This paper presents a novel centralized, variational data assimilation approach for calibrating transient dynamic models in electrical power systems, focusing on load model parameters. With the increasing importance of inverter-based resources, assessing power systems' dynamic performance under disturbances has become challenging, necessitating robust model calibration methods. The proposed approa… ▽ More This paper presents a novel centralized, variational data assimilation approach for calibrating transient dynamic models in electrical power systems, focusing on load model parameters. With the increasing importance of inverter-based resources, assessing power systems' dynamic performance under disturbances has become challenging, necessitating robust model calibration methods. The proposed approach expands on previous Bayesian frameworks by establishing a posterior distribution of parameters using an approximation around the maximum a posteriori value. We illustrate the efficacy of our method by generating events of varying intensity, highlighting its ability to capture the systems' evolution accurately and with associated uncertainty estimates. This research improves the precision of dynamic performance assessments in modern power systems, with potential applications in managing uncertainties and optimizing system operations. △ Less

Submitted 13 November, 2023; originally announced November 2023.

Comments: 9 pages, 8 figures, and 1 table

arXiv:2309.09220 [pdf, other]

Improving Speech Inversion Through Self-Supervised Embeddings and Enhanced Tract Variables

Authors: Ahmed Adel Attia, Yashish M. Siriwardena, Carol Espy-Wilson

Abstract: The performance of deep learning models depends significantly on their capacity to encode input features efficiently and decode them into meaningful outputs. Better input and output representation has the potential to boost models' performance and generalization. In the context of acoustic-to-articulatory speech inversion (SI) systems, we study the impact of utilizing speech representations acquir… ▽ More The performance of deep learning models depends significantly on their capacity to encode input features efficiently and decode them into meaningful outputs. Better input and output representation has the potential to boost models' performance and generalization. In the context of acoustic-to-articulatory speech inversion (SI) systems, we study the impact of utilizing speech representations acquired via self-supervised learning (SSL) models, such as HuBERT compared to conventional acoustic features. Additionally, we investigate the incorporation of novel tract variables (TVs) through an improved geometric transformation model. By combining these two approaches, we improve the Pearson product-moment correlation (PPMC) scores which evaluate the accuracy of TV estimation of the SI system from 0.7452 to 0.8141, a 6.9% increase. Our findings underscore the profound influence of rich feature representations from SSL models and improved geometric transformations with target TVs on the enhanced functionality of SI systems. △ Less

Submitted 17 September, 2023; originally announced September 2023.

arXiv:2309.07927 [pdf, ps, other]

Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults

Authors: Ahmed Adel Attia, **g Liu, Wei Ai, Dorottya Demszky, Carol Espy-Wilson

Abstract: Recent advancements in Automatic Speech Recognition (ASR) systems, exemplified by Whisper, have demonstrated the potential of these systems to approach human-level performance given sufficient data. However, this progress doesn't readily extend to ASR for children due to the limited availability of suitable child-specific databases and the distinct characteristics of children's speech. A recent st… ▽ More Recent advancements in Automatic Speech Recognition (ASR) systems, exemplified by Whisper, have demonstrated the potential of these systems to approach human-level performance given sufficient data. However, this progress doesn't readily extend to ASR for children due to the limited availability of suitable child-specific databases and the distinct characteristics of children's speech. A recent study investigated leveraging the My Science Tutor (MyST) children's speech corpus to enhance Whisper's performance in recognizing children's speech. They were able to demonstrate some improvement on a limited testset. This paper builds on these findings by enhancing the utility of the MyST dataset through more efficient data preprocessing. We reduce the Word Error Rate (WER) on the MyST testset 13.93% to 9.11% with Whisper-Small and from 13.23% to 8.61% with Whisper-Medium and show that this improvement can be generalized to unseen datasets. We also highlight important challenges towards improving children's ASR performance. The results showcase the viable and efficient integration of Whisper for effective children's speech recognition. △ Less

Submitted 15 May, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

arXiv:2305.10775 [pdf, other]

doi 10.21437/Interspeech.2023-2211

Enhancing Speech Articulation Analysis using a Geometric Transformation of the X-ray Microbeam Dataset

Authors: Ahmed Adel Attia, Mark Tiede, Carol Y. Espy-Wilson

Abstract: Accurate analysis of speech articulation is crucial for speech analysis. However, X-Y coordinates of articulators strongly depend on the anatomy of the speakers and the variability of pellet placements, and existing methods for map** anatomical landmarks in the X-ray Microbeam Dataset (XRMB) fail to capture the entire anatomy of the vocal tract. In this paper, we propose a new geometric transfor… ▽ More Accurate analysis of speech articulation is crucial for speech analysis. However, X-Y coordinates of articulators strongly depend on the anatomy of the speakers and the variability of pellet placements, and existing methods for map** anatomical landmarks in the X-ray Microbeam Dataset (XRMB) fail to capture the entire anatomy of the vocal tract. In this paper, we propose a new geometric transformation that improves the accuracy of these measurements. Our transformation maps anatomical landmarks' X-Y coordinates along the midsagittal plane onto six relative measures: Lip Aperture (LA), Lip Protusion (LP), Tongue Body Constriction Location (TTCL), Degree (TBCD), Tongue Tip Constriction Location (TTCL) and Degree (TTCD). Our novel contribution is the extension of the palate trace towards the inferred anterior pharyngeal line, which improves measurements of tongue body constriction. △ Less

Submitted 28 September, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

arXiv:2305.03855 [pdf, other]

Robust A-Optimal Experimental Design for Bayesian Inverse Problems

Authors: Ahmed Attia, Sven Leyffer, Todd Munson

Abstract: Optimal design of experiments for Bayesian inverse problems has recently gained wide popularity and attracted much attention, especially in the computational science and Bayesian inversion communities. An optimal design maximizes a predefined utility function that is formulated in terms of the elements of an inverse problem, an example being optimal sensor placement for parameter identification. T… ▽ More Optimal design of experiments for Bayesian inverse problems has recently gained wide popularity and attracted much attention, especially in the computational science and Bayesian inversion communities. An optimal design maximizes a predefined utility function that is formulated in terms of the elements of an inverse problem, an example being optimal sensor placement for parameter identification. The state-of-the-art algorithmic approaches following this simple formulation generally overlook misspecification of the elements of the inverse problem, such as the prior or the measurement uncertainties. This work presents an efficient algorithmic approach for designing optimal experimental design schemes for Bayesian inverse problems such that the optimal design is robust to misspecification of elements of the inverse problem. Specifically, we consider a worst-case scenario approach for the uncertain or misspecified parameters, formulate robust objectives, and propose an algorithmic approach for optimizing such objectives. Both relaxation and stochastic solution approaches are discussed with detailed analysis and insight into the interpretation of the problem and the proposed algorithmic approach. Extensive numerical experiments to validate and analyze the proposed approach are carried out for sensor placement in a parameter identification problem. △ Less

Submitted 5 May, 2023; originally announced May 2023.

Comments: 25 pages, 11 figures

MSC Class: 62K05; 35Q62; 62F15; 35R30; 35Q93; 65C60; 93E35

arXiv:2302.08783 [pdf, ps, other]

SGD with AdaGrad Stepsizes: Full Adaptivity with High Probability to Unknown Parameters, Unbounded Gradients and Affine Variance

Authors: Amit Attia, Tomer Koren

Abstract: We study Stochastic Gradient Descent with AdaGrad stepsizes: a popular adaptive (self-tuning) method for first-order stochastic optimization. Despite being well studied, existing analyses of this method suffer from various shortcomings: they either assume some knowledge of the problem parameters, impose strong global Lipschitz conditions, or fail to give bounds that hold with high probability. We… ▽ More We study Stochastic Gradient Descent with AdaGrad stepsizes: a popular adaptive (self-tuning) method for first-order stochastic optimization. Despite being well studied, existing analyses of this method suffer from various shortcomings: they either assume some knowledge of the problem parameters, impose strong global Lipschitz conditions, or fail to give bounds that hold with high probability. We provide a comprehensive analysis of this basic method without any of these limitations, in both the convex and non-convex (smooth) cases, that additionally supports a general ``affine variance'' noise model and provides sharp rates of convergence in both the low-noise and high-noise~regimes. △ Less

Submitted 11 June, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

Comments: 27 pages

arXiv:2301.08336 [pdf, other]

PyOED: An Extensible Suite for Data Assimilation and Model-Constrained Optimal Design of Experiments

Authors: Abhijit Chowdhary, Shady E. Ahmed, Ahmed Attia

Abstract: This paper describes PyOED, a highly extensible scientific package that enables develo** and testing model-constrained optimal experimental design (OED) for inverse problems. Specifically, PyOED aims to be a comprehensive Python toolkit for model-constrained OED. The package targets scientists and researchers interested in understanding the details of OED formulations and approaches. It is also… ▽ More This paper describes PyOED, a highly extensible scientific package that enables develo** and testing model-constrained optimal experimental design (OED) for inverse problems. Specifically, PyOED aims to be a comprehensive Python toolkit for model-constrained OED. The package targets scientists and researchers interested in understanding the details of OED formulations and approaches. It is also meant to enable researchers to experiment with standard and innovative OED technologies with a wide range of test problems (e.g., simulation models). OED, inverse problems (e.g., Bayesian inversion), and data assimilation (DA) are closely related research fields, and their formulations overlap significantly. Thus, PyOED is continuously being expanded with a plethora of Bayesian inversion, DA, and OED methods as well as new scientific simulation models, observation error models, and observation operators. These pieces are added such that they can be permuted to enable testing OED methods in various settings of varying complexities. The PyOED core is completely written in Python and utilizes the inherent object-oriented capabilities; however, the current version of PyOED is meant to be extensible rather than scalable. Specifically, PyOED is developed to enable rapid development and benchmarking of OED methods with minimal coding effort and to maximize code reutilization. This paper provides a brief description of the PyOED layout and philosophy and provides a set of exemplary test cases and tutorials to demonstrate the potential of the package. △ Less

Submitted 19 December, 2023; v1 submitted 19 January, 2023; originally announced January 2023.

Comments: 22 pages, 8 figures

MSC Class: 68Vxx

arXiv:2210.15195 [pdf, other]

doi 10.1109/ICASSP49357.2023.10096209

Masked Autoencoders Are Articulatory Learners

Authors: Ahmed Adel Attia, Carol Espy-Wilson

Abstract: Articulatory recordings track the positions and motion of different articulators along the vocal tract and are widely used to study speech production and to develop speech technologies such as articulatory based speech synthesizers and speech inversion systems. The University of Wisconsin X-Ray microbeam (XRMB) dataset is one of various datasets that provide articulatory recordings synced with aud… ▽ More Articulatory recordings track the positions and motion of different articulators along the vocal tract and are widely used to study speech production and to develop speech technologies such as articulatory based speech synthesizers and speech inversion systems. The University of Wisconsin X-Ray microbeam (XRMB) dataset is one of various datasets that provide articulatory recordings synced with audio recordings. The XRMB articulatory recordings employ pellets placed on a number of articulators which can be tracked by the microbeam. However, a significant portion of the articulatory recordings are mistracked, and have been so far unsuable. In this work, we present a deep learning based approach using Masked Autoencoders to accurately reconstruct the mistracked articulatory recordings for 41 out of 47 speakers of the XRMB dataset. Our model is able to reconstruct articulatory trajectories that closely match ground truth, even when three out of eight articulators are mistracked, and retrieve 3.28 out of 3.4 hours of previously unusable recordings. △ Less

Submitted 18 May, 2023; v1 submitted 27 October, 2022; originally announced October 2022.

arXiv:2207.08257 [pdf, ps, other]

Uniform Stability for First-Order Empirical Risk Minimization

Authors: Amit Attia, Tomer Koren

Abstract: We consider the problem of designing uniformly stable first-order optimization algorithms for empirical risk minimization. Uniform stability is often used to obtain generalization error bounds for optimization algorithms, and we are interested in a general approach to achieve it. For Euclidean geometry, we suggest a black-box conversion which given a smooth optimization algorithm, produces a unifo… ▽ More We consider the problem of designing uniformly stable first-order optimization algorithms for empirical risk minimization. Uniform stability is often used to obtain generalization error bounds for optimization algorithms, and we are interested in a general approach to achieve it. For Euclidean geometry, we suggest a black-box conversion which given a smooth optimization algorithm, produces a uniformly stable version of the algorithm while maintaining its convergence rate up to logarithmic factors. Using this reduction we obtain a (nearly) optimal algorithm for smooth optimization with convergence rate $\widetilde{O}(1/T^2)$ and uniform stability $O(T^2/n)$, resolving an open problem of Chen et al. (2018); Attia and Koren (2021). For more general geometries, we develop a variant of Mirror Descent for smooth optimization with convergence rate $\widetilde{O}(1/T)$ and uniform stability $O(T/n)$, leaving open the question of devising a general conversion method as in the Euclidean case. △ Less

Submitted 17 July, 2022; originally announced July 2022.

Comments: 18 pages, Proceedings of Thirty Fifth Conference on Learning Theory, PMLR 178:3313-3332, 2022

arXiv:2102.02167 [pdf, other]

Algorithmic Instabilities of Accelerated Gradient Descent

Authors: Amit Attia, Tomer Koren

Abstract: We study the algorithmic stability of Nesterov's accelerated gradient method. For convex quadratic objectives, Chen et al. (2018) proved that the uniform stability of the method grows quadratically with the number of optimization steps, and conjectured that the same is true for the general convex and smooth case. We disprove this conjecture and show, for two notions of algorithmic stability (inclu… ▽ More We study the algorithmic stability of Nesterov's accelerated gradient method. For convex quadratic objectives, Chen et al. (2018) proved that the uniform stability of the method grows quadratically with the number of optimization steps, and conjectured that the same is true for the general convex and smooth case. We disprove this conjecture and show, for two notions of algorithmic stability (including uniform stability), that the stability of Nesterov's accelerated method in fact deteriorates exponentially fast with the number of gradient steps. This stands in sharp contrast to the bounds in the quadratic case, but also to known results for non-accelerated gradient methods where stability typically grows linearly with the number of steps. △ Less

Submitted 19 June, 2021; v1 submitted 3 February, 2021; originally announced February 2021.

Comments: 37 pages

arXiv:2101.05958 [pdf, other]

doi 10.1137/21M1404363

Stochastic Learning Approach to Binary Optimization for Optimal Design of Experiments

Authors: Ahmed Attia, Sven Leyffer, Todd Munson

Abstract: We present a novel stochastic approach to binary optimization for optimal experimental design (OED) for Bayesian inverse problems governed by mathematical models such as partial differential equations. The OED utility function, namely, the regularized optimality criterion, is cast into a stochastic objective function in the form of an expectation over a multivariate Bernoulli distribution. The pro… ▽ More We present a novel stochastic approach to binary optimization for optimal experimental design (OED) for Bayesian inverse problems governed by mathematical models such as partial differential equations. The OED utility function, namely, the regularized optimality criterion, is cast into a stochastic objective function in the form of an expectation over a multivariate Bernoulli distribution. The probabilistic objective is then solved by using a stochastic optimization routine to find an optimal observational policy. The proposed approach is analyzed from an optimization perspective and also from a machine learning perspective with correspondence to policy gradient reinforcement learning. The approach is demonstrated numerically by using an idealized two-dimensional Bayesian linear inverse problem, and validated by extensive numerical experiments carried out for sensor placement in a parameter identification setup. △ Less

Submitted 14 January, 2021; originally announced January 2021.

Comments: 34 pages, 12 figures

arXiv:2006.03048 [pdf, other]

Asymmetric Leaky Private Information Retrieval

Authors: Islam Samy, Mohamed A. Attia, Ravi Tandon, Loukas Lazos

Abstract: Information-theoretic formulations of the private information retrieval (PIR) problem have been investigated under a variety of scenarios. Symmetric private information retrieval (SPIR) is a variant where a user is able to privately retrieve one out of $K$ messages from $N$ non-colluding replicated databases without learning anything about the remaining $K-1$ messages. However, the goal of perfect… ▽ More Information-theoretic formulations of the private information retrieval (PIR) problem have been investigated under a variety of scenarios. Symmetric private information retrieval (SPIR) is a variant where a user is able to privately retrieve one out of $K$ messages from $N$ non-colluding replicated databases without learning anything about the remaining $K-1$ messages. However, the goal of perfect privacy can be too taxing for certain applications. In this paper, we investigate if the information-theoretic capacity of SPIR (equivalently, the inverse of the minimum download cost) can be increased by relaxing both user and DB privacy definitions. Such relaxation is relevant in applications where privacy can be traded for communication efficiency. We introduce and investigate the Asymmetric Leaky PIR (AL-PIR) model with different privacy leakage budgets in each direction. For user privacy leakage, we bound the probability ratios between all possible realizations of DB queries by a function of a non-negative constant $ε$. For DB privacy, we bound the mutual information between the undesired messages, the queries, and the answers, by a function of a non-negative constant $δ$. We propose a general AL-PIR scheme that achieves an upper bound on the optimal download cost for arbitrary $ε$ and $δ$. We show that the optimal download cost of AL-PIR is upper-bounded as $D^{*}(ε,δ)\leq 1+\frac{1}{N-1}-\frac{δe^ε}{N^{K-1}-1}$. Second, we obtain an information-theoretic lower bound on the download cost as $D^{*}(ε,δ)\geq 1+\frac{1}{Ne^ε-1}-\fracδ{(Ne^ε)^{K-1}-1}$. The gap analysis between the two bounds shows that our AL-PIR scheme is optimal when $ε=0$, i.e., under perfect user privacy and it is optimal within a maximum multiplicative gap of $\frac{N-e^{-ε}}{N-1}$ for any $(ε,δ)$. △ Less

Submitted 4 June, 2020; originally announced June 2020.

arXiv:2001.05998 [pdf, other]

Latent-variable Private Information Retrieval

Authors: Islam Samy, Mohamed A. Attia, Ravi Tandon, Loukas Lazos

Abstract: In many applications, content accessed by users (movies, videos, news articles, etc.) can leak sensitive latent attributes, such as religious and political views, sexual orientation, ethnicity, gender, and others. To prevent such information leakage, the goal of classical PIR is to hide the identity of the content/message being accessed, which subsequently also hides the latent attributes. This so… ▽ More In many applications, content accessed by users (movies, videos, news articles, etc.) can leak sensitive latent attributes, such as religious and political views, sexual orientation, ethnicity, gender, and others. To prevent such information leakage, the goal of classical PIR is to hide the identity of the content/message being accessed, which subsequently also hides the latent attributes. This solution, while private, can be too costly, particularly, when perfect (information-theoretic) privacy constraints are imposed. For instance, for a single database holding $K$ messages, privately retrieving one message is possible if and only if the user downloads the entire database of $K$ messages. Retrieving content privately, however, may not be necessary to perfectly hide the latent attributes. Motivated by the above, we formulate and study the problem of latent-variable private information retrieval (LV-PIR), which aims at allowing the user efficiently retrieve one out of $K$ messages (indexed by $θ$) without revealing any information about the latent variable (modeled by $S$). We focus on the practically relevant setting of a single database and show that one can significantly reduce the download cost of LV-PIR (compared to the classical PIR) based on the correlation between $θ$ and $S$. We present a general scheme for LV-PIR as a function of the statistical relationship between $θ$ and $S$, and also provide new results on the capacity/download cost of LV-PIR. Several open problems and new directions are also discussed. △ Less

Submitted 14 May, 2020; v1 submitted 16 January, 2020; originally announced January 2020.

arXiv:1806.10655 [pdf, other]

An Optimal Experimental Design Framework for Adaptive Inflation and Covariance Localization for Ensemble Filters

Authors: Ahmed Attia, Emil Constantinescu

Abstract: We develop an optimal experimental design framework for adapting the covariance inflation and localization in data assimilation problems. Covariance inflation and localization are ubiquitously employed to alleviate the effect of using ensembles of finite sizes in all practical data assimilation systems. The choice of both the inflation factor and the localization radius can have a significant impa… ▽ More We develop an optimal experimental design framework for adapting the covariance inflation and localization in data assimilation problems. Covariance inflation and localization are ubiquitously employed to alleviate the effect of using ensembles of finite sizes in all practical data assimilation systems. The choice of both the inflation factor and the localization radius can have a significant impact on the performance of the assimilation scheme. These parameters are generally tuned by trial and error, rendering them expensive to optimize in practice. Spatially and temporally varying inflation parameter and localization radii have been recently proposed and have been empirically proven to enhance the performance of the employed assimilation filter. In this study, we present a variational framework for adaptive tuning of the inflation and localization parameters. Each of these parameters is optimized independently, with an objective to minimize the uncertainty in the posterior state. The proposed framework does not assume uncorrelated observations or prior errors and can in principle be applied without expert knowledge about the model and the observations. Thus, it is adequate for handling dense as well as sparse observational networks. We present the mathematical formulation, algorithmic description of the approach, and numerical experiments using the two-layer Lorenz-96 model. △ Less

Submitted 24 March, 2019; v1 submitted 27 June, 2018; originally announced June 2018.

Comments: 31 pages, 15 figures

arXiv:1805.04104 [pdf, other]

The Capacity of Private Information Retrieval from Uncoded Storage Constrained Databases

Authors: Mohamed Adel Attia, Deepak Kumar, Ravi Tandon

Abstract: Private information retrieval (PIR) allows a user to retrieve a desired message from a set of databases without revealing the identity of the desired message. The replicated databases scenario was considered by Sun and Jafar, 2016, where $N$ databases can store the same $K$ messages completely. A PIR scheme was developed to achieve the optimal download cost given by… ▽ More Private information retrieval (PIR) allows a user to retrieve a desired message from a set of databases without revealing the identity of the desired message. The replicated databases scenario was considered by Sun and Jafar, 2016, where $N$ databases can store the same $K$ messages completely. A PIR scheme was developed to achieve the optimal download cost given by $\left(1+ \frac{1}{N}+ \frac{1}{N^{2}}+ \cdots + \frac{1}{N^{K-1}}\right)$. In this work, we consider the problem of PIR from storage constrained databases. Each database has a storage capacity of $μKL$ bits, where $L$ is the size of each message in bits, and $μ\in [1/N, 1]$ is the normalized storage. On one extreme, $μ=1$ is the replicated databases case. On the other hand, when $μ= 1/N$, then in order to retrieve a message privately, the user has to download all the messages from the databases achieving a download cost of $1/K$. We aim to characterize the optimal download cost versus storage trade-off for any storage capacity in the range $μ\in [1/N, 1]$. For any $(N,K)$, we show that the optimal trade-off between storage, $μ$, and the download cost, $D(μ)$, is given by the lower convex hull of the $N$ pairs $\left(μ= \frac{t}{N},D(μ) = \left(1+ \frac{1}{t}+ \frac{1}{t^{2}}+ \cdots + \frac{1}{t^{K-1}}\right)\right)$ for $t=1,2,\ldots, N$. To prove this result, we first present the storage constrained PIR scheme for any $(N,K)$. We next obtain a general lower bound on the download cost for PIR, which is valid for the following storage scenarios: replicated or storage constrained, coded or uncoded, and fixed or optimized. We then specialize this bound using the uncoded storage assumption to obtain lower bounds matching the achievable download cost of the storage constrained PIR scheme for any value of the available storage. △ Less

Submitted 23 October, 2018; v1 submitted 10 May, 2018; originally announced May 2018.

arXiv:1802.06517 [pdf, other]

doi 10.1088/1361-6420/aad210

Goal-Oriented Optimal Design of Experiments for Large-Scale Bayesian Linear Inverse Problems

Authors: Ahmed Attia, Alen Alexanderian, Arvind K. Saibaba

Abstract: We develop a framework for goal-oriented optimal design of experiments (GOODE) for large-scale Bayesian linear inverse problems governed by PDEs. This framework differs from classical Bayesian optimal design of experiments (ODE) in the following sense: we seek experimental designs that minimize the posterior uncertainty in the experiment end-goal, e.g., a quantity of interest (QoI), rather than th… ▽ More We develop a framework for goal-oriented optimal design of experiments (GOODE) for large-scale Bayesian linear inverse problems governed by PDEs. This framework differs from classical Bayesian optimal design of experiments (ODE) in the following sense: we seek experimental designs that minimize the posterior uncertainty in the experiment end-goal, e.g., a quantity of interest (QoI), rather than the estimated parameter itself. This is suitable for scenarios in which the solution of an inverse problem is an intermediate step and the estimated parameter is then used to compute a QoI. In such problems, a GOODE approach has two benefits: the designs can avoid wastage of experimental resources by a targeted collection of data, and the resulting design criteria are computationally easier to evaluate due to the often low-dimensionality of the QoIs. We present two modified design criteria, A-GOODE and D-GOODE, which are natural analogues of classical Bayesian A- and D-optimal criteria. We analyze the connections to other ODE criteria, and provide interpretations for the GOODE criteria by using tools from information theory. Then, we develop an efficient gradient-based optimization framework for solving the GOODE optimization problems. Additionally, we present comprehensive numerical experiments testing the various aspects of the presented approach. The driving application is the optimal placement of sensors to identify the source of contaminants in a diffusion and transport problem. We enforce sparsity of the sensor placements using an $\ell_1$-norm penalty approach, and propose a practical strategy for specifying the associated penalty parameter. △ Less

Submitted 11 June, 2018; v1 submitted 18 February, 2018; originally announced February 2018.

Comments: 25 pages, 13 figures

arXiv:1801.06504 [pdf, other]

Detecting and counting tiny faces

Authors: Alexandre Attia, Sharone Dayan

Abstract: Finding Tiny Faces (by Hu and Ramanan) proposes a novel approach to find small objects in an image. Our contribution consists in deeply understanding the choices of the paper together with applying and extending a similar method to a real world subject which is the counting of people in a public demonstration. Finding Tiny Faces (by Hu and Ramanan) proposes a novel approach to find small objects in an image. Our contribution consists in deeply understanding the choices of the paper together with applying and extending a similar method to a real world subject which is the counting of people in a public demonstration. △ Less

Submitted 24 January, 2018; v1 submitted 19 January, 2018; originally announced January 2018.

Comments: 4 pages, 10 figures, 2 appendix page

arXiv:1801.06503 [pdf, other]

Global overview of Imitation Learning

Authors: Alexandre Attia, Sharone Dayan

Abstract: Imitation Learning is a sequential task where the learner tries to mimic an expert's action in order to achieve the best performance. Several algorithms have been proposed recently for this task. In this project, we aim at proposing a wide review of these algorithms, presenting their main features and comparing them on their performance and their regret bounds. Imitation Learning is a sequential task where the learner tries to mimic an expert's action in order to achieve the best performance. Several algorithms have been proposed recently for this task. In this project, we aim at proposing a wide review of these algorithms, presenting their main features and comparing them on their performance and their regret bounds. △ Less

Submitted 19 January, 2018; originally announced January 2018.

Comments: 9 pages, 5 figures, 5 appendix pages

arXiv:1801.02171 [pdf, other]

Detection and segmentation of the Left Ventricle in Cardiac MRI using Deep Learning

Authors: Alexandre Attia, Sharone Dayan

Abstract: Manual segmentation of the Left Ventricle (LV) is a tedious and meticulous task that can vary depending on the patient, the Magnetic Resonance Images (MRI) cuts and the experts. Still today, we consider manual delineation done by experts as being the ground truth for cardiac diagnosticians. Thus, we are reviewing the paper - written by Avendi and al. - who presents a combined approach with Convolu… ▽ More Manual segmentation of the Left Ventricle (LV) is a tedious and meticulous task that can vary depending on the patient, the Magnetic Resonance Images (MRI) cuts and the experts. Still today, we consider manual delineation done by experts as being the ground truth for cardiac diagnosticians. Thus, we are reviewing the paper - written by Avendi and al. - who presents a combined approach with Convolutional Neural Networks, Stacked Auto-Encoders and Deformable Models, to try and automate the segmentation while performing more accurately. Furthermore, we have implemented parts of the paper (around three quarts) and experimented both the original method and slightly modified versions when changing the architecture and the parameters. △ Less

Submitted 7 January, 2018; originally announced January 2018.

arXiv:1801.01875 [pdf, other]

Near Optimal Coded Data Shuffling for Distributed Learning

Authors: Mohamed A. Attia, Ravi Tandon

Abstract: Data shuffling between distributed cluster of nodes is one of the critical steps in implementing large-scale learning algorithms. Randomly shuffling the data-set among a cluster of workers allows different nodes to obtain fresh data assignments at each learning epoch. This process has been shown to provide improvements in the learning process. However, the statistical benefits of distributed data… ▽ More Data shuffling between distributed cluster of nodes is one of the critical steps in implementing large-scale learning algorithms. Randomly shuffling the data-set among a cluster of workers allows different nodes to obtain fresh data assignments at each learning epoch. This process has been shown to provide improvements in the learning process. However, the statistical benefits of distributed data shuffling come at the cost of extra communication overhead from the master node to worker nodes, and can act as one of the major bottlenecks in the overall time for computation. There has been significant recent interest in devising approaches to minimize this communication overhead. One approach is to provision for extra storage at the computing nodes. The other emerging approach is to leverage coded communication to minimize the overall communication overhead. The focus of this work is to understand the fundamental trade-off between the amount of storage and the communication overhead for distributed data shuffling. In this work, we first present an information theoretic formulation for the data shuffling problem, accounting for the underlying problem parameters (number of workers, $K$, number of data points, $N$, and the available storage, $S$ per node). We then present an information theoretic lower bound on the communication overhead for data shuffling as a function of these parameters. We next present a novel coded communication scheme and show that the resulting communication overhead of the proposed scheme is within a multiplicative factor of at most $\frac{K}{K-1}$ from the information-theoretic lower bound. Furthermore, we present the aligned coded shuffling scheme for some storage values, which achieves the optimal storage vs communication trade-off for $K<5$, and further reduces the maximum multiplicative gap down to $\frac{K-\frac{1}{3}}{K-1}$, for $K\geq 5$. △ Less

Submitted 5 January, 2018; originally announced January 2018.

arXiv:1801.00548 [pdf, other]

A Machine Learning Approach to Adaptive Covariance Localization

Authors: Azam Moosavi, Ahmed Attia, Adrian Sandu

Abstract: Data assimilation plays a key role in large-scale atmospheric weather forecasting, where the state of the physical system is estimated from model outputs and observations, and is then used as initial condition to produce accurate future forecasts. The Ensemble Kalman Filter (EnKF) provides a practical implementation of the statistical solution of the data assimilation problem and has gained wide p… ▽ More Data assimilation plays a key role in large-scale atmospheric weather forecasting, where the state of the physical system is estimated from model outputs and observations, and is then used as initial condition to produce accurate future forecasts. The Ensemble Kalman Filter (EnKF) provides a practical implementation of the statistical solution of the data assimilation problem and has gained wide popularity as. This success can be attributed to its simple formulation and ease of implementation. EnKF is a Monte-Carlo algorithm that solves the data assimilation problem by sampling the probability distributions involved in Bayes theorem. Because of this, all flavors of EnKF are fundamentally prone to sampling errors when the ensemble size is small. In typical weather forecasting applications, the model state space has dimension $10^{9}-10^{12}$, while the ensemble size typically ranges between $30-100$ members. Sampling errors manifest themselves as long-range spurious correlations and have been shown to cause filter divergence. To alleviate this effect covariance localization dampens spurious correlations between state variables located at a large distance in the physical space, via an empirical distance-dependent function. The quality of the resulting analysis and forecast is greatly influenced by the choice of the localization function parameters, e.g., the radius of influence. The localization radius is generally tuned empirically to yield desirable results.This work, proposes two adaptive algorithms for covariance localization in the EnKF framework, both based on a machine learning approach. The first algorithm adapts the localization radius in time, while the second algorithm tunes the localization radius in both time and space. Numerical experiments carried out with the Lorenz-96 model, and a quasi-geostrophic model, reveal the potential of the proposed machine learning approaches. △ Less

Submitted 10 February, 2018; v1 submitted 1 January, 2018; originally announced January 2018.

Comments: 23 pages, 12 figures

Report number: CSTR-01

arXiv:1711.08452 [pdf, other]

Combating Computational Heterogeneity in Large-Scale Distributed Computing via Work Exchange

Authors: Mohamed A. Attia, Ravi Tandon

Abstract: Owing to data-intensive large-scale applications, distributed computation systems have gained significant recent interest, due to their ability of running such tasks over a large number of commodity nodes in a time efficient manner. One of the major bottlenecks that adversely impacts the time efficiency is the computational heterogeneity of distributed nodes, often limiting the task completion tim… ▽ More Owing to data-intensive large-scale applications, distributed computation systems have gained significant recent interest, due to their ability of running such tasks over a large number of commodity nodes in a time efficient manner. One of the major bottlenecks that adversely impacts the time efficiency is the computational heterogeneity of distributed nodes, often limiting the task completion time due to the slowest worker. In this paper, we first present a lower bound on the expected computation time based on the work-conservation principle. We then present our approach of work exchange to combat the latency problem, in which faster workers can be reassigned additional leftover computations that were originally assigned to slower workers. We present two variations of the work exchange approach: a) when the computational heterogeneity knowledge is known a priori; and b) when heterogeneity is unknown and is estimated in an online manner to assign tasks to distributed workers. As a baseline, we also present and analyze the use of an optimized Maximum Distance Separable (MDS) coded distributed computation scheme over heterogeneous nodes. Simulation results also compare the proposed approach of work exchange, the baseline MDS coded scheme and the lower bound obtained via work-conservation principle. We show that the work exchange scheme achieves time for computation which is very close to the lower bound with limited coordination and communication overhead even when the knowledge about heterogeneity levels is not available. △ Less

Submitted 22 November, 2017; originally announced November 2017.

arXiv:1704.05594 [pdf, other]

DATeS: A Highly-Extensible Data Assimilation Testing Suite v1.0

Authors: Ahmed Attia, Adrian Sandu

Abstract: A flexible and highly-extensible data assimilation testing suite, named DATeS, is described in this paper. DATeS aims to offer a unified testing environment that allows researchers to compare different data assimilation methodologies and understand their performance in various settings. The core of DATeS is implemented in Python and takes advantage of its object-oriented capabilities. The main com… ▽ More A flexible and highly-extensible data assimilation testing suite, named DATeS, is described in this paper. DATeS aims to offer a unified testing environment that allows researchers to compare different data assimilation methodologies and understand their performance in various settings. The core of DATeS is implemented in Python and takes advantage of its object-oriented capabilities. The main components of the package (the numerical models, the data assimilation algorithms, the linear algebra solvers, and the time discretization routines) are independent of each other, which offers great flexibility to configure data assimilation applications. DATeS can interface easily with large third-party numerical models written in Fortran or in C, and with a plethora of external solvers. △ Less

Submitted 1 July, 2018; v1 submitted 18 April, 2017; originally announced April 2017.

Report number: CSTR-5/2017

arXiv:1403.7137 [pdf, other]

A Sampling Filter for Non-Gaussian Data Assimilation

Authors: Ahmed Attia, Adrian Sandu

Abstract: Data assimilation combines information from models, measurements, and priors to estimate the state of a dynamical system such as the atmosphere. The Ensemble Kalman filter (EnKF) is a family of ensemble-based data assimilation approaches that has gained wide popularity due its simple formulation, ease of implementation, and good practical results. Most EnKF algorithms assume that the underlying pr… ▽ More Data assimilation combines information from models, measurements, and priors to estimate the state of a dynamical system such as the atmosphere. The Ensemble Kalman filter (EnKF) is a family of ensemble-based data assimilation approaches that has gained wide popularity due its simple formulation, ease of implementation, and good practical results. Most EnKF algorithms assume that the underlying probability distributions are Gaussian. Although this assumption is well accepted, it is too restrictive when applied to large nonlinear models, nonlinear observation operators, and large levels of uncertainty. Several approaches have been proposed in order to avoid the Gaussianity assumption. One of the most successful strategies is the maximum likelihood ensemble filter (MLEF) which computes a maximum a posteriori estimate of the state assuming the posterior distribution is Gaussian. MLEF is designed to work with nonlinear and even non-differentiable observation operators, and shows good practical performance. However, there are limits to the degree of nonlinearity that MLEF can handle. This paper proposes a new ensemble-based data assimilation method, named the "sampling filter", which obtains the analysis by sampling directly from the posterior distribution. The sampling strategy is based on a Hybrid Monte Carlo (HMC) approach that can handle non-Gaussian probability distributions. Numerical experiments are carried out using the Lorenz-96 model and observation operators with different levels of non-linearity and differentiability. The proposed filter is also tested with shallow water model on a sphere with linear observation operator. The results show that the sampling filter can perform well even in highly nonlinear situations were EnKF and MLEF filters diverge. △ Less

Submitted 5 December, 2014; v1 submitted 27 March, 2014; originally announced March 2014.

Comments: 52 pages, 24 figures, 4 tables

Report number: CSTR-4/2014

Showing 1–29 of 29 results for author: Attia, A