Search | arXiv e-print repository

Fairness Concerns in App Reviews: A Study on AI-based Mobile Apps

Authors: Ali Rezaei Nasab, Maedeh Dashti, Mojtaba Shahin, Mansooreh Zahedi, Hourieh Khalajzadeh, Chetan Arora, Peng Liang

Abstract: Fairness is one of the socio-technical concerns that must be addressed in AI-based systems. Unfair AI-based systems, particularly unfair AI-based mobile apps, can pose difficulties for a significant proportion of the global population. This paper aims to analyze fairness concerns in AI-based app reviews. We first manually constructed a ground-truth dataset, including 1,132 fairness and 1,473 non-f… ▽ More Fairness is one of the socio-technical concerns that must be addressed in AI-based systems. Unfair AI-based systems, particularly unfair AI-based mobile apps, can pose difficulties for a significant proportion of the global population. This paper aims to analyze fairness concerns in AI-based app reviews. We first manually constructed a ground-truth dataset, including 1,132 fairness and 1,473 non-fairness reviews. Leveraging the ground-truth dataset, we developed and evaluated a set of machine learning and deep learning models that distinguish fairness reviews from non-fairness reviews. Our experiments show that our best-performing model can detect fairness reviews with a precision of 94%. We then applied the best-performing model on approximately 9.5M reviews collected from 108 AI-based apps and identified around 92K fairness reviews. Next, applying the K-means clustering technique to the 92K fairness reviews, followed by manual analysis, led to the identification of six distinct types of fairness concerns (e.g., 'receiving different quality of features and services in different platforms and devices' and 'lack of transparency and fairness in dealing with user-generated content'). Finally, the manual analysis of 2,248 app owners' responses to the fairness reviews identified six root causes (e.g., 'copyright issues') that app owners report to justify fairness concerns. △ Less

Submitted 20 June, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

Comments: 30 pages, 5 images, 6 tables, Manuscript submitted to a Journal (2024)

arXiv:2307.14536 [pdf, ps, other]

Stability of particle trajectories of scalar conservation laws and applications in Bayesian inverse problems

Authors: Masoumeh Dashti, Duc-Lam Duong

Abstract: We consider the scalar conservation law in one space dimension with a genuinely nonlinear flux. We assume that an appropriate velocity function depending on the entropy solution of the conservation law is given for the comprising particles, and study their corresponding trajectories under the flow. The differential equation that each of these trajectories satisfies depends on the entropy solution… ▽ More We consider the scalar conservation law in one space dimension with a genuinely nonlinear flux. We assume that an appropriate velocity function depending on the entropy solution of the conservation law is given for the comprising particles, and study their corresponding trajectories under the flow. The differential equation that each of these trajectories satisfies depends on the entropy solution of the conservation law which is typically discontinuous in both time and space variables. The existence and uniqueness of these trajectories are guaranteed by the Filippov theory of differential equations. We show that such a Filippov solution is compatible with the front tracking and vanishing viscosity approximations in the sense that the approximate trajectories given by either of these methods converge uniformly to the trajectories corresponding to the entropy solution of the scalar conservation law. For certain classes of flux functions, illustrated by traffic flow, we prove the Hölder continuity of the particle trajectories with respect to the initial field or the flux function. We then consider the inverse problem of recovering the initial field or the flux function of the scalar conservation law from discrete pointwise measurements of the particle trajectories. We show that the above continuity properties translate to the stability of the Bayesian regularised solutions of these inverse problems with respect to appropriate approximations of the forward map. We also discuss the limitations of the situation where the same inverse problems are considered with pointwise observations made from the entropy solution itself. △ Less

Submitted 26 July, 2023; originally announced July 2023.

MSC Class: 35L65; 35R30; 62F15; 76A30

arXiv:2302.06407 [pdf]

doi 10.1093/llc/fqx054

Correcting Real-Word Spelling Errors: A New Hybrid Approach

Authors: Seyed MohammadSadegh Dashti, Amid Khatibi Bardsiri, Vahid Khatibi Bardsiri

Abstract: Spelling correction is one of the main tasks in the field of Natural Language Processing. Contrary to common spelling errors, real-word errors cannot be detected by conventional spelling correction methods. The real-word correction model proposed by Mays, Damerau and Mercer showed a great performance in different evaluations. In this research, however, a new hybrid approach is proposed which relie… ▽ More Spelling correction is one of the main tasks in the field of Natural Language Processing. Contrary to common spelling errors, real-word errors cannot be detected by conventional spelling correction methods. The real-word correction model proposed by Mays, Damerau and Mercer showed a great performance in different evaluations. In this research, however, a new hybrid approach is proposed which relies on statistical and syntactic knowledge to detect and correct real-word errors. In this model, Constraint Grammar (CG) is used to discriminate among sets of correction candidates in the search space. Mays, Damerau and Mercer's trigram approach is manipulated to estimate the probability of syntactically well-formed correction candidates. The approach proposed here is tested on the Wall Street Journal corpus. The model can prove to be more practical than some other models, such as WordNet-based method of Hirst and Budanitsky and fixed windows size method of Wilcox-O'Hearn and Hirst. △ Less

Submitted 9 February, 2023; originally announced February 2023.

Journal ref: Digital Scholarship in the Humanities. 2018 Sep 1;33(3)

arXiv:2302.04096 [pdf]

doi 10.1007/s10579-017-9397-4

Real-Word Error Correction with Trigrams: Correcting Multiple Errors in a Sentence

Authors: Seyed MohammadSadegh Dashti

Abstract: Spelling correction is a fundamental task in Text Mining. In this study, we assess the real-word error correction model proposed by Mays, Damerau and Mercer and describe several drawbacks of the model. We propose a new variation which focuses on detecting and correcting multiple real-word errors in a sentence, by manipulating a Probabilistic Context-Free Grammar (PCFG) to discriminate between item… ▽ More Spelling correction is a fundamental task in Text Mining. In this study, we assess the real-word error correction model proposed by Mays, Damerau and Mercer and describe several drawbacks of the model. We propose a new variation which focuses on detecting and correcting multiple real-word errors in a sentence, by manipulating a Probabilistic Context-Free Grammar (PCFG) to discriminate between items in the search space. We test our approach on the Wall Street Journal corpus and show that it outperforms Hirst and Budanitsky's WordNet-based method and Wilcox-O'Hearn, Hirst, and Budanitsky's fixed windows size method.-O'Hearn, Hirst, and Budanitsky's fixed windows size method. △ Less

Submitted 7 February, 2023; originally announced February 2023.

Journal ref: Language Resources and Evaluation. 2018 Jun;52:485-502

arXiv:2302.03625 [pdf]

doi 10.2174/1875036202013010057

An Expert System to Diagnose Spinal Disorders

Authors: Seyed Mohammad Sadegh Dashti, Seyedeh Fatemeh Dashti

Abstract: Objective: Until now, traditional invasive approaches have been the only means being leveraged to diagnose spinal disorders. Traditional manual diagnostics require a high workload, and diagnostic errors are likely to occur due to the prolonged work of physicians. In this research, we develop an expert system based on a hybrid inference algorithm and comprehensive integrated knowledge for assisting… ▽ More Objective: Until now, traditional invasive approaches have been the only means being leveraged to diagnose spinal disorders. Traditional manual diagnostics require a high workload, and diagnostic errors are likely to occur due to the prolonged work of physicians. In this research, we develop an expert system based on a hybrid inference algorithm and comprehensive integrated knowledge for assisting the experts in the fast and high-quality diagnosis of spinal disorders. Methods: First, for each spinal anomaly, the accurate and integrated knowledge was acquired from related experts and resources. Second, based on probability distributions and dependencies between symptoms of each anomaly, a unique numerical value known as certainty effect value was assigned to each symptom. Third, a new hybrid inference algorithm was designed to obtain excellent performance, which was an incorporation of the Backward Chaining Inference and Theory of Uncertainty. Results: The proposed expert system was evaluated in two different phases, real-world samples, and medical records evaluation. Evaluations show that in terms of real-world samples analysis, the system achieved excellent accuracy. Application of the system on the sample with anomalies revealed the degree of severity of disorders and the risk of development of abnormalities in unhealthy and healthy patients. In the case of medical records analysis, our expert system proved to have promising performance, which was very close to those of experts. Conclusion: Evaluations suggest that the proposed expert system provides promising performance, hel** specialists to validate the accuracy and integrity of their diagnosis. It can also serve as an intelligent educational software for medical students to gain familiarity with spinal disorder diagnosis process, and related symptoms. △ Less

Submitted 7 February, 2023; originally announced February 2023.

arXiv:2111.05369 [pdf, other]

Probabilistic predictions of SIS epidemics on networks based on population-level observations

Authors: Tanja Zerenner, Francesco Di Lauro, Masoumeh Dashti, Luc Berthouze, Istvan Z. Kiss

Abstract: We predict the future course of ongoing susceptible-infected-susceptible (SIS) epidemics on regular, Erdős-Rényi and Barabási-Albert networks. It is known that the contact network influences the spread of an epidemic within a population. Therefore, observations of an epidemic, in this case at the population-level, contain information about the underlying network. This information, in turn, is usef… ▽ More We predict the future course of ongoing susceptible-infected-susceptible (SIS) epidemics on regular, Erdős-Rényi and Barabási-Albert networks. It is known that the contact network influences the spread of an epidemic within a population. Therefore, observations of an epidemic, in this case at the population-level, contain information about the underlying network. This information, in turn, is useful for predicting the future course of an ongoing epidemic. To exploit this in a prediction framework, the exact high-dimensional stochastic model of an SIS epidemic on a network is approximated by a lower-dimensional surrogate model. The surrogate model is based on a birth-and-death process; the effect of the underlying network is described by a parametric model for the birth rates. We demonstrate empirically that the surrogate model captures the intrinsic stochasticity of the epidemic once it reaches a point from which it will not die out. Bayesian parameter inference allows for uncertainty about the model parameters and the class of the underlying network to be incorporated directly into probabilistic predictions. An evaluation of a number of scenarios shows that in most cases the resulting prediction intervals adequately quantify the prediction uncertainty. As long as the population-level data is available over a long-enough period, even if not sampled frequently, the model leads to excellent predictions where the underlying network is correctly identified and prediction uncertainty mainly reflects the intrinsic stochasticity of the spreading epidemic. For predictions inferred from shorter observational periods, uncertainty about parameters and network class dominate prediction uncertainty. The proposed method relies on minimal data and is numerically efficient, which makes it attractive either as a standalone inference and prediction scheme or in conjunction with other methods. △ Less

Submitted 9 November, 2021; originally announced November 2021.

arXiv:2101.01874 [pdf]

doi 10.29252/jsdp.15.2.69

Smile and Laugh Expressions Detection Based on Local Minimum Key Points

Authors: Mina Mohammadi Dashti, Majid Harouni

Abstract: In this paper, a smile and laugh facial expression is presented based on dimension reduction and description process of the key points. The paper has two main objectives; the first is to extract the local critical points in terms of their apparent features, and the second is to reduce the system's dependence on training inputs. To achieve these objectives, three different scenarios on extracting t… ▽ More In this paper, a smile and laugh facial expression is presented based on dimension reduction and description process of the key points. The paper has two main objectives; the first is to extract the local critical points in terms of their apparent features, and the second is to reduce the system's dependence on training inputs. To achieve these objectives, three different scenarios on extracting the features are proposed. First of all, the discrete parts of a face are detected by local binary pattern method that is used to extract a set of global feature vectors for texture classification considering various regions of an input-image face. Then, in the first scenario and with respect to the correlation changes of adjacent pixels on the texture of a mouth area, a set of local key points are extracted using the Harris corner detector. In the second scenario, the dimension reduction of the extracted points of first scenario provided by principal component analysis algorithm leading to reduction in computational costs and overall complexity without loss of performance and flexibility, etc. △ Less

Submitted 6 January, 2021; originally announced January 2021.

Comments: 20 pages, in Farsi, 11 figures, 7 tables, 21 equations, journal, 2 authors

MSC Class: 65D19 ACM Class: I.4.6; I.2.10; I.5.0; I.3.0; E.0

arXiv:2012.06281 [pdf, other]

Trash Talk: Accelerating Garbage Collection on Integrated GPUs is Worthless

Authors: Mohammad Dashti, Alexandra Fedorova

Abstract: Systems integrating heterogeneous processors with unified memory provide seamless integration among these processors with minimal development complexity. These systems integrate accelerators such as GPUs on the same die with CPU cores to accommodate running parallel applications with varying levels of parallelism. Such integration is becoming very common on modern chip architectures, and it places… ▽ More Systems integrating heterogeneous processors with unified memory provide seamless integration among these processors with minimal development complexity. These systems integrate accelerators such as GPUs on the same die with CPU cores to accommodate running parallel applications with varying levels of parallelism. Such integration is becoming very common on modern chip architectures, and it places a burden (or opportunity) on application and system programmers to utilize the full potential of such integrated chips. In this paper we evaluate whether we can obtain any performance benefits from running garbage collection on integrated GPU systems, and discuss how difficult it would be to realize these gains for the programmer. Proliferation of garbage-collected languages running on a variety of platforms from handheld mobile devices to data centers makes garbage collection an interesting target to examine on such platforms and can offer valuable lessons for other applications. We present our analysis of running garbage collection on integrated systems and find that the current state of these systems does not provide an advantage for accelerating such a task. We build a framework that allows us to offload garbage collection tasks on integrated GPU systems from within the JVM. We identify dominant phases of garbage collection and study the viability of offloading them to the integrated GPU. We show that performance advantages are limited, partly because an integrated GPU has limited advantage in memory bandwidth over the CPU, and partly because of costly atomic operations. △ Less

Submitted 11 December, 2020; originally announced December 2020.

arXiv:2006.10387 [pdf, ps, other]

A Theory of Black-Box Tests

Authors: Mohammad Torabi Dashti, David Basin

Abstract: The purpose of testing a system with respect to a requirement is to refute the hypothesis that the system satisfies the requirement. We build a theory of tests and refutation based on the elementary notions of satisfaction and refinement. We use this theory to characterize the requirements that can be refuted through black-box testing and, dually, verified through such tests. We consider refutatio… ▽ More The purpose of testing a system with respect to a requirement is to refute the hypothesis that the system satisfies the requirement. We build a theory of tests and refutation based on the elementary notions of satisfaction and refinement. We use this theory to characterize the requirements that can be refuted through black-box testing and, dually, verified through such tests. We consider refutation in finite time and obtain the finite falsifiability of hyper-safety temporal requirements as a special case. We extend our theory with computational constraints and separate refutation from enforcement in the context of temporal hyper-properties. Overall, our theory provides a basis to analyze the scope and reach of black-box tests and to bridge results from diverse areas including testing, verification, and enforcement. △ Less

Submitted 18 June, 2020; originally announced June 2020.

arXiv:2004.04636 [pdf, other]

Nonparametric Bayesian inference of discretely observed diffusions

Authors: Jean-Charles Croix, Masoumeh Dashti, Istvàn Zoltàn Kiss

Abstract: We consider the problem of the Bayesian inference of drift and diffusion coefficient functions in a stochastic differential equation given discrete observations of a realisation of its solution. We give conditions for the well-posedness and stable approximations of the posterior measure. These conditions in particular allow for priors with unbounded support. Our proof relies on the explicit constr… ▽ More We consider the problem of the Bayesian inference of drift and diffusion coefficient functions in a stochastic differential equation given discrete observations of a realisation of its solution. We give conditions for the well-posedness and stable approximations of the posterior measure. These conditions in particular allow for priors with unbounded support. Our proof relies on the explicit construction of transition probability densities using the parametrix method for general parabolic equations. We then study an application of these results in inferring the rates of Birth-and-Death processes. △ Less

Submitted 9 April, 2020; originally announced April 2020.

Comments: 25 pages, 1 figure

MSC Class: 62G05; 62F15; 60J60; 65N21; 35K20

arXiv:1906.10966 [pdf, ps, other]

Network Inference from Population-Level Observation of Epidemics

Authors: F. Di Lauro, J. -C. Croix, M. Dashti, L. Berthouze, I. Z. Kiss

Abstract: Using the continuous-time susceptible-infected-susceptible (SIS) model on networks, we investigate the problem of inferring the class of the underlying network when epidemic data is only available at population-level (i.e. the number of infected individuals at a finite set of discrete times of a single realisation of the epidemic), the only information likely to be available in real world settings… ▽ More Using the continuous-time susceptible-infected-susceptible (SIS) model on networks, we investigate the problem of inferring the class of the underlying network when epidemic data is only available at population-level (i.e. the number of infected individuals at a finite set of discrete times of a single realisation of the epidemic), the only information likely to be available in real world settings. To tackle this, epidemics on networks are approximated by a Birth-and-Death process which keeps track of the number of infected nodes at population level. The rates of this surrogate model encode both the structure of the underlying network and disease dynamics. We use extensive simulations over Regular, Erdős-Rényi and Barabási-Albert networks to build network class-specific priors for these rates. % show that different well-known network classes map onto distinct regions of the parameter space of this model. We then use Bayesian model selection to recover the most likely underlying network class, based only on a single realisation of the epidemic. We show that the proposed methodology yields good results on both synthetic and real-world networks. △ Less

Submitted 4 December, 2019; v1 submitted 26 June, 2019; originally announced June 2019.

Comments: 19 pages, 15 figures

arXiv:1811.12244 [pdf, ps, other]

Rates of contraction of posterior distributions based on $p$-exponential priors

Authors: Sergios Agapiou, Masoumeh Dashti, Tapio Helin

Abstract: We consider a family of infinite dimensional product measures with tails between Gaussian and exponential, which we call $p$-exponential measures. We study their measure-theoretic properties and in particular their concentration. Our findings are used to develop a general contraction theory of posterior distributions on nonparametric models with $p$-exponential priors in separable Banach parameter… ▽ More We consider a family of infinite dimensional product measures with tails between Gaussian and exponential, which we call $p$-exponential measures. We study their measure-theoretic properties and in particular their concentration. Our findings are used to develop a general contraction theory of posterior distributions on nonparametric models with $p$-exponential priors in separable Banach parameter spaces. Our approach builds on the general contraction theory for Gaussian process priors in van der Vaart and van Zanten 2008, namely we use prior concentration to verify prior mass and entropy conditions sufficient for posterior contraction. However, the specific concentration properties of $p$-exponential priors lead to a more complex entropy bound which can influence negatively the obtained rate of contraction, depending on the topology of the parameter space. Subject to the more complex entropy bound, we show that the rate of contraction depends on the position of the true parameter relative to a certain Banach space associated to $p$-exponential measures and on the small ball probabilities of these measures. For example, we apply our theory in the white noise model under Besov regularity of the truth and obtain minimax rates of contraction using (rescaled) $α$-regular $p$-exponential priors. In particular, our results suggest that when interested in spatially inhomogeneous unknown functions, in terms of posterior contraction, it is preferable to use Laplace rather than Gaussian priors. △ Less

Submitted 8 October, 2020; v1 submitted 29 November, 2018; originally announced November 2018.

MSC Class: 62G20; 62G05; 60G50

arXiv:1807.09887 [pdf, other]

Compiling Database Application Programs

Authors: Mohammad Dashti, Sachin Basil John, Thierry Coppey, Amir Shaikhha, Vo** Jovanovic, Christoph Koch

Abstract: There is a trend towards increased specialization of data management software for performance reasons. In this paper, we study the automatic specialization and optimization of database application programs -- sequences of queries and updates, augmented with control flow constructs as they appear in database scripts, UDFs, transactional workloads and triggers in languages such as PL/SQL. We show ho… ▽ More There is a trend towards increased specialization of data management software for performance reasons. In this paper, we study the automatic specialization and optimization of database application programs -- sequences of queries and updates, augmented with control flow constructs as they appear in database scripts, UDFs, transactional workloads and triggers in languages such as PL/SQL. We show how to build an optimizing compiler for database application programs using generative programming and state-of-the-art compiler technology. We evaluate a hand-optimized low-level implementation of TPC-C, and identify the key optimization techniques that account for its good performance. Our compiler fully automates these optimizations and, applied to this benchmark, outperforms the manually optimized baseline by a factor of two. By selectively disabling some of the optimizations in the compiler, we derive a clinical and precise way of obtaining insight into their individual performance contributions. △ Less

Submitted 25 July, 2018; originally announced July 2018.

Comments: 16 pages

ACM Class: H.2.4

arXiv:1805.02260 [pdf]

Active disturbance rejection for precise positioning of dual-stage hard disk drives

Authors: Mohammad Amin Dashti, Ali Chaibakhsh

Abstract: This paper presents an application of adaptive control algorithm in order to reject the external disturbances in dual-stage hard disk drives. For this purpose, a dual PID controller is first designed without the plant exposure to external disturbances. Then, an adaptive control approach based on recursive least squares adaptive (RLS) algorithm was employed to identify and reject disturbances. The… ▽ More This paper presents an application of adaptive control algorithm in order to reject the external disturbances in dual-stage hard disk drives. For this purpose, a dual PID controller is first designed without the plant exposure to external disturbances. Then, an adaptive control approach based on recursive least squares adaptive (RLS) algorithm was employed to identify and reject disturbances. The performance of the proposed technique was evaluated for hard disk track-seeking through simulation experiments. Results show the feasibility and precise tracking of the designed control system. △ Less

Submitted 6 May, 2018; originally announced May 2018.

Comments: 4 pages

arXiv:1711.09263 [pdf]

Optimal design with EGM approach in conjugate natural convection with surface radiation in a two-dimensional enclosure

Authors: Mohammad Amin Dashti, Ali Safavinejad

Abstract: Analysis of conjugate natural convection with surface radiation in a two-dimensional enclosure is carried out in order to search the optimal location of the heat source with entropy generation minimization (EGM) approach and conventional heat transfer parameters. The air as an incompressible fluid and transparent media is considered the fluid filling the enclosure with the steady and laminar regim… ▽ More Analysis of conjugate natural convection with surface radiation in a two-dimensional enclosure is carried out in order to search the optimal location of the heat source with entropy generation minimization (EGM) approach and conventional heat transfer parameters. The air as an incompressible fluid and transparent media is considered the fluid filling the enclosure with the steady and laminar regime. The enclosure internal surfaces are also gray, opaque and diffuse. The governing equations with stream function and vorticity formulation are solved using finite difference approach. Results include the effect of Rayleigh number and emissivity on the dimensionless average rate of entropy generation and its optimum location. The optimum location search with conventional heat transfer parameters including maximum temperature and Nusselt numbers are also examined. △ Less

Submitted 25 November, 2017; originally announced November 2017.

Comments: 22 pages, 13 figures

arXiv:1705.03286 [pdf, ps, other]

doi 10.1088/1361-6420/aaacac

Sparsity-promoting and edge-preserving maximum a posteriori estimators in non-parametric Bayesian inverse problems

Authors: Sergios Agapiou, Martin Burger, Masoumeh Dashti, Tapio Helin

Abstract: We consider the inverse problem of recovering an unknown functional parameter $u$ in a separable Banach space, from a noisy observation $y$ of its image through a known possibly non-linear ill-posed map ${\mathcal G}$. The data $y$ is finite-dimensional and the noise is Gaussian. We adopt a Bayesian approach to the problem and consider Besov space priors (see Lassas et al. 2009), which are well-kn… ▽ More We consider the inverse problem of recovering an unknown functional parameter $u$ in a separable Banach space, from a noisy observation $y$ of its image through a known possibly non-linear ill-posed map ${\mathcal G}$. The data $y$ is finite-dimensional and the noise is Gaussian. We adopt a Bayesian approach to the problem and consider Besov space priors (see Lassas et al. 2009), which are well-known for their edge-preserving and sparsity-promoting properties and have recently attracted wide attention especially in the medical imaging community. Our key result is to show that in this non-parametric setup the maximum a posteriori (MAP) estimates are characterized by the minimizers of a generalized Onsager--Machlup functional of the posterior. This is done independently for the so-called weak and strong MAP estimates, which as we show coincide in our context. In addition, we prove a form of weak consistency for the MAP estimators in the infinitely informative data limit. Our results are remarkable for two reasons: first, the prior distribution is non-Gaussian and does not meet the smoothness conditions required in previous research on non-parametric MAP estimates. Second, the result analytically justifies existing uses of the MAP estimate in finite but high dimensional discretizations of Bayesian inverse problems with the considered Besov priors. △ Less

Submitted 23 May, 2017; v1 submitted 9 May, 2017; originally announced May 2017.

Comments: 36 pages, some typos corrected, acknowledgements added

MSC Class: 49N45; 62C10; 62G05; 62G20

arXiv:1610.09166 [pdf, other]

Push vs. Pull-Based Loop Fusion in Query Engines

Authors: Amir Shaikhha, Mohammad Dashti, Christoph Koch

Abstract: Database query engines use pull-based or push-based approaches to avoid the materialization of data across query operators. In this paper, we study these two types of query engines in depth and present the limitations and advantages of each engine. Similarly, the programming languages community has developed loop fusion techniques to remove intermediate collections in the context of collection pro… ▽ More Database query engines use pull-based or push-based approaches to avoid the materialization of data across query operators. In this paper, we study these two types of query engines in depth and present the limitations and advantages of each engine. Similarly, the programming languages community has developed loop fusion techniques to remove intermediate collections in the context of collection programming. We draw parallels between the DB and PL communities by demonstrating the connection between pipelined query engines and loop fusion techniques. Based on this connection, we propose a new type of pull-based engine, inspired by a loop fusion technique, which combines the benefits of both approaches. Then we experimentally evaluate the various engines, in the context of query compilation, for the first time in a fair environment, eliminating the biasing impact of ancillary optimizations that have traditionally only been used with one of the approaches. We show that for realistic analytical workloads, there is no considerable advantage for either form of pipelined query engine, as opposed to what recent research suggests. Also, by using microbenchmarks we show that our proposed engine dominates the existing engines by combining the benefits of both. △ Less

Submitted 28 October, 2016; originally announced October 2016.

arXiv:1605.01769 [pdf, other]

Access Control Synthesis for Physical Spaces

Authors: Petar Tsankov, Mohammad Torabi Dashti, David Basin

Abstract: Access-control requirements for physical spaces, like office buildings and airports, are best formulated from a global viewpoint in terms of system-wide requirements. For example, "there is an authorized path to exit the building from every room." In contrast, individual access-control components, such as doors and turnstiles, can only enforce local policies, specifying when the component may open… ▽ More Access-control requirements for physical spaces, like office buildings and airports, are best formulated from a global viewpoint in terms of system-wide requirements. For example, "there is an authorized path to exit the building from every room." In contrast, individual access-control components, such as doors and turnstiles, can only enforce local policies, specifying when the component may open. In practice, the gap between the system-wide, global requirements and the many local policies is bridged manually, which is tedious, error-prone, and scales poorly. We propose a framework to automatically synthesize local access control policies from a set of global requirements for physical spaces. Our framework consists of an expressive language to specify both global requirements and physical spaces, and an algorithm for synthesizing local, attribute-based policies from the global specification. We empirically demonstrate the framework's effectiveness on three substantial case studies. The studies demonstrate that access control synthesis is practical even for complex physical spaces, such as airports, with many interrelated security requirements. △ Less

Submitted 5 May, 2016; originally announced May 2016.

arXiv:1603.00542 [pdf, other]

Repairing Conflicts among MVCC Transactions

Authors: Mohammad Dashti, Sachin Basil John, Amir Shaikhha, Christoph Koch

Abstract: The optimistic variants of MVCC (Multi-Version Concurrency Control) avoid blocking concurrent transactions at the cost of having a validation phase. Upon failure in the validation phase, the transaction is usually aborted and restarted from scratch. The "abort and restart" approach becomes a performance bottleneck for the use cases with high contention objects or long running transactions. In addi… ▽ More The optimistic variants of MVCC (Multi-Version Concurrency Control) avoid blocking concurrent transactions at the cost of having a validation phase. Upon failure in the validation phase, the transaction is usually aborted and restarted from scratch. The "abort and restart" approach becomes a performance bottleneck for the use cases with high contention objects or long running transactions. In addition, restarting from scratch creates a negative feedback loop in the system, because the system incurs additional overhead that may create even further conflicts. In this paper, we propose a novel approach for conflict resolution in MVCC for in-memory databases. This low overhead approach summarizes the transaction programs in the form of a dependency graph. The dependency graph also contains the constructs used in the validation phase of the MVCC algorithm. Then, in the case of encountering conflicts among transactions, the conflict locations in the program are quickly detected, and the conflicting transactions are partially re-executed. This approach maximizes the reuse of the computations done in the initial execution round, and increases the transaction processing throughput. △ Less

Submitted 1 March, 2016; originally announced March 2016.

Comments: 12 pages, 9 figures

ACM Class: H.2.4

arXiv:1303.4795 [pdf, other]

doi 10.1088/0266-5611/29/9/095017

MAP Estimators and Their Consistency in Bayesian Nonparametric Inverse Problems

Authors: Masoumeh Dashti, Kody J. H. Law, Andrew M. Stuart, Jochen Voss

Abstract: We consider the inverse problem of estimating an unknown function $u$ from noisy measurements $y$ of a known, possibly nonlinear, map $\mathcal{G}$ applied to $u$. We adopt a Bayesian approach to the problem and work in a setting where the prior measure is specified as a Gaussian random field $μ_0$. We work under a natural set of conditions on the likelihood which imply the existence of a well-pos… ▽ More We consider the inverse problem of estimating an unknown function $u$ from noisy measurements $y$ of a known, possibly nonlinear, map $\mathcal{G}$ applied to $u$. We adopt a Bayesian approach to the problem and work in a setting where the prior measure is specified as a Gaussian random field $μ_0$. We work under a natural set of conditions on the likelihood which imply the existence of a well-posed posterior measure, $μ^y$. Under these conditions we show that the {\em maximum a posteriori} (MAP) estimator is well-defined as the minimiser of an Onsager-Machlup functional defined on the Cameron-Martin space of the prior; thus we link a problem in probability with a problem in the calculus of variations. We then consider the case where the observational noise vanishes and establish a form of Bayesian posterior consistency. We also prove a similar result for the case where the observation of $\mathcal{G}(u)$ can be repeated as many times as desired with independent identically distributed noise. The theory is illustrated with examples from an inverse problem for the Navier-Stokes equation, motivated by problems arising in weather forecasting, and from the theory of conditioned diffusions, motivated by problems arising in molecular dynamics. △ Less

Submitted 23 August, 2013; v1 submitted 19 March, 2013; originally announced March 2013.

Comments: changed title, minor fixes

MSC Class: 49N45 60G35 62F15

arXiv:1302.6989 [pdf, ps, other]

The Bayesian Approach To Inverse Problems

Authors: Masoumeh Dashti, Andrew M. Stuart

Abstract: These lecture notes highlight the mathematical and computational structure relating to the formulation of, and development of algorithms for, the Bayesian approach to inverse problems in differential equations. This approach is fundamental in the quantification of uncertainty within applications involving the blending of mathematical models with data. These lecture notes highlight the mathematical and computational structure relating to the formulation of, and development of algorithms for, the Bayesian approach to inverse problems in differential equations. This approach is fundamental in the quantification of uncertainty within applications involving the blending of mathematical models with data. △ Less

Submitted 2 July, 2015; v1 submitted 27 February, 2013; originally announced February 2013.

Comments: Lecture notes to appear in Handbook of Uncertainty Quantification, Editors R. Ghanem, D. Higdon and H. Owhadi, Springer, 2016

arXiv:1301.2185 [pdf, other]

doi 10.1364/OL.38.000887

Reconstructing the Poynting vector skew angle and wave-front of optical vortex beams via two-channel moiré deflectometery

Authors: Mohammad Yeganeh, Saifollah Rasouli, Mohsen Dashti, Sergei Slussarenko, Enrico Santamato, Ebrahim Karimi

Abstract: A novel approach based on the two-channel moiré deflectometry has been used to measure both wave-front and transverse component of the Poynting vector of an optical vortex beam. Generated vortex beam by the q-plate, an inhomogeneous liquid crystal cell, has been analyzed with such technique. The measured topological charge of generated beams are in an excellent agreement with theoretical predictio… ▽ More A novel approach based on the two-channel moiré deflectometry has been used to measure both wave-front and transverse component of the Poynting vector of an optical vortex beam. Generated vortex beam by the q-plate, an inhomogeneous liquid crystal cell, has been analyzed with such technique. The measured topological charge of generated beams are in an excellent agreement with theoretical prediction. △ Less

Submitted 8 March, 2013; v1 submitted 10 January, 2013; originally announced January 2013.

Comments: 3 pages, 2 figures

Journal ref: Optics Letters 38, 887 (2013)

arXiv:1105.0889 [pdf, ps, other]

Besov priors for Bayesian inverse problems

Authors: Masoumeh Dashti, Stephen Harris, Andrew Stuart

Abstract: We consider the inverse problem of estimating a function $u$ from noisy, possibly nonlinear, observations. We adopt a Bayesian approach to the problem. This approach has a long history for inversion, dating back to 1970, and has, over the last decade, gained importance as a practical tool. However most of the existing theory has been developed for Gaussian prior measures. Recently Lassas, Saksman… ▽ More We consider the inverse problem of estimating a function $u$ from noisy, possibly nonlinear, observations. We adopt a Bayesian approach to the problem. This approach has a long history for inversion, dating back to 1970, and has, over the last decade, gained importance as a practical tool. However most of the existing theory has been developed for Gaussian prior measures. Recently Lassas, Saksman and Siltanen (Inv. Prob. Imag. 2009) showed how to construct Besov prior measures, based on wavelet expansions with random coefficients, and used these prior measures to study linear inverse problems. In this paper we build on this development of Besov priors to include the case of nonlinear measurements. In doing so a key technical tool, established here, is a Fernique-like theorem for Besov measures. This theorem enables us to identify appropriate conditions on the forward solution operator which, when matched to properties of the prior Besov measure, imply the well-definedness and well-posedness of the posterior measure. We then consider the application of these results to the inverse problem of finding the diffusion coefficient of an elliptic partial differential equation, given noisy measurements of its solution. △ Less

Submitted 15 November, 2011; v1 submitted 4 May, 2011; originally announced May 2011.

Comments: 18 pages

arXiv:1102.3072 [pdf, ps, other]

doi 10.1007/s00205-011-0401-7

The motion of a fluid-rigid disc system at the zero limit of the rigid disc radius

Authors: Masoumeh Dashti, James C. Robinson

Abstract: We consider the two-dimensional motion of the coupled system of a viscous incompressible fluid and a rigid disc moving with the fluid, in the whole plane. The fluid motion is described by the Navier-Stokes equations and the motion of the rigid body by conservation laws of linear and angular momentum. We show that, assuming that the rigid disc is not allowed to rotate, as the radius of the disc goe… ▽ More We consider the two-dimensional motion of the coupled system of a viscous incompressible fluid and a rigid disc moving with the fluid, in the whole plane. The fluid motion is described by the Navier-Stokes equations and the motion of the rigid body by conservation laws of linear and angular momentum. We show that, assuming that the rigid disc is not allowed to rotate, as the radius of the disc goes to zero, the solution of this system converges, in an appropriate sense, to the solution of the Navier-Stokes equations describing the motion of only fluid in the whole plane. We also prove that the trajectory of the centre of the disc, at the zero limit of its radius, coincides with a fluid particle trajectory. △ Less

Submitted 15 February, 2011; originally announced February 2011.

Comments: 29 pages, 0 figures

arXiv:1102.0143 [pdf, ps, other]

Uncertainty quantification and weak approximation of an elliptic inverse problem

Authors: Masoumeh Dashti, Andrew M. Stuart

Abstract: We consider the inverse problem of determining the permeability from the pressure in a Darcy model of flow in a porous medium. Mathematically the problem is to find the diffusion coefficient for a linear uniformly elliptic partial differential equation in divergence form, in a bounded domain in dimension $d \le 3$, from measurements of the solution in the interior. We adopt a Bayesian approach to… ▽ More We consider the inverse problem of determining the permeability from the pressure in a Darcy model of flow in a porous medium. Mathematically the problem is to find the diffusion coefficient for a linear uniformly elliptic partial differential equation in divergence form, in a bounded domain in dimension $d \le 3$, from measurements of the solution in the interior. We adopt a Bayesian approach to the problem. We place a prior random field measure on the log permeability, specified through the Karhunen-Loève expansion of its draws. We consider Gaussian measures constructed this way, and study the regularity of functions drawn from them. We also study the Lipschitz properties of the observation operator map** the log permeability to the observations. Combining these regularity and continuity estimates, we show that the posterior measure is well-defined on a suitable Banach space. Furthermore the posterior measure is shown to be Lipschitz with respect to the data in the Hellinger metric, giving rise to a form of well-posedness of the inverse problem. Determining the posterior measure, given the data, solves the problem of uncertainty quantification for this inverse problem. In practice the posterior measure must be approximated in a finite dimensional space. We quantify the errors incurred by employing a truncated Karhunen-Loève expansion to represent this meausure. In particular we study weak convergence of a general class of locally Lipschitz functions of the log permeability, and apply this general theory to estimate errors in the posterior mean of the pressure and the pressure covariance, under refinement of the finite dimensional Karhunen-Loève truncation. △ Less

Submitted 1 February, 2011; originally announced February 2011.

Comments: 19 pages, 0 figures, submitted to SIAM Journal on Numerical Analysis

arXiv:0909.2126 [pdf, ps, other]

Approximation of Bayesian Inverse Problems for PDEs

Authors: S. L. Cotter, M. Dashti, A. M. Stuart

Abstract: Inverse problems are often ill-posed, with solutions that depend sensitively on data. In any numerical approach to the solution of such problems, regularization of some form is needed to counteract the resulting instability. This paper is based on an approach to regularization, employing a Bayesian formulation of the problem, which leads to a notion of well-posedness for inverse problems, at the… ▽ More Inverse problems are often ill-posed, with solutions that depend sensitively on data. In any numerical approach to the solution of such problems, regularization of some form is needed to counteract the resulting instability. This paper is based on an approach to regularization, employing a Bayesian formulation of the problem, which leads to a notion of well-posedness for inverse problems, at the level of probability measures. The stability which results from this well-posedness may be used as the basis for quantifying the approximation, in finite dimensional spaces, of inverse problems for functions. This paper contains a theory which utilizes the stability to estimate the distance between the true and approximate posterior distributions, in the Hellinger metric, in terms of error estimates for approximation of the underlying forward problem. This is potentially useful as it allows for the transfer of estimates from the numerical analysis of forward problems into estimates for the solution of the related inverse problem. In particular controlling differences in the Hellinger metric leads to control on the differences between expected values of polynomially bounded functions and operators, including the mean and covariance operator. The ideas are illustrated with the classical inverse problem for the heat equation, and then applied to some more complicated non-Gaussian inverse problems arising in data assimilation, involving determination of the initial condition for the Stokes or Navier-Stokes equation from Lagrangian and Eulerian observations respectively. △ Less

Submitted 11 September, 2009; originally announced September 2009.

Comments: 35 pages, 3 figures

MSC Class: 65P99; 65C05

arXiv:0903.4358 [pdf, ps, other]

Sums of powers via integration

Authors: M. Torabi Dashti

Abstract: Sum of powers 1^p+...+n^p, with n and p being natural numbers and n>=1, can be expressed as a polynomial function of n of degree p+1. Such representations are often called Faulhaber formulae. A simple recursive algorithm for computing coefficients of Faulhaber formulae is presented. The correctness of the algorithm is proved by giving a recurrence relation on Faulhaber formulae. Sum of powers 1^p+...+n^p, with n and p being natural numbers and n>=1, can be expressed as a polynomial function of n of degree p+1. Such representations are often called Faulhaber formulae. A simple recursive algorithm for computing coefficients of Faulhaber formulae is presented. The correctness of the algorithm is proved by giving a recurrence relation on Faulhaber formulae. △ Less

Submitted 25 March, 2009; originally announced March 2009.

Comments: 4 pages

arXiv:0710.5708 [pdf, ps, other]

doi 10.1088/0951-7715/22/4/003

A simple proof of uniqueness of the particle trajectories for solutions of the Navier-Stokes equations

Authors: Masoumeh Dashti, James C. Robinson

Abstract: We give a simple proof of the uniqueness of fluid particle trajectories corresponding to: 1) the solution of the two-dimensional Navier Stokes equations with an initial condition that is only square integrable, and 2) the local strong solution of the three-dimensional equations with an $H^{1/2}$-regular initial condition i.e.\ with the minimal Sobolev regularity known to guarantee uniqueness. This… ▽ More We give a simple proof of the uniqueness of fluid particle trajectories corresponding to: 1) the solution of the two-dimensional Navier Stokes equations with an initial condition that is only square integrable, and 2) the local strong solution of the three-dimensional equations with an $H^{1/2}$-regular initial condition i.e.\ with the minimal Sobolev regularity known to guarantee uniqueness. This result was proved by Chemin & Lerner (J Diff Eq 121 (1995) 314-328) using the Littlewood-Paley theory for the flow in the whole space $\R^d$, $d\ge 2$. We first show that the solutions of the differential equation $\dot{X}=u(X,t)$ are unique if $u\in L^p(0,T;H^{(d/2)-1})$ for some $p>1$ and $\sqrt{t}\,u\in L^2(0,T;H^{(d/2)+1})$. We then prove, using standard energy methods, that the solution of the Navier-Stokes equations with initial condition in $H^{(d/2)-1}$ satisfies these conditions. This proof is also valid for the more physically relevant case of bounded domains. △ Less

Submitted 15 February, 2011; v1 submitted 30 October, 2007; originally announced October 2007.

Comments: 13 pages, 0 figures

Journal ref: Nonlinearity 22, 735-746, 2009

arXiv:math/0701341 [pdf, ps, other]

An a posteriori condition on the numerical approximations of the Navier-Stokes equations for the existence of a strong solution

Authors: Masoumeh Dashti, James C. Robinson

Abstract: In their 2006 paper, Chernyshenko et al prove that a sufficiently smooth strong solution of the 3d Navier-Stokes equations is robust with respect to small enough changes in initial conditions and forcing function. They also show that if a regular enough strong solution exists then Galerkin approximations converge to it. They then use these results to conclude that the existence of a sufficiently… ▽ More In their 2006 paper, Chernyshenko et al prove that a sufficiently smooth strong solution of the 3d Navier-Stokes equations is robust with respect to small enough changes in initial conditions and forcing function. They also show that if a regular enough strong solution exists then Galerkin approximations converge to it. They then use these results to conclude that the existence of a sufficiently regular strong solution can be verified using sufficiently refined numerical computations. In this paper we study the solutions with minimal required regularity to be strong, which are less regular than those considered in Chernyshenko et al (2006). We prove a similar robustness result and show the validity of the results relating convergent numerical computations and the existence of the strong solutions. △ Less

Submitted 12 January, 2007; originally announced January 2007.

Showing 1–29 of 29 results for author: Dashti, M