-
Multivariate Power Series in Maple
Authors:
Mohammadali Asadi,
Alexander Brandt,
Mahsa Kazemi,
Marc Moreno Maza,
Erik Postma
Abstract:
We present MultivariatePowerSeries, a Maple library introduced in Maple 2021, providing a variety of methods to study formal multivariate power series and univariate polynomials over such series. This library offers a simple and easy-to-use user interface. Its implementation relies on lazy evaluation techniques and takes advantage of Maple's features for object-oriented programming. The exposed me…
▽ More
We present MultivariatePowerSeries, a Maple library introduced in Maple 2021, providing a variety of methods to study formal multivariate power series and univariate polynomials over such series. This library offers a simple and easy-to-use user interface. Its implementation relies on lazy evaluation techniques and takes advantage of Maple's features for object-oriented programming. The exposed methods include Weierstrass Preparation Theorem and factorization via Hensel's lemma. The computational performance is demonstrated by means of an experimental comparison with software counterparts.
△ Less
Submitted 21 June, 2021;
originally announced June 2021.
-
On the Complexity and Parallel Implementation of Hensel's Lemma and Weierstrass Preparation
Authors:
Alexander Brandt,
Marc Moreno Maza
Abstract:
Hensel's lemma, combined with repeated applications of Weierstrass preparation theorem, allows for the factorization of polynomials with multivariate power series coefficients. We present a complexity analysis for this method and leverage those results to guide the load-balancing of a parallel implementation to concurrently update all factors. In particular, the factorization creates a pipeline wh…
▽ More
Hensel's lemma, combined with repeated applications of Weierstrass preparation theorem, allows for the factorization of polynomials with multivariate power series coefficients. We present a complexity analysis for this method and leverage those results to guide the load-balancing of a parallel implementation to concurrently update all factors. In particular, the factorization creates a pipeline where the terms of degree k of the first factor are computed simultaneously with the terms of degree k-1 of the second factor, etc. An implementation challenge is the inherent irregularity of computational work between factors, as our complexity analysis reveals. Additional resource utilization and load-balancing is achieved through the parallelization of Weierstrass preparation. Experimental results show the efficacy of this mixed parallel scheme, achieving up to 9x parallel speedup on 12 cores.
△ Less
Submitted 2 July, 2021; v1 submitted 22 May, 2021;
originally announced May 2021.
-
KLARAPTOR: A Tool for Dynamically Finding Optimal Kernel Launch Parameters Targeting CUDA Programs
Authors:
Alexander Brandt,
Davood Mohajerani,
Marc Moreno Maza,
Jeeva Paudel,
Linxiao Wang
Abstract:
In this paper we present KLARAPTOR (Kernel LAunch parameters RAtional Program estimaTOR), a new tool built on top of the LLVM Pass Framework and NVIDIA CUPTI API to dynamically determine the optimal values of kernel launch parameters of a CUDA program P. To be precise, we describe a novel technique to statically build (at the compile time of P) a so-called rational program R. Using a performance p…
▽ More
In this paper we present KLARAPTOR (Kernel LAunch parameters RAtional Program estimaTOR), a new tool built on top of the LLVM Pass Framework and NVIDIA CUPTI API to dynamically determine the optimal values of kernel launch parameters of a CUDA program P. To be precise, we describe a novel technique to statically build (at the compile time of P) a so-called rational program R. Using a performance prediction model, and knowing particular data and hardware parameters of P at runtime, the program R can automatically and dynamically determine the values of launch parameters of P that will yield optimal performance. Our technique can be applied to parallel programs in general, as well as to generic performance prediction models which account for program and hardware parameters. We are particularly interested in programs targeting manycore accelerators. We have implemented and successfully tested our technique in the context of GPU kernels written in CUDA using the MWP-CWP performance prediction model.
△ Less
Submitted 4 November, 2019;
originally announced November 2019.
-
A Technique for Finding Optimal Program Launch Parameters Targeting Manycore Accelerators
Authors:
Alexander Brandt,
Davood Mohajerani,
Marc Moreno Maza,
Jeeva Paudel,
Lin-Xiao Wang
Abstract:
In this paper, we present a new technique to dynamically determine the values of program parameters in order to optimize the performance of a multithreaded program P. To be precise, we describe a novel technique to statically build another program, say, R, that can dynamically determine the optimal values of program parameters to yield the best program performance for P given values for its data a…
▽ More
In this paper, we present a new technique to dynamically determine the values of program parameters in order to optimize the performance of a multithreaded program P. To be precise, we describe a novel technique to statically build another program, say, R, that can dynamically determine the optimal values of program parameters to yield the best program performance for P given values for its data and hardware parameters. While this technique can be applied to parallel programs in general, we are particularly interested in programs targeting manycore accelerators. Our technique has successfully been employed for GPU kernels using the MWP-CWP performance model for CUDA.
△ Less
Submitted 31 May, 2019;
originally announced June 2019.
-
On the Parallelization of Triangular Decomposition of Polynomial Systems
Authors:
Mohammadali Asadi,
Alexander Brandt,
Robert H. C. Moir,
Marc Moreno Maza,
Yuzhen Xie
Abstract:
We discuss the parallelization of algorithms for solving polynomial systems symbolically by way of triangular decomposition. Algorithms for solving polynomial systems combine low-level routines for performing arithmetic operations on polynomials and high-level procedures which produce the different components (points, curves, surfaces) of the solution set. The latter "component-level" parallelizat…
▽ More
We discuss the parallelization of algorithms for solving polynomial systems symbolically by way of triangular decomposition. Algorithms for solving polynomial systems combine low-level routines for performing arithmetic operations on polynomials and high-level procedures which produce the different components (points, curves, surfaces) of the solution set. The latter "component-level" parallelization of triangular decompositions, our focus here, belongs to the class of dynamic irregular parallel applications. Possible speedup factors depend on geometrical properties of the solution set (number of components, their dimensions and degrees); these algorithms do not scale with the number of processors. In this paper we combine two different concurrency schemes, the fork-join model and producer-consumer patterns, to better capture opportunities for component-level parallelization. We report on our implementation with the publicly available BPAS library. Our experimentation with 340 systems yields promising results.
△ Less
Submitted 31 May, 2019;
originally announced June 2019.
-
Symbolic-Numeric Integration of Rational Functions
Authors:
Robert M. Corless,
Robert H. C. Moir,
Marc Moreno Maza,
Ning Xie
Abstract:
We consider the problem of symbolic-numeric integration of symbolic functions, focusing on rational functions. Using a hybrid method allows the stable yet efficient computation of symbolic antiderivatives while avoiding issues of ill-conditioning to which numerical methods are susceptible. We propose two alternative methods for exact input that compute the rational part of the integral using Hermi…
▽ More
We consider the problem of symbolic-numeric integration of symbolic functions, focusing on rational functions. Using a hybrid method allows the stable yet efficient computation of symbolic antiderivatives while avoiding issues of ill-conditioning to which numerical methods are susceptible. We propose two alternative methods for exact input that compute the rational part of the integral using Hermite reduction and then compute the transcendental part two different ways using a combination of exact integration and efficient numerical computation of roots. The symbolic computation is done within BPAS, or Basic Polynomial Algebra Subprograms, which is a highly optimized environment for polynomial computation on parallel architectures, while the numerical computation is done using the highly optimized multiprecision rootfinding package MPSolve. We show that both methods are forward and backward stable in a structured sense and away from singularities tolerance proportionality is achieved by adjusting the precision of the rootfinding tasks.
△ Less
Submitted 25 October, 2018; v1 submitted 5 December, 2017;
originally announced December 2017.
-
Parallel Integer Polynomial Multiplication
Authors:
Changbo Chen,
Svyatoslav Covanov,
Farnam Mansouri,
Marc Moreno Maza,
Ning Xie,
Yuzhen Xie
Abstract:
We propose a new algorithm for multiplying dense polynomials with integer coefficients in a parallel fashion, targeting multi-core processor architectures. Complexity estimates and experimental comparisons demonstrate the advantages of this new approach.
We propose a new algorithm for multiplying dense polynomials with integer coefficients in a parallel fashion, targeting multi-core processor architectures. Complexity estimates and experimental comparisons demonstrate the advantages of this new approach.
△ Less
Submitted 17 December, 2016;
originally announced December 2016.
-
Problem formulation for truth-table invariant cylindrical algebraic decomposition by incremental triangular decomposition
Authors:
Matthew England,
Russell Bradford,
Changbo Chen,
James H. Davenport,
Marc Moreno Maza,
David Wilson
Abstract:
Cylindrical algebraic decompositions (CADs) are a key tool for solving problems in real algebraic geometry and beyond. We recently presented a new CAD algorithm combining two advances: truth-table invariance, making the CAD invariant with respect to the truth of logical formulae rather than the signs of polynomials; and CAD construction by regular chains technology, where first a complex decomposi…
▽ More
Cylindrical algebraic decompositions (CADs) are a key tool for solving problems in real algebraic geometry and beyond. We recently presented a new CAD algorithm combining two advances: truth-table invariance, making the CAD invariant with respect to the truth of logical formulae rather than the signs of polynomials; and CAD construction by regular chains technology, where first a complex decomposition is constructed by refining a tree incrementally by constraint. We here consider how best to formulate problems for input to this algorithm. We focus on a choice (not relevant for other CAD algorithms) about the order in which constraints are presented. We develop new heuristics to help make this choice and thus allow the best use of the algorithm in practice. We also consider other choices of problem formulation for CAD, as discussed in CICM 2013, revisiting these in the context of the new algorithm.
△ Less
Submitted 25 April, 2014;
originally announced April 2014.
-
A Many-core Machine Model for Designing Algorithms with Minimum Parallelism Overheads
Authors:
Sardar Anisul Haque,
Marc Moreno Maza,
Ning Xie
Abstract:
We present a model of multithreaded computation, combining fork-join and single-instruction-multiple-data parallelisms, with an emphasis on estimating parallelism overheads of programs written for modern many-core architectures. We establish a Graham-Brent theorem for this model so as to estimate execution time of programs running on a given number of streaming multiprocessors. We evaluate the ben…
▽ More
We present a model of multithreaded computation, combining fork-join and single-instruction-multiple-data parallelisms, with an emphasis on estimating parallelism overheads of programs written for modern many-core architectures. We establish a Graham-Brent theorem for this model so as to estimate execution time of programs running on a given number of streaming multiprocessors. We evaluate the benefits of our model with four fundamental algorithms from scientific computing. In each case, our model is used to minimize parallelism overheads by determining an appropriate value range for a given program parameter; moreover experimentation confirms the model's prediction.
△ Less
Submitted 2 February, 2014;
originally announced February 2014.
-
Truth Table Invariant Cylindrical Algebraic Decomposition by Regular Chains
Authors:
R. Bradford,
C. Chen,
J. H. Davenport,
M. England,
M. Moreno Maza,
D. Wilson
Abstract:
A new algorithm to compute cylindrical algebraic decompositions (CADs) is presented, building on two recent advances. Firstly, the output is truth table invariant (a TTICAD) meaning given formulae have constant truth value on each cell of the decomposition. Secondly, the computation uses regular chains theory to first build a cylindrical decomposition of complex space (CCD) incrementally by polyno…
▽ More
A new algorithm to compute cylindrical algebraic decompositions (CADs) is presented, building on two recent advances. Firstly, the output is truth table invariant (a TTICAD) meaning given formulae have constant truth value on each cell of the decomposition. Secondly, the computation uses regular chains theory to first build a cylindrical decomposition of complex space (CCD) incrementally by polynomial. Significant modification of the regular chains technology was used to achieve the more sophisticated invariance criteria. Experimental results on an implementation in the RegularChains Library for Maple verify that combining these advances gives an algorithm superior to its individual components and competitive with the state of the art.
△ Less
Submitted 10 June, 2014; v1 submitted 24 January, 2014;
originally announced January 2014.
-
An Algorithm for Computing the Limit Points of the Quasi-component of a Regular Chain
Authors:
Parisa Alvandi,
Changbo Chen,
Marc Moreno Maza
Abstract:
For a regular chain $R$, we propose an algorithm which computes the (non-trivial) limit points of the quasi-component of $R$, that is, the set $\bar{W(R)} \setminus W(R)$. Our procedure relies on Puiseux series expansions and does not require to compute a system of generators of the saturated ideal of $R$. We focus on the case where this saturated ideal has dimension one and we discuss extensions…
▽ More
For a regular chain $R$, we propose an algorithm which computes the (non-trivial) limit points of the quasi-component of $R$, that is, the set $\bar{W(R)} \setminus W(R)$. Our procedure relies on Puiseux series expansions and does not require to compute a system of generators of the saturated ideal of $R$. We focus on the case where this saturated ideal has dimension one and we discuss extensions of this work in higher dimensions. We provide experimental results illustrating the benefits of our algorithms.
△ Less
Submitted 19 February, 2013;
originally announced February 2013.
-
An Incremental Algorithm for Computing Cylindrical Algebraic Decompositions
Authors:
Changbo Chen,
Marc Moreno Maza
Abstract:
In this paper, we propose an incremental algorithm for computing cylindrical algebraic decompositions. The algorithm consists of two parts: computing a complex cylindrical tree and refining this complex tree into a cylindrical tree in real space. The incrementality comes from the first part of the algorithm, where a complex cylindrical tree is constructed by refining a previous complex cylindrical…
▽ More
In this paper, we propose an incremental algorithm for computing cylindrical algebraic decompositions. The algorithm consists of two parts: computing a complex cylindrical tree and refining this complex tree into a cylindrical tree in real space. The incrementality comes from the first part of the algorithm, where a complex cylindrical tree is constructed by refining a previous complex cylindrical tree with a polynomial constraint. We have implemented our algorithm in Maple. The experimentation shows that the proposed algorithm outperforms existing ones for many examples taken from the literature.
△ Less
Submitted 19 October, 2012;
originally announced October 2012.
-
Generating Program Invariants via Interpolation
Authors:
Marc Moreno Maza,
Rong Xiao
Abstract:
This article focuses on automatically generating polynomial equations that are inductive loop invariants of computer programs. We propose a new algorithm for this task, which is based on polynomial interpolation. Though the proposed algorithm is not complete, it is efficient and can be applied to a broader range of problems compared to existing methods targeting similar problems. The efficiency of…
▽ More
This article focuses on automatically generating polynomial equations that are inductive loop invariants of computer programs. We propose a new algorithm for this task, which is based on polynomial interpolation. Though the proposed algorithm is not complete, it is efficient and can be applied to a broader range of problems compared to existing methods targeting similar problems. The efficiency of our approach is testified by experiments on a large collection of programs. The current implementation of our method is based on dense interpolation, for which a total degree bound is needed. On the theoretical front, we study the degree and dimension of the invariant ideal of loops which have no branches and where the assignments define a P-solvable recurrence. In addition, we obtain sufficient conditions for non-trivial polynomial equation invariants to exist (resp. not to exist).
△ Less
Submitted 23 April, 2012; v1 submitted 24 January, 2012;
originally announced January 2012.
-
Algorithms for Computing Triangular Decompositions of Polynomial Systems
Authors:
Changbo Chen,
Marc Moreno Maza
Abstract:
We propose new algorithms for computing triangular decompositions of polynomial systems incrementally. With respect to previous works, our improvements are based on a {\em weakened} notion of a polynomial GCD modulo a regular chain, which permits to greatly simplify and optimize the sub-algorithms. Extracting common work from similar expensive computations is also a key feature of our algorithms.…
▽ More
We propose new algorithms for computing triangular decompositions of polynomial systems incrementally. With respect to previous works, our improvements are based on a {\em weakened} notion of a polynomial GCD modulo a regular chain, which permits to greatly simplify and optimize the sub-algorithms. Extracting common work from similar expensive computations is also a key feature of our algorithms. In our experimental results the implementation of our new algorithms, realized with the {\RegularChains} library in {\Maple}, outperforms solvers with similar specifications by several orders of magnitude on sufficiently difficult problems.
△ Less
Submitted 4 April, 2011;
originally announced April 2011.
-
Triangular Decomposition of Semi-algebraic Systems
Authors:
Changbo Chen,
James H. Davenport,
John P. May,
Marc Moreno Maza,
Bican Xia,
Rong Xiao
Abstract:
Regular chains and triangular decompositions are fundamental and well-developed tools for describing the complex solutions of polynomial systems. This paper proposes adaptations of these tools focusing on solutions of the real analogue: semi-algebraic systems. We show that any such system can be decomposed into finitely many {\em regular semi-algebraic systems}. We propose two specifications of su…
▽ More
Regular chains and triangular decompositions are fundamental and well-developed tools for describing the complex solutions of polynomial systems. This paper proposes adaptations of these tools focusing on solutions of the real analogue: semi-algebraic systems. We show that any such system can be decomposed into finitely many {\em regular semi-algebraic systems}. We propose two specifications of such a decomposition and present corresponding algorithms. Under some assumptions, one type of decomposition can be computed in singly exponential time w.r.t.\ the number of variables. We implement our algorithms and the experimental results illustrate their effectiveness.
△ Less
Submitted 13 May, 2010; v1 submitted 25 February, 2010;
originally announced February 2010.
-
Computing Cylindrical Algebraic Decomposition via Triangular Decomposition
Authors:
Changbo Chen,
Marc Moreno Maza,
Bican Xia,
Lu Yang
Abstract:
Cylindrical algebraic decomposition is one of the most important tools for computing with semi-algebraic sets, while triangular decomposition is among the most important approaches for manipulating constructible sets. In this paper, for an arbitrary finite set $F \subset {\R}[y_1, ..., y_n]$ we apply comprehensive triangular decomposition in order to obtain an $F$-invariant cylindrical decomposi…
▽ More
Cylindrical algebraic decomposition is one of the most important tools for computing with semi-algebraic sets, while triangular decomposition is among the most important approaches for manipulating constructible sets. In this paper, for an arbitrary finite set $F \subset {\R}[y_1, ..., y_n]$ we apply comprehensive triangular decomposition in order to obtain an $F$-invariant cylindrical decomposition of the $n$-dimensional complex space, from which we extract an $F$-invariant cylindrical algebraic decomposition of the $n$-dimensional real space. We report on an implementation of this new approach for constructing cylindrical algebraic decompositions.
△ Less
Submitted 30 March, 2009;
originally announced March 2009.
-
Computations modulo regular chains
Authors:
Xin Li,
Marc Moreno Maza,
Wei Pan
Abstract:
The computation of triangular decompositions are based on two fundamental operations: polynomial GCDs modulo regular chains and regularity test modulo saturated ideals. We propose new algorithms for these core operations relying on modular methods and fast polynomial arithmetic. Our strategies take also advantage of the context in which these operations are performed. We report on extensive expe…
▽ More
The computation of triangular decompositions are based on two fundamental operations: polynomial GCDs modulo regular chains and regularity test modulo saturated ideals. We propose new algorithms for these core operations relying on modular methods and fast polynomial arithmetic. Our strategies take also advantage of the context in which these operations are performed. We report on extensive experimentation, comparing our code to pre-existing Maple implementations, as well as more optimized Magma functions. In most cases, our new code outperforms the other packages by several orders of magnitude.
△ Less
Submitted 24 July, 2009; v1 submitted 21 March, 2009;
originally announced March 2009.
-
Bounds for algorithms in differential algebra
Authors:
Oleg Golubitsky,
Marina Kondratieva,
Marc Moreno Maza,
Alexey Ovchinnikov
Abstract:
We consider the Rosenfeld-Groebner algorithm for computing a regular decomposition of a radical differential ideal generated by a set of ordinary differential polynomials in n indeterminates. For a set of ordinary differential polynomials F, let M(F) be the sum of maximal orders of differential indeterminates occurring in F. We propose a modification of the Rosenfeld-Groebner algorithm, in which…
▽ More
We consider the Rosenfeld-Groebner algorithm for computing a regular decomposition of a radical differential ideal generated by a set of ordinary differential polynomials in n indeterminates. For a set of ordinary differential polynomials F, let M(F) be the sum of maximal orders of differential indeterminates occurring in F. We propose a modification of the Rosenfeld-Groebner algorithm, in which for every intermediate polynomial system F, the bound M(F) is less than or equal to (n-1)!M(G), where G is the initial set of generators of the radical ideal. In particular, the resulting regular systems satisfy the bound. Since regular ideals can be decomposed into characterizable components algebraically, the bound also holds for the orders of derivatives occurring in a characteristic decomposition of a radical differential ideal.
We also give an algorithm for converting a characteristic decomposition of a radical differential ideal from one ranking into another. This algorithm performs all differentiations in the beginning and then uses a purely algebraic decomposition algorithm.
△ Less
Submitted 15 February, 2007;
originally announced February 2007.