Multiparameter Fuss–Catalan numbers with application to algebraic equations

S. R. Mane [email protected] Convergent Computing Inc., P. O. Box 561, Shoreham, NY 11786, USA

Abstract

We present an exposition on the Fuss–Catalan numbers, which are a generalization of the well known Catalan numbers. The literature on the subject is scattered (especially for the case of multiple independent parameters, as will be explained in the text), with overlap** definitions by different authors and duplication of proofs. This paper collects the main theorems and identities, with a consistent notation. Contact is made with the works of numerous authors, including the early works of Lambert and Euler. We demonstrate the application of the formalism to solve algebraic equations by infinite series. Our main result in this context is a new necessary and sufficient formula for the domain of absolute convergence of the series solutions of algebraic equations, which corrects and extends previous work in the field. Some historical material is placed in an Appendix.

keywords:

Fuss–Catalan numbers , generating functions , solutions of algebraic equations by infinite series , complete Reinhardt domain , domain of absolute convergence

MSC:

[2010] primary 05-02 , 32A05; secondary 30B10 , 32A07

^†^†journal: Expositiones Mathematicae

1 Introduction

We employ the standard notation $\mathbb{C}$ for the complex numbers, $\mathbb{R}$ for the reals and $\mathbb{N}$ for the natural numbers $\{0,1,2,\dots\}$ . The Catalan numbers are defined, for $t\in\mathbb{N}$ , as

C_{t}=\frac{1}{t+1}\binom{2t}{t}=\frac{(2t)!}{(t+1)!t!}\,.

(1.1)

(It is more usual to write $C_{n}$ instead of $C_{t}$ , but there are too many other meanings for $n$ later in this paper, so to avoid confusion I shall employ $t$ not $n$ .) Catalan numbers have been claimed to be the most ubiquitions numbers in combinatorics, second only to the binomial coefficients themselves, e.g. see the text by Stanley [41]. It is also shown in [41] that Catalan numbers are the solutions to numerous counting problems. For example, Euler showed that $C_{t}$ gives the number of triangulations of a convex $(t+2)$ -gon. (See [41] for an extensive historical description, including quotes from the correspondence of Euler and other authors.) A generalization of the Catalan numbers, known as the Fuss–Catalan numbers, are the principal objects of interest in this paper. (They are named after Nicolas Fuss and Eugène Charles Catalan; see the text by Graham et al. [23] for a historical discussion.) First let $m,t\in\mathbb{N}$ and define

A_{t}(m)=\frac{1}{(m-1)t+1}\binom{mt}{t}\,.

(1.2)

The Catalan numbers are the special case where $m=2$ . Then $A_{t}(m+1,1)$ counts the number of dissections of a convex $(mt+2)$ -gon into regions that are $(m+2)$ -gons [41, exercise A14]. The term ‘dissection’ means the diagonals joining the vertices of the $(mt+2)$ -gon, to form the $(m+2)$ -gon, do not intersect in their interiors. See [41] for details. However, our interest extends beyond combinatorics. We require a definition not restricted to integers. We define the Fuss–Catalan numbers, for $\mu,r\in\mathbb{C}$ and $t\in\mathbb{N}$ , as $A_{0}(\mu,r):=1$ and for $t\geq 1$ via

A_{t}(\mu,r):=\frac{r}{t!}\prod_{j=1}^{t-1}(t\mu+r-j)\,.

(1.3)

The above expression is well-defined for all $\mu,r\in\mathbb{C}$ . We can employ the Gamma function to write

A_{t}(\mu,r)=r\,\frac{\Gamma(t\mu+r)}{\Gamma(t+1)\Gamma(t(\mu-1)+r+1)}\,.

(1.4)

However, this expression contains potential $0/0$ problems if the arguments of the Gamma functions equal zero or a negative integer. We shall employ eq. (1.3) in this paper. There are other equivalent definitions of the Fuss–Catalan numbers; for example the text by Graham et al. [23] employs generalized binomial coefficients. All of the applications in this paper will in fact treat only $\mu,r\in\mathbb{R}$ . Note that eq. (1.2) is the special case $\mu=m$ and $r=1$ .

Concomitant with the Fuss–Catalan numbers is their generating function, and in fact we shall mostly work with the generating function below (here $z\in\mathbb{C}$ )

B_{\mu}(r;z)=\sum_{t=0}^{\infty}A_{t}(\mu,r)z^{t}\,.

(1.5)

It is proved in [23] that $B_{\mu}(r;z)$ has the remarkable property $B_{\mu}(1;z)^{r}=B_{\mu}(r;z)$ . Also, and very importantly, $B_{\mu}(1;z)$ satisfies the following equation for $f(z)$ (again, see [23])

f=1+zf^{\mu}\,.

(1.6)

Variations of this equation were solved, using power series, by Lambert [28, 29] and Euler [16]. In both cases, their solutions are now known to be Fuss–Catalan series; this will be shown below. (I shall use the term ‘Fuss–Catalan series’ as a shorthand for ‘power series whose coefficients are Fuss–Catalan numbers.’) It is then very natural to extend eq. (1.6) to functions of multiple $k>1$ complex variables

f=1+z_{1}f^{\mu_{1}}+\cdots+z_{k}f^{\mu_{k}}\,.

(1.7)

Here $z_{1},\dots,z_{k}\in\mathbb{C}$ and also $\mu_{1},\dots,\mu_{k}\in\mathbb{C}$ . Analogous to eq. (1.6), the solution of eq. (1.7) is also given by a generating function, a multinomial power series in $z_{1},\dots,z_{k}$ , where the series coefficients are ‘multiparameter Fuss–Catalan numbers.’

This brings us to the heart of this paper. The multiparameter Fuss–Catalan numbers will be defined below. However, it turns out that the literature on the multiparameter Fuss–Catalan numbers is scattered. As can be seen from above, there are two broad threads, i.e. combinatorics and the theory of several complex variables. Different authors have published overlap** (not always equivalent) definitions, with duplication of theorems and proofs. It is the purpose of this paper to collect together the literature on the multiparameter Fuss–Catalan numbers, with a consistent notation and references to the various theorems and proofs by diverse authors. In particular, consider the general algebraic equation of degree $n$ , with $x\in\mathbb{C}$ and complex coefficients $a_{0},\dots,a_{n}$

a_{0}+a_{1}x+\cdots+a_{n}x^{n}=0\,.

(1.8)

It is known that eq. (1.8) can be solved by expressing $x$ in a multivariate series (more accurately, a Laurent–Puiseux series) in the coefficients $a_{0},\dots,a_{n}$ . This can be accomplished by an application of the Lagrange Inversion Theorem; indeed Lagrange himself did so as a demonstration of his theorem, for the special case of the trinomial [27]. Clearly, eq. (1.8) can be cast in the form of eq. (1.7), and the solution is a (multiparameter) Fuss–Catalan series. This highlights two broad themes in this paper: the literature on combinatorics treats integer-valued parameters, whereas that on complex variables treats algebraic equations (polynomials), but both are subsumed into a general framework of Fuss–Catalan series. Note that the exponents $\mu_{1},\dots,\mu_{k}$ in eq. (1.7) are arbitrary real (or in principle complex) numbers, and are not restricted to be integers (or rational numbers). Some authors, such as Mohanty [34], recognize this fact, but most do not. Mohanty’s work will be important below. Contact will also be made with the works of numerous other authors such as Euler, Lagrange and Lambert (mentioned above), Klein (solution of the quintic), Gould, Mellin and Raney, to name a few. Mellin [33] employed his eponymous transform to solve eq. (1.8); it will be shown below that his solution is a Fuss–Catalan series. Ramanujan [5] also published briefly on the subject, the equation and his series solution will be cited below; it is of course a Fuss–Catalan series. Significantly, Ramanujan derived a bound for the radius of convergence of his series (many other authors did not). We shall derive two bounds for the absolute convergence of the series solution of eq. (1.7). The first is necessary but in general not sufficient and the second is sufficient but in general not necessary. A necessary and sufficient bound for absolute convergence is not known at this time. However, for the special case of an algebraic equation, we shall present a new necessary and sufficient bound for the absolute convergence of the series solution of eq. (1.8) in Sec. 8. The new bound is based on earlier work by Passare and Tsikh [36]. Some counterexamples to their results will be displayed below; this indicates the need for a more careful treatment of the problem.

On a more personal level, in a recent paper [13], Dilworth and this author derived the analytical solution for the probability mass function of the geometric distribution of order $k$ [18]. The roots of the associated recurrence relation were obtained as series in Fuss–Catalan numbers. It was recognized that Fuss–Catalan series are a potentially powerful tool to solve related problems, and in a follow-up paper [14], they were applied to solve additional problems for success runs of Bernoulli trials. The title of [14] was deliberately worded “Applications of Fuss–Catalan Numbers to Success Runs of Bernoulli Trials.” This paper will not treat problems of probability and statistics, but it is this author’s personal belief that (multiparameter) Fuss–Catalan series offer great promise to solve problems in numerous subfields of mathematics. This motivates the desire to collect the literature on the subject in one place, with a consistent notation and to assemble together the various duplicated theorems and proofs.

The structure of this paper is as follows. The basic definitions of Fuss–Catalan numbers, their generating functions and relevant theorems are presented in Sec. 2. Bounds for the absolute convergence of Fuss–Catalan series are derived in Sec. 3. The application to algebraic equations is presented in Sec. 4. The quintic is sufficiently important that it is placed in a separate section in Sec. 5. The trinomial equation is also sufficiently important that it is placed in a separate section in Sec. 6. The domain of absolute convergence for the solutions of algebraic equations by infinite series is discussed in Sec. 7, where it is shown that a new, more careful treatment is required, and a new necessary and sufficient bound is presented in Sec. 8. A sample nontrivial application of the new bound is presented in Sec. 9, for the principal and Brioschi quintics. Some material, including historical material, is relegated to A. In B, contact is made with the work of Sturmfels [44] on the solutions of algebraic equations via so-called $\mathscr{A}$ -hypergeometric series.

A few disclaimers and words of caution follow. First, it is important to note that there are complex roots in many of the series, hence branch cuts are required to obtain well-defined expressions. Overall, this detail is not clearly (or explicitly) addressed in the literature, but it is important. The claimed series ‘solution’ of eq. (1.8) may be erroneous (or meaningless) if an appropriate branch cut is not specified. The series may converge, but not to the root of the original equation. The subject of branch cuts will be discussed below.

Next, no claim is made here that the use of a series to solve eq. (1.7) is a computationally efficient algorithm, nor that a series solution of the algebraic equation eq. (1.8) converges rapidly to a root of the polynomial. McClintock did make such a claim [32], but in 1895 modern digital computers and the concomitant numerical algorithms did not exist. Indeed, a power series will not converge rapidly close to its circle of convergence. Nevertheless, an analytical expression can indicate properties of a function not evident from a purely numerical solution. For example, no alternative analytical expression is known, at the present time, for the probability mass function of the geometric distribution of order $k$ [13].

Finally, this paper is not intended to be an encyclopedia. There is a vast literature on the solution of algebraic equations by infinite series, as well as on combinatorics using Catalan and Fuss–Catalan numbers. Any omissions are inadvertent and not deliberate. For example, the text by Appell and Kampé de Fériet [1] derives solutions of algebraic equations using generalized hypergeometric functions. The general sextic equation can be solved using Kampé de Fériet functions. It is beyond the scope of this paper to discuss such functions. The paper by Kamber [25] contains interesting material on the coefficients of certain inverse power series, but is also beyond the scope of this paper.

2 Basic definitions and theorems

For ease of reference, some of the equations displayed in the introduction will be repeated below. The Fuss–Catalan numbers are defined, for $\mu,r\in\mathbb{C}$ and $t\in\mathbb{N}$ , as $A_{0}(\mu,r):=1$ and for $t\geq 1$ via

A_{t}(\mu,r):=\frac{r}{t!}\prod_{j=1}^{t-1}(t\mu+r-j)\,.

(2.1)

As stated in the introduction, all of the applications in this paper will treat $\mu,r\in\mathbb{R}$ . The above numbers are also known as Raney numbers, at least when $\mu$ and $r$ are nonnegative integers, in which case $A_{t}(\mu,r)$ is itself a nonnegative integer. Raney’s work [37] will be cited below. The generating function of the Fuss–Catalan numbers is (where $z\in\mathbb{C}$ )

B_{\mu}(r;z)=\sum_{t=0}^{\infty}A_{t}(\mu,r)z^{t}\,.

(2.2)

The following results are known:

Theorem 2.1.

1.

The generating function $B_{\mu}(1;z)$ satisfies the following equation for $f(z)$

$f=1+zf^{\mu}\,.$ (2.3)

The generating function $B_{\mu}(r;z)$ also has the property

B_{\mu}(1;z)^{r}=B_{\mu}(r;z)\,.

(2.4)

Let $s\in\mathbb{C}$ and using $B_{\mu}(1;z)^{r+s}=B_{\mu}(1;z)^{r}B_{\mu}(1;z)^{s}$ , eq. (2.4) is equivalent to the statement

B_{\mu}(r+s;z)=B_{\mu}(r;z)B_{\mu}(s;z)\,.

(2.5)

The Fuss–Catalan numbers satisfy the following convolution identity

A_{t}(\mu,s+r)=\sum_{u=0}^{t}A_{u}(\mu,r)A_{t-u}(\mu,s)\,.

(2.6)

The Fuss–Catalan numbers satisfy the recurrence relation (there are other equivalent ways to express the recurrence)

A_{t}(\mu,r+1)=A_{t}(\mu,r)+A_{t-1}(\mu,r+\mu)\,.

(2.7)

Note that Theorems 2.1(b) and (c) are equivalent. Write out eq. (2.5) in full, then

\sum_{t=0}^{\infty}A_{t}(\mu,r+s)z^{t}=\biggl{(}\sum_{t^{\prime}=0}^{\infty}A_% {t^{\prime}}(\mu,r)z^{t^{\prime}}\biggr{)}\biggl{(}\sum_{t^{\prime\prime}=0}^{% \infty}A_{t^{\prime\prime}}(\mu,s)z^{t^{\prime\prime}}\biggr{)}\,.

(2.8)

Selecting a particular value of $t$ on the left hand side and equating terms, we obtain a sum of terms $t^{\prime}+t^{\prime\prime}=t$ on the right-hand side and eq. (2.6) follows. Reversing the steps proves the converse. Also, using eqs. (2.2) and (2.4), eq. (2.7) can easily be employed to show that

B_{\mu}(1;z)^{r+1}=B_{\mu}(1;z)^{r}+zB_{\mu}(1;z)^{r+\mu}\,.

(2.9)

This is simply eq. (2.3) multiplied through by $B_{\mu}(1;z)^{r}$ .

To generalize to $k\geq 1$ multiple parameters, we employ a vector notaton and introduce the $k$ -tuples $\bm{t}=(t_{1},\dots,t_{k})\in\mathbb{N}^{k}$ , $\bm{\mu}=(\mu_{1},\dots,\mu_{k})\in\mathbb{C}^{k}$ and $\bm{z}=(z_{1},\dots,z_{k})\in\mathbb{C}^{k}$ (and recall $r,s\in\mathbb{C}$ ). Also define $|\bm{t}|=|t_{1}|+\cdots+|t_{k}|$ . For brevity we shall frequently write $t=|\bm{t}|$ below. Also, for $|\bm{t}|>0$ , define the ‘unit vector’ $\hat{\bm{t}}=\bm{t}/|\bm{t}|$ . We also define the zero vector $\bm{0}=(0,\dots,0)$ and the ‘basis vectors’ $\bm{e}_{j}=(0,\dots,0,1,0,\dots,0)=(\delta_{1j},\dots,\delta_{kj})$ .

Definition 2.2 (multiparameter Fuss–Catalan numbers).

We define the multiparameter Fuss–Catalan numbers $\mathscr{A}_{\bm{t}}(\bm{\mu},r)$ via $\mathscr{A}_{\bm{0}}(\bm{\mu},r):=1$ for $\bm{t}=\bm{0}$ and for $|\bm{t}|>0$ via

\mathscr{A}_{\bm{t}}(\bm{\mu},r):=\frac{r}{t_{1}!\cdots t_{k}!}\prod_{j=1}^{|% \bm{t}|-1}(\bm{t}\cdot\bm{\mu}+r-j)\,.

(2.10)

If $k=1$ this reduces to the single-parameter definition eq. (2.1). Equivalently, for all $|\bm{t}|\geq 0$ ,

\mathscr{A}_{\bm{t}}(\bm{\mu},r)=\binom{t}{t_{1},\dots,t_{k}}\,A_{t}(\hat{\bm{% t}}\cdot\bm{\mu},r)\,.

(2.11)

Note that a $0/0$ indeterminate expression for $\hat{\bm{t}}$ does not arise in eq. (2.11) because of the definition $A_{0}(\cdot):=1$ .

Definition 2.3 (multiparameter generating function).

The multiparameter Fuss–Catalan generating function is defined as

\mathcal{B}(\bm{\mu};r;\bm{z}):=\sum_{\bm{t}\in\mathbb{N}^{k}}\mathscr{A}_{\bm% {t}}(\bm{\mu},r)\,z_{1}^{t_{1}}\cdots z_{k}^{t_{k}}\,.

(2.12)

Technically, the above expression is not well-defined because the answer can depend on the order of summation. In all the applications in this paper, we collect the terms in level sets in $t=|\bm{t}|$

\mathcal{B}(\bm{\mu};r;\bm{z})=\sum_{t=0}^{\infty}\sum_{t_{1}+\cdots+t_{k}=t}% \binom{t}{t_{1},\dots,t_{k}}A_{t}(\hat{\bm{t}}\cdot\bm{\mu},r)\,z_{1}^{t_{1}}% \cdots z_{k}^{t_{k}}\,.

(2.13)

However, to obtain rigorous results, we must specify a domain of absolute convergence. Then the answer will not depend on the order of summation. The topic of absolute convergence will be discussed below.

Theorem 2.4.

The generating function $\mathcal{B}(\bm{\mu};1;\bm{z})$ satisfies the following equation for $f(\bm{z})$

f=1+z_{1}f^{\mu_{1}}+\cdots+z_{k}f^{\mu_{k}}\,.

(2.14)

Analogous to eq. (2.4), the generating function $\mathcal{B}(\bm{\mu};r;\bm{z})$ has the property

\mathcal{B}(\bm{\mu};1;\bm{z})^{r}=\mathcal{B}(\bm{\mu};r;\bm{z})\,.

(2.15)

Analogous to eq. (2.5), it follows that

\mathcal{B}(\bm{\mu};r+s;\bm{z})=\mathcal{B}(\bm{\mu};r;\bm{z})\mathcal{B}(\bm% {\mu};s;\bm{z})\,.

(2.16)

The multiparameter convolution identity analogous to eq. (2.6) is (the allowed values of $\bm{u}$ are obvious)

\mathscr{A}_{\bm{t}}(\bm{\mu},r+s)=\sum_{\bm{u}\in\mathbb{N}^{k}}\mathscr{A}_{% \bm{u}}(\bm{\mu},r)\mathscr{A}_{\bm{t}-\bm{u}}(\bm{\mu},s)\,.

(2.17)

Analogous to eq. (2.7), the multiparameter recurrence is (again, there are other equivalent ways to express the recurrence)

A_{\bm{t}}(\bm{\mu},r+1)=A_{\bm{t}}(\bm{\mu},r)+\sum_{j=1}^{k}A_{\bm{t}-\bm{e}% _{j}}(\bm{\mu},r+\mu_{j})\,.

(2.18)

Theorems 2.4(b) and (c) are equivalent; the proof follows the same steps as for the case $k=1$ . Also, using eqs. (2.12) (or (2.13)) and (2.15), eq. (2.18) yields

\mathcal{B}(\bm{\mu};1;\bm{z})^{r+1}=\mathcal{B}(\bm{\mu};1;\bm{z})^{r}+\sum_{% j=1}^{k}z_{j}\mathcal{B}(\bm{\mu};1;\bm{z})^{r+\mu_{j}}\,.

(2.19)

This is eq. (2.14) multiplied through by $\mathcal{B}(\bm{\mu};1;\bm{z})^{r}$ . A search of the literature revealed that all of the results in Theorem 2.4 have already been proved. Unfortunately, the proofs are scattered (and rediscovered) in the literature. Unlike the single-parameter case, where the relations are explicitly stated as properties of Fuss–Catalan numbers (see Theorem 2.1), for $k>1$ there is a variety of notations and not all authors mention Fuss and Catalan. (This should not be misinterpreted as a criticism; see the comment at the beginning of A.) For the multiparameter case, the most comprehensive references I have found were by Raney [37], Chu [11] and Mohanty [34]. I summarize their works in turn. Raney [37] presented proofs of all the results in Theorem 2.4. Raney’s expression is as follows. Let $a_{1},a_{2},\dots$ be an infinite sequence of natural numbers of which at most a finite number of terms are different from zero. Then define $m=n+\sum_{i=1}^{\infty}ia_{i}$ and $a_{0}=n+\sum_{i=1}^{\infty}(i-1)a_{i}$ . Raney defined the multinomial coefficient

M(a_{0},a_{1},a_{2},\dots)=\frac{(a_{0}+a_{1}+a_{2}+\cdots)!}{a_{0}!a_{1}!a_{2% }!\cdots}\,.

(2.20)

Then [37, Theorem 2.2] states

mL(n;a_{1},a_{2},\dots)=nM(a_{0},a_{1},a_{2},\dots)\,.

(2.21)

Let us reexpress this in our notation. We know only finitely many of the $a_{i}$ are nonzero. Suppose there are $k$ nonzero $a_{i}$ are they are indexed by the set $(\mu_{1},\dots,\mu_{k})$ . Also define $t_{j}=a_{\mu_{j}}$ and replace $n$ by $r$ , then $m=r+\bm{t}\cdot\bm{\mu}$ and $a_{0}=r+\bm{t}\cdot\bm{\mu}-|\bm{t}|=m-|\bm{t}|$ . Then

\begin{split}L(n;a_{1},a_{2},\dots)&=\frac{n}{m}\,\frac{(a_{0}+a_{1}+a_{2}+% \cdots)!}{a_{0}!a_{1}!a_{2}!\cdots}\\ &=\frac{r}{r+\bm{t}\cdot\bm{\mu}}\,\frac{(r+\bm{t}\cdot\bm{\mu})!}{(r+\bm{t}% \cdot\bm{\mu}-|\bm{t}|)!t_{1}!\cdots t_{k}!}\\ &=\frac{r}{t_{1}!\cdots t_{k}!}\,\prod_{j=1}^{|\bm{t}|-1}(\bm{t}\cdot\bm{\mu}+% r-j)\\ &=\mathscr{A}_{\bm{t}}(\bm{\mu},r)\,.\end{split}

(2.22)

Notice that Raney’s expression for $L$ is a solution, not a definition. Raney posed and solved many combinatorial problems in [37]. Then [37, Theorems 2.3, 2,4, 4.1] yield respectively eqs. (2.17), (2.18) and (2.15). Also [37, eqs. (6.1) and (6.2)] yield eq. (2.14). Note that Raney [37] took the $\mu_{j}$ (in my notation) to be integers; this is common also in the derivations by other authors (see below). However, it is straightforward to generalize from integer to complex-valued parameters. The relevant steps are given by Graham et al. [23] (for the single-parameter case $k=1$ , but the same reasoning works also for multiple parameters $k>1$ ). Chu [11] also published a proof of the solution of eq. (2.14) (citing Raney [37]). Chu remarked that eq. (2.14) can also be derived using the multi-variable version of the Lagrange inversion formula [19]. (Numerous authors have stated that eq. (2.14) can be derived using Lagrange inversion. Raney gave an example of Lagrange inversion in [37].) Chu treated only integer-valued parameters. Chu defined ‘higher Catalan numbers’ and ‘generalized Catalan numbers’ as follows. Chu employed vectors $\vec{v}$ and $\bm{n}$ , which are $k$ -tuples of integers. The ‘higher Catalan numbers’ are [11, eq. (1)]

C_{k}(n)=\frac{1}{nk+1}\binom{nk+1}{n}\,.

(2.23)

This is equivalent to $A_{n}(k,1)$ in eq. (2.1) The ‘generalized Catalan numbers’ are [11, eq. (2)]

C_{\vec{v}}(\vec{n})=\frac{1}{\sum_{i=1}^{k}n_{i}v_{1}+1}\binom{\sum_{i=1}^{k}% n_{i}v_{1}+1}{n_{1},n_{2},\dots,n_{k},1+\sum_{i=1}^{k}n_{i}(v_{1}-1)}\,.

(2.24)

This is equivalent to $\mathscr{A}_{\bm{n}}(\bm{v},1)$ in eq. (2.10). Beware of the slightly inconsistent use of $k$ by Chu, as quoted in eqs. (2.23) and (2.24). Curiously, Chu [11] restricted his definitions only to $r=1$ , even though unnamed expressions with $r>1$ appear in his paper (Chu wrote $t$ for what I call $r$ ). Mohanty [34] explicitly treated complex-valued parameters. He defined the multinomial coefficient as follows [34, eq. (3)]

\binom{x}{j_{1},\dots,j_{k}}=\frac{x(x-1)\cdots(x-\sum_{j=1}^{k}j+1)}{\prod_{i% =1}^{k}j_{i}!}\,.

(2.25)

Then Mohanty defined (without assigning a name) [34, eq. (4)]

A(a;b_{1},\dots,b_{k};n_{1},\dots,n_{k})=\frac{a}{a+\sum_{i=1}^{k}b_{i}n_{i}}% \binom{a+\sum_{i=1}^{k}b_{i}n_{i}}{n_{1},\dots,n_{k}}\,.

(2.26)

Here $a,b_{1},\dots,b_{k}\in\mathbb{C}$ are all complex. This is equivalent to $\mathscr{A}_{\bm{n}}(\bm{b},a)$ in eq. (2.10). Mohanty proved several multiparameter convolution identities in [34], in particular eq. (2.17). (Additional results are given in A below.) Mohanty defined a generating function [34, unnumbered before eq. (13)] and proved that it satisfies the following equation for $z$ [34, eq. (22)]

sz^{b}+tz^{d}-z+1=0\,.

(2.27)

Here all of $b$ , $d$ , $s$ and $t$ are complex. Note that Mohanty displayed explicit derivations for the case $k=2$ and pointed out that the extension to more parameters merely requires additional bookkee**, hence eq. (2.27) generalizes to $k>2$ parameters and is effectively eq. (2.14). Similarly [34, eq. (25)] yields the identity eq. (2.16). Strehl [43] also gave a proof of the solution of eq. (2.14), where [43, eq.(21)] is an algebraic equation with complex coefficients. In fact [43, eq.(21)] is the equation solved by Mellin [33] and displayed in eq. (4.12) below. Strehl also provides some historical background, citing both Chu [11] and Raney [37]. More recently, eq. (2.14) was solved by Schuetz and Whieldon [40, Theorem 4.2], who treated integer valued coefficients and exponents only. The series coefficients were identified as Fuss–Catalan numbers, multiplied by multinomial coefficients (see eq. (2.11)); this is the only reference I have found to mention Fuss–Catalan explicitly for the multiparameter case (but see Chu [11] above). Banderier and Drmota [3] also derived a series solution for an algebraic equation where [3, Theorem 3.3] is termed the ‘Flajolet-Soria formula for coefficients of an algebraic function.’ See [3, eq. (3.3)].

Remark 2.5.

The exponents $\mu_{j}$ in eq. (2.14) need not be distinct, although from a practical viewpoint it may be pointless if they are not. Consider the extreme case where they are all equal $\mu_{1}=\cdots=\mu_{k}=\mu$ so $\hat{\bm{t}}\cdot\bm{\mu}=\mu$ . Then eq. (2.14) simplifies to

f=1+(z_{1}+\cdots+z_{k})f^{\mu}\,.

(2.28)

This is simply eq. (2.3) with $z=\sum_{j=1}^{k}z_{j}$ . Then in eq. (2.14), $A_{t}(\mu,1)$ does not depend on the individual $t_{j}$ so

\begin{split}f&=\sum_{t=0}^{\infty}A_{t}(\mu,1)\biggl{[}\sum_{t_{1}+\cdots+t_{% k}=t}\binom{t}{t_{1},\dots,t_{k}}z_{1}^{t_{1}}\cdots z_{k}^{t_{k}}\biggr{]}\\ &=\sum_{t=0}^{\infty}A_{t}(\mu,1)(z_{1}+\cdots+z_{k})^{t}\,.\end{split}

(2.29)

This is precisely the Fuss–Catalan series which is the known solution of eq. (2.28).

Remark 2.6 (branch cuts).

If some of the $\mu_{j}$ in eq. (2.14) are nonintegers, a branch cut is required in the complex plane. The classic example is the square root $\mu_{j}=\frac{1}{2}$ and $f^{1/2}$ . A specific sheet of the complex plane must be selected, to render equations such as eq. (2.14) well defined (although, as pointed out above, the domain of absolute convergence will not depend on branch cuts). In all of the numerical work reported in this paper, the branch cut was placed along the positive real axis, so $0\leq\arg(z_{j})<2\pi$ for $j=1,\dots,k$ and similarly for $f$ and all other complex variables to appear below. This is necessary to obtain meaningful sums for the various series in this paper. Mellin [33] placed the branch cut along the negative real axis. The essential fact is that a branch cut is required; one must make a specific choice and adhere to it consistently.

3 Domain of convergence

In general, the convergence of an infinite series depends on the order of summation. In this paper, we take ‘convergence’ to mean exclusively absolute convergence. In that case, the answer does not depend on the order of summation. In general, the series in eq. (2.13) has a finite domain of absolute convergence. We present two sets of conditions for the series in eq. (2.13) to converge absolutely. The first is necessary, but in general not sufficient, and the second is sufficient, but in general not necessary. A more detailed analysis for the special case of algebraic equations will be presented in Secs. 7 and 8. The derivations below assume the $\mu_{j}$ are real and are ordered $\mu_{1}\leq\mu_{2}\leq\cdots\leq\mu_{k}$ . We begin with the following lemma for the asymptotic value of the Fuss–Catalan numbers.

Lemma 3.1.

Asymptotically for $t\gg 1$ and real $\mu,r$ ,

A_{t}(\mu,r)\sim\frac{r}{\sqrt{2\pi}\,t^{3/2}}\,\frac{|\mu|^{r-\frac{1}{2}}}{|% 1-\mu|^{r+\frac{1}{2}}}\,(|\mu|^{\mu}|1-\mu|^{1-\mu})^{t}\,.

(3.1)

The above is an application of Stirling’s formula and the proof is omitted. We require $\mu\neq 0$ and $\mu\neq 1$ to justify the intermediate steps in the derivation. To determine the radius of convergence using d’Alembert’s ratio test, note that asymptotically

\frac{A_{t}(\mu,r)}{A_{t-1}(\mu,r)}\sim|\mu|^{\mu}|1-\mu|^{1-\mu}\,.

(3.2)

Proposition 3.2 (necessary, not sufficient).

For the series in eq. (2.13) to converge absolutely, it is necessary that

|z_{j}|\leq|z_{j}|_{\max}\equiv\frac{1}{|\mu_{j}|^{\mu_{j}}|1-\mu_{j}|^{1-\mu_% {j}}}\qquad(j=1,\dots,k)\,.

(3.3)

Then all points of the following form lie in the domain of convergence

\tilde{\bm{z}}_{j}=(0,\dots,0,|z_{j}|=|z_{j}|_{\max},0,\dots,0)\qquad(j=1,% \dots,k)\,.

(3.4)

Proof.

Fix a value of $j$ and set all the other $z_{j^{\prime}}$ to zero, where $j^{\prime}\neq j$ . Then the series in eq. (2.13) reduces to a sum in powers of single variable $z_{j}$ . Then eq. (3.3) follows from eq. (3.2) and d’Alembert’s ratio test. From the asymptotic form of the Fuss–Catalan numbers in eq. (3.1), the series converges also on its circle of convergence, justifying the ‘ $\leq$ ’ in eq. (3.3). Then eq. (3.4) follows immediately. ∎

Proposition 3.3 (sufficient, not necessary).

The series in eq. (2.13) converges absolutely if

\sum_{j=1}^{k}|z_{j}|\leq\min\biggl{(}\frac{1}{|\mu_{1}|^{\mu_{1}}|1-\mu_{1}|^% {1-\mu_{1}}}\,,\frac{1}{|\mu_{k}|^{\mu_{k}}|1-\mu_{k}|^{1-\mu_{k}}}\biggr{)}\,.

(3.5)

The above condition is sufficient, but in general not necessary.

Proof.

We employ eq. (2.13) and eq. (2.14). Let us define $\alpha=\sum_{j=1}^{k}|z_{j}|$ and $p_{j}=|z_{j}|/\alpha$ , for $j=1,\dots,k$ . Then $0\leq p_{j}\leq 1$ and $\sum_{j=1}^{k}p_{j}=1$ . Then from eq. (2.14)

|f|\leq\sum_{t=0}^{\infty}\alpha^{t}\biggl{\{}\sum_{t_{1}+\cdots+t_{k}=t}|A_{t% }(\hat{\bm{t}}\cdot\bm{\mu},1)|\binom{t}{t_{1},\dots,t_{k}}p_{1}^{t_{1}}\cdots p% _{k}^{t_{k}}\biggr{\}}\,.

(3.6)

Let us suppose that $|A_{t}(\hat{\bm{t}}\cdot\bm{\mu},1)|$ is majorized by setting $\hat{\bm{t}}\cdot\bm{\mu}=\mu_{*}$ , where $\mu_{*}$ does not depend on the $t_{j}$ . (This will be discussed in more detail below.) Actually, to establish convergence of the series, it is sufficient if $|A_{t}(\hat{\bm{t}}\cdot\bm{\mu},1)|<|A_{t}(\mu_{*},1)|$ only asymptotically, say for $t\geq T$ . Then

\begin{split}|f|&\leq\textrm{const}+\sum_{t=T}^{\infty}|A_{t}(\mu_{*},1)|\,% \alpha^{t}\biggl{\{}\sum_{t_{1}+\cdots+t_{k}=t}\binom{t}{t_{1},\dots,t_{k}}p_{% 1}^{t_{1}}\dots p_{k}^{t_{k}}\biggr{\}}\\ &=\textrm{const}+\sum_{t=T}^{\infty}|A_{t}(\mu_{*},1)|\,\alpha^{t}\,.\end{split}

(3.7)

Using eq. (3.2) and d’Alembert’s ratio test, we obtain the following sufficient (but not always necessary) condition for convergence:

\sum_{j=1}^{k}|z_{j}|\leq\frac{1}{|\mu_{*}|^{\mu_{*}}|1-\mu_{*}|^{1-\mu_{*}}}\,.

(3.8)

The essential step to complete the proof is to specify the value of $\mu_{*}$ . Since the $\mu_{j}$ are ordered, $\mu_{1}\leq\hat{\bm{t}}\cdot\bm{\mu}\leq\mu_{k}$ . Now the graph of $|\mu_{*}|^{\mu_{*}}|1-\mu_{*}|^{1-\mu_{*}}$ attains a minimum at $\mu_{*}=\frac{1}{2}$ (and is symmetric around $\mu_{*}=\frac{1}{2}$ ) and increases monotonically in either direction away from the minimum. Hence the value of $|\mu_{*}|^{\mu_{*}}|1-\mu_{*}|^{1-\mu_{*}}$ is maximized by setting $\mu_{*}=\mu_{1}$ or $\mu_{*}=\mu_{k}$ . Either value will do if they are equidistant from $\frac{1}{2}$ . This proves eq. (3.5). Admittedly, this may not be an optimal criterion: it is sufficient, but may not be necessary. Numerical tests indicate that the domain of convergence using the above value of $\mu_{*}$ can be very conservative. ∎

Corollary 3.4 (trinomial).

For the special case $k=1$ , there is only one summand, and so $\mu_{*}=\mu_{1}=\mu$ and we may write $z_{1}=z$ . Then the criterion for absolute convergence is necessary and sufficient. The series in eq. (2.2) converges if and only if

|z|\leq\frac{1}{|\mu|^{\mu}|1-\mu|^{1-\mu}}\,.

(3.9)

The proof is immediate from taking Propositions 3.2 and 3.3 together. From Proposition 3.2, the series converges everywhere on its circle of convergence. This corollary will be important below.

Remark 3.5.

It is clear that the conditions in eq. (3.3) are individually necessary, but, even taken together, they are not sufficient to guarantee absolute convergence of the full sum in eq. (2.13). Hence an upper bound for the measure of the domain of absolute convergence, for $(|z_{1}|,\dots,|z_{k}|)\in\mathbb{R}^{k}$ , is given by the finite product

\mu\leq\prod_{j=1}^{k}\frac{1}{|\mu_{j}|^{\mu_{j}}|1-\mu_{j}|^{1-\mu_{j}}}<% \infty\,.

(3.10)

The use of $\mu$ on the left hand side to denote measure should not be confused with other uses of $\mu$ in this paper. The true domain of absolute convergence is a set of smaller measure. This justifies the claim at the beginning of this section that the series in eq. (2.13) has a ‘finite domain’ of absolute convergence, i.e. finite measure. Similarly, using the sufficient condition in eq. (3.5) and $\mu_{*}$ from eq. (3.8), a lower bound for the measure of the domain of absolute convergence is

\mu\geq\frac{1}{k!}\biggl{(}\frac{1}{|\mu_{*}|^{\mu_{*}}|1-\mu_{*}|^{1-\mu_{*}% }}\biggr{)}^{k}>0\,.

(3.11)

The measure is positive: for sufficiently small $|z_{j}|$ , $j=1,\dots,k$ , the series in eq. (2.13) converges in an open neighborhood of the origin for $\bm{z}\in\mathbb{C}^{k}$ . Of course this latter fact could be deduced directly using eq. (2.14), but eq. (3.11) supplies a quantitative lower bound. For the special case of a trinomial, where $k=1$ , the two bounds in eqs. (3.10) and (3.11) coincide.

Remark 3.6.

A complete derivation of a necessary and sufficient condition for the absolute convergence of the series in eq. (2.13) has not yet been discovered. However, the situation is different for an algebraic equation. As stated in the introduction, the necessary and sufficient bound for the convergence of the series solution of eq. (1.8) will be presented in Sec. 8.

4 Algebraic equations

4.1 Preliminary remarks

We now treat some applications of the above formalism. In this paper we shall treat algebraic equations, i.e. series solutions for roots of polynomials. Consider the general algebraic equation of degree $n$ with $x\in\mathbb{C}$ and complex coefficients $a_{0},\dots,a_{n}$

0=a_{0}+a_{1}x+\cdots+a_{n}x^{n}\,.

(4.1)

We require $a_{0}\neq 0$ else we factor out a root $x=0$ . We also require $a_{n}\neq 0$ . We begin with an obvious, but necessary, caveat. It is possible that some or all of the remaining $a_{j}$ could vanish. To avoid cluttering the presentation, it is to be understood that in all of the multinomial sums below, the sums extend only over the nonzero $a_{j}$ . We now note two elementary transformations of eq. (4.1), which do not affect the fundamental properties of the roots. First, we can multiply all the coefficients by a constant $\lambda\neq 0$ . This does not change the roots of eq. (4.1). Next, we can replace $x$ by $\mu y$ , where $\mu\neq 0$ . The roots for $x$ are simply those for $y$ , multiplied by $\mu$ . The resulting equation is $\sum_{j=0}^{n}a_{j}\lambda\mu^{j}y^{j}=0$ . Define $b_{j}=a_{j}\lambda\mu^{j}$ . We can select two integers $p$ and $q$ such that $0\leq p<q\leq n$ and find values for $\lambda$ and $\mu$ such that $b_{p}=b_{q}=1$ , yielding

0=b_{0}+b_{1}y+\cdots+y^{p}+\cdots+y^{q}+\cdots+b_{n}y^{n}\,.

(4.2)

Clearly both $a_{p}$ and $a_{q}$ must be nonzero to do this. It is easily derived that $\mu=(a_{p}/a_{q})^{1/(q-p)}$ and

b_{j}=\frac{a_{j}}{a_{p}^{(q-j)/(q-p)}a_{q}^{(j-p)/(q-p)}}\,.

(4.3)

Technically, $b_{j}$ depends on $p$ and $q$ also, but we consider this to be understood below. Clearly a branch cut is required to derive the above expressions. There are actually $q-p$ solutions for $\mu$ and $b_{j}$ , indexed by the $q-p$ roots of unity $1^{1/(q-p)}$ (actually $(-1)^{1/(q-p)}$ , we shall see this below). For brevity we define the set $\mathscr{N}_{npq}=\{0,\dots,n\}\setminus\{p,q\}$ . We divide eq. (4.2) through by $y^{p}$ and rearrange terms to obtain the equation

-y^{q-p}=1+\sum_{j\in\mathscr{N}_{npq}}b_{j}y^{j-p}\,.

(4.4)

Note that if we set all the $b_{j}$ to zero, the equation reduces to $y^{q-p}=-1$ . The solution is any of the radicals $y=(-1)^{1/(q-p)}$ . By the implicit function theorem, an absolutely convergent solution for $y$ exists for sufficienly small amplitudes of the $|b_{j}|$ . Hence the domain of absolute convergence of the series solution of eq. (4.4) is nonempty, expressing $y$ in a power series in the $b_{j}$ . (We already know this from Sec. 3.) Now set $\zeta=-y^{q-p}$ , so $y=(-1)^{1/(q-p)}\zeta^{1/(q-p)}$ . We append a subscript $\ell$ on $x$ , $y$ and $\zeta$ to index the $q-p$ choices of radicals $(-1)^{1/(q-p)}$ . Employing a branch cut along the positive real axis, they are $e^{i\pi(2\ell+1)/(q-p)}$ , where $\ell=0,\dots,q-p-1$ . Set $\mu_{j}=(j-p)/(q-p)$ , then $b_{j}=a_{j}/(a_{p}^{1-\mu_{j}}a_{q}^{\mu_{j}})$ and $\zeta_{\ell}$ satisfies

\zeta_{\ell}=1+\sum_{j\in\mathscr{N}_{npq}}e^{i\pi(2\ell+1)\mu_{j}}\frac{a_{j}% }{a_{p}^{1-\mu_{j}}a_{q}^{\mu_{j}}}\,\zeta_{\ell}^{\mu_{j}}\,.

(4.5)

This has the form of eq. (2.14), with $k=\textrm{Card}(\mathscr{N}_{npq})$ parameters (the $b_{j}$ ). The expressions for $t$ and $\bm{t}\cdot\bm{\mu}$ are, in this case,

t=\sum_{j\in\mathscr{N}_{npq}}t_{j}\,,\qquad\bm{t}\cdot\bm{\mu}=\sum_{j\in% \mathscr{N}_{npq}}t_{j}\mu_{j}\,.

(4.6)

The solution for $\zeta_{\ell}$ is given by eq. (2.13) and $x_{\ell}$ is obtained from $\zeta_{\ell}$ via

x_{\ell}=e^{i\pi\frac{2\ell+1}{q-p}}\Bigl{(}\frac{a_{p}}{a_{q}}\Bigr{)}^{1/(q-% p)}\zeta_{\ell}^{1/(q-p)}\,.

(4.7)

It is conventional to solve for the $r^{th}$ powers of the roots. From Theorem 2.4(a) and (b) and eq. (4.7), we obtain

\begin{split}x_{\ell}^{r}&=e^{i\pi\frac{(2\ell+1)r}{q-p}}\Bigl{(}\frac{a_{p}}{% a_{q}}\Bigr{)}^{r/(q-p)}\,\sum_{\bm{t}\in\mathbb{N}^{n-1}}\mathscr{A}_{\bm{t}}% \Bigl{(}\bm{\mu},\frac{r}{q-p}\Bigr{)}\,e^{i\pi(2\ell+1)\bm{t}\cdot\bm{\mu}}% \Bigl{(}\prod_{j\in\mathscr{N}_{npq}}b_{j}^{t_{j}}\Bigr{)}\\ &=e^{i\pi\frac{(2\ell+1)r}{q-p}}\Bigl{(}\frac{a_{p}}{a_{q}}\Bigr{)}^{1/(q-p)}% \,\sum_{\bm{t}\in\mathbb{N}^{n-1}}\mathscr{A}_{\bm{t}}\Bigl{(}\bm{\mu},\frac{r% }{q-p}\Bigr{)}\,\frac{e^{i\pi(2\ell+1)\bm{t}\cdot\bm{\mu}}}{a_{p}^{t-\bm{t}% \cdot\bm{\mu}}a_{q}^{\bm{t}\cdot\bm{\mu}}}\Bigl{(}\prod_{j\in\mathscr{N}_{npq}% }a_{j}^{t_{j}}\Bigr{)}\,.\end{split}

(4.8)

In the first line of eq. (4.8), $x_{\ell}^{r}$ is a sum over products of positive integral powers of the $b_{j}$ , i.e. a power series. In the second line, $a_{p}$ and $a_{q}$ appear with fractional (and possibly negative) exponents, whereas the other $a_{j}$ appear with positive integral powers. Hence in general the series solution for the roots is a Laurent–Puiseux series in the coefficients of the polynomial, i.e. eq. (4.1). This is of course a known fact, not connected with Fuss–Catalan numbers.

Note that eq. (4.1) has $n$ roots, counting multiplicities, but eq. (4.8) yields $q-p$ roots. If $p=0$ and $q=n$ , so $q-p=n$ , then eq. (4.8) yields expressions for all the $n$ roots of eq. (4.1). If $q-p<n$ then we require multiple series to obtain all the $n$ roots of eq. (4.1). It is simplest to explain with an example. Choose $p=0$ and $q=1$ , this yields only one root. Next choose $p=1$ and $q=n$ , this yields $n-1$ roots. One must try different selections for $p$ and $q$ to verify that expressions for all the roots of eq. (4.1) have been found. We shall see this in connection with the trinomial below.

It is also possible to choose $q<p$ . Doing so yields the same set of roots of eq. (4.1) obtained by interchanging $p$ and $q$ . This can be seen with some elementary transformations and relabelling of indices. The details are left to the reader. Hence without loss of generality we may assume $p<q$ .

For all the nonvanishing $a_{j}$ , from eq. (3.3), for absolute convergence we require (necessary, not sufficient)

|b_{j}|=\frac{|a_{j}|}{|a_{p}|^{1-\mu_{j}}|a_{q}|^{\mu_{j}}}\leq\frac{1}{|\mu_% {j}|^{\mu_{j}}|1-\mu_{j}|^{1-\mu_{j}}}\,.

(4.9)

Let the lowest and highest indices of the nonzero $a_{j}$ in $\mathscr{N}_{npq}$ be $j_{\min}$ and $j_{\max}$ , respectively. From eq. (3.5), the sufficient (but not necessary) criterion for absolute convergence is

\begin{split}\sum_{j\in\mathscr{N}_{npq}}|b_{j}|=\sum_{j\in\mathscr{N}_{npq}}% \frac{|a_{j}|}{|a_{p}|^{1-\mu_{j}}|a_{q}|^{\mu_{j}}}&\leq\min\biggl{(}\frac{1}% {|\mu_{j_{\min}}|^{\mu_{j_{\min}}}|1-\mu_{j_{\min}}|^{1-\mu_{j_{\min}}}}\,,\\ &\qquad\qquad\qquad\frac{1}{|\mu_{j_{\max}}|^{\mu_{j_{\max}}}|1-\mu_{j_{\max}}% |^{1-\mu_{j_{\max}}}}\biggr{)}\,.\end{split}

(4.10)

As noted in Sec. 3, the domain of absolute convergence depends only on the amplitudes $|a_{j}|$ and hence the domain is the same for all the choices of radicals for $(-1)^{1/(q-p)}$ , i.e. all the values of $\ell$ .

4.2 Comment on McClintock’s series

McClintock in 1895 published a paper [32] deriving series expressions for all the roots of a polynomial of arbitrary degree. He began with the illustrative example $x^{6}=-1-x$ . McClintock obtained [32, eq. (1)]

x=\omega-\omega^{2}a-\frac{3}{2}\omega^{3}a^{2}-\frac{8}{3}\omega^{4}a^{3}-\cdots

(4.11)

where $a=-\frac{1}{6}$ and “ $\omega$ is any one of sixth-roots of $-1$ .” This is essentially exactly the procedure I employed above: in eq. (4.7) I took an $n^{th}$ root of the highest power $x^{n}$ and solved for $\zeta_{\ell}$ in eq. (4.5). Note that $e^{i\pi(2\ell+1)/n}$ is an $n^{th}$ root of $-1$ . McClintock examined many other polynomials, on a case by case basis; the analysis in the previous subsection gives a general expression for the roots of an arbitrary polynomial. McClintock’s solutions are multiparameter Fuss–Catalan series. McClintock noted that his series had finite radii of convergence but did not derive a general expression for the radius of convergence. McClintock also noted the use of the Lagrange inversion theorem in his derivations.

4.3 Comment on Mellin’s solution

Mellin derived a series solution for the following algebraic equation [33]

z^{n}+x_{1}z^{n_{1}}+x_{2}z^{n_{2}}+\cdots+x_{p}z^{n_{p}}-1=0\,.

(4.12)

Here $n>n_{s}\geq 1$ , $s=1,\dots,p$ (see [33]). Hence all the coefficients $x_{s}$ in eq. (4.12) are nonzero by definition. Mellin derived a series solution for the ‘Hauptlösung’ or principal root, which is the unique branch which equals 1 for $x_{1}=\cdots=x_{p}=0$ , and where $\alpha$ is a positive number [33]

z^{\alpha}=1+\alpha\sum_{k=1}^{\infty}\frac{(-1)^{k}}{n^{k}}\sum_{\nu_{1}+% \cdots+\nu_{p}=k}\frac{\prod_{\mu=1}^{k-1}(\alpha+n_{1}\nu_{1}+\cdots n_{p}\nu% _{p}-n\mu)}{\Gamma(\nu_{1}+1)\Gamma(\nu_{2}+1)\cdots\Gamma(\nu_{p}+1)}\,x_{1}^% {\nu_{1}}\cdots x_{p}^{\nu_{p}}\,.

(4.13)

It is easily verified that this equals the Fuss–Catalan series with $k=p$ , $r=\alpha/n$ , $\mu_{j}=n_{j}/n$ , $t_{j}=\nu_{j}$ and $z_{j}=-x_{j}$ , for $j=1,\dots,p$ , so

z^{\alpha}=\mathcal{B}\Bigl{(}\Bigl{(}\frac{n_{1}}{n},\dots,\frac{n_{p}}{n}% \Bigr{)};\frac{\alpha}{n};(-x_{1},\dots,-x_{p})\Bigr{)}\,.

(4.14)

In fact $\alpha$ is not constrained to be positive. Mellin also specified the following bound for the domain of convergence of his series; it is clearly sufficient but not always necessary [33]

|x_{1}|,\dots,|x_{p}|<\frac{1}{p}\min\biggl{(}\frac{1}{|\mu_{1}|^{\mu_{1}}|1-% \mu_{1}|^{1-\mu_{1}}}\,,\frac{1}{|\mu_{p}|^{\mu_{p}}|1-\mu_{p}|^{1-\mu_{p}}}% \biggr{)}\,.

(4.15)

For ease of comparison with my work, I have written $\mu_{1}$ and $\mu_{p}$ on the right hand side. This is a more conservative bound and is superseded by eq. (3.5) or eq. (4.10).

4.4 Comment on Birkeland’s series

Birkeland published papers on the solutions of algebraic equations using hypergeometric series [6, 7, 8], culminating in his 1927 paper [9]. We study the latter paper (which largely subsumes his earlier work). Birkeland treated the general algebraic equation with complex coefficients [9, eq. (1)]

a_{0}x^{n}+a_{a}x^{n-1}+\cdots+a_{n-1}x+a_{n}=0\,.

(4.16)

Hence his indexing is the opposite of that in eq. (4.1). Birkeland also selected two integers, with $p>q$ , such that $0\leq q<p\leq n$ . He then obtained the scaled equation [9, eq. ( $1^{\prime}$ )] (see his paper for the definitions of $z$ , $l_{i}$ and $m_{i}$ )

z^{p}=z^{q}+l_{1}z^{m_{1}}+l_{2}z^{m_{2}}+\cdots+l_{s}z^{m_{s}}\,.

(4.17)

This is very similar to eq. (4.4). Birkeland then derived a series solution of eq. (4.17). First define $\varepsilon$ as a primitive root of unity satisfying the equation $x^{p-q}=1$ . Then Birkeland obtained for the root $z_{j}$ (raised to the power $\gamma$ ), where $j=1,2,\dots,p-q$ [9, eq. (5)]

z_{j}^{\gamma}=\varepsilon^{j\gamma}\biggl{[}\,1+\frac{\gamma}{p-q}\sum_{% \alpha_{1},\dots,\alpha_{s}=0}^{\infty}\varepsilon^{jv}\,\frac{(\tau,r-1)}{% \alpha_{1}!\alpha_{2}!\dots\alpha_{s}!}\,l_{1}^{\alpha_{1}}l_{2}^{\alpha_{2}}% \dots l_{s}^{\alpha_{s}}\,\biggr{]}\,.

(4.18)

(N.B.: I changed $i$ to $j$ in Birkeland’s equation to avoid confusion with $i=\sqrt{-1}$ .) With elementary changes of notation, eq. (4.18) is equivalent to the first line of eq. (4.8). See in particular [9, eq. (4)] for his definitions of $\tau$ and $v$ . Birkeland did not recognize his series coefficients as Fuss–Catalan numbers. Birkeland then expressed the series solution in eq. (4.18) in terms of sums of hypergeometric series. It would take us too far afield to discuss hypergeometric series in this paper. Birkeland derived the same convergence criteria as in Sec. 3. He derived the necessary (but not sufficient) bound [9, unnumbered, §2]

|\zeta_{1}|<1\,,\quad|\zeta_{2}|<1\,,\quad\dots,\quad|\zeta_{s}|<1\,.

(4.19)

This matches eq. (4.9), after working through the details of his notation: his $\zeta_{j}$ equals my $b_{j}|\mu_{j}|^{\mu_{j}}|1-\mu_{j}|^{1-\mu_{j}}$ , with obvious allowances for differences in his indexing. Birkeland did not recognize that one can write ‘ $\leq$ ’ instead of strict inequalities ‘ $<$ ’ in the bound. As for the sufficient (but not necessary) bound, Birkeland obtained [9, eq. 12]

|l_{1}|+|l_{2}|+\cdots+|l_{s}|<\frac{p-q}{m+p-q}\frac{1}{\displaystyle\Bigl{(}% 1+\frac{p-q}{m}\Bigr{)}^{\frac{m}{p-q}}}\,.

(4.20)

This is clearly similar to eq. (4.10). From eq. (4.17), Birkeland’s $l_{j}$ are my $b_{j}$ , and on the right hand side of eq. (4.20), he wrote out the bound explicitly in terms of integers $m$ , $p$ and $q$ . Contrary to Mellin [33] (see eq. (4.15)) and myself (see eq. (4.10)), Birkeland did not write a ‘min’ of two possible choices for the best bound. Here Birkeland made an error of algebra: Birkeland defined $m$ in eq. (4.20) as the value of $m_{\nu}$ which maximizes the value of $|m_{\nu}-p|$ . Quoting from [9], “Wir wollen mit $m$ die größte der Zahlen $|m_{\nu}-p|$ …” However, as was seen in Sec. 3 for the parameter $\mu_{*}$ , we must choose $\mu_{*}$ to be the value of $\mu_{j}$ which maximizes the value of $|\mu_{j}-\frac{1}{2}|$ . Working through Birkeland’s notation, we must choose $m$ to maximize the value of $|m_{\nu}-(p+q)/2|$ . Birkeland then applied his formalism to derive the solution of the trinomial, which he had also treated in an earlier paper [6]. The trinomial is sufficiently important that it will be studied in a section of its own in Sec. 6.

4.5 Comment on Lewis’s series

In 1939, Lewis [31] published a paper on the solution of algebraic equations by infinite series. He treated the trinomial, then the quadrinomial and finally general multinomial equations. We discuss only the general case here. Lewis treated the general algebraic equation with complex coefficients [31, eq. (39)]

a_{n}z^{n}-a_{k}z^{k}-a_{g}z^{g}-\cdots-a_{b}z^{b}-a_{0}=0\,.

(4.21)

(The above corrects a misprint in [31].) The notation suggests that all the coefficients are nonzero. Lewis treated only the case we denoted above by $p=0$ and $q=n$ . He wrote [31, eq. (40)]

z^{n}=a_{0}/a_{n}+(1/a_{n})(a_{k}z^{k}+a_{g}z^{g}+\cdots+a_{b}z^{b})\,.

(4.22)

Lewis employed Lagrange inversion to derive his solution. Following Lewis, we write $a_{0}/a_{n}=re^{i\theta}$ and the $n$ roots of $a_{n}z^{n}-a_{0}=0$ are denoted by $\alpha_{h}=r^{1/n}e^{i(2h\pi+\theta)/n}$ , where $h=1,\dots,n$ . The solution for the root $z_{h}$ is given as [31, eq. (41)]

z_{h}=\sum_{p,q,\dots,v=0}^{\infty}\frac{a_{k}^{p}a_{g}^{q}\dots a_{b}^{v}}{p!% q!\dots v!(a_{0}n)^{p+q+\cdots+v}}\binom{1+pk+\cdots+vb-n}{p+q+\cdots+v-1}\,% \alpha_{h}^{1+pk+\cdots+vb}\qquad(h=1,\dots,n)\,.

(4.23)

Lewis did not derive a series for powers of the roots. With some effort, eq. (4.23) can be equated to the solution in eq. (4.8). Lewis derived the following sufficient condition for absolute convergence [31, eq. (42)]

\biggl{\{}\frac{|a_{k}\alpha_{h}^{k}|+\cdots+|a_{b}\alpha_{h}^{b}|}{|a_{0}|}% \biggr{\}}^{n}\leq\frac{n^{n}}{k^{k}(n-k)^{n-k}}\,.

(4.24)

Unlike Mellin [33] and Birkeland [9], Lewis [31] recognized that equality ‘ $\leq$ ’ is permitted in the bound. However, like Birkeland, Lewis failed to recognize that the bound on the right hand side is given by the minimum of multiple possibilities, and the expression he derived is not always the correct choice.

4.6 Comment on Raney’s series

Raney [37] employed his formalism to demonstrate the use of Lagrange inversion for a power series $\bar{z}=\sum_{n=0}^{\infty}a_{n}\bar{x}^{n}$ [37, eqs. (5,4), (5,7) and (5.8)]. Then in [37, Sec. 6], Raney applied his formalism to derive series solutions for algebraic equations. As mentioned earlier, [37, eqs. (6.1) and (6.2)] yield eq. (2.14). Raney took the $\mu_{j}$ to be integers; this yields an algebraic equation. Raney displayed the example of the trinomial $\bar{w}=1+\bar{x}\bar{w}^{n}$ [37, eq. (6.3)], with the series solution [37, eq. (6.4)]

\bar{w}=\sum_{k=0}^{\infty}\frac{1}{1+(n-1)k}\binom{nk}{k}\bar{x}^{k}\,.

(4.25)

This matches the series coefficients in eq. (1.2), replacing $m$ by $n$ and $t$ by $k$ . Raney did not discuss questions of convergence. Unlike the other authors cited earlier in this section, Raney took the coefficients in his equations to be elements in a commutative ring, not just complex numbers.

5 Quintic

The quintic is sufficiently important that it is placed in a separate section. It is known that by means of a Tschirnhaus transformation, a general quintic may be brought to the Bring-Jerrard normal form

x^{5}-x+\gamma=0\,.

(5.1)

This algebraic equation (of degree $n=5$ ) lends itself naturally to a solution using Fuss–Catalan series. Using the formalism in Sec. 5, we set $p=0$ and $q=5$ . From eq. (4.7), $x_{\ell}=e^{i\pi(2\ell+1)/5}\gamma^{1/5}\zeta_{\ell}^{1/5}$ with $\ell=0,\dots,4$ . Then $\zeta_{\ell}$ satisfies the equation

\zeta_{\ell}=1-\frac{e^{i\pi(2\ell+1)/5}}{\gamma^{4/5}}\,\zeta_{\ell}^{1/5}\,.

(5.2)

There is only one summand, so $k=1$ and $\mu=1/5$ . The roots $x_{\ell}$ are given by

\begin{split}x_{\ell}=e^{i\pi(2\ell+1)/5}\gamma^{1/5}\mathcal{B}\Bigl{(}\frac{% 1}{5};\,\frac{1}{5};\,-\frac{e^{i\pi(2\ell+1)/5}}{\gamma^{4/5}}\Bigr{)}\,.\end% {split}

(5.3)

From Corollary 3.4, the condition for convergence is necessary and sufficient:

\frac{1}{|\gamma|^{4/5}}\leq\frac{1}{(\frac{1}{5})^{1/5}(\frac{4}{5})^{4/5}}=% \frac{5}{4^{4/5}}\,,\qquad|\gamma|\geq\frac{4}{5^{5/4}}\simeq 0.534992\,.

(5.4)

We can say more. What if the value of $\gamma$ does not satisfy the above bound? There are alternative series we can derive. Rewrite eq. (5.1) as $x=\gamma+x^{5}$ and set $\zeta=x/\gamma$ . This corresponds to $p=0$ and $q=1$ . Then $\zeta$ satisfies $\zeta=1+\gamma^{4}\zeta^{5}$ . Once again $k=1$ and now $z_{1}=\gamma^{4}$ and $\mu=5$ . There is only one root and it is

x=\gamma\zeta=\gamma\mathcal{B}(5;1;\gamma^{4})\,.

(5.5)

Comment: eq. (5.5) is equivalent to the solution published by Eisenstein [17] (see also [42], in English). Eisenstein expressed the Bring-Jerrard normal form differently and solved the equation $x^{5}+x+\gamma=0$ . The necessary and sufficient condition for convergence is

|\gamma|^{4}\leq\frac{1}{5^{5}4^{-4}}=\frac{4^{4}}{5^{5}}\,,\qquad|\gamma|\leq% \frac{4}{5^{5/4}}\,.

(5.6)

This is the inverse of the condition in eq. (5.4). However, we have found only one root. We obtain the other four roots as follows. We divide eq. (5.1) through by $x$ to obtain $x^{4}=1-\gamma/x$ . This corresponds to $p=1$ and $q=5$ . Set $\zeta=x^{4}$ or $x=e^{i2\pi\ell/4}\zeta^{1/4}$ , for $l=0,\dots,3$ , so $\zeta_{\ell}$ satisfies

\zeta_{\ell}=1-\gamma e^{-i\pi\ell/2}\zeta_{\ell}^{-1/4}\,.

(5.7)

Once again $k=1$ and now $z_{1}=-\gamma e^{-i\pi\ell/2}$ and $\mu=-\frac{1}{4}$ . There are four roots, indexed by $\ell=0,\dots,3$

\begin{split}x_{\ell}=e^{i\pi\ell/2}\mathcal{B}\Bigl{(}-\frac{1}{4};\,\frac{1}% {4};\,-e^{-i\pi\ell/2}\gamma\Bigr{)}\,.\end{split}

(5.8)

The necessary and sufficient condition for convergence is

|\gamma|\leq\frac{1}{(\frac{1}{4})^{1/5}(\frac{5}{4})^{5/4}}=\frac{4}{5^{5/4}}\,.

(5.9)

This is the same as eq. (5.6). As noted earlier for the case $k=1$ , all the series converge on their respective circles of convergence.

The series in eqs. (5.5) and (5.8) do not yield the same root. If we set $\gamma=0$ , eq. (5.1) reduces to $x(x^{4}-1)=0$ . One root is zero and the others are the fourth roots of unity. The roots in the series in eqs. (5.5) and (5.8) lie on the branches which respectively approach zero and the fourth roots of unity. Hence the two series, taken together, yield all five roots of eq. (5.1).

There are multiple ways to find the roots of a polynomial using Fuss–Catalan series. The series in eq. (5.3) converges for $|\gamma|\geq 4/5^{5/4}$ whereas those in eqs. (5.5) and (5.8) converge for $|\gamma|\leq 4/5^{5/4}$ . In all cases one obtains convergent solutions for all the five roots of eq. (5.1), thence the general quintic.

The Bring-Jerrard normal form has been solved using hypergeometric functions, e.g. see [35]. Klein’s solution of the quintic [26] also employed hypergeometric series. The series have finite radii of convergence (actually the same radius for all the series). Analytic continuation is required to treat all coefficients of the general quintic. Using Fuss–Catalan series, the series in eqs. (5.5) and (5.8) are the explicit analytic continuations of the series in eq. (5.3) across the ‘boundary’ $|\gamma|=4/5^{5/4}$ . Together they cover the whole parameter space, i.e. all values of $\gamma$ in eq. (5.1). The solution of the Bring-Jerrard normal form using Fuss–Catalan series is arguably ‘cleaner’ than that using hypergeometric series.

6 Trinomial

6.1 General solution

The trinomial is also sufficiently important that it is placed in a separate section. The Bring-Jerrard normal form of the quintic and Lambert’s trinomial, to be discussed in Sec. 6.2, are particular examples of the general trinomial equation

x^{m+n}+ax^{n}+b=0\,.

(6.1)

The general solution of the trinomial, for arbitrary values of the coefficients, was derived by Birkeland (1920,1927) [6, 9], Lewis (1935) [31] and Eagle (1939) [15] (who employed McClintock’s [32] formalism). The general solution of the trinomial was also derived in 1908 by P. A. Lambert [30], not to be confused with Johann Lambert. P. A. Lambert in fact also presented a series solution for the general algebraic equation, but his analysis contained some technical errors and was not discussed in Sec. 4. The derivation below follows Eagle (1939) [15], and eq. (6.1) is taken from his paper. We have already noted that the solutions are Fuss–Catalan series. All four authors cited above derived correct expressions for the radii of convergence of their series. The derivation below may be considered as an independent validation of their results.

As was seen in Sec. 5, there are three series. To systematize the derivation, to give a more panoramic overview of the results, we proceed as follows. Here $\ell$ takes values as appropriate to index the roots of unity.

Set $p=0$ and $q=m+n$ and $x_{\ell}=e^{i\pi(2\ell+1)/(m+n)}b^{1/(m+n)}\zeta_{\ell}^{1/(m+n)}$ then $\zeta_{\ell}$ satisfies

\zeta_{\ell}=1+e^{i\pi(2\ell+1)n/(m+n)}\frac{a}{b^{m/(m+n)}}\zeta_{\ell}^{n/(m% +n)}\,.

(6.2)

Then $\mu=n/(m+n)$ . This series yields $m+n$ roots.

Set $p=0$ and $q=n$ and $x_{\ell}=e^{i\pi(2\ell+1)/n}(b\zeta_{\ell}/a)^{1/n}$ then $\zeta_{\ell}$ satisfies

\zeta_{\ell}=1+e^{i\pi(2\ell+1)(m+n)/n}\frac{b^{m/n}}{a^{(m+n)/n}}\zeta_{\ell}% ^{(m+n)/n}\,.

(6.3)

Then $\mu=(m+n)/n$ . This series yields $n$ roots.

Set $p=n$ and $q=m+n$ and divide eq. (6.1) through by $x^{n}$ . Set $x_{\ell}=e^{i\pi(2\ell+1)/m}(a\zeta_{\ell})^{1/m}$ then $\zeta_{\ell}$ satisfies

\zeta_{\ell}=1+e^{-i\pi(2\ell+1)n/m}\frac{b}{a^{(m+n)/m}}\zeta_{\ell}^{-n/m}\,.

(6.4)

Then $\mu=-n/m$ . This series yields $m$ roots.

The respective series solutions are


$\displaystyle x_{\ell}$	$\displaystyle=e^{i\pi(2\ell+1)/(m+n)}b^{1/(m+n)}\mathcal{B}\Bigl{(}\frac{n}{m+% n};\,\frac{1}{m+n};\,e^{i\pi\frac{(2\ell+1)n}{m+n}}\frac{a}{b^{m/(m+n)}}\Bigr{)}$	$\displaystyle(0\leq\ell\leq m+n-1)\,,$	(6.5a)
$\displaystyle x_{\ell}$	$\displaystyle=e^{i\pi(2\ell+1)/n}\frac{b^{1/n}}{a^{1/n}}\mathcal{B}\Bigl{(}% \frac{m+n}{n};\,\frac{1}{n};\,e^{i\pi(2\ell+1)(m+n)/n}\frac{b^{m/n}}{a^{(m+n)/% n}}\Bigr{)}$	$\displaystyle(0\leq\ell\leq n-1)\,,$	(6.5b)
$\displaystyle x_{\ell}$	$\displaystyle=e^{i\pi(2\ell+1)/m}a^{1/m}\mathcal{B}\Bigl{(}-\frac{n}{m};\,% \frac{1}{m};\,e^{-i\pi(2\ell+1)n/m}\frac{b}{a^{(m+n)/m}}\Bigr{)}$	$\displaystyle(0\leq\ell\leq m-1)\,.$	(6.5c)

The respective domains of convergence are as follows.


$\displaystyle\frac{\|a\|}{\|b\|^{m/(m+n)}}$	$\displaystyle\leq\frac{(m+n)}{n^{n/(m+n)}m^{m/(m+n)}}\,,$	$\displaystyle\qquad\frac{\|b\|^{m}}{\|a\|^{m+n}}$	$\displaystyle\geq\frac{m^{m}n^{n}}{(m+n)^{m+n}}\,,$	(6.6a)
$\displaystyle\frac{\|b^{m/n}\|}{\|a\|^{(m+n)/n}}$	$\displaystyle\leq\frac{n}{(m+n)m^{-m/n}}\,,$	$\displaystyle\qquad\frac{\|b\|^{m}}{\|a\|^{m+n}}$	$\displaystyle\leq\frac{m^{m}n^{n}}{(m+n)^{m+n}}\,,$	(6.6b)
$\displaystyle\frac{\|b\|}{\|a\|^{(m+n)/m}}$	$\displaystyle\leq\frac{m}{n^{-n/m}(m+n)^{(m+n)/m}}\,,$	$\displaystyle\qquad\frac{\|b\|^{m}}{\|a\|^{m+n}}$	$\displaystyle\leq\frac{m^{m}n^{n}}{(m+n)^{m+n}}\,.$	(6.6c)

If we set $b\to 0$ then $x^{n}(x^{m}+a)\to 0$ . Hence $n$ roots approach zero and $m$ roots approach the respective $m^{th}$ roots of $-a$ . The roots of the second and third series respectively lie on the branches which approach zero and the $m^{th}$ roots of $-a$ as $b\to 0$ . The second and third series have the same domain of convergence and hence together yield all the $m+n$ roots of eq. (6.1). The above set of three series are those found by Eagle [15] and together yield all the roots of the general trinomial for arbitrary values of the coefficients. They are equivalent to the solutions derived by P. A. Lambert [30], Birkeland [6, 9] and Lewis [31].

Consider also the following. Divide eq. (6.1) through by $x^{n}$ as above, but now set $x_{\ell}=e^{i\pi(2\ell+1)/n}(a\zeta_{\ell}/b)^{-1/n}$ . This corresponds to setting $p=n$ and $q=0$ , i.e. $q<p$ . Then $\zeta_{\ell}$ satisfies

\zeta_{\ell}=1+e^{i\pi(2\ell+1)m/n}\frac{b^{m/n}}{a^{(m+n)/n}}\zeta_{\ell}^{-m% /n}\,.

(6.7)

Compare this to eq. (6.4) Now $\mu=-m/n$ . This series yields $n$ roots. It converges if and only if

\frac{|b|^{m/n}}{|a|^{(m+n)/n}}\leq\frac{n}{m^{-m/n}(m+n)^{(m+n)/n}}\,,\qquad% \qquad\frac{|b|^{m}}{|a|^{m+n}}\leq\frac{m^{m}n^{n}}{(m+n)^{m+n}}\,.

(6.8)

The solution is

x_{\ell}=e^{i\pi(2\ell+1)/n}\frac{b^{1/n}}{a^{1/n}}\mathcal{B}\Bigl{(}-\frac{m% }{n};\,-\frac{1}{n};\,e^{i\pi(2\ell+1)m/n}\frac{b^{m/n}}{a^{(m+n)/n}}\Bigr{)}% \qquad(0\leq\ell\leq n-1)\,.

(6.9)

Hence this series yields the same $n$ roots as the second series above, for which $\mu=(m+n)/n$ , with the same domain of convergence. It is therefore the same series as in eq. (6.5b). Even the permutations of the roots are identical, because the first terms of both series are $e^{i\pi(2\ell+1)/n}(b/a)^{1/n}$ . It was remarked in Sec. 4 that choosing $q<p$ yields the same solutions as the series obtained by interchanging $p$ and $q$ .

Johann Lambert [28] solved the equation $x^{m}+px=q$ in 1758. Corless et al. [12] stated “In 1758, Lambert solved the trinomial equation $x=q+x^{m}$ by giving a series development for $x$ in powers of $q$ .” I found that the equation $x=q+x^{m}$ appears in Lambert’s 1770 paper [29, §8], where Lambert stated “auquel on peut toujous donner la forme plus simple” (“to which we can always provide the simpler form”) and where Lambert derived a series for the $n^{th}$ power $x^{n}$ . Lambert’s solutions are Fuss–Catalan series, for the branch which approaches zero when $q\to 0$ . Lambert’s series solution for $x^{n}$ , where $x=q+x^{m}$ , may be written as [29, §8]

x^{n}=q\mathcal{B}(m;n;q^{m-1})\,.

(6.10)

Lambert did not specify the radius of convergence of his series. It converges if and only if

|q|\leq\frac{m-1}{m^{m/(m-1)}}\,.

(6.11)

Ramanujan also solved the trinomial via a series. The equation he treated was [5, first quarterly report, 1.6 (iv), eq. (1.15)]

aqx^{p}+x^{q}=1\,.

(6.12)

Ramanujan derived the following solution for any power $n$ [5, first quarterly report, 1.6 (iv), eq. (1.16)]

x^{n}=\frac{n}{q}\sum_{k=0}^{\infty}\frac{\Gamma(\{n+pk\}/q)(-qa)^{k}}{\Gamma(% \{n+pk\}/q-k+1)k!}\,.

(6.13)

This is the branch which approaches unity for $a\to 0$ . The above expression is stated in [5] to be valid for all real numbers $n$ , $p$ , $q$ and for complex $a$ satisfying

|a|\leq|p|^{-p/q}|p-q|^{(p-q)/q}\,.

(6.14)

Let us verify Ramanujan’s solution. The expression in eq. (6.13) is tricky if $n=0$ . The first term in the sum is actually unity

\begin{split}x^{n}&=\frac{(n/q)\Gamma(n/q)}{\Gamma(n/q+1)}+\frac{n}{q}\sum_{k=% 1}^{\infty}\frac{(-qa)^{k}}{k!}\prod_{u=1}^{k-1}(kp/q+n/q-u)\\ &=1+\frac{n}{q}\sum_{k=1}^{\infty}\frac{(-qa)^{k}}{k!}\prod_{u=1}^{k-1}(kp/q+n% /q-u)\,.\end{split}

(6.15)

We need to perform the cancellations before setting $n=0$ on the right hand side. To solve eq. (6.12) using a Fuss–Catalan series (for the branch treated by Ramanujan), put $\zeta=x^{q}$ , so $x=\zeta^{1/q}$ , then $\zeta=1-aq\zeta^{p/q}$ . Hence $\mu=p/q$ and $z=-qa$ in eq. (2.3). The solution is (using $k$ as a summation variable)

\begin{split}x^{n}=\zeta^{n/q}&=\sum_{k=0}^{\infty}A_{k}(p/q,n/q)(-qa)^{k}\\ &=1+\frac{n}{q}\sum_{k=1}^{\infty}\frac{(-qa)^{k}}{k!}\prod_{u=1}^{k-1}(kp/q+n% /q-u)\,.\end{split}

(6.16)

This equals the expression in eq. (6.13). The series converges if and only if

\begin{split}|a|&\leq\frac{1}{|q|}\,\frac{1}{|p/q|^{p/q}|1-p/q|^{1-p/q}}\,,\\ &=|p|^{-p/q}|p-q|^{(p-q)/q}\,.\end{split}

(6.17)

This confirms the bound in eq. (6.14).

6.2 Lambert and Euler trinomial equations

At stated above, in 1758 Lambert [28] gave a series solution for the trinomial equation $x^{m}+px=q$ and later in 1770, Lambert [29, §8] revisited the equation in the form

x=q+x^{m}\,.

(6.18)

The treatment below follows Corless et al. [12]. In 1779 Euler [16] derived the following equation from Lambert’s trinomial (I have changed Euler’s ‘ $x$ ’ to ‘ $z$ ’ to avoid confusion as to which equation $x$ satisfies)

z^{\alpha}-z^{\beta}=(\alpha-\beta)vz^{\alpha+\beta}\,.

(6.19)

This is obtained from eq. (6.18) via the substitutions $x=z^{-\beta}$ , $m=\alpha/\beta$ (this corrects a misprint in [12], which stated $m=\alpha\beta$ ) and $q=(\alpha-\beta)v$ . Euler’s solution of eq. (6.19), for $z^{n}$ , was [16]

\begin{split}z^{n}=1+nv&+\frac{1}{2!}\,n(n+\alpha+\beta)\,v^{2}\\ &+\frac{1}{3!}\,n(n+\alpha+2\beta)(n+2\alpha+\beta)\,v^{3}\\ &+\frac{1}{4!}\,n(n+\alpha+3\beta)(n+2\alpha+2\beta)(n+3\alpha+\beta)\,v^{4}+% \cdots\end{split}

(6.20)

Clearly, eq. (6.18) can be solved using Fuss–Catalan series. There are $m$ roots, of which one approaches 0 and $m-1$ approach the roots of unity as $q\to 0$ . Lambert’s solution is the unique branch which vanishes for $q=0$ and was displayed as a Fuss–Catalan series in eq. (6.10). Euler’s solution in eq. (6.20) does not vanish for $\alpha=\beta$ , i.e. $q=0$ , and is the unique solution of eq. (6.18) which is real (if $q$ is real) and approaches 1 as $q\to 0$ . We can derive it as follows. Divide eq. (6.18) through by $x^{m}$ and set $x=\zeta^{-1/(m-1)}$ , then $\zeta=1+q\zeta^{m/(m-1)}$ . Hence $\mu=m/(m-1)$ and the solution is

x=\mathcal{B}\Bigl{(}\frac{m}{m-1};\,-\frac{1}{m-1};\,q\Bigr{)}\,.

(6.21)

Put $x=z^{-\beta}$ , $m=\alpha/\beta$ and $q=(\alpha-\beta)v$ , then $z^{n}=x^{-n/\beta}$

z^{n}=\mathcal{B}\Bigl{(}\frac{\alpha}{\alpha-\beta};\,\frac{n}{\alpha-\beta};% \,(\alpha-\beta)v\Bigr{)}\,.

(6.22)

Let us verify from eq. (6.20) that this equals Euler’s solution:

\begin{split}z^{n}&=1+n\sum_{t=1}^{\infty}\frac{v^{t}}{t!}\,\prod_{j=1}^{t-1}(% t\alpha+n-j(\alpha-\beta))\\ &=1+\frac{n}{\alpha-\beta}\sum_{t=1}^{\infty}\frac{(\alpha-\beta)^{t}v^{t}}{t!% }\,\prod_{j=1}^{t-1}\Bigl{(}\frac{t\alpha}{\alpha-\beta}+\frac{n}{\alpha-\beta% }-j\Bigr{)}\\ &=\sum_{t=0}^{\infty}A_{t}\Bigl{(}\frac{\alpha}{\alpha-\beta},\,\frac{n}{% \alpha-\beta}\Bigr{)}\,(\alpha-\beta)^{t}v^{t}\\ &=\mathcal{B}\Bigl{(}\frac{\alpha}{\alpha-\beta};\,\frac{n}{\alpha-\beta};\,(% \alpha-\beta)v\Bigr{)}\,.\end{split}

(6.23)

This agrees with eq. (6.22). The paper by Corless et al. [12] discusses various applications of the Lambert $W$ function, which is the real solution (for real $x\geq-1/e$ ) of the equation $W(x)e^{W(x)}=x$ , but that is beyond the scope of this paper.

7 Algebraic equations: convergence of series I

7.1 General remarks

A necessary and sufficient bound for the domain of absolute convergence is available for the important special case of the solutions of algebraic equations by infinite series. We begin with some known theorems from the theory of power series in several complex variables.

Definition 7.1 (multicircular or Reinhardt domain).

A multi-circular or Reinhardt domain in $\mathbb{C}^{k}$ has the property that for $k$ complex variables $\bm{z}=(z_{1},\dots,z_{k})$ , if a point $\bm{z}_{*}$ lies in the domain, then so does every point $\bm{z}$ such that $|z_{j}|=|z_{*j}|$ for $j=1,\dots,k$ . A multi-circular domain with the property that if a point $\bm{z}_{*}$ lies in the domain, then so does the polydisc given by $\{\bm{z}:|z_{j}|\leq|z_{*j}|,\;j=1,\dots,k\}$ is known as a complete Reinhardt domain. A polydisc is a Cartesian product of discs, in general with different radii.

The convergence domain of a power series in multiple variables is a union of polydiscs centered at the origin and is a complete Reinhardt domain. The following is also known. Using a vector notation, with coefficients $\bm{c}_{\bm{\alpha}}$ indexed by a $k$ -tuple $\bm{\alpha}$ , if both $\sum_{\bm{\alpha}}|\bm{c}_{\bm{\alpha}}\bm{z}^{\bm{\alpha}}|$ and $\sum_{\bm{\alpha}}|\bm{c}_{\bm{\alpha}}\bm{w}^{\bm{\alpha}}|$ converge, then so does $\sum_{\bm{\alpha}}|\bm{c}_{\bm{\alpha}}||\bm{z}^{\bm{\alpha}}|^{t}|\bm{w}^{\bm% {\alpha}}|^{1-t}$ for $0\leq t\leq 1$ . This property of a Reinhardt domain is called logarithmic convexity. Define a map $\textrm{Log}:(\mathbb{C}\setminus\{0\})^{k}\to\mathbb{R}^{k}$ where $z_{j}\mapsto\ln|z_{j}|$ for $j=1,\dots,k$ . Let the image of the domain of convergence $\mathcal{D}$ be $\textrm{Log}(\mathcal{D})\subset\mathbb{R}^{k}$ . If $\textrm{Log}(z),\textrm{Log}(w)\in\textrm{Log}(\mathcal{D})$ , then also $t\textrm{Log}(z)+(1-t)\textrm{Log}(w)\in\textrm{Log}(\mathcal{D})$ for $0\leq t\leq 1$ , i.e.

(t\ln|z_{1}|+(1-t)\ln|w_{1}|,\dots,t\ln|z_{k}|+(1-t)\ln|w_{k}|)\in\textrm{Log}% (\mathcal{D})\,.

(7.1)

A complete Reinhardt domain in $\mathbb{C}^{k}$ is the domain of absolute convergence of a power series if and only if the domain is logarithmically convex. The power series converges uniformly in every compact subset of the domain $\mathcal{D}$ . Note that logarithmic convexity does not imply convexity. For $k=1$ , the domain of convergence of a univariate power series is a disc in $\mathbb{C}$ centered on the origin, and is convex. However, for $k\geq 2$ variables, a complete Reinhardt domain in not in general convex. However, from the foregoing remarks about polydiscs, the following is true. If a point $\bm{z}_{*}$ lies in the domain of convergence, then so does every point on the ray joining the origin to $\bm{z}_{*}$ , i.e. $\bm{z}=\lambda\bm{z}_{*}$ for $0\leq\lambda\leq 1$ .

The above theory is general. In this section, we are concerned with the domain of absolute convergence of the series in eq. (4.8), which is the solution of eq. (4.1). Passare and Tsikh [36] claimed to offer a necessary and sufficient bound for absolute convergence in this case. We summarize their work below. We also display some counterexamples to their bound, and offer a more detailed analysis. For ease of contact with the formalism in [36], we write

a_{0}+a_{1}x+\cdots+x^{p}+\cdots+x^{q}+\cdots+a_{n}x^{n}=0\,.

(7.2)

This is effectively eq. (4.1) (or eq. (4.2)) where we have set $a_{p}=a_{q}=1$ . This is the equation treated in [36]. The series solution is given by eq. (4.8), with obvious changes of notation. Passare and Tsikh employed the notation $[p]$ to denote that the index $p$ is excluded from a list of the form $(\alpha_{0},\alpha_{1},\dots,[p],\dots,\alpha_{n})$ . The solution of eq. (7.2) is a power series in the $n-1$ variables $(a_{0},a_{1},\dots,[p],\dots,[q],\dots,a_{n})$ . Passare and Tsikh studied the discriminant $\Delta_{pq}(a_{0},a_{1},\dots,[p],\dots,[q],\dots,a_{n})$ which is the discriminant of the polynomial in eq. (7.2). Then Passare and Tsikh [36] claimed that the domain of absolute convergence $\mathcal{D}_{pq}$ of the series solution of eq. (7.2) is a complete Reinhardt domain whose boundary is (a segment of) the zero locus $\Delta_{pq}(a_{0},a_{1},\dots,[p],\dots,[q],\dots,a_{n})=0$ . Specifically, for absolute convergence, they derived equations of the form $\Delta_{pq}(\pm|a_{0}|,\dots,[p],\dots,[q],\dots,\pm|a_{n}|)=0$ . See [36, Thm. 3] for a precise statement of their result.

7.2 Application to cubics

Passare and Tsikh employed their formalism to display the domains of convergence for the series solutions of a cubic [36, Sec. 5.2]. The general cubic equation with complex coefficients $(a_{0},a_{1},a_{2},a_{3})$ is

a_{0}+a_{1}x+a_{2}x^{2}+a_{3}x^{3}=0\,.

(7.3)

There are six choices for $p$ and $q$ , and the respective domains of convergence $\mathcal{D}_{pq}$ were given as follows [36, unnumbered before eq. (17)]


$\displaystyle\mathcal{D}_{01}$	$\displaystyle=\{\Delta_{01}(\|a_{2}\|,-\|a_{3}\|)<0\}\,,$	(7.4a)
$\displaystyle\mathcal{D}_{02}^{*}$	$\displaystyle=\{\Delta_{02}(\|a_{1}\|,\|a_{3}\|)>0\}\cap\{\Delta_{02}(\|a_{1}\|,-\|a_% {3}\|)<0\}\,,$	(7.4b)
$\displaystyle\mathcal{D}_{03}$	$\displaystyle=\{\Delta_{03}(-\|a_{1}\|,-\|a_{2}\|)>0\}\,,$	(7.4c)
$\displaystyle\mathcal{D}_{12}$	$\displaystyle=\{\Delta_{12}(\|a_{0}\|,-\|a_{3}\|)<0\}\cap\{\Delta_{12}(-\|a_{0}\|,\|a% _{3}\|)<0\}\,,$	(7.4d)
$\displaystyle\mathcal{D}_{13}^{*}$	$\displaystyle=\{\Delta_{13}(\|a_{0}\|,\|a_{2}\|)>0\}\cap\{\Delta_{13}(\|a_{0}\|,-\|a_% {2}\|)<0\}\,,$	(7.4e)
$\displaystyle\mathcal{D}_{23}^{*}$	$\displaystyle=\{\Delta_{23}(\|a_{0}\|,-\|a_{1}\|)<0\}\,.$	(7.4f)

Three of the above six cases, marked with asterisks, are wrong. The cases $\mathcal{D}_{02}$ and $\mathcal{D}_{13}$ contain fundamental errors, while $\mathcal{D}_{23}$ can be explained as a misprint. I have attempted to resolve these issues privately with Tsikh, but regrettably have not received a reply of scientific substance. (Passare is deceased.) We begin with the case $p=0$ and $q=2$ . The relevant cubic equation is

1+a_{1}x+x^{2}+a_{3}x^{3}=0\,.

(7.5)

Setting $a_{3}=0$ in eq. (7.4b) yields the self-contradictory conditions

\Delta_{02}(|a_{1}|,0)>0\qquad\textrm{and}\qquad\Delta_{02}(|a_{1}|,0)<0\,.

(7.6)

These conditions imply that for $a_{3}=0$ , the series does not converge for any $a_{1}$ , and in particular the origin $(a_{1},a_{3})=(0,0)$ is not in the domain of convergence, which is false. For $a_{3}=0$ , eq. (7.5) reduces to the quadratic $1+a_{1}x+x^{2}=0$ and the series solution converges for $4-|a_{1}|^{2}>0$ or $|a_{1}|<2$ . We now show that the error in eq. (7.4b) is fundamental and cannot be explained as a misprint in [36].

First, the expressions for the discriminants are, for all four $\pm$ sign assignments $(\pm|a_{1}|,\pm|a_{3}|)$ ,


$\displaystyle\Delta_{02}(\|a_{1}\|,\|a_{3}\|)$	$\displaystyle=27\|a_{3}\|^{2}+4\|a_{1}\|^{3}\|a_{3}\|+4-18\|a_{1}\|\|a_{3}\|-\|a_{1}\|^{2}\,,$	(7.7a)
$\displaystyle\Delta_{02}(\|a_{1}\|,-\|a_{3}\|)$	$\displaystyle=27\|a_{3}\|^{2}-4\|a_{1}\|^{3}\|a_{3}\|+4+18\|a_{1}\|\|a_{3}\|-\|a_{1}\|^{2}\,.$	(7.7b)
$\displaystyle\Delta_{02}(-\|a_{1}\|,\|a_{3}\|)$	$\displaystyle=\Delta_{02}(\|a_{1}\|,-\|a_{3}\|)\,,$	(7.7c)
$\displaystyle\Delta_{02}(-\|a_{1}\|,-\|a_{3}\|)$	$\displaystyle=\Delta_{02}(\|a_{1}\|,\|a_{3}\|)\,.$	(7.7d)

Hence there are only two independent expressions, viz. $\Delta_{02}(|a_{1}|,|a_{3}|)$ and $\Delta_{02}(|a_{1}|,-|a_{3}|)$ . Hence the problem with eq. (7.4b) cannot be explained as a misprint in the assignment of $\pm$ signs for $\pm|a_{1}|$ and/or $\pm|a_{3}|$ .

Putting $a_{1}=a_{3}=0$ yields $\Delta_{02}(0,0)=4$ , i.e. a positive number. Let us therefore tentatively reverse the second inequality in eq. (7.4b) as follows

\mathcal{D}_{02}\ =?\ \{\Delta_{02}(|a_{1}|,|a_{3}|)>0\}\cap\{\Delta_{02}(|a_{% 1}|,-|a_{3}|)>0\}\,.

(7.8)

Putting $a_{1}=0$ yields $\Delta_{02}(0,|a_{3}|)=\Delta_{02}(0,-|a_{3}|)=27|a_{3}|^{2}+4$ , i.e. both expressions are equal and positive definite, so eq. (7.8) is satisfied for all $a_{3}$ . Hence, if eq. (7.8) is taken seriously, it implies that for $a_{1}=0$ , the series solution of eq. (7.5) converges absolutely for all $a_{3}$ . We know this is false. If $a_{1}=0$ , then eq. (7.5) reduces to the following trinomial equation $1+x^{2}+a_{3}x^{3}=0$ . We proved in Sec. 6 that for such a situation the series solution converges absolutely for $|a_{3}|^{2}\leq 4/27$ .

Hence there is no assignment of $\pm$ signs for $\pm|a_{1}|$ and/or $\pm|a_{3}|$ , nor any reversal of the inequalities in eq. (7.4b), which leads to a correct formula for the domain of convergence $\mathcal{D}_{02}$ . The error in eq. (7.4b) cannot be explained as a misprint in [36]. A more careful treatment is therefore required, and will be presented in the next section. We shall also deal with the other cases $\mathcal{D}_{13}$ and $\mathcal{D}_{23}$ in Sec. 8.

8 Algebraic equations: convergence of series II

8.1 Revised formalism

We present a more careful analysis of the problem of the domain of absolute convergence of the series solution of an algebraic equation below. To make the exposition self-contained, we begin from scratch, although we shall attempt to minimize repetition of material already presented earlier in this paper. The original polynomial is

\mathscr{P}(x)=a_{0}+a_{1}x+\cdots+a_{n}x^{n}\,.

(8.1)

For the purposes of determining domains of convergence, we assume all the coefficients are nonzero in general. The specialization to cases such as a trinomial is obvious. We fix two integers $p$ and $q$ such that $0\leq p<q\leq n$ and derive a transformed polynomial, whose roots are proportional to those of $\mathscr{P}(x)$ . Employing (nonzero) constants $\lambda$ and $\mu$ , where $x=\mu y$ , we obtain

\mathscr{P}_{pq}(y)\equiv\lambda\mathscr{P}(\mu y)=b_{0}+b_{1}y+\cdots+y^{p}+% \cdots+y^{q}+\cdots+b_{n}y^{n}\,.

(8.2)

Here $b_{j}=a_{j}/(a_{p}^{1-\mu_{j}}a_{q}^{\mu_{j}})$ , where $\mu_{j}=(j-p)/(q-p)$ . Then $b_{p}=b_{q}=1$ , by construction. For brevity below, we define the tuples $\bm{a}=(a_{0},\dots,a_{n})$ and $\bm{b}=(b_{0},\dots,[p],\dots,[q],\dots,b_{n})$ . Then $\bm{a}$ and $\bm{b}$ contain respectively $n+1$ and $n-1$ components. We solve for a root $y_{m}$ where $\mathscr{P}_{pq}(y_{m})=0$ . Note that $y_{m}$ also depends on $p$ and $q$ , but we omit this for brevity. We express $y_{m}$ as a multivariate power series in the $n-1$ scaled coefficients $b_{j}$ , where $j\in\mathscr{N}_{npq}$ . Recall $\mathscr{N}_{npq}=\{0,1,\dots,n\}\setminus\{p,q\}$ . We saw previously that this procedure yields $q-p$ roots, so $m=0,\dots,q-p-1$ . We know the domain of absolute convergence of the resulting power series for $y_{m}$ includes a nonempty open neighborhood of the origin $\bm{0}_{pq}$ , where $b_{j}=0$ for all $j\in\mathscr{N}_{npq}$ . We also know the domain of absolute convergence is a complete Reinhardt domain and depends only on the amplitudes $|b_{j}|$ , i.e. the domain is the same for all the $q-p$ roots $y_{m}$ .

Next note that just as the original polynomial $\mathscr{P}(x)$ can always be transformed to $\mathscr{P}_{pq}(y)$ , the same procedure also transforms the discriminant $\Delta(\bm{a})$ of $\mathscr{P}(x)$ to the scaled discriminant $\Delta_{pq}(\bm{b})$ of $\mathscr{P}_{pq}(y)$ . Then $\Delta_{pq}(\bm{b})=\lambda^{2(n-1)}\mu^{n(n-1)}\Delta(\bm{a})$ . Thus far, the argument is correct. It is, however, false to conclude that the boundary of the domain of absolute convergence is determined by the scaled discriminants given by $\Delta_{pq}(\pm|b_{0}|,\dots,[p],\dots,[q],\dots,\pm|b_{n}|)$ , specifically, by solving for the hypersurfaces given by $\Delta_{pq}(\pm|b_{0}|,\dots,[p],\dots,[q],\dots,\pm|b_{n}|)=0$ . For example, we saw above that this led to erroneous results for the domains of convergence for the series solutions of a cubic.

The weak point is that there are additional discriminants, which are also required to determine the boundary of the domain of absolute convergence. To see this, let us review the key steps. We employ fresh notation to avoid confusion with the above symbols. For brevity below, define an $(n-1)$ -tuple of $\pm$ signs

\bm{\sigma}=(\sigma_{0},\dots,[p],\dots,[q],\dots,\sigma_{n})\,.

(8.3)

Here $\sigma_{j}=\pm 1$ for $j\in\mathscr{N}_{npq}$ . The dependence of $\bm{\sigma}$ and $\sigma_{j}$ on $p$ and $q$ is taken as understood. We also define the set $\Sigma_{pq}$ of all the distinct tuples $\bm{\sigma}$ . Then $\Sigma_{pq}$ has cardinality $2^{n-1}$ . The power series solutions for the roots of the following $2^{n+1}$ algebraic equations all have the same domain of absolute convergence

\sigma_{0}|b_{0}|+\sigma_{1}|b_{1}|y+\cdots\pm y^{p}\cdots\pm y^{q}+\cdots+% \sigma_{n}|b_{n}|y^{n}=0\,.

(8.4)

The coefficient of $y^{j}$ is permitted to be $\pm|b_{j}|$ only. We can always divide through by $-1$ , if necessary, so that the coefficient of $y^{p}$ is unity. This yields $2^{n}$ distinct equations. The discriminant of the associated polynomial is $\Delta(\sigma_{0}|b_{0}|,\cdots,1,\cdots,\pm 1,\cdots,\sigma_{n}|b_{n}|)$ . Let us now define the following two families (or sets) of discriminants. We employ the symbol $\Psi$ to avoid confusion with $\Delta_{pq}$ above. Then, with $1$ in the $p^{th}$ slot and $\pm 1$ in the $q^{th}$ slot, we define


$\displaystyle\Psi^{+}_{pq}(\bm{b},\bm{\sigma})$	$\displaystyle=\Delta(\sigma_{0}\|b_{0}\|,\cdots,1,\cdots,\phantom{-}1,\cdots,% \sigma_{n}\|b_{n}\|)\,,$	(8.5a)
$\displaystyle\Psi^{-}_{pq}(\bm{b},\bm{\sigma})$	$\displaystyle=\Delta(\sigma_{0}\|b_{0}\|,\cdots,1,\cdots,-1,\cdots,\sigma_{n}\|b_% {n}\|)\,,$	(8.5b)
$\displaystyle\Psi^{+}_{pq}(\bm{b})$	$\displaystyle=\{\Psi^{+}_{pq}(\bm{b},\bm{\sigma})\,\|\,\bm{\sigma}\in\Sigma_{pq% }\}\,,$	(8.5c)
$\displaystyle\Psi^{-}_{pq}(\bm{b})$	$\displaystyle=\{\Psi^{-}_{pq}(\bm{b},\bm{\sigma})\,\|\,\bm{\sigma}\in\Sigma_{pq% }\}\,.$	(8.5d)

Each set has at most $2^{n-1}$ distinct elements. The following lemma shows that the family $\Psi_{pq}^{-}(\bm{b})$ is nontrivial.

Lemma 8.1.

If $q-p$ is odd, the sets $\Psi^{+}_{pq}(\bm{b})$ and $\Psi^{-}_{pq}(\bm{b})$ are identical. If $q-p$ is even, the sets $\Psi^{+}_{pq}(\bm{b})$ and $\Psi^{-}_{pq}(\bm{b})$ are disjoint.

Proof.

Introduce two additional tuples $\bm{\sigma}^{\prime}$ where $\sigma_{j}^{\prime}=(-1)^{j}\sigma_{j}$ and $-\bm{\sigma}^{\prime}$ where obviously the components are $-(-1)^{j}\sigma_{j}$ . Then consider the polynomials

P_{\pm}(y,\bm{\sigma})=\sigma_{0}|b_{0}|+\sigma_{1}|b_{1}|y+\cdots+y^{p}+% \cdots\pm y^{q}+\cdots+\sigma_{n}|b_{n}|y^{n}\,.

(8.6)

The only permitted transformations of $P_{+}$ and $P_{-}$ are to reverse the sign of $y$ and/or to multiply $P_{\pm}$ by $-1$ , because the coefficient of $y^{j}$ must be $\pm|b_{j}|$ only. By construction, the discriminant of $P_{+}(y,\bm{\sigma})$ is an element of $\Psi^{+}_{pq}(\bm{b})$ and that of $P_{-}(y,\bm{\sigma})$ is an element of $\Psi^{-}_{pq}(\bm{b})$ . First suppose $q-p$ is odd. If $p$ is even and $q$ is odd, then

\begin{split}P_{\pm}(-y,\bm{\sigma})&=\sigma_{0}|b_{0}|-\sigma_{1}|b_{1}|y+% \cdots+y^{p}+\cdots\mp y^{q}+\cdots+(-1)^{n}\sigma_{n}|b_{n}|y^{n}\\ &=P_{\mp}(y,\bm{\sigma}^{\prime})\,.\end{split}

(8.7)

If $p$ is odd and $q$ is even, then

\begin{split}-P_{\pm}(-y,\bm{\sigma})&=-\sigma_{0}|b_{0}|+\sigma_{1}|b_{1}|y+% \cdots+y^{p}+\cdots\mp y^{q}+\cdots+(-1)^{n+1}\sigma_{n}|b_{n}|y^{n}\\ &=P_{\mp}(y,-\bm{\sigma}^{\prime})\,.\end{split}

(8.8)

Cycling through all values of $\bm{\sigma}$ shows that the sets $\Psi^{+}_{pq}(\bm{b})$ and $\Psi^{-}_{pq}(\bm{b})$ are identical, if $q-p$ is odd. Now suppose $q-p$ is even. Suppose $p$ and $q$ are both even. Then

\begin{split}P_{\pm}(-y,\bm{\sigma})&=\sigma_{0}|b_{0}|-\sigma_{1}|b_{1}|y+% \cdots+y^{p}+\cdots\pm y^{q}+\cdots+(-1)^{n}\sigma_{n}|b_{n}|y^{n}\\ &=P_{\pm}(y,\bm{\sigma}^{\prime})\,.\end{split}

(8.9)

The discriminant of $P_{+}(-y,\bm{\sigma})$ is an element of $\Psi^{+}_{pq}(\bm{b})$ . The other transformations $-P_{+}(y,\bm{\sigma})$ and $-P_{+}(-y,\bm{\sigma})$ also fail to yield a polynomial with a discriminant which is an element of $\Psi^{-}_{pq}(\bm{b})$ . Similarly the discriminant of $\pm P_{-}(\pm y,\bm{\sigma})$ is always an element of $\Psi^{-}_{pq}(\bm{b})$ . Next suppose $p$ and $q$ are both odd. Then

\begin{split}-P_{\pm}(-y,\bm{\sigma})&=-\sigma_{0}|b_{0}|+\sigma_{1}|b_{1}|y+% \cdots+y^{p}+\cdots\pm y^{q}+\cdots+(-1)^{n+1}\sigma_{n}|b_{n}|y^{n}\\ &=P_{\pm}(y,-\bm{\sigma}^{\prime})\,.\end{split}

(8.10)

As was the case when $p$ and $q$ were both even, the discriminant of $\pm P_{+}(\pm y,\bm{\sigma})$ is always an element of $\Psi^{+}_{pq}(\bm{b})$ and the discriminant of $\pm P_{-}(\pm y,\bm{\sigma})$ is always an element of $\Psi^{-}_{pq}(\bm{b})$ . Hence if $q-p$ is even, the sets $\Psi^{+}_{pq}(\bm{b})$ and $\Psi^{-}_{pq}(\bm{b})$ are disjoint. ∎

Armed with this additional information, we return to eq. (7.5) and the case $p=0$ and $q=2$ . There are four distinct discriminants which can contribute to the boundary of the domain of convergence, viz.


$\displaystyle\Psi^{+}_{02}(\|a_{1}\|,\|a_{3}\|)$	$\displaystyle=27\|a_{3}\|^{2}+4\|a_{1}\|^{3}\|a_{3}\|+4-18\|a_{1}\|\|a_{3}\|-\|a_{1}\|^{2}\,,$	(8.11a)
$\displaystyle\Psi^{+}_{02}(\|a_{1}\|,-\|a_{3}\|)$	$\displaystyle=27\|a_{3}\|^{2}-4\|a_{1}\|^{3}\|a_{3}\|+4+18\|a_{1}\|\|a_{3}\|-\|a_{1}\|^{2}\,,$	(8.11b)
$\displaystyle\Psi^{-}_{02}(\|a_{1}\|,\|a_{3}\|)$	$\displaystyle=27\|a_{3}\|^{2}+4\|a_{1}\|^{3}\|a_{3}\|-4+18\|a_{1}\|\|a_{3}\|-\|a_{1}\|^{2}\,,$	(8.11c)
$\displaystyle\Psi^{-}_{02}(\|a_{1}\|,-\|a_{3}\|)$	$\displaystyle=27\|a_{3}\|^{2}-4\|a_{1}\|^{3}\|a_{3}\|-4-18\|a_{1}\|\|a_{3}\|-\|a_{1}\|^{2}\,.$	(8.11d)

The expressions for $\Psi^{+}_{02}(|a_{1}|,\pm|a_{3}|)$ are the same as for $\Delta_{02}(|a_{1}|,\pm|a_{3}|)$ in eq. (7.7). Observe that $\Psi^{+}_{02}(0,0)=4$ and $\Psi^{-}_{02}(0,0)=-4$ . If we set $|a_{1}|=0$ then $\Psi^{-}_{02}(0,\pm|a_{3}|)=27|a_{3}|^{2}-4$ , which yields the correct upper bound for $|a_{3}|$ . If we set $|a_{3}|=0$ then $\Psi^{+}_{02}(|a_{1}|,0)=4-|a_{1}|^{2}$ , which yields the correct upper bound for $|a_{1}|$ . The correct answer requires both $\Psi^{+}_{02}$ and $\Psi^{-}_{02}$ . Some further algebra yields the correct expression for the domain of convergence to be $\mathcal{D}_{02}=\{\Psi^{+}_{02}(|a_{1}|,|a_{3}|)\geq 0\}\cap\{\Psi^{-}_{02}(|% a_{1}|,|a_{3}|)\leq 0\}$ . Note that the series converges on the boundary of its domain of convergence.

The corrected expressions for the domains of absolute convergence for the series solutions for the roots of a cubic as follows. For clarity, we distinguish between the coefficients $a_{0},\dots,a_{3}$ of the original cubic in eq. (7.3) and the coefficients $b_{j}$ in the scaled polynomial $\mathscr{P}_{pq}(y)$ . Recall that ‘ $b_{j}$ ’ depends also on $p$ and $q$ but this is considered to be understood. The domains of convergence $\mathcal{D}_{pq}$ are given by


$\displaystyle\mathcal{D}_{01}$	$\displaystyle=\{\Psi^{+}_{01}(\|b_{2}\|,-\|b_{3}\|)\leq 0\}\,,$	(8.12a)
$\displaystyle\mathcal{D}_{02}$	$\displaystyle=\{\Psi^{+}_{02}(\|b_{1}\|,\|b_{3}\|)\geq 0\}\cap\{\Psi^{-}_{02}(\|b_{% 1}\|,\|b_{3}\|)\leq 0\}\,,$	(8.12b)
$\displaystyle\mathcal{D}_{03}$	$\displaystyle=\{\Psi^{+}_{03}(-\|b_{1}\|,-\|b_{2}\|)\geq 0\}\,,$	(8.12c)
$\displaystyle\mathcal{D}_{12}$	$\displaystyle=\{\Psi^{+}_{12}(\|b_{0}\|,-\|b_{3}\|)\leq 0\}\cap\{\Psi^{+}_{12}(-\|b% _{0}\|,\|b_{3}\|)\leq 0\}\,,$	(8.12d)
$\displaystyle\mathcal{D}_{13}$	$\displaystyle=\{\Psi^{+}_{13}(\|b_{0}\|,\|b_{2}\|)\geq 0\}\cap\{\Psi^{-}_{13}(\|b_{% 0}\|,\|b_{2}\|)\leq 0\}\,,$	(8.12e)
$\displaystyle\mathcal{D}_{23}$	$\displaystyle=\{\Psi^{+}_{23}(-\|b_{0}\|,\|b_{1}\|)\leq 0\}\,.$	(8.12f)

The series converge on the boundaries of their respective domains of convergence. For the case $\mathcal{D}_{23}$ , note that $q-p=1$ is odd, hence $\Psi^{+}_{pq}(\bm{b})$ and $\Psi^{-}_{pq}(\bm{b})$ are identical, so the solution is expressed using purely $\Psi^{+}_{23}(\bm{b})$ . We can therefore consider the expression in eq. (7.4f) to be a misprint. For the cases $\mathcal{D}_{01}$ and $\mathcal{D}_{13}$ , where $q-p=2$ is even, the need for $\Psi^{-}_{pq}(\bm{b})$ is essential.

There is an additional caveat, which is that the formula for the domain of convergence cannot always be expressed using purely inequalities. Consider the quartic equation with no term in $x^{3}$

a_{0}+a_{1}x+x^{2}+x^{4}=0\,.

(8.13)

We choose $p=2$ and $q=4$ and we have set $a_{2}=a_{4}=1$ so that the scaled coefficients are simply $b_{j}=a_{j}$ . The boundary of the domain of convergence in this case is determined solely by $\Psi^{-}_{24}(-|a_{0}|,|a_{1}|)$ . The derivation is omitted. Then

\Psi_{24}^{-}(-|a_{0}|,|a_{1}|)=16|a_{0}|(1-4|a_{0}|)^{2}+4(1-36|a_{0}|)|a_{1}% |^{2}-27|a_{1}|^{4}\,.

(8.14)

1.

The discriminant vanishes at the origin: $\Psi_{24}^{-}(0,0)=0$ . The significance of this will be discussed below. For now we seek nonzero solutions of the equation $\Psi_{24}^{-}(-|a_{0}|,|a_{1}|)=0$ .
2.

Put $a_{1}=0$ , then $\Psi^{-}_{24}(-|a_{0}|,0)=16|a_{0}|(1-4|a_{0}|)^{2}$ . This is (proportional to) a perfect square, which equals zero at $|a_{0}|=\frac{1}{4}$ . The necessary upper bound on $|a_{0}|$ for this problem is known to be $|a_{0}|\leq\frac{1}{4}$ .
3.

Next put $a_{0}=0$ , then $\Psi^{-}_{24}(0,|a_{1}|)=(4-27|a_{1}|^{2})|a_{1}|^{2}$ . For nonzero $a_{1}$ , this vanishes at $|a_{1}|=\sqrt{4/27}$ . The necessary upper bound on $|a_{1}|$ for this problem is known to be $|a_{1}|\leq\sqrt{4/27}$ .
4.

Next let us put $|a_{0}|=a$ and $|a_{1}|=\frac{1}{2}a$ , where $a\in\mathbb{R}_{+}$ . The graph of $\Psi^{-}_{24}(-a,\frac{1}{2}a)$ is plotted against $a$ in Fig. 1. As the value of $a$ increases from zero, initially $\Psi^{-}_{24}(0,0)=0$ for $a=0$ , then the value of $\Psi^{-}_{24}(-a,\frac{1}{2}a)$ is positive, reaches a maximum, then it changes sign and becomes negative, reaches a minimum and then becomes positive again and increases to $+\infty$ thereafter. The value of $\Psi^{-}_{24}(-a,\frac{1}{2}a)$ is thus not monotonic in $a$ .
5.

All of the above facts demonstrate that an unconditional inequality $\Psi^{-}_{24}(-|a_{0}|,|a_{1}|)\geq 0$ is insufficient to determine the domain of convergence. First, the discriminant vanishes at the origin. We need to exclude the origin as a solution, because we know the domain of convergence has positive measure. Even after doing so, we require an additional stipulation “the domain of convergence includes only the component which satisfies the inequality and is connected to the origin.”
6.

It is implicit in [36, Thm. 3] that the domain of convergence includes only the component connected to the origin. What is not clear is that the formula for the domain of convergence cannot always be expressed using only unconditional inequalities on the values of the discriminants. The stipulation “the component connected to the origin” is necessary.
7.

We remark in passing that for this problem, the domain of convergence is determined solely by a discriminant of the form $\Psi^{-}_{pq}$ . A discriminant of the form $\Psi^{+}_{pq}$ , i.e. $\Delta_{pq}$ in the formalism in [36], does not appear.

One source of the difficulty is that if $a_{1}=0$ in eq. (8.13), it becomes an algebraic equation in $x^{2}$ , viz. $a_{0}+x^{2}+x^{4}=0$ . For a polynomial of degree $n$ such as in eq. (8.1), the discriminant of $\mathscr{P}(x^{m})$ , for a positive integer $m$ , is given by

\Delta(\mathscr{P}(x^{m}))=(-1)^{nm(m-1)/2}m^{mn}(a_{0}a_{n})^{m-1}\bigl{(}% \Delta(\mathscr{P}(x))\bigr{)}^{m}\,.

(8.15)

Hence if $m$ is even, the discriminant $\Delta(\mathscr{P}(x^{m}))$ will not change sign as the (absolute values of the) coefficients are varied. Hence, in general, an unconditional inequality on the value of the discriminant(s) is insufficient to determine the domain of convergence. This feature will occur generically (or at least, cannot be ruled out) for a quartic and algebraic equations of all higher degrees, for example if the coefficients of all the odd powers of $x$ are set to zero.

8.2 General formula

We have seen that the formalism in [36] must be augmented by the inclusion of an extra set of discriminants. Although this yields the correct result for a cubic, as in eq. (8.12), the procedure in [36] becomes tedious for polynomials of high degree, and we have seen that it is prone to error. We seek a procedure that yields a single ‘general formula’ valid for arbitrary $n$ , which is simpler to state and to compute, for practical work. This can be accomplished via the use of hyperplanes and foliations, as will be explained below. (N.B. the word ‘single’ was employed informally above; we shall require at least two formulas.)

Still speaking informally, given an algebraic equation of degree $n$ with a coefficient tuple $\bm{a}$ and a choice for $p$ and $q$ , hence a scaled tuple $\bm{b}$ , the equations $\Psi^{+}_{pq}(\bm{b},\bm{\sigma})=0$ and $\Psi^{-}_{pq}(\bm{b},\bm{\sigma})=0$ , taken over all $\bm{\sigma}\in\Sigma_{pq}$ , specify a set of hyperplanes in the amplitudes $|b_{j}|$ . The domain of convergence in $\mathbb{R}_{+}^{n-1}$ is given by the set of hyperplanes closest to the origin and which together bound a region which is connected to the origin $\bm{0}_{pq}$ . The domain of convergence for $\bm{b}\in\mathbb{C}^{n-1}$ is the inverse image of the above domain in $\mathbb{R}_{+}^{n-1}$ . The domain of absolute convergence is clearly unique. If there were two or more sets of such hyperplanes, the full domain of absolute convergence would simply be the union of the individual domains. However, one reason the above discussion is informal is that we saw that the discriminant can vanish at the origin. Hence to write an equation such as ‘ $\Psi^{\pm}_{pq}(\bm{b},\bm{\sigma})=0$ ’ is not precise enough for our needs.

We now sharpen the above ideas. Clearly, the domain of absolute convergence is determined solely by the amplitudes $|b_{j}|$ , $j\in\mathscr{N}_{npq}$ . We previously denoted the doman of absolute convergence by $\mathcal{D}$ and introduced its image $\textrm{Log}(\mathcal{D})$ . Here we define a second image via an ‘amplitude map’ $\mathbb{C}^{k}\to\mathbb{R}_{+}^{k}$ where $z_{j}\mapsto|z_{j}|$ for $j=1,\dots,k$ :

\mathscr{D}=\{(|z_{1}|,\dots,|z_{k}|)\,|\,\bm{z}\in\mathcal{D}\}\,.

(8.16)

From the previous discussion of polydiscs and a complete Reinhardt domain, $\bm{z}\in\mathcal{D}$ if and only if $(|z_{1}|,\dots,|z_{k}|)\in\mathscr{D}$ . Clearly also $\textrm{Log}(\mathscr{D})=\textrm{Log}(\mathcal{D})$ . Our interest is the case of an algebraic equation of degree $n$ , so $k=n-1$ and $\bm{z}=\bm{b}$ . We shall derive a formula to determine the domain $\mathscr{D}$ in this case. Obviously $\bm{0}_{pq}\in\mathscr{D}$ . Recall that for an algebraic equation of degree $n$ and fixed $p,q$ , then $\mu_{j}=(j-p)/(q-p)$ . From eq. (3.4), let us define, for algebraic equations,

\hat{b}_{j}=\frac{1}{|\mu_{j}|^{\mu_{j}}|1-\mu_{j}|^{1-\mu_{j}}}\,.

(8.17)

Recall one must have $|b_{j}|\leq\hat{b}_{j}$ for $j\in\mathscr{N}_{npq}$ . It follows that $\mathscr{D}\subset\mathscr{\hat{D}}$ where the ‘hypercuboid’ is

\mathscr{\hat{D}}=\biggl{\{}(|b_{1}|,\dots,[p],\dots,[q],\dots,|b_{n}|)\;% \biggl{|}\;|b_{j}|\leq\hat{b}_{j},\,j\in\mathscr{N}_{npq}\biggr{\}}\,.

(8.18)

The following $n-1$ vertices of the hypercuboid lie in the domain of convergence, viz. $(\hat{b}_{0},0,\dots,0)$ , $(0,\hat{b}_{1},0,\dots,0)$ , …, $(0,\dots,\hat{b}_{n})$ . We also know that $\mathscr{D}$ has positive measure and $\mathscr{D}\supset\mathscr{\check{D}}$ , where

\mathscr{\check{D}}=\biggl{\{}(|b_{1}|,\dots,[p],\dots,[q],\dots,|b_{n}|)\,% \biggl{|}\,\sum_{j\in\mathscr{N}_{npq}}|b_{j}|\leq\frac{1}{|\mu_{*}|^{\mu_{*}}% |1-\mu_{*}|^{1-\mu_{*}}}\biggr{\}}\,.

(8.19)

Recall eq. (3.8) and the definition of $\mu_{*}$ . We require the following lemma.

Lemma 8.2.

At the origin, exactly one of the two following mutually exclusive possibilities is true: (i) $\Psi_{pq}^{+}(\bm{0}_{pq})=\Psi_{pq}^{-}(\bm{0}_{pq})=0$ , or (ii) $\Psi_{pq}^{+}(\bm{0}_{pq})=\pm\Psi_{pq}^{-}(\bm{0}_{pq})\neq 0$ . (Explicit mention of $\bm{\sigma}$ has been omitted since it is irrelevant at the origin.)

Proof.

The values of $\Psi_{pq}^{+}(\bm{0}_{pq})$ and $\Psi_{pq}^{-}(\bm{0}_{pq})$ can be nonzero if and only if the unscaled discriminant $\Delta(a_{0},\dots,a_{n})$ contains a term of the form $c_{pq}a_{p}^{\alpha}a_{q}^{\beta}$ for some coefficient $c_{pq}$ and exponents $\alpha$ and $\beta$ . From the homogeneity properties of the discriminant, we must have $\alpha+\beta=2n-2$ and $p\alpha+q\beta=n(n-1)$ . Hence, given $p$ and $q$ , then $\alpha$ and $\beta$ are uniquely determined, so there is at most one monomial of this form in the discriminant. We say ‘at most one’ because $\alpha=(n-1)(2q-n)/(q-p)$ and $\beta=(n-1)(n-2p)/(q-p)$ and these values may not be integers. Even if they are integers, the relevant monomial may not appear in the discriminant. After scaling, this term (if it exists) maps to $c_{pq}b_{p}^{\alpha}b_{q}^{\beta}$ . Then at the origin $\bm{)}_{pq}$ we obtain $\Psi_{pq}^{+}(\bm{0}_{pq})=c_{pq}$ and $\Psi_{pq}^{-}(\bm{0}_{pq})=(-1)^{\beta}c_{pq}$ . Hence either (i) holds, if $c_{pq}=0$ , or else (ii) holds, with $\Psi_{pq}^{+}(\bm{0}_{pq})=c_{pq}=\pm\Psi_{pq}^{-}(\bm{0}_{pq})$ . ∎

The two cases (i) and (ii) in Lemma 8.2 require separate treatments. In practice, it is convenient to introduce the notion of a ‘reduced’ discriminant. If $\Psi_{pq}^{+}(\bm{b},\bm{\sigma})$ contains a common factor, we divide out that common factor. A common factor in a discriminant clearly cannot contribute to the determination of the domain boundary in an equation such as $\Psi_{pq}^{+}(\bm{b},\bm{\sigma})=0$ . We denote the reduced discriminant by $\tilde{\Psi}_{pq}^{+}(\bm{b},\bm{\sigma})$ . By definition, it does not vanish if any single component $b_{j}$ in $\bm{b}$ is set to zero. Next $\Psi_{pq}^{-}(\bm{b},\bm{\sigma})$ clearly contains the same common factor as $\Psi_{pq}^{+}(\bm{b},\bm{\sigma})$ because flip** $\pm$ signs in the coefficients of the polynomial does not affect common factors in the discriminant. Hence by an obvious analogy we define the reduced discriminant $\tilde{\Psi}_{pq}^{-}(\bm{b},\bm{\sigma})$ . We work with $\tilde{\Psi}_{pq}^{+}(\bm{b},\bm{\sigma})$ and $\tilde{\Psi}_{pq}^{-}(\bm{b},\bm{\sigma})$ below. Note that Lemma 8.2 holds true also for the reduced discriminants.

The next key idea is that of foliation. For fixed tuples $\bm{b}$ and $\bm{\sigma}$ , the level sets of $\tilde{\Psi}_{pq}^{+}(\bm{b},\bm{\sigma})$ foliate the parameter space $\mathbb{R}_{+}^{n-1}$ . The level sets of $\tilde{\Psi}_{pq}^{-}(\bm{b},\bm{\sigma})$ also foliate $\mathbb{R}_{+}^{n-1}$ . Hence both families of level sets foliate the domain $\mathscr{\hat{D}}$ . For our purposes, the foliation is a map** $\mathbb{R}_{+}^{n-1}\to\mathbb{R}$ , because both $\tilde{\Psi}_{pq}^{+}(\bm{b},\bm{\sigma})$ and $\tilde{\Psi}_{pq}^{-}(\bm{b},\bm{\sigma})$ are real valued.

8.3 $\tilde{\Psi}_{pq}^{\pm}(\bm{b},\sigma)$ nonzero at origin

We begin with the simpler case (ii) in Lemma 8.2, where $\tilde{\Psi}_{pq}^{+}(\bm{b},\sigma)$ and $\tilde{\Psi}_{pq}^{-}(\bm{b},\sigma)$ are nonzero at the origin. First fix the values of $p$ and $q$ . Then for any tuple $\bm{\sigma}$ , for any $\bm{b}^{\prime}$ whose image is in $\mathscr{\check{D}}$ , both $\tilde{\Psi}_{pq}^{+}(\bm{0}_{pq},\bm{\sigma})\tilde{\Psi}_{pq}^{+}(\bm{b}^{% \prime},\bm{\sigma})>0$ and $\tilde{\Psi}_{pq}^{-}(\bm{0}_{pq},\bm{\sigma})\tilde{\Psi}_{pq}^{-}(\bm{b}^{% \prime},\bm{\sigma})>0$ . To determine the boundary of the domain of convergence, we solve for $\bm{b}_{*}$ where $\tilde{\Psi}_{pq}^{+}(\bm{b}_{*},\bm{\sigma})=0$ or $\tilde{\Psi}_{pq}^{-}(\bm{b}_{*},\bm{\sigma})=0$ for any $\sigma\in\Sigma_{pq}$ . This can be encapsulated in a single formula

\prod_{\bm{\sigma}\in\Sigma_{pq}}\tilde{\Psi}_{pq}^{+}(\bm{b}_{*},\bm{\sigma})% \tilde{\Psi}_{pq}^{-}(\bm{b}_{*},\bm{\sigma})=0\,.

(8.20)

The domain of convergence $\mathscr{D}$ is the set connected to the origin, bounded by the hyperplanes which satisfy eq. (8.20). Although technically there are $2^{n}$ discriminants in the product in eq. (8.20), in practice many of them are identical and the number of distinct discriminants is much fewer. However, I do not have a definitive estimate of the number of distinct discriminants. If $q-p$ is odd, we need consider only $\tilde{\Psi}_{pq}^{+}$ and we can simplify eq. (8.20) to

\prod_{\bm{\sigma}\in\Sigma_{pq}}\tilde{\Psi}_{pq}^{+}(\bm{b}_{*},\bm{\sigma})% =0\,.

(8.21)

As an illustrative example, consider the quartic equation $a_{0}+x+x^{2}+a_{4}x^{4}=0$ . We choose $p=1$ and $q=2$ and we have set $a_{1}=a_{2}=1$ so that the scaled coefficients are simply $b_{j}=a_{j}$ . Because $q-p=1$ is odd, we require only $\Psi_{pq}^{+}$ . The discriminant has a common factor of $|a_{4}|$ :

\Psi_{12}^{+}(|a_{0}|,|a_{4}|)=|a_{4}|\Bigl{(}256|a_{0}|^{3}|a_{4}|^{2}-128|a_% {0}|^{2}|a_{4}|+144|a_{0}||a_{4}|+16|a_{0}|-27|a_{4}|-4\Bigr{)}\,.

(8.22)

We divide out the common factor $|a_{4}|$ and obtain the reduced discriminants


$\displaystyle\tilde{\Psi}_{12}^{+}(\|a_{0}\|,\|a_{4}\|)$	$\displaystyle=\phantom{-}256\|a_{0}\|^{3}\|a_{4}\|^{2}-128\|a_{0}\|^{2}\|a_{4}\|+144\|a% _{0}\|\|a_{4}\|+16\|a_{0}\|-27\|a_{4}\|-4\,,$	(8.23a)
$\displaystyle\tilde{\Psi}_{12}^{+}(\|a_{0}\|,-\|a_{4}\|)$	$\displaystyle=\phantom{-}256\|a_{0}\|^{3}\|a_{4}\|^{2}+128\|a_{0}\|^{2}\|a_{4}\|-144\|a% _{0}\|\|a_{4}\|+16\|a_{0}\|+27\|a_{4}\|-4\,,$	(8.23b)
$\displaystyle\tilde{\Psi}_{12}^{+}(-\|a_{0}\|,\|a_{4}\|)$	$\displaystyle=-256\|a_{0}\|^{3}\|a_{4}\|^{2}-128\|a_{0}\|^{2}\|a_{4}\|-144\|a_{0}\|\|a_{4% }\|-16\|a_{0}\|-27\|a_{4}\|-4\,,$	(8.23c)
$\displaystyle\tilde{\Psi}_{12}^{+}(-\|a_{0}\|,-\|a_{4}\|)$	$\displaystyle=-256\|a_{0}\|^{3}\|a_{4}\|^{2}+128\|a_{0}\|^{2}\|a_{4}\|+144\|a_{0}\|\|a_{4% }\|-16\|a_{0}\|+27\|a_{4}\|-4\,.$	(8.23d)

The reduced discriminants all equal $-4$ at the origin. The necessary bounds for convergence yield $|a_{0}|\leq\frac{1}{4}$ and $|a_{4}|\leq 4/27$ . Note that the above expressions are quadratics in $|a_{4}|$ . Thus to solve for $\Psi^{+}_{12}(\pm|a_{0}|,\pm|a_{4}|)=0$ , we fix a value of $|a_{0}|$ and solve the resulting quadratic in $|a_{4}|$ . Note that this procedure will not always yield a real solution for $|a_{4}|$ ; the discriminants which fail to do so do not contribute to the boundary of the domain of convergence. The discriminant $\tilde{\Psi}_{12}^{+}(-|a_{0}|,|a_{4}|)$ is such a case. Setting the other discriminants to zero yields valid hyperplanes. The resulting curves in the $(|a_{0}|,|a_{4}|)$ parameter space are displayed in Fig. 2, for $\Psi^{+}_{12}(|a_{0}|,|a_{4}|)=0$ (dashed), $\Psi^{+}_{12}(|a_{0}|,-|a_{4}|)=0$ (dotdash) and $\Psi^{+}_{12}(-|a_{0}|,-|a_{4}|)=0$ (solid). The shaded area indicates the domain $\mathscr{D}_{12}$ , which is determined by the two hyperplanes given by the level sets $\Psi^{+}_{12}(|a_{0}|,|a_{4}|)=0$ and $\Psi^{+}_{12}(-|a_{0}|,-|a_{4}|)=0$ . The level set $\Psi^{+}_{12}(|a_{0}|,-|a_{4}|)=0$ does not contribute. The domain of convergence is therefore

\mathcal{D}_{12}=\{\Psi^{+}_{12}(|a_{0}|,|a_{4}|)\leq 0\}\cap\{\Psi^{+}_{12}(-% |a_{0}|,-|a_{4}|)\leq 0\}\,.

(8.24)

Recall that technically, the domain $\mathcal{D}_{12}$ is the component which satisfies the above conditions and is connected to the origin. Observe from the curvature of the upper boundary in Fig. 2, i.e. the level set $\Psi^{+}_{12}(-|a_{0}|,-|a_{4}|)=0$ , that the domain of convergence is not convex. A complete Reinhardt domain is logarithmically convex, but is not necessarily convex.

8.4 $\tilde{\Psi}_{pq}^{\pm}(\bm{b},\sigma)$ vanishes at origin

The case (i) in Lemma 8.2 is more difficult. Now $\tilde{\Psi}_{pq}^{+}(\bm{b},\sigma)$ and $\tilde{\Psi}_{pq}^{-}(\bm{b},\sigma)$ vanish at the origin, hence solving for $\tilde{\Psi}_{pq}^{+}(\bm{b},\sigma)=0$ or $\tilde{\Psi}_{pq}^{-}(\bm{b},\sigma)=0$ yields the origin as an unwanted solution. Recall the example of the quartic eq. (8.13). Hence we must proceed more carefully. As always, we first fix the values of $p$ and $q$ . Next, fix a tuple $\bm{\sigma}$ . Then the discriminants will exhibit one of three mutually exclusive properties: either $\tilde{\Psi}_{pq}^{+}(\bm{b},\sigma)$ has a local maximum, or a local minimum, or a saddle point at the origin. The same is true for $\tilde{\Psi}_{pq}^{-}(\bm{b},\sigma)$ . We can also say ‘a local extremum or a saddle point’ at the origin. The concept of ‘local extremum’ must be understood carefully, because it is really a constrained extremization. It is simplest to illustrate with an example. Consider a cubic equation $1+x+b_{2}x^{2}+b_{3}x^{3}=0$ with $p=0$ and $q=1$ . Since $q-p=1$ is odd, it suffices to treat $\tilde{\Psi}_{pq}^{+}(\bm{b},\sigma)$ only. The expressions for the discriminants in this case are (note that they all vanish at the origin)


$\displaystyle\tilde{\Psi}_{01}^{+}(\|b_{2}\|,\|b_{3}\|)$	$\displaystyle=27\|b_{3}\|^{2}+4\|b_{3}\|+4\|b_{2}\|^{3}-18\|b_{2}\|\|b_{3}\|-\|b_{2}\|^{2}\,,$	(8.25a)
$\displaystyle\tilde{\Psi}_{01}^{+}(\|b_{2}\|,-\|b_{3}\|)$	$\displaystyle=27\|b_{3}\|^{2}-4\|b_{3}\|+4\|b_{2}\|^{3}+18\|b_{2}\|\|b_{3}\|-\|b_{2}\|^{2}\,,$	(8.25b)
$\displaystyle\tilde{\Psi}_{01}^{+}(-\|b_{2}\|,\|b_{3}\|)$	$\displaystyle=27\|b_{3}\|^{2}+4\|b_{3}\|-4\|b_{2}\|^{3}+18\|b_{2}\|\|b_{3}\|-\|b_{2}\|^{2}\,,$	(8.25c)
$\displaystyle\tilde{\Psi}_{01}^{+}(-\|b_{2}\|,-\|b_{3}\|)$	$\displaystyle=27\|b_{3}\|^{2}-4\|b_{3}\|-4\|b_{2}\|^{3}-18\|b_{2}\|\|b_{3}\|-\|b_{2}\|^{2}\,.$	(8.25d)

1.

Then $\tilde{\Psi}_{01}^{+}(|b_{2}|,-|b_{3}|)$ has a local maximum at the origin. Put $|b_{2}|=0$ , then $\tilde{\Psi}_{01}^{+}(0,-|b_{3}|)\simeq-4|b_{3}|$ for sufficiently small $|b_{3}|$ . This is negative definite because $|b_{3}|>0$ only, for $b_{3}\neq 0$ . However, the partial derivative $\partial\tilde{\Psi}_{01}^{+}/\partial|b_{3}|$ does not vanish at $|b_{1}|=0$ . Next put $|b_{3}|=0$ , then $\tilde{\Psi}_{01}^{+}(|b_{2}|,0)\simeq-|b_{2}|^{2}$ for sufficiently small $|b_{2}|$ . This is also negative definite for $b_{2}\neq 0$ . One can show that $\tilde{\Psi}_{01}^{+}(|b_{2}|,-|b_{3}|)<0$ for all sufficiently small $|b_{2}|>0$ and $|b_{3}|>0$ . Hence for our purposes, a ‘local maximum’ is a constrained local maximum. Similarly, the concept of ‘local minimum’ is a constrained local minimum.
2.

Similarly $\tilde{\Psi}_{01}^{+}(-|b_{2}|,-|b_{3}|)$ also has a local maximum at the origin.
3.

However $\tilde{\Psi}_{01}^{+}(|b_{2}|,|b_{3}|)$ and $\tilde{\Psi}_{01}^{+}(-|b_{2}|,|b_{3}|)$ both have saddle points at the origin. Put $|b_{2}|=0$ , then $\tilde{\Psi}_{01}^{+}(0,|b_{3}|)\simeq 4|b_{3}|$ for sufficiently small $|b_{3}|$ , and is positive for $|b_{3}|>0$ . Next put $|b_{3}|=0$ , then $\tilde{\Psi}_{01}^{+}(|b_{2}|,0)\simeq-|b_{2}|^{2}$ and $\tilde{\Psi}_{01}^{+}(-|b_{2}|,0)\simeq-|b_{2}|^{2}$ for sufficiently small $|b_{2}|$ , and are both negative for $|b_{2}|>0$ . This establishes that both $\tilde{\Psi}_{01}^{+}(|b_{2}|,|b_{3}|)$ and $\tilde{\Psi}_{01}^{+}(-|b_{2}|,|b_{3}|)$ are of indefinite sign in the vicinity of the origin, i.e. they have saddle points at the origin.

Returning to the general theory, a discriminant $\tilde{\Psi}_{pq}^{+}(\bm{b},\sigma)$ or $\tilde{\Psi}_{pq}^{-}(\bm{b},\sigma)$ which has a saddle point at the origin does not contribute to the determination of the boundary of the domain of convergence. Such a discriminant has a nontrivial level set $\tilde{\Psi}_{pq}^{+}(\bm{b},\sigma)=0$ or $\tilde{\Psi}_{pq}^{-}(\bm{b},\sigma)=0$ which includes the origin, and as such, the hyperplane cannot form part of a set which encloses any open neighborhood of the origin in $\mathbb{R}_{+}^{n-1}$ .

We must therefore define a set of $\pm$ sign assignments $\Sigma^{+}_{pq}$ (resp. $\Sigma^{-}_{pq}$ ) consisting only of those $\bm{\sigma}$ such that the discriminants $\tilde{\Psi}_{pq}^{+}(\bm{b},\sigma)$ (resp. $\tilde{\Psi}_{pq}^{-}(\bm{b},\sigma)$ ) have a (constrained) local extremum at the origin. (It is possible that the sets $\Sigma^{+}_{pq}$ and $\Sigma^{-}_{pq}$ are identical, but I do not have a proof of this.) For these values of $\bm{\sigma}$ , the origin is a one-element level set of the equations $\tilde{\Psi}_{pq}^{+}(\bm{b},\sigma)=0$ or $\tilde{\Psi}_{pq}^{-}(\bm{b},\sigma)=0$ . We exclude the origin as an unwanted solution. Recall the example of the quartic eq. (8.13) and the discriminant $\tilde{\Psi}^{-}_{24}(-|a_{0}|,|a_{1}|)$ . We then proceed as in the previous section. To determine the boundary of the domain of convergence, we solve for $\bm{b}_{**}\neq\bm{0}_{pq}$ where

\biggl{(}\prod_{\bm{\sigma}\in\Sigma^{+}_{pq}}\tilde{\Psi}_{pq}^{+}(\bm{b}_{**% },\bm{\sigma})\biggr{)}\biggl{(}\prod_{\bm{\sigma}\in\Sigma^{-}_{pq}}\tilde{% \Psi}_{pq}^{-}(\bm{b}_{**},\bm{\sigma})\biggr{)}=0\,.

(8.26)

The domain of convergence $\mathscr{D}$ is the set connected to the origin, bounded by the hyperplanes which satisfy eq. (8.26). Hence we require a preliminary calculation to exclude those values of $\bm{\sigma}$ for which the discriminants have saddle points at the origin. As before, if $q-p$ is odd, we can restrict attention only to $\tilde{\Psi}_{pq}^{+}$ and write the simpler formula

\prod_{\bm{\sigma}\in\Sigma^{+}_{pq}}\tilde{\Psi}_{pq}^{+}(\bm{b}_{**},\bm{% \sigma})=0\,.

(8.27)

As an illustrative example, consider the quartic eq. (8.13). Recall we choose $p=2$ and $q=4$ and we have set $a_{2}=a_{4}=1$ so that the scaled coefficients are simply $b_{0}=a_{0}$ and $b_{1}=a_{1}$ . Then


$\displaystyle\Psi_{24}^{+}(\|a_{0}\|,\pm\|a_{1}\|)$	$\displaystyle=-27\|a_{1}\|^{4}-4(1-36\|a_{0}\|)\|a_{1}\|^{2}+16\|a_{0}\|(1-4\|a_{0}\|)^{% 2}\,,$	(8.28a)
$\displaystyle\Psi_{24}^{+}(-\|a_{0}\|,\pm\|a_{1}\|)$	$\displaystyle=-27\|a_{1}\|^{4}-4(1+36\|a_{0}\|)\|a_{1}\|^{2}+16\|a_{0}\|(1+4\|a_{0}\|)^{% 2}\,,$	(8.28b)
$\displaystyle\Psi_{24}^{-}(\|a_{0}\|,\pm\|a_{1}\|)$	$\displaystyle=-27\|a_{1}\|^{4}+4(1+36\|a_{0}\|)\|a_{1}\|^{2}-16\|a_{0}\|(1+4\|a_{0}\|)^{% 2}\,,$	(8.28c)
$\displaystyle\Psi_{24}^{-}(-\|a_{0}\|,\pm\|a_{1}\|)$	$\displaystyle=-27\|a_{1}\|^{4}+4(1-36\|a_{0}\|)\|a_{1}\|^{2}+16\|a_{0}\|(1-4\|a_{0}\|)^{% 2}\,.$	(8.28d)

Hence there are four distinct discriminants, viz. $\Psi_{24}^{+}(|a_{0}|,|a_{1}|)$ , $\Psi_{24}^{+}(-|a_{0}|,|a_{1}|)$ , $\Psi_{24}^{-}(|a_{0}|,|a_{1}|)$ and $\Psi_{24}^{-}(-|a_{0}|,|a_{1}|)$ . The necessary bounds for convergence yield $|a_{0}|\leq\frac{1}{4}$ and $|a_{1}|\leq\sqrt{4/27}$ . The above expressions are all quadratics in $|a_{1}|^{2}$ . Thus to solve for $\Psi^{\pm}_{24}(\pm|a_{0}|,\pm|a_{1}|)=0$ , we fix a value of $|a_{0}|$ and solve the resulting quadratic in $|a_{1}|^{2}$ . This procedure will not always yield a real positive solution for $|a_{1}|$ ; the discriminants which fail to do so do not contribute to the boundary of the domain of convergence. The discriminant $\Psi_{24}^{+}(-|a_{0}|,|a_{1}|)$ is such an example. Setting the other discriminants in eq. (8.28) to zero yields valid hyperplanes. The resulting curves in the $(|a_{0}|,|a_{1}|)$ parameter space are displayed in Fig. 3, for $\Psi^{+}_{24}(|a_{0}|,|a_{1}|)=0$ (dashed), $\Psi^{-}_{24}(|a_{0}|,|a_{1}|)=0$ (dotdash) and $\Psi^{-}_{24}(-|a_{0}|,|a_{1}|)=0$ (solid). The first two level sets pass through the origin, because the discriminants have saddle points at the origin, and they do not contribute to the boundary of the domain of convergence. The domain of convergence is determined solely by the level set $\Psi^{-}_{24}(-|a_{0}|,|a_{1}|)=0$ . However, that level set (the solid curve) bounds two domains in Fig. 3. The domain $\mathscr{D}_{24}$ is given by the shaded area only, because that is the region connected to the origin. The cross-hatched region is not connected to the origin and is not part of the domain of convergence. Because $\Psi^{-}_{24}(-|a_{0}|,|a_{1}|)$ has a local minimum at the origin, the domain of convergence is the component connected to the origin such that

\mathcal{D}_{24}=\{\Psi^{-}_{24}(-|a_{0}|,|a_{1}|)\geq 0\}\,.

(8.29)

The caveat about ‘connectedness to the origin’ is essential in this case. The domain $\mathcal{D}_{24}$ is also not convex. This is demonstrated in Fig. 4. The domain $\mathscr{D}$ is the shaded area and is bounded by the solid curve. The dashed line is the straight line which joins the vertices $(\hat{b}_{0},0)=(\frac{1}{4},0)$ and $(0,\hat{b}_{1})=(0,\sqrt{4/27})$ to form a right-angled triangle with the origin. Observe that $\mathcal{D}_{24}$ is not convex, hence the full domain of convergence $\mathcal{D}_{24}$ is not convex.

8.5 Normalization of discriminant

The literature in fact contains multiple normalization conventions for discriminants. For example for the quadratic $ax^{2}+bx+c$ , some authors define the discriminant to be $\Delta=4ac-b^{2}$ and others prefer $\Delta=b^{2}-4ac$ . Hence the directions of the inequalities in expressions such as eq. (8.12) could be reversed, depending on the normalization convention employed for the discriminant. The reader should beware of this important detail. However, the formulas in both eqs. (8.20) and (8.26) are independent of the normalization of the discriminant, which is an advantage of the above formalism.

9 Applications: principal and Brioschi quintics

The principal and Brioschi forms of the quintic are tetranomials, and furnish nontrivial applications of the more sophisticated formalism of Sec. 8, to determine the domains of convergence of their solutions by infinite series. We treat them in turn. The principal quintic form is

a_{0}+a_{1}x+a_{2}x^{2}+x^{5}=0\,.

(9.1)

The discriminant is

\begin{split}\Delta_{\rm prin}=a_{5}^{2}\Bigl{(}3125a_{0}^{4}a_{5}^{2}+2250a_{% 0}^{2}a_{1}a_{2}^{2}a_{5}-1600a_{0}a_{1}^{3}a_{2}a_{5}+108a_{0}a_{2}^{5}+256a_% {1}^{5}a_{5}-27a_{1}^{2}a_{2}^{4}\Bigr{)}\,.\end{split}

(9.2)

We factor out $a_{5}^{2}$ and employ reduced discriminants for the formulas for the domains of convergence, expressed in terms of the scaled coefficients $(b_{0},b_{1},b_{2},b_{5})$ for each choice of $p$ and $q$ :


$\displaystyle\mathcal{D}_{01}$	$\displaystyle=\{\tilde{\Psi}^{+}_{01}(\|b_{2}\|,-\|b_{5}\|)\leq 0\}\,,$	(9.3a)
$\displaystyle\mathcal{D}_{02}$	$\displaystyle=\{\tilde{\Psi}^{+}_{02}(\|b_{1}\|,\|b_{5}\|)\geq 0\}\cap\{\tilde{% \Psi}^{-}_{02}(\|b_{1}\|,\|b_{5}\|)\leq 0\}\,,$	(9.3b)
$\displaystyle\mathcal{D}_{05}$	$\displaystyle=\{\tilde{\Psi}^{+}_{05}(-\|b_{1}\|,-\|b_{2}\|)\geq 0\}\,,$	(9.3c)
$\displaystyle\mathcal{D}_{12}$	$\displaystyle=\{\tilde{\Psi}^{+}_{12}(\|b_{0}\|,-\|b_{5}\|)\leq 0\}\cap\{\tilde{% \Psi}^{+}_{12}(-\|b_{0}\|,\|b_{5}\|)\leq 0\}\,,$	(9.3d)
$\displaystyle\mathcal{D}_{15}$	$\displaystyle=\{\tilde{\Psi}^{+}_{15}(\|b_{0}\|,\|b_{2}\|)\geq 0\}\cap\{\tilde{% \Psi}^{-}_{15}(\|b_{0}\|,\|b_{2}\|)\leq 0\}\,,$	(9.3e)
$\displaystyle\mathcal{D}_{25}$	$\displaystyle=\{\tilde{\Psi}^{+}_{25}(-\|b_{0}\|,\|b_{1}\|)\leq 0\}\,.$	(9.3f)

As always, the domains of convergence consist only of the components which are connected to the origin. Next, the Brioschi normal form of the quintic is [10]

x^{5}-10Cx^{3}+45C^{2}x-C^{2}=0\,.

(9.4)

The coefficients are all functions of a single parameter $C$ and are hence not independent. There is a real root for all real $C$ . If $C=0$ there is a repeated root of multiplicity five at $x=0$ . We write eq. (9.4) as $a_{0}+a_{1}x+a_{3}x^{3}+a_{5}x^{5}=0$ where $a_{0}=-C^{2}$ , $a_{1}=45C^{2}$ , $a_{3}=-10C$ and $a_{5}=1$ . The discriminant is

\begin{split}\Delta_{\rm Br}&=a_{5}\Bigl{(}3125a_{0}^{4}a_{5}^{3}+2000a_{0}^{2% }a_{1}^{2}a_{3}a_{5}^{2}-900a_{0}^{2}a_{1}a_{3}^{3}a_{5}+108a_{0}^{2}a_{3}^{5}% +256a_{1}^{5}a_{5}^{2}-128a_{1}^{4}a_{3}^{2}a_{5}+16a_{1}^{3}a_{3}^{4}\Bigr{)}% \,.\end{split}

(9.5)

We factor out $a_{5}$ and employ reduced discriminants for the formulas for the domains of convergence, now expressed in terms of the scaled coefficients $(b_{0},b_{1},b_{3},b_{5})$ .

Setting $p=0$ and $q=1$ yields one root. Put $x=(a_{0}/a_{1})z=-z/45$ , then

0=1+z-\frac{10}{45^{3}C}\,z^{3}+\frac{1}{45^{5}C^{2}}\,z^{5}\,.

(9.6)

The domain of convergence is given by

\mathcal{D}_{01}=\{\tilde{\Psi}^{+}_{01}(-|b_{3}|,-|b_{5}|)\geq 0\}\,.

(9.7)

This yields the condition

1+29376|C|-36578304|C|^{2}\leq 0\,.

(9.8)

This is satisfied for

|C|\geq\frac{17+13\sqrt{2}}{32\cdot 27\cdot 49}\simeq 8.358\cdot 10^{-4}\,.

(9.9)

Setting $p=1$ and $q=5$ yields four roots. Put $x=(a_{1}/a_{5})^{1/4}z=(45C^{2})^{1/4}z$ , then

0=-\frac{1}{45^{5/4}C^{1/2}}+z-\frac{10}{45^{1/2}}z^{3}+z^{5}\,.

(9.10)

The domain of convergence is given by

\mathcal{D}_{15}=\{\tilde{\Psi}^{+}_{15}(|b_{0}|,-|b_{3}|)\geq 0\}\cap\{\tilde% {\Psi}^{-}_{15}(|b_{0}|,-|b_{3}|)\leq 0\}\,.

(9.11)

This also yields the condition eq. (9.8) and hence is also satisfied for the bound in eq. (9.9).

Setting $p=0$ and $q=5$ yields five roots. Put $x=(a_{0}/a_{5})^{1/5}z=-C^{2/5}z$ , then

0=1+45C^{2/5}x-10C^{1/5}z^{3}+z^{5}\,.

(9.12)

The domain of convergence is given by

\mathcal{D}_{05}=\{\tilde{\Psi}^{+}_{05}(-|b_{1}|,-|b_{3}|)\geq 0\}\,.

(9.13)

This yields the condition

1-29376|C|-36578304|C|^{2}\geq 0\,.

(9.14)

This is satisfied for

|C|\leq\frac{-17+13\sqrt{2}}{32\cdot 27\cdot 49}\simeq 0.327\cdot 10^{-4}\,.

(9.15)

Next set $p=0$ and $q=3$ . Put $x=(a_{0}/a_{3})^{1/3}z=(C/10)^{1/3}z$ then

0=1-\frac{45C^{1/3}}{10^{1/3}}z+z^{3}-\frac{1}{10^{5/3}C^{1/3}}z^{5}\,.

(9.16)

The domain of convergence is given by

\mathcal{D}_{03}=\{\tilde{\Psi}^{+}_{03}(-|b_{1}|,-|b_{5}|)\geq 0\}\,.

(9.17)

This yields the condition

-(1-1728|C|)^{2}\geq 0\,.

(9.18)

This is only satisfied by the single value $|C|=1/1728$ , but $|C|=1/1728$ lies in a domain not connected to the origin. Hence this scenario yields no roots. We see that the stipulation ‘connected to the origin’ is essential.

Next set $p=1$ and $q=3$ . Put $x=(a_{1}/a_{3})^{1/2}z=i(45C/10)^{1/2}z$ , then

0=\frac{i10^{1/2}}{45^{3/2}C^{1/2}}+z+z^{3}+\frac{45}{100}z^{5}\,.

(9.19)

The necessary upper bound for convergence is $|b_{5}|\leq\frac{1}{4}$ . However, $b_{5}=0.45$ , which exceeds the above bound. Hence the series solution does not converge for any value of $C$ , for this scenario.

Next set $p=3$ and $q=5$ . Put $x=(a_{3}/a_{5})^{1/2}z=i(10C)^{1/2}z$ , then

0=\frac{i}{10^{5/2}C^{1/2}}+\frac{45}{100}z+z^{3}+z^{5}\,.

(9.20)

The necessary upper bound for convergence is $|b_{1}|\leq\frac{1}{4}$ . However, $b_{1}=0.45$ , which exceeds the above bound. Hence the series solution does not converge for any value of $C$ for this scenario.

The Brioschi quintic normal form yields some instructive insights. First, for three of the six possible choices of $p$ and $q$ , the series solutions do not converge for any value of $C$ . Second, observe that the choices $p=0$ , $q=1$ and $p=1$ , $q=5$ together yield five roots, but only if $|C|\gtrsim 8.358\cdot 10^{-4}$ (see eq. (9.9)), whereas the choice $p=0$ , $q=5$ also yields five roots, but only if $|C|\lesssim 0.327\cdot 10^{-4}$ (see eq. (9.15)). Hence there is a gap of values for which there is no convergent series solution of the Brioschi quintic, for any choice of $p$ and $q$ , given by

\frac{-17+13\sqrt{2}}{32\cdot 27\cdot 49}\leq|C|\leq\frac{17+13\sqrt{2}}{32% \cdot 27\cdot 49}\,.

(9.21)

The Brioschi quintic demonstrates that the formulas in Sec. 8 do not imply that the domains of convergence for an algebraic equation of degree $n$ span all values of the coefficients, i.e. all of $\mathbb{C}^{n+1}$ . The criterion for convergence may not be satisfied for any of the choices of $p$ and $q$ . Nevertheless, it was shown in Sec. 5 that convergent series solutions do exist for all values of the coefficients of a general quintic, via the use of the Bring-Jerrand normal form. Hence one must examine every algebraic equation on its merits; some transformations may work better than others.

10 Conclusion

This author was led to the main ideas of this paper because they are required to prove results in probability and statistics (not reported here). The papers by numerous authors were cited and the various notations, definitions, identities and nomenclature were collected in a common setting. Note that although most of the derivations in the literature treat only integer valued parameters, Theorem 2.4 is applicable for arbitrary complex coefficients and real (or even complex) exponents. The early works by Lambert [28, 29] and Euler [16] and were shown to be Fuss–Catalan series. An important application of the formalism was the solution of algebraic equations by infinite series. This is a heavily studied problem and contact was made with the works of numerous authors [5, 9, 15, 31, 32, 33]. An example was to present convergent Fuss–Catalan series solutions for all the roots the Bring-Jerrard normal form, thence the roots of a general quintic, for arbitrary values of the quintic coefficients. Two bounds for the absolute convergence of general Fuss–Catalan series were derived (necessary but not sufficient and sufficient but not necessary). For the important special case of the solutions of algebraic equations by infinite series, a new necessary and sufficient bound for absolute convergence was presented in Sec. 8, correcting and extending earlier work in the field [36].

Acknowledgements

I am deeply indebted to numerous individuals who helped me generously with their time and enthusiasm. However, special mention must go to Professors R. E. Borcherds and S. J. Dilworth, and to Dr. P. M. Strickland, without whose unflagging support this work simply would not have seen fruition. It is my enormous pleasure to thank them, and also, in alphabetical order, Professors I. M. Gessel, H. W. Gould, R. L. Graham, O. Patashnik, T. J. Ransford, R. P. Stanley and D. Zeilberger.
Addendum: I thank Drs. F. de Sousa Coelho and D. Rubine for pointing out misprints in previous versions of this note.

Appendix A Miscellaneous items

This Appendix lists various items which were not used in the main body of the paper, essentially for completeness of the exposition. According to the information in Appendix B of Stanley’s text [41] (the Appendix was written by Pak), the name ‘Catalan numbers’ only came into prominent use after the publication of Riordan’s monograph [38] (in 1968, first edition). Hence it is understandable if authors such as Gould [20, 21] and Raney [37], also earlier authors such as Mellin [33] and Schläfli [39], did not mention Fuss or Catalan. Belardinelli’s memoir [4] contains an overview of the solutions of algebraic equations using hypergeometric series. His extensive bibliography lists several papers on functions of several complex variables, but not papers on combinatorics such as by Raney [37]. There is evidently a diversity of notations and terminology, and duplication of proofs.

The ‘diversity of notations’ leads to an immediate caveat: different authors employ the same symbols, such as $n$ , $p$ or $q$ , to mean different things and it is impractical to disambiguate all the notations in the equations below. The reader is warned to consult the original literature for the precise meanings of all symbols displayed below.

Turning to technical matters, Mohanty [34] derived additional convolution identities not mentioned in the main text above. I list only one, viz. [34, eq. (11)], because it subsumes the others as special cases. In the notation of this paper, [34, eq. (11)] is

\sum_{\bm{j}\in\mathbb{N}^{k}}(p+\bm{q}\cdot\bm{j})\mathscr{A}_{\bm{j}}(\bm{b}% ,a)\mathscr{A}_{\bm{n}-\bm{j}}(\bm{b},c)=\frac{p(a+c)+a\bm{q}\cdot\bm{n}}{a+c}% \mathscr{A}_{\bm{n}}(\bm{b},a+c)\,.

(1.1)

All of $a$ , $c$ , $p$ , $\bm{b}$ and $\bm{q}$ are complex valued and $\bm{b}$ , $\bm{j}$ , $\bm{n}$ and $\bm{q}$ are $k$ -tuples. Put $\bm{q}=0$ then eq. (1.1) yields [34, eq. (9)], which is eq. (2.17) above. Put $p=c+\bm{b}\cdot\bm{n}$ and $\bm{q}=-\bm{b}$ then eq. (1.1) yields [34, eq. (10)]. Mohanty [34] actually cited Gould [21] for the single-parameter convolution identities; Mohanty generalized them to multiparameter versions. Gould [20, 21] proved several convolution identities and suggested that they are all special cases of a single general formula. The exposition below follows Raney’s [37] summary of Gould’s work. Define the numbers [37, eq. (7.7)]

\begin{split}G(\alpha,0;\beta,\gamma)&=1\,,\\ G(\alpha,n;\beta,\gamma)&=\frac{\alpha}{n!}\prod_{m=1}^{n-1}(\alpha+\beta n-% \gamma m)\,.\end{split}

(1.2)

See also the polynomial $P_{k}(p)$ by Gould [21, Sec. 5] and associated comments therein about the work of Schläfli [39]. Then [37, eq. (7.8)]

G(\alpha_{1}+\alpha_{2},n;\beta,\gamma)=\sum_{n_{1}+n_{2}=n}G(\alpha_{1},n_{1}% ;\beta,\gamma)G(\alpha_{2},n_{2};\beta,\gamma)\,.

(1.3)

Here $\alpha_{1},\alpha_{2},\beta,\gamma\in\mathbb{A}$ where $\mathbb{A}$ is a commutative ring and $n,n_{1},n_{2}\in\mathbb{N}$ . Gould [21, Sec. 5] also proved that the convolution identity derived by Schläfli [39], in the latter’s 1847 paper on Lambert series, was equivalent to [20, eq. (10)]. See also Riordan [38] for additional combinatorial identities and Strehl [43] for an overview of numerous multiparameter identities. If $\gamma=0$ then

G(\alpha,n;\beta,0)=\frac{\alpha}{\alpha+\beta n}\frac{(\alpha+\beta n)^{n}}{n% !}\,.

(1.4)

If $\gamma\neq 0$ then the above is proportional to a Fuss–Catalan number

\begin{split}G(\alpha,n;\beta,\gamma)&=\frac{(\alpha/\gamma)\gamma^{n}}{n!}% \prod_{m=1}^{n-1}\Bigl{(}\frac{\alpha+\beta n}{\gamma}-m\Bigr{)}\\ &=\gamma^{n}A_{n}(\beta/\gamma,\alpha/\gamma)\,.\end{split}

(1.5)

Note that this relation works for $n=0$ also. A multiparameter generalization might be as follows

\begin{split}G(\alpha,\bm{0};\bm{\beta},\gamma)&=1\,,\\ G(\alpha,\bm{n};\bm{\beta},\gamma)&=\frac{\alpha}{n_{1}!\cdots n_{k}!}\prod_{m% =1}^{|\bm{n}|-1}(\alpha+\bm{\beta}\cdot\bm{n}-\gamma m)\,.\end{split}

(1.6)

Gould also published a later paper [22] with additional formulas, but its contents are beyond the scope of this paper. The work of Gould may lead to a more general set of multiparameter identities and generating functions. The matter will be left to future work.

Kahkeshani [24] has defined so-called ‘generalized Catalan numbers’ via

C(m,n)=\frac{1}{n(m-1)+1}\binom{2n(m-1)}{\underbrace{n,\dots,n}_{m-1},n(m-1)}\,.

(1.7)

Let us process this as follows. Set $k=m-1$ and $r=1$ in eq. (2.1). Also set $t_{1}=\cdots=t_{k}=n$ and $\mu_{1}=\cdots=\mu_{k}=2$ , so $|\bm{t}|=n(m-1)$ and $\bm{t}\cdot\bm{\mu}=2n(m-1)$ . Then

\begin{split}C(m,n)&=\frac{1}{n(m-1)+1}\frac{1}{(n!)^{m-1}}\prod_{j=0}^{n(m-1)% -1}(2n(m-1)-j)\\ &=\frac{1}{(n!)^{m-1}}\prod_{j=1}^{n(m-1)-1}(2n(m-1)+1-j)\\ &=\mathscr{A}_{(n,\dots,n)}((2,\dots,2),1)\,.\end{split}

(1.8)

Hence Kahkeshani’s definition is a special case of the multiparameter Fuss–Catalan numbers defined in this paper. Note that Chu’s [11] and Kahkeshani’s [24] nomenclature ‘generalized Catalan numbers’ should not be confused with each other.

We close with a comment on the paper by Aval [2], who defined so-called ‘multivariate Fuss–Catalan numbers’ via [2, remark 3.2]

B_{p}(n,k_{1},k_{2},\dots,k_{p-1})=\biggl{(}\prod_{i=1}^{p-1}\binom{n+k_{i}-1}% {k_{i}}\biggr{)}\,\frac{n-\sum_{i=1}^{p-1}k_{i}}{n}\,.

(1.9)

Clearly $B_{p}(\cdot)=1$ for $p=0$ and $p=1$ . For $p\geq 2$ we have

\begin{split}B_{p}(n,k_{1},k_{2},\dots,k_{p-1})&=\frac{n-\sum_{i=1}^{p-1}k_{i}% }{n}\prod_{i=1}^{p-1}\biggl{(}\frac{1}{k_{1}!}\prod_{j=0}^{k_{i}-1}(n+k_{i}-1-% j)\biggr{)}\\ &=n^{p-2}(n-|\bm{k}|)\prod_{i=1}^{p-1}\biggl{(}\frac{1}{k_{1}!}\prod_{j=1}^{k_% {i}-1}(n+k_{i}-j)\biggr{)}\\ &=n^{p-2}(n-|\bm{k}|)\prod_{i=1}^{p-1}A_{k_{i}}(1,n)\,.\end{split}

(1.10)

Hence for $p\geq 2$ , Aval’s definition equals a product of $p-1$ single-parameter Fuss–Catalan numbers, with a prefactor. This is different from the multiparameter Fuss–Catalan numbers defined in this paper.

Appendix B $\mathscr{A}$ -hypergeometric series

Sturmfels [44] published an elegant analysis employing so-called $\mathscr{A}$ -hypergeometric series to solve for the roots of the general algebraic equation of degree $n$ . A brief comparison with his work is presented here. His first example is for the quintic. Let us write the quintic in the form

x=-\frac{a_{0}}{a_{1}}-\frac{a_{2}}{a_{1}}x^{2}-\frac{a_{3}}{a_{1}}x^{3}-\frac% {a_{4}}{a_{1}}x^{4}-\frac{a_{5}}{a_{1}}x^{5}\,.

(2.1)

This corresponds to $p=0$ and $q=1$ in my terminology, so $q-p=1$ and the Fuss–Catalan series yields one root, which is

\begin{split}x_{\rm root}&=-\frac{a_{0}}{a_{1}}\biggl{[}\sum_{\bm{t}\in\mathbb% {N}^{4}}A_{\bm{t}}(\bm{\mu},1)\,\frac{e^{-i\pi\bm{t}\cdot\bm{\mu}}}{a_{0}^{t-% \bm{t}\cdot\bm{\mu}}a_{1}^{\bm{t}\cdot\bm{\mu}}}\Bigl{(}\prod_{j\in\mathscr{N}% _{npq}}a_{j}^{t_{j}}\Bigr{)}\biggr{]}\\ &=-\frac{a_{0}}{a_{1}}\biggl{[}\,1+\frac{a_{0}a_{2}}{a_{1}^{2}}-\frac{a_{0}^{2% }a_{3}}{a_{1}^{3}}+\frac{a_{0}^{3}a_{4}}{a_{1}^{4}}-\frac{a_{0}^{4}a_{5}}{a_{1% }^{5}}+\frac{2a_{0}^{2}a_{2}^{2}}{a_{1}^{4}}-\frac{5a_{0}^{3}a_{2}a_{3}}{a_{1}% ^{5}}+\cdots\biggr{]}\,.\end{split}

(2.2)

This equals the root $X_{1,-1}$ of Sturmfels [44]. Next let us select $p=0$ and $q=5$ and write

x^{5}=-\frac{a_{0}}{a_{5}}-\frac{a_{1}}{a_{5}}x-\frac{a_{2}}{a_{5}}x^{2}-\frac% {a_{3}}{a_{5}}x^{3}-\frac{a_{4}}{a_{5}}x^{4}\,.

(2.3)

The series yields five roots. Following Sturmfels, we define $\xi=e^{i\pi(2\ell+1)/5}$ as a root of $-1$ . Then the roots of the quintic are given by

\begin{split}x_{\xi}&=\xi\,\frac{a_{0}^{1/5}}{a_{5}^{1/5}}\,\sum_{\bm{t}\in% \mathbb{N}^{4}}A_{\bm{t}}\Bigl{(}\bm{\mu},\frac{1}{5}\Bigr{)}\,\frac{\xi^{5\bm% {t}\cdot\bm{\mu}}}{a_{0}^{t-\bm{t}\cdot\bm{\mu}}a_{5}^{\bm{t}\cdot\bm{\mu}}}% \Bigl{(}\prod_{j\in\mathscr{N}_{npq}}a_{j}^{t_{j}}\Bigr{)}\\ &=\xi\,\frac{a_{0}^{1/5}}{a_{5}^{1/5}}+\frac{1}{5}\biggl{(}\frac{\xi^{2}a_{1}}% {a_{0}^{3/5}a_{5}^{2/5}}+\frac{\xi^{3}a_{2}}{a_{0}^{4/5}a_{5}^{3/5}}+\frac{\xi% ^{4}a_{3}}{a_{0}^{1/5}a_{5}^{4/5}}-\frac{a_{4}}{a_{5}}\biggr{)}+\cdots\end{split}

(2.4)

These are the leading terms of the $\mathscr{A}$ -hypergeometric series for the roots $X_{5,\xi}$ of Sturmfels (see [44] for details of his notation)

X_{5,\xi}=\xi\,\biggl{[}\frac{a_{0}^{1/5}}{a_{5}^{1/5}}\biggr{]}+\frac{1}{5}% \biggl{(}\xi^{2}\,\biggl{[}\frac{a_{1}}{a_{0}^{3/5}a_{5}^{2/5}}\biggr{]}+\xi^{% 3}\,\biggl{[}\frac{a_{2}}{a_{0}^{2/5}a_{5}^{3/5}}\biggr{]}+\xi^{4}\,\biggl{[}% \frac{a_{3}}{a_{0}^{1/5}a_{5}^{4/5}}\biggr{]}-\biggl{[}\frac{a_{4}}{a_{5}}% \biggr{]}\biggr{)}\,.

(2.5)

1.

This illustrates a difference between the use of $\mathscr{A}$ -hypergeometric series and Fuss–Catalan series. In general, for a polynomial of degree $n$ , a total of $n$ $\mathscr{A}$ -hypergeometric series are required to derive solutions for all the roots. In contrast, a Fuss–Catalan series encapsulates all the roots in one series, cycling through the roots of unity. There is a single formula for all the terms in a Fuss–Catalan series.
2.

A similar remark applies to the work of Birkeland [9]. In general, a total of $|q-p|^{n-1}$ hypergeometric series are required to express the solutions for all the roots of an algebraic equation of degree $n$ .

For the ‘triangulation of unit length’ of the quintic, Sturmfels obtained expressions for the five roots $X_{j,-1}$ , $j=1,\dots,5$ (see [44] for details). The example $X_{1,-1}$ was displayed above. If all of the coefficients of the quintic are real then all of the series for the roots $X_{j,-1}$ are real. However, a quintic with all real coefficients may not have all real roots. As Sturmfels noted, the $\mathscr{A}$ -hypergeometric series have finite radii of convergence. Sturmfels offered a convergence criterion for the $\mathscr{A}$ -hypergeometric series in his Theorem 3.2, reproduced here for ease of reference. (Consult [44] for definitions and notation).

Theorem ([44] Theorem 3.2).

The $n$ series $X_{j,\xi}$ are roots of the general equation of order $n$ ; that is; $f(X_{j,\xi})=0$ . There exists a constant $M$ such that all $n$ series $X_{j,\xi}$ converge whenever

|a_{i_{j-1}}|^{i_{j}-k}|a_{i_{j}}|^{k-i_{j-1}}\geq M|a_{k}|^{d_{j}}\qquad% \textrm{for all $1\leq j\leq r$ and $k\not\in\{i_{j-1},i_{j}\}$}\,.

(2.6)

The above corrects a misprint in the direction of the inequality in [44, Thm. 3.2]. (I thank Sturmfels [45] for confirming the correct direction of the inequality.)

Sturmfels also stated (last paragraph in [44, Section 3]) “First, no good bound for $M$ seems to be currently known, and, second, for many concrete instances the inequalities (3.2) [this is reproduced as eq. (2.6) above] will not hold for any triangulation. In this case one has to carry out analytic continuation: …” From eq. (3.3), we can supply a value for $M$ above. First define

M_{k,i_{j},i_{j-1}}=\frac{|k-i_{j-1}|^{k-i_{j-1}}|i_{j}-k|^{i_{j}-k}}{|i_{j}-i% _{j-1}|^{i_{j}-i_{j-1}}}\,.

(2.7)

Then $M=\min(M_{k,i_{j},i_{j-1}})/(n-1)$ . Sturmfels was however correct to note that a convergent series solution might not exist for any triangulation. We saw some examples earlier in this paper.

References

[1] P. Appell and Kampé de Fériet, Fonctions hypergéométriques et hypersphériques: polynomes d’Hermite, Gauthier-Villars, Paris (1926).
[2] J.-C. Aval, “Multivariate Fuss–Catalan numbers,” Discrete Mathematics, 308 (2008) 4660–4669.
[3] C. Banderier and M. Drmota, “Formulae and Asymptotics for Coefficients of Algebraic Functions,” Combinatorics, Probability and Computing, 24 (2015) 1–53.
[4] G. Belardinelli, Fonctions hypergéométriques de plusieurs variables et résolution analytique des équations algébriques genérales, Gauthier-Villars, Paris (1960).
[5] B. C. Berndt, Ramanujan’s Notebooks, Pt. 1, Springer-Verlag, New York, USA, (1985).
[6] R. Birkeland, “Résolution de l’equation algebrique trinome par des fonctions hypergéométriques supérieurs,” Comptes Rendus Acad. Sci., 171 (1920) 778–781.
[7] R. Birkeland, “Résolution de l’equation genérale du cinquième degré,” Comptes Rendus Acad. Sci., 171 (1920) 1047–1049.
[8] R. Birkeland, “Résolution de l’equation algebrique genérale par des fonctions hypergéométriques des plusieurs variables,” Comptes Rendus Acad. Sci., 171 (1920) 1370–1372 and 172 (1921) 309–311.
[9] R. Birkeland, “Über die Auflösung algebraischer Gleichungen durch hypergeometrische Funktionen,” Mathematische Zeitschrift, 26 (1927) 1047–1049.
[10] F. Brioschi, “Sulla risoluzione delle equazioni di quinto grado,” Ann. Mat. Pura Appl. 1 (1858) 256–259.
[11] W. Chu, “A new combinatorial interpretation for generalized Catalan number,” Discrete Mathematics, 65 (1987) 91–94.
[12] R. M. Corless, G. H. Gonnet, D. E. G. Hare, D. J. Jeffrey and D. E. Knuth, “On the Lambert $W$ function,” Advances in Computational Mathematics, 5 (1996) 329–359.
[13] S. J. Dilworth and S. R. Mane, “Success Run Waiting Times and Fuss–Catalan Numbers,” Journal of Probability and Statistics, 2015 482462 (2015).
[14] S. J. Dilworth and S. R. Mane, “Applications of Fuss–Catalan Numbers to Success Runs of Bernoulli Trials,” Journal of Probability and Statistics, 2016 2071582 (2016).
[15] A. Eagle, “Series for all the Roots of a Trinomial Equation,” American Mathematical Monthly, 46 (1939) 422–425.
[16] L. Euler, “De serie Lambertina plurimisque eius insignibus proprietatibus,” Acta Academiae Scientarum Imperialis Petropolitinae, 2 (1783) (original date 1779) 29–51. Reprinted in L. Euler, Opera Omnia, Series Prima in “Commentationes Algebraicae,” Teubner, Leipzig, Germany, 6 (1921) 350–369.
[17] F. G. M. Eisenstein, “Allgemeine Auflösung der Gleichungen von den ersten vier Geraden,” J. Reine Angew. Math. 27, (1844) 81–83. Mathematische Werke I, 7–9. (in German).
[18] W. Feller, An Introduction to Probability Theory and its Applications, Wiley, New York, USA, 3rd Edition, (1968).
[19] I. J. Good, “Generalizations to several variables of Lagrange’s expansion, with applications to stochastic processes,” Mathematical Proceedings of the Cambridge Philosophical Society, 56 (1960) 367–380.
[20] H. W. Gould, “Some Generalizations of Vandermonde’s Convolution,” American Mathematical Monthly, 63 (1956) 84–91.
[21] H. W. Gould, “Final Analysis of Vandermonde’s Convolution,” American Mathematical Monthly, 64 (1957) 409–415.
[22] H. W. Gould, “Coefficient Identities for Powers of Taylor and Dirichlet Series,” American Mathematical Monthly, 81 (1974) 3–14.
[23] R. L. Graham, D. E. Knuth and O. Patashnik, Concrete Mathematics, Addison-Wesley, New York, USA, 2nd Edition, (1994).
[24] R. Kahkeshani, “A Generalization of the Catalan Numbers,” Journal of Integer Sequences, 16 (2013) 13.6.8.
[25] F. Kamber, “Formules exprimant les valeurs des coefficients des séries de puissances inverse,” Acta Mathematica, 78 (1946) 193–204.
[26] F. Klein, “Vorlesungen über das Ikosaeder und die Auflösung der Gleichungen vom fünften Grade,” Teubner, Leipzig, Germany (1884).
[27] M. J. A. Serret, Oeuvres de Lagrange, Gauthier-Villars, Paris (1869).
[28] J. Lambert, “Observationes variae in mathesin puram,” Acta Helvetica, 3 (1758) 128–168.
[29] J. Lambert, “Observations analytiques,” Nouveaux Mémoires de l’Académie Royale des Sciences et Belles-Lettres, Berlin, (1770) pp. 225–244.
[30] P. A. Lambert, “On the Solution of Algebraic Equations in Infinite Series,” Bull. Amer. Math. Soc., 14 (1908) 467–477.
[31] A. J. Lewis, “The Solution of Algebraic Equations by Infinite Series,” National Mathematics Magazine, 10 (1935) 80–95.
[32] E. McClintock, “A Method for Calculating Simultaneously all the Roots of an Equation.” American Journal of Mathematics, 17 (1895) 89–110.
[33] H. Mellin, “Ein allgemeiner Satz über algebraische Gleichungen,” Ann. Soc. Fennicae, (A) 7 (1915) 44S.
[34] S. G. Mohanty, “Identities of Rothe-Abel-Schläfli-Hurwitz-type,” SIAM Review, 8 (1966) 501–509.
[35] O. Nash, “On Klein’s icosahedral solution of the quintic,” Expositiones Mathematicae, 32 (2014) 99–120.
[36] M. Passare and A. Tsikh, “Algebraic Equations and Hypergeometric Series” in The Legacy of Neils Hendrik Abel, Springer-Verlag, Berlin, (2002) 653–672.
[37] G. N. Raney, “Functional Composition Patterns and Power Series Reversion,” Transactions of the American Mathematical Society, 94 (1960) 441–451.
[38] J. Riordan, Combinatorial Identities, Wiley, New York, USA, 2nd Edition, (1979).
[39] L. Schläfli, “Gesammelte Mathematische Abhandlungen,” vol. 1, (reprinted) Springer, Basel, Switzerland, (1950).
[40] A. Schuetz and G. Whieldon, “Polygonal Dissections and Reversions of Series,” Involve, 9 (2016) 223–236.
[41] R. P. Stanley, Catalan Numbers, Cambridge University Press, New York, USA, (2015).
[42] J. Stillwell, “Eisenstein’s Footnote,” The Mathematical Intelligencer, 17, (1995) 58–62.
[43] V. Strehl, “Identities of Rothe-Abel-Schläfli-Hurwitz-type,” Discrete Mathematics, 99 (1992) 321–340.
[44] B. Sturmfels, “Solving algebraic equations in terms of $\mathscr{A}$ -hypergeometric series,” Discrete Mathematics, 210 (2000) 171–181.
[45] B. Sturmfels, private communication (2016).

Refer to caption — Figure 1: Graph of the discriminant $\Psi^{-}_{24}(-|a_{0}|,|a_{1}|)$ for $(|a_{0}|,|a_{1}|)=(a,\frac{1}{2}a)$ , plotted as a function of $a$ .


$\displaystyle\Delta_{02}(\|a_{1}\|,\|a_{3}\|)$	$\displaystyle=27\|a_{3}\|^{2}+4\|a_{1}\|^{3}\|a_{3}\|+4-18\|a_{1}\|\|a_{3}\|-\|a_{1}\|^{2}\,,$	(7.7a)
$\displaystyle\Delta_{02}(\|a_{1}\|,-\|a_{3}\|)$	$\displaystyle=27\|a_{3}\|^{2}-4\|a_{1}\|^{3}\|a_{3}\|+4+18\|a_{1}\|\|a_{3}\|-\|a_{1}\|^{2}\,.$	(7.7b)
$\displaystyle\Delta_{02}(-\|a_{1}\|,\|a_{3}\|)$	$\displaystyle=\Delta_{02}(\|a_{1}\|,-\|a_{3}\|)\,,$	(7.7c)
$\displaystyle\Delta_{02}(-\|a_{1}\|,-\|a_{3}\|)$	$\displaystyle=\Delta_{02}(\|a_{1}\|,\|a_{3}\|)\,.$	(7.7d)


$\displaystyle\Psi^{+}_{02}(\|a_{1}\|,\|a_{3}\|)$	$\displaystyle=27\|a_{3}\|^{2}+4\|a_{1}\|^{3}\|a_{3}\|+4-18\|a_{1}\|\|a_{3}\|-\|a_{1}\|^{2}\,,$	(8.11a)
$\displaystyle\Psi^{+}_{02}(\|a_{1}\|,-\|a_{3}\|)$	$\displaystyle=27\|a_{3}\|^{2}-4\|a_{1}\|^{3}\|a_{3}\|+4+18\|a_{1}\|\|a_{3}\|-\|a_{1}\|^{2}\,,$	(8.11b)
$\displaystyle\Psi^{-}_{02}(\|a_{1}\|,\|a_{3}\|)$	$\displaystyle=27\|a_{3}\|^{2}+4\|a_{1}\|^{3}\|a_{3}\|-4+18\|a_{1}\|\|a_{3}\|-\|a_{1}\|^{2}\,,$	(8.11c)
$\displaystyle\Psi^{-}_{02}(\|a_{1}\|,-\|a_{3}\|)$	$\displaystyle=27\|a_{3}\|^{2}-4\|a_{1}\|^{3}\|a_{3}\|-4-18\|a_{1}\|\|a_{3}\|-\|a_{1}\|^{2}\,.$	(8.11d)


$\displaystyle\tilde{\Psi}_{12}^{+}(\|a_{0}\|,\|a_{4}\|)$	$\displaystyle=\phantom{-}256\|a_{0}\|^{3}\|a_{4}\|^{2}-128\|a_{0}\|^{2}\|a_{4}\|+144\|a% _{0}\|\|a_{4}\|+16\|a_{0}\|-27\|a_{4}\|-4\,,$	(8.23a)
$\displaystyle\tilde{\Psi}_{12}^{+}(\|a_{0}\|,-\|a_{4}\|)$	$\displaystyle=\phantom{-}256\|a_{0}\|^{3}\|a_{4}\|^{2}+128\|a_{0}\|^{2}\|a_{4}\|-144\|a% _{0}\|\|a_{4}\|+16\|a_{0}\|+27\|a_{4}\|-4\,,$	(8.23b)
$\displaystyle\tilde{\Psi}_{12}^{+}(-\|a_{0}\|,\|a_{4}\|)$	$\displaystyle=-256\|a_{0}\|^{3}\|a_{4}\|^{2}-128\|a_{0}\|^{2}\|a_{4}\|-144\|a_{0}\|\|a_{4% }\|-16\|a_{0}\|-27\|a_{4}\|-4\,,$	(8.23c)
$\displaystyle\tilde{\Psi}_{12}^{+}(-\|a_{0}\|,-\|a_{4}\|)$	$\displaystyle=-256\|a_{0}\|^{3}\|a_{4}\|^{2}+128\|a_{0}\|^{2}\|a_{4}\|+144\|a_{0}\|\|a_{4% }\|-16\|a_{0}\|+27\|a_{4}\|-4\,.$	(8.23d)


$\displaystyle\tilde{\Psi}_{01}^{+}(\|b_{2}\|,\|b_{3}\|)$	$\displaystyle=27\|b_{3}\|^{2}+4\|b_{3}\|+4\|b_{2}\|^{3}-18\|b_{2}\|\|b_{3}\|-\|b_{2}\|^{2}\,,$	(8.25a)
$\displaystyle\tilde{\Psi}_{01}^{+}(\|b_{2}\|,-\|b_{3}\|)$	$\displaystyle=27\|b_{3}\|^{2}-4\|b_{3}\|+4\|b_{2}\|^{3}+18\|b_{2}\|\|b_{3}\|-\|b_{2}\|^{2}\,,$	(8.25b)
$\displaystyle\tilde{\Psi}_{01}^{+}(-\|b_{2}\|,\|b_{3}\|)$	$\displaystyle=27\|b_{3}\|^{2}+4\|b_{3}\|-4\|b_{2}\|^{3}+18\|b_{2}\|\|b_{3}\|-\|b_{2}\|^{2}\,,$	(8.25c)
$\displaystyle\tilde{\Psi}_{01}^{+}(-\|b_{2}\|,-\|b_{3}\|)$	$\displaystyle=27\|b_{3}\|^{2}-4\|b_{3}\|-4\|b_{2}\|^{3}-18\|b_{2}\|\|b_{3}\|-\|b_{2}\|^{2}\,.$	(8.25d)


$\displaystyle\Psi_{24}^{+}(\|a_{0}\|,\pm\|a_{1}\|)$	$\displaystyle=-27\|a_{1}\|^{4}-4(1-36\|a_{0}\|)\|a_{1}\|^{2}+16\|a_{0}\|(1-4\|a_{0}\|)^{% 2}\,,$	(8.28a)
$\displaystyle\Psi_{24}^{+}(-\|a_{0}\|,\pm\|a_{1}\|)$	$\displaystyle=-27\|a_{1}\|^{4}-4(1+36\|a_{0}\|)\|a_{1}\|^{2}+16\|a_{0}\|(1+4\|a_{0}\|)^{% 2}\,,$	(8.28b)
$\displaystyle\Psi_{24}^{-}(\|a_{0}\|,\pm\|a_{1}\|)$	$\displaystyle=-27\|a_{1}\|^{4}+4(1+36\|a_{0}\|)\|a_{1}\|^{2}-16\|a_{0}\|(1+4\|a_{0}\|)^{% 2}\,,$	(8.28c)
$\displaystyle\Psi_{24}^{-}(-\|a_{0}\|,\pm\|a_{1}\|)$	$\displaystyle=-27\|a_{1}\|^{4}+4(1-36\|a_{0}\|)\|a_{1}\|^{2}+16\|a_{0}\|(1-4\|a_{0}\|)^{% 2}\,.$	(8.28d)

Multiparameter Fuss–Catalan numbers with application to algebraic equations

Abstract

keywords:

MSC:

1 Introduction

2 Basic definitions and theorems

Theorem 2.1.

Definition 2.2 (multiparameter Fuss–Catalan numbers).

Definition 2.3 (multiparameter generating function).

Theorem 2.4.

Remark 2.5.

Remark 2.6 (branch cuts).

3 Domain of convergence

Lemma 3.1.

Proposition 3.2 (necessary, not sufficient).

Proof.

Proposition 3.3 (sufficient, not necessary).

Proof.

Corollary 3.4 (trinomial).

Remark 3.5.

Remark 3.6.

4 Algebraic equations

4.1 Preliminary remarks

4.2 Comment on McClintock’s series

4.3 Comment on Mellin’s solution

4.4 Comment on Birkeland’s series

4.5 Comment on Lewis’s series

4.6 Comment on Raney’s series

5 Quintic

6 Trinomial

6.1 General solution

6.2 Lambert and Euler trinomial equations

7 Algebraic equations: convergence of series I

7.1 General remarks

Definition 7.1 (multicircular or Reinhardt domain).

7.2 Application to cubics

8 Algebraic equations: convergence of series II

8.1 Revised formalism

Lemma 8.1.

Proof.

8.2 General formula

Lemma 8.2.

Proof.

8.3 Ψ~p⁢q±⁢(𝒃,σ)superscriptsubscript~Ψ𝑝𝑞plus-or-minus𝒃𝜎\tilde{\Psi}_{pq}^{\pm}(\bm{b},\sigma)over~ start_ARG roman_Ψ end_ARG start_POSTSUBSCRIPT italic_p italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ± end_POSTSUPERSCRIPT ( bold_italic_b , italic_σ ) nonzero at origin

8.4 Ψ~p⁢q±⁢(𝒃,σ)superscriptsubscript~Ψ𝑝𝑞plus-or-minus𝒃𝜎\tilde{\Psi}_{pq}^{\pm}(\bm{b},\sigma)over~ start_ARG roman_Ψ end_ARG start_POSTSUBSCRIPT italic_p italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ± end_POSTSUPERSCRIPT ( bold_italic_b , italic_σ ) vanishes at origin

8.5 Normalization of discriminant

9 Applications: principal and Brioschi quintics

10 Conclusion

Acknowledgements

Appendix A Miscellaneous items

Appendix B 𝒜𝒜\mathscr{A}script_A-hypergeometric series

Theorem ([44] Theorem 3.2).

References

8.3 $\tilde{\Psi}_{pq}^{\pm}(\bm{b},\sigma)$ nonzero at origin

8.4 $\tilde{\Psi}_{pq}^{\pm}(\bm{b},\sigma)$ vanishes at origin

Appendix B $\mathscr{A}$ -hypergeometric series