Pièces de viole des Cinq Livres and their statistical signatures: the musical work of Marin Marais and Jordi Savall

Igor Lugo1, Martha G. Alatriste-Contreras2 Corresponding author: [email protected]
(1Centro Regional de Investigaciones Multidisciplinarias, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México
2Facultad de Economía, Universidad Nacional Autónoma de México, Ciudad de México, México
April 30, 2024)
Abstract

This study analyzes the spectrum of audio signals related to the work of “Pièces de viole des Cinq Livres” based on the collaborative work between Marin Marais and Jordi Savall for the underlying musical information. In particular, we explore the identification of possible statistical signatures related to this musical work. Based on the complex systems approach, we compute the spectrum of audio signals, analyze and identify their best-fit statistical distributions, and plot their relative frequencies using the scientific pitch notation. Findings suggest that the collection of frequency components related to the spectrum of each of the books that form this audio work show highly skewed and associated statistical distributions. Therefore, the most frequent statistical distribution that best describes the collection of these audio data and may be associated with a singular statistical signature is the exponential.

keyword: Bass viol de Gamba, Marin Marais, Jordi Savall, complex systems, statistical distributions

Introduction

Marin Marais is one of the most outstanding composer and performer musician of bass viola da gamba in music history. Not only in his era, but also in current days, his music has continued to delight from novice to expert musicians. Nowadays, Jordi Savall—a contemporary conductor, composer, historian, and viol player—has communicated globally most of the Marias’ work. The extraordinary musical contribution of both musicians has showed one of the highest level of musical expression over time. There is a deep musical connection between this pair of musicians (named in the following as Marais-Savall) that has transcended time and frontiers. However, a scientific analysis of Marais-Savall audio signals—waveforms—is still missing in the musician and scientific communities. In particular, one of the most important works of Marais, “Pièces de viole des Cinq Livres,” has not been explored based on its statistical properties that underly its music information. Therefore, our study aims to analyze the audio signals of these five volumes for identifying the statistical distributions that best describe their spectrum—the frequency of components related to music notes. After establishing the presence of such distributions, we can reinterpret the music of bass viol and identify the signature of viol players.

This statistical approach for using waveforms and its spectrum has been applied by Lugo and Alatriste-Contreras [1]. They suggested that the concept of virtuosity in music is possible related to entropy values and the best-fit distributions of the spectrum of audio signals. In particular, they suggested that the waveform and its spectrum contain information to identify levels of virtuosity in music. Moreover, the work of Downey [2] showed different topics of signal processing in music. In particular, he presented techniques and applications with a programming-based approach for understanding real audio signals. Other relevant work of audio signals is Müller and Klapuri [3]. They presented an overview of principles and applications of music signals that are the key for underlying music analysis problems. Therefore, the data analysis of audio signals based on interdisciplinary approaches and the current digital technology may generate a deep understanding of the underlying information in music. The intuition of musicians about identifying a particular composer when only playing or hearing few notes of some music repertoire can be confirmed if we look into the statistics of the spectrum of signals.

In the case of Marais-Savall, the identification of their unique statistical signatures might be related to different aspects of their lifes. About Marais, several studies about his musical abilities have highlighted factors that are associated with his personal experience and social relationships [4, 5]. In particular, the work of Milliot and de la Gorce [6] described almost a complete view of the context in which Marais’ developed his musical creativity and skills. For example, his relationship with two of the most respected musicians in that time, Jean de Sainte-Colombe and Jean-Baptiste Lully. Other work that complemented this reference is the audio work of Savall et al., [7]. It offers materials that not only can listened, but also read; we can read information about the collection of the audio tracks. For example, the booklet described the common discussion about musicians in that time about the balance between melody and harmony. Finally, an unexpected, but interesting study is the work of Matloubieh et al., [8]. This study came from another discipline, and authors suggested that Marais’ mixed his musical reputation with a medical procedure about lithotomy. Then, the influence of Marais’ covered not only common issues on music, but also different and relevant themes on his time.

In the case of Savall, there are different information sources that display his work across several areas—i.e., concert performer, teacher, researcher, just to name a few [9]. For example, the websites of AliaVox and Fundació Centre International de Música Antiga illustrate his outstanding work for rescuing and preserving early music. The work of Forti i Murrugat [10] analyzed Savall’s projects related to this type of music to propose a musical framework for music and art. Therefore, the Marais-Savall relationship shows singular musical and personal characteristics that are possibly imprinted in most of their musical work.

Our main questions are the following: does the collection of the five books show similar statistical distributions? and, what are those statistical distributions? We believe that the underlying attributes of waveforms and their audio spectrum are related to statistical distributions. They can be interpreted as the signature of a set of music repertoire. Depending on the performance of musicians—different musicians play the same repertoire—the signature varies only slightly from its real value. Therefore, a large set of audio signals of bass viol played by the same musician provides the event-based condition for identifying accurately the signature of a particular musical work.

The document is divided on four sections. The Materials section shows the audio resources from the data that was collected. The Method section explains the application of the complex systems approach into an explorative data analysis based on identifying statistical distributions. The Results section displays our findings. Finally, the Discussion section points out some items to be considered in the analysis and gives the conclusions.

Materials

The data consisted of the audio material related to the work of Savall et al. [7]. This material, named Pièces de viole des Cinq Livres, is a collection of five audio CDs that contains a total of 84 tracks. We selected this audio material because is one of the best recording audio data about Marin Marais up until now. This material combines together audio recordings and historical documents that more accurately described the Marais’ musical contribution. As a listener, the execution behind each track reflects Savall’s expertise of playing the instrument and his knowledge of recording music. Because of the copyrights of this audio material, we suggest to obtain these CDs and follow our method for replicating results. Therefore, we transformed each track of this album from m4a𝑚4𝑎m4aitalic_m 4 italic_a to WAV𝑊𝐴𝑉WAVitalic_W italic_A italic_V files in 16-bit PCM. During this process, we spliced stereo to mono using Audacity.

On the other hand, we used different Python libraries to retrieve, analyze, and plot the data and results. In particular, we used the Numpy, Matplotlib, Scipy, and Pandas. Moreover, we used some part of the code provided by Downey [2]. The database and the code are available in our Open Science Framework (OSF) for the reproduction of our findings: Complex systems and early music.

Methods

High quality audio recordings of viola da gamba are rare because it is not common to play such an instrument nowadays. In this case, Savall has provided a unique collection of recordings of Marais that we can analyze based on their audio signals. Therefore, in this section, we present our procedure for identifying the statistical signatures of the collaborative work between Marais-Savall.

The first step in this procedure is the use of the spectral decomposition. This is a procedure for simplifying audio data based on the Fast Fourier Transformation (FFT) algorithm and the Discrete Fourier Transformation (DFT) [11, 12, 13]. The result of this decomposition is named the “spectrum” that shows approximately the frequency components related to pitches—the dominant pitch and its harmonics. The importance of this spectrum is to identify the frequency components of sections related to the immediate musical composition or improvisation. The analysis of these sections is not trivial because they are commonly related to different musical interpretations in which the musician communicates emotions by performing. For example, contemporary guitarists improvise solos that engage audiences or listeners to experience different emotions [14]. Then, the spectrum is one of the keys to unfold the rich musical imaginations of celebrated composer-performer musicians [15]. Therefore, we used the spectral decomposition associated with the FFT as following:

y[k]=n=0N1e2πjknNx[n]𝑦delimited-[]𝑘superscriptsubscript𝑛0𝑁1superscript𝑒2𝜋𝑗𝑘𝑛𝑁𝑥delimited-[]𝑛\displaystyle y[k]=\sum_{n=0}^{N-1}e^{-2\pi j\frac{kn}{N}}x[n]italic_y [ italic_k ] = ∑ start_POSTSUBSCRIPT italic_n = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N - 1 end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT - 2 italic_π italic_j divide start_ARG italic_k italic_n end_ARG start_ARG italic_N end_ARG end_POSTSUPERSCRIPT italic_x [ italic_n ] (1)

in which y[k]𝑦delimited-[]𝑘y[k]italic_y [ italic_k ] is the frequency component of a sequence of the signal x[n]𝑥delimited-[]𝑛x[n]italic_x [ italic_n ] from n𝑛nitalic_n to N1𝑁1N-1italic_N - 1. In our case, the identification of these components provides the inputs for understanding the statistical signature associated with our audio material. Therefore, we can define the statistical signature of a collection of audio materials as a clearly identified statistical distribution that shows particular properties.

Next, the second step is to analyze those collections of frequency components looking for the identification of statistical distributions that best describe them. The statistical distribution or probability distribution is a mathematical function that describes the occurrence of possible events. It approximates the generating process of particular data. A major advantage of this function is to infer properties underlying it [16, 17]. In particular, continuous distributions are commonly related to three types of parameters: location, scale, and shape [18]. The location parameter, which is associated with the first moment or mean, refers to the place where the most frequent value is observed along the x-axis in a frequency plot. The scale parameter, which is associated with the second moment or variance, refers to how spread out are the data with respect to the location along the x-axis. The shape parameter, which is associated with all higher moments such as the skewness and kurtosis, refers to the shape or geometry of the data. It is important to mention that depending on the skewed or non-skewed data the location and scale parameters can be related to different measures, for example for skewed data is commonly suggested to compute the median and the entropy [19, 20, 21]. Therefore, based on this collection of statistical measures, we can describe and infer the behavior of any distribution. In our case, after obtaining the frequency components, we are in the possibility of identifying the statistical distribution that represents accurately the unique audio signal of the work of Marais-Savall.

To identify the possible statistical distributions that best describe the audio work of Marais-Savall, we used the Kolmogorov-Smirnov (KS) test to identify whether or not our audio data comes from a certain distribution [22]. In our case, this test compares our frequency components (empirical data) with a set of given distributions (theoretical distributions). We used the normal, log-normal, exponential, Pareto, Gilbrat, power law, and exponentiated Weibull as our theoretical distributions. These distributions represent an important set of continuous distributions in the literature that might represent non-skewed and skewed data, as well as different relationships between them [23, 24]. In particular, these distributions are connected with other distributions based on their properties, for example the linear combination, coevolution, and products, just to mention a few [18]. Therefore, the KS test using those statistical distributions provides an accurate process to compute and identify distributions that best describe the audio work of Marais-Savall.

Finally, once the best-fit distributions were identified, we display their relative frequencies of the frequency components using the scientific pitch notation—notes names and octave numbers—as bins (Figure 1). This figure aims to show the statistical attributes of audio signals based on how frequently some notes and octaves are used. It is the key to understand the possible statistical signature of the work of Marais-Savall. In the next section, we display this figure in each result related to the book’s audio data.

Refer to caption
Figure 1: Relative frequencies of the spectrum. Relative frequencies are related to Hz, and Frequency components are associated with the output of the FFT (Eq. 1). We used the data provided by the Physics Department, Michigan Technological University [25]. To avoid confusion in the scientific pitch notation, notes are translated by multiplying or dividing the frequency by 2. Then, in this Figure, vertical lines are approximations of octave locations used for reference purposes only.

In essence, we use each track per book for computing the FFT. The resulted book’s collection of component frequencies are analyzed for identifying its best-fit statistical distribution. To identify the distribution that best describes the component frequencies, we show our proposed plot of relative frequencies of the spectrum (Figure 1). Finally, the programming code of each step can be found in our project of the Open Science Framework platform.

Results

The main goal of this study is to identify possible statistical signatures related to the spectrum of audio signals of the work of “Pièces de viole des Cinq Livres” based on the work of Marais-Savall. Each book, which is a collection of audio tracks, of this work was analyzed by following our proposed method. Therefore, for simplicity and ease of interpretation, we are going to show our results for the five books by using only our proposed plot related to the relative frequencies (Figure 1).

Figure 2 shows the statistical signature of each of the books. As can be seen, the curves related to each of the collection of frequency components associated with their spectrums are highly skewed. This lack of symmetry suggested that lower and higher octaves are played more frequently. In particular, lower octaves represent the majority of notes played between the C0𝐶0C0italic_C 0 and C3𝐶3C3italic_C 3. On the other hand, higher octaves represent the relative notes played between C6𝐶6C6italic_C 6 and C8𝐶8C8italic_C 8. Moreover, between C3𝐶3C3italic_C 3 and C6𝐶6C6italic_C 6 octaves, we can see important differences of played notes. Books 4444 and 5555 represent the greatest difference, meanwhile books 1111, 2222, and 3333 are in between them. This particular result suggests that the main differences between the frequency components of each book are around the standard tuning C4𝐶4C4italic_C 4.

Refer to caption
Figure 2: Relative frequencies of the audio spectrums and best-fit distributions. See Table 2 for statistical results of best-fit, parameters, and KS test results.

In addition to these results, we present the estimated parameters related to the location, scale, and shape per book. As we can see in the Table 1, there are two values related to the location: median and mean. In this particular case, we are interested in the median due to the resulted highly skewed distributions. By using the mean value, can be misleading or just plain wrong because it is commonly related to non-skewed data, such as the normal distribution. Then, the median values of the books associated with the exponential distribution show a stable location between the notes A1𝐴1A1italic_A 1 and E2𝐸2E2italic_E 2, meanwhile book1𝑏𝑜𝑜𝑘1book1italic_b italic_o italic_o italic_k 1 and book3𝑏𝑜𝑜𝑘3book3italic_b italic_o italic_o italic_k 3 showed locations less than C0𝐶0C0italic_C 0. Next, there are two scale parameters related to the data dispersion: variance and entropy. As we have just mentioned, we used the entropy value due to its attributes related to skewed data. Then, entropy values showed similar dispersion except for the book1𝑏𝑜𝑜𝑘1book1italic_b italic_o italic_o italic_k 1. Finally, the estimated shape parameters related to geometry showed that there is more weight on the right tail of the distributions, and they exhibit peaked shapes.

Table 1: Estimated parameters of frequency components per book
Name Median Mean Variance Entropy Skew Kurtosis
Book1𝐵𝑜𝑜𝑘1Book1italic_B italic_o italic_o italic_k 1 11.9226 15.9130 165.1470 2.2294 8.6444 165.0793
Book2𝐵𝑜𝑜𝑘2Book2italic_B italic_o italic_o italic_k 2 79.3857 114.5291 13116.7683 5.7408 2.0 6.0
Book3𝐵𝑜𝑜𝑘3Book3italic_B italic_o italic_o italic_k 3 19.4983 58.5914 15103.0876 4.7947 7.5989 126.6237
Book4𝐵𝑜𝑜𝑘4Book4italic_B italic_o italic_o italic_k 4 77.2398 111.4334 12417.3580 5.7134 2.0 6.0
Book5𝐵𝑜𝑜𝑘5Book5italic_B italic_o italic_o italic_k 5 54.4538 78.5601 6171.6320 5.3638 2.0 6.0

Summing up, the results of our data analysis are conclusive for identifying reliable parameters that point out the statistical signature of the work of Marais-Savall. The frequency components of the audio spectrums related to each book were associated with highly skewed distributions. These distributions may well be related to the exponential distribution because it is the most frequent best-fit distribution presented in our results.

Discussion and conclusion

Our study about the musical collaboration between Marais-Savall related to the audio work of “Pièces de viole des Cinq Livres” has showed the possibility of underlying its musical information. In particular, we could identify statistical attributes that distinguish the most frequent best-fit distributions related to each book’s audio spectrums. Consequently, our results indicated that the frequency components of such spectrums must have been related to the presence of highly skewed distributions, particularly in relationship with the exponential statistical distribution.

The significance of these findings is to be found in recognition of the musical work between musicians. Even though musicians are separated by time periods and places, their original and unique musical contributions can be recognized not only by the timbre, but also by the information related to the audio wave. In our case, the collaboration between Marin Marais and Jordi Savall has showed one of the highest levels of musical expressions over time that must have been recognized by their highly skewed distributions of the frequency components of audio spectrums. Such statistical distributions are related to the exponential distribution. This type of distribution is commonly related to describe system reliability and the times between events [23]. One of its main characteristics is a constant failure rate function—no memory when considering events based on its age. In our case, the memoryless or Markovian [26] property indicates that if the most frequent octaves notes are played for s𝑠sitalic_s units of times, the probability that higher octaves notes will play in additional time units is independent of s𝑠sitalic_s. In other worlds, the probability for playing lower or higher octave notes in an audio track is independent. Therefore, these findings suggest that the transition from one note to other in an audio track follows a random process based on the exponential distribution.

Translating this result into musical expressions, we can say that there may be a link between the improvisation and the selection of notes in particular musical passage. In the case of Marais-Savall, we know that the composition and performance abilities of Marais were frequently associated with improvisation [7, 27]. Consequently, it is expected that the Savall’s interpretation and performance reproduce such Marais’ habits. These findings may help us to understand that the free performance of the musician may follow different random processes that most of the time are related to skewed statistical distributions.

The implications of these findings regarding the teaching and learning of music are related to composing and playing activities that cover not only the bass viola da gamba, but also any type of string instruments. For example, in a musical composition, musicians may use the information related to type of statistical distribution for exploring alternatives or extensions to conceive a pice of music. Depending on the type of skewed distribution, musicians must use more frequently higher and lower octaves for achieving most of their musical material. Therefore, before starting the process of composition, it is recommended to analyze previous personal works and the work of other musicians for obtaining a unique and original material. In the case of performing, the prior information about statistical distributions can provide the keys for improvising music in different styles and contexts. For example, if we know that the work of Marais-Savall is best described by an exponential distribution, we must play the patterns suggested by such a distribution. We must play more frequently higher and lower octaves, meanwhile around the standard tuning, we can play notes for connecting and transitioning those octaves. Following this information, we can replicate the work of those musicians or generate our personal material.

Future studies on the current topic are therefore recommended. In particular, a natural extension of our findings is to explore the following questions: is it possible to find the presence of similar best-fit distributions whether a particular musical passage is played by different musicians? how different or similar can be the statistical properties of each performance? On the other hand, a future study with more focus on a computational approximation for composing music based on our findings is therefore suggested. To answer these questions in future work, we suggested following the line of complex systems and music.

Therefore, the greatest contribution of this study is to underly statistical properties related to early music, composed and played by two outstanding musicians. Our method can be used for analyzing not only the bass viola da gamba, but also other string instruments and musicians. The audio work of “Pièces de viole des Cinq Livres” showed highly skewed distributions possible related to the exponential distribution. This type of statistical distribution may contain the keys to understand the elements of musical composition and performance of the bass viola da gamba.

References

  • [1] Lugo I., Alatriste-Contreras M., 2024, arXiv:2404.16259.
  • [2] Downey A., Think DSP: Digital Signal Processing in Python, Shroff Publishers & Distributors Pvt Ltd, Sebastopol, CA, 2016.
  • [3] Müller M., Klapuri A., In: Academic Press Library in Signal Processing: Volume 4, Trussell J., Srivastava A., Roy-Chowdhury A., Srivastava A., Naylor P., Chellappa R., Theodoridis S. (Eds.), Elsevier, 2014, 713–756, DOI: 10.1016/B978-0-12-396501-1.00027-3.
  • [4] Cyr M., The Musical Times, 2016, 157, 49–61, https://www.jstor.org/stable/44862536.
  • [5] Bane M., Journal of the Viola da Gamba Society of America, 2018, 50, 24–48, https://www.vdgsa.org/_files/ugd/d493e7_4f60de21c70f422c9e5ab0d2c5aaced0.pdf.
  • [6] Milliot S., de la Gorce J., Marin Marais, Fayard, France, 1991.
  • [7] Savall J., Koopman T., Smith H., Coin C., Marin marais, pièces de viole des cinq livres, cd, 2010, URL https://www.alia-vox.com/en/catalogue/marin-marais-pieces-de-viole-des-cinq-livres/.
  • [8] Matloubieh J., Eghbali M., Rabinowitz R., Urology, 2020, 141, 60–63, DOI:10.1016/j.urology.2020.03.044.
  • [9] Fernández T., Tamaro E., Biografia de jordi savall, 2022, URL https://www.biografiasyvidas.com/biografia/s/savall.htm.
  • [10] Fort i Marrugat O., AusArt, 2019, 7, DOI:10.1387/ausart.20372.
  • [11] Cooley J., Tukey J., Math. Comput, 1965, 19, 297–301.
  • [12] Press W., Teukolsky S., Vetterline W., Flannery B., Numerical Recipes: The Art of Scientific Computing, Vol. ch. 12-13, Cambridge Univ. Press, 2007.
  • [13] SciPy, Scipy tutorial, fourier transforms (scipy.fft). retrieved june 14, 2021, from, 2021, URL https://docs.scipy.org/doc/scipy/reference/tutorial/fft.html#ct65.
  • [14] Swarbrick D., Bosnyak D., Livingstone S., Bansal J., Marsh-Rollo S., Woolhouse M., Trainor L., Frontiers in Psychology, 2019, 9, doi:10.3389/fpsyg.2018.02682.
  • [15] Encyclopedia Britannica, The editors of encyclopaedia.“improvisation”. accessed july 24, 2022, from, 2022, URL https://www.britannica.com/art/improvisation-music.
  • [16] Ross S., A First Course in Probability, Vol. 9th edition, Pearson, 2012.
  • [17] Ross S., Introduction to Probability and Statistics for Engineers and Scientists, Vol. 6th edition, Elsevier Inc., 2020, doi:10.1016/C2018-0-02166-0.
  • [18] Leemis L., McQueston J., American Statistical Association, 2008, 62, No. 1, 45–53, doi:10.1198/000313008X270448.
  • [19] Cover T., Thomas J., Elements of Information Theory, Vol. 2nd edition, Wiley, 2006.
  • [20] Smaldino P., Ecological Modelling, 2013, 254, 50–53, doi:10.1016/j.ecolmodel.2013.01.015.
  • [21] Mohr D., Wilson W., Freund R., In: Statistical Methods, Mohr D., Wilson W., Freund R. (Eds.), Academic Press, 2014, 169–199, DOI: 10.1016/B978-0-12-823043-5.00004-7.
  • [22] Massey F., Journal of the American Statistical Association, 1951, 46, No. 253, 68–78.
  • [23] Lehoczky J., In: International Encyclopedia of the Social & Behavioral Sciences (Second Edition), Wright J. (Ed.), Elsevier, 2015, 575–579, https://doi.org/10.1016/B978-0-08-097086-8.42115-X.
  • [24] The College of William & Mary, Univariate distribution relationship chart. accessed july 24, 2022, from, 2022, URL http://www.math.wm.edu/~leemis/chart/UDR/UDR.html.
  • [25] Suits B., Physics of music-notes. physics department, michigan technological university. accessed july 24, 2022, from, 2022, URL https://pages.mtu.edu/~suits/notefreqs.html.
  • [26] Billard L., In: International Encyclopedia of the Social & Behavioral Sciences (Second Edition), Wright J. (Ed.), Elsevier, 2015, 576–583, https://doi.org/10.1016/B978-0-08-097086-8.42144-6.
  • [27] Encyclopedia Britannica, The editors of encyclopaedia.“musical expression”. accessed july 25, 2022, from, 2022, URL https://www.britannica.com/art/musical-expression.
  • [28] Lugo I., Alatriste-Contreras M., PLoS ONE, 2019, 14, No. 7, e0218593, doi:10.1371/journal.pone.0218593.
  • [29] Lugo I., Martínez-Mekler G., J. shipp. trd, 2022, 7, No. 16, doi:10.1186/s41072-022-00117-6.
  • [30] Lugo I., Alatriste-Contreras M., Sci Rep, 2022, 12, No. 13481, doi:10.1038/s41598-022-17665-3.

Supplementary material

See Table 2 and 3. The criteria for selecting the final result in Table 2 were as follows:

  1. 1.

    Errors in the estimation method. Based on the scipy reference, we used the Maximum Likelihood Estimation (MLE). If there were no errors, we selected the best fit; if there were errors, we selected the second best fit.

  2. 2.

    Visualizing the Cumulative Distribution Function (CDF). If the estimated values of the KS test (d𝑑ditalic_d, p-value) of the first and second best fit were the same, we plotted the empirical and theoretical CDFs. This ensures to identify the best fit statistical distribution related to data.

Consequently, we found that the fit of Pareto estimations showed RuntimeErrors. Then, we had to visualize the CDFs for selecting final results in Table 2. For greater precision of the KS test estimated parameters and plot the CDFs, see and execute the code in Complex systems and early music. For the use of the same criteria applied to different scientific studies, see [28], [29], and [30].

Table 2: Statistical attributes of the “Pièces de viole des Cinq Livres”.
Name Best and second best fit Parameters KS test
(a, b, loc, scale) (d, p-value)
Book1𝐵𝑜𝑜𝑘1Book1italic_B italic_o italic_o italic_k 1 Pareto (0.5000772790696841, (8.810404621462098e-05,
-2.5313693644133393, 0.7663125997309213)
2.5320219554920884)
exponentiated Weibull * (1.8078554913188745, (8.810404621462098e-05,
0.40464609484912933, 0.7663125997309213)
10.52751466016218)
Book2𝐵𝑜𝑜𝑘2Book2italic_B italic_o italic_o italic_k 2 exponential* (0.0007007909083517199, (8.531262287569952e-05,
114.52846083233646) 0.785822040345603)
Pareto (0.45694976715045765, (8.531262287581054e-05,
-2.3484869507168717, 0.7858220403442739)
2.349187741621745)
Book3𝐵𝑜𝑜𝑘3Book3italic_B italic_o italic_o italic_k 3 exponentiated Weibull * (2.388182477333179, (7.587900003558357e-05,
0.40846064327782183, 0.7629043858598745)
0.0006176602639155107,
8.883106480102926)
exponential (0.0006176602639155108, (7.587900003563908e-05,
92.6733864816965) 0.76290438585909)
Book4𝐵𝑜𝑜𝑘4Book4italic_B italic_o italic_o italic_k 4 exponential * (0.00024278816105826503, (9.143090021945799e-05,
111.43319977150223) 0.7129222750411135)
Pareto (0.4717388601895959, (9.143090021945799e-05,
-2.384176737481692, 0.7129222750411135)
2.3844195256376968)
Book5𝐵𝑜𝑜𝑘5Book5italic_B italic_o italic_o italic_k 5 exponential* (0.00036681097553665327, (7.877884394258405e-05,
78.55973558262349) 0.8129868745072603)
Pareto (0.4239939171603715, (7.877884394258405e-05,
-1.6836235103637094, 0.8129868745072603)
1.683990321337741)

Name of the books, the best and second best fit statistical distributions, the estimated parameters of the KS goodness-of-fit test, and the KS test two-sided statistic. * Statistics of the selected best fit test. See Table 3 for the name of statistical distributions and their probability density function (PDF).

Table 3: Statistical distributions and their probability density functions (PDF).
Name PDF
exponential f(x)=exp(x)𝑓𝑥𝑒𝑥𝑝𝑥f(x)=exp(-x)italic_f ( italic_x ) = italic_e italic_x italic_p ( - italic_x ), for x>=0𝑥0x>=0italic_x > = 0
Pareto f(x,b)=bxb+1𝑓𝑥𝑏𝑏superscript𝑥𝑏1f(x,b)=\frac{b}{x^{b+1}}italic_f ( italic_x , italic_b ) = divide start_ARG italic_b end_ARG start_ARG italic_x start_POSTSUPERSCRIPT italic_b + 1 end_POSTSUPERSCRIPT end_ARG, for x>=1𝑥1x>=1italic_x > = 1, b>0𝑏0b>0italic_b > 0
exponentiated Weibull f(x,α,c)=αc[1exp(xc)]α1exp(xc)xc1𝑓𝑥𝛼𝑐𝛼𝑐superscriptdelimited-[]1𝑒𝑥𝑝superscript𝑥𝑐𝛼1𝑒𝑥𝑝superscript𝑥𝑐superscript𝑥𝑐1f(x,\alpha,c)=\alpha c[1-exp(-x^{c})]^{\alpha-1}exp(-x^{c})x^{c-1}italic_f ( italic_x , italic_α , italic_c ) = italic_α italic_c [ 1 - italic_e italic_x italic_p ( - italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT ) ] start_POSTSUPERSCRIPT italic_α - 1 end_POSTSUPERSCRIPT italic_e italic_x italic_p ( - italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT ) italic_x start_POSTSUPERSCRIPT italic_c - 1 end_POSTSUPERSCRIPT, for x>0𝑥0x>0italic_x > 0, α>0𝛼0\alpha>0italic_α > 0, c>0𝑐0c>0italic_c > 0

Name of the statistical distributions and their PDF.