-
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning
Authors:
Masaya Kawamura,
Ryuichi Yamamoto,
Yuma Shirahata,
Takuya Hasumi,
Kentaro Tachibana
Abstract:
We introduce LibriTTS-P, a new corpus based on LibriTTS-R that includes utterance-level descriptions (i.e., prompts) of speaking style and speaker-level prompts of speaker characteristics. We employ a hybrid approach to construct prompt annotations: (1) manual annotations that capture human perceptions of speaker characteristics and (2) synthetic annotations on speaking style. Compared to existing…
▽ More
We introduce LibriTTS-P, a new corpus based on LibriTTS-R that includes utterance-level descriptions (i.e., prompts) of speaking style and speaker-level prompts of speaker characteristics. We employ a hybrid approach to construct prompt annotations: (1) manual annotations that capture human perceptions of speaker characteristics and (2) synthetic annotations on speaking style. Compared to existing English prompt datasets, our corpus provides more diverse prompt annotations for all speakers of LibriTTS-R. Experimental results for prompt-based controllable TTS demonstrate that the TTS model trained with LibriTTS-P achieves higher naturalness than the model using the conventional dataset. Furthermore, the results for style captioning tasks show that the model utilizing LibriTTS-P generates 2.5 times more accurate words than the model using a conventional dataset. Our corpus, LibriTTS-P, is available at https://github.com/line/LibriTTS-P.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Multichannel Audio Source Separation with Independent Deeply Learned Matrix Analysis Using Product of Source Models
Authors:
Takuya Hasumi,
Tomohiko Nakamura,
Norihiro Takamune,
Hiroshi Saruwatari,
Daichi Kitamura,
Yu Takahashi,
Kazunobu Kondo
Abstract:
Independent deeply learned matrix analysis (IDLMA) is one of the state-of-the-art multichannel audio source separation methods using the source power estimation based on deep neural networks (DNNs). The DNN-based power estimation works well for sounds having timbres similar to the DNN training data. However, the sounds to which IDLMA is applied do not always have such timbres, and the timbral mism…
▽ More
Independent deeply learned matrix analysis (IDLMA) is one of the state-of-the-art multichannel audio source separation methods using the source power estimation based on deep neural networks (DNNs). The DNN-based power estimation works well for sounds having timbres similar to the DNN training data. However, the sounds to which IDLMA is applied do not always have such timbres, and the timbral mismatch causes the performance degradation of IDLMA. To tackle this problem, we focus on a blind source separation counterpart of IDLMA, independent low-rank matrix analysis. It uses nonnegative matrix factorization (NMF) as the source model, which can capture source spectral components that only appear in the target mixture, using the low-rank structure of the source spectrogram as a clue. We thus extend the DNN-based source model to encompass the NMF-based source model on the basis of the product-of-expert concept, which we call the product of source models (PoSM). For the proposed PoSM-based IDLMA, we derive a computationally efficient parameter estimation algorithm based on an optimization principle called the majorization-minimization algorithm. Experimental evaluations show the effectiveness of the proposed method.
△ Less
Submitted 2 September, 2021;
originally announced September 2021.
-
Empirical Bayesian Independent Deeply Learned Matrix Analysis For Multichannel Audio Source Separation
Authors:
Takuya Hasumi,
Tomohiko Nakamura,
Norihiro Takamune,
Hiroshi Saruwatari,
Daichi Kitamura,
Yu Takahashi,
Kazunobu Kondo
Abstract:
Independent deeply learned matrix analysis (IDLMA) is one of the state-of-the-art supervised multichannel audio source separation methods. It blindly estimates the demixing filters on the basis of source independence, using the source model estimated by the deep neural network (DNN). However, since the ratios of the source to interferer signals vary widely among time-frequency (TF) slots, it is di…
▽ More
Independent deeply learned matrix analysis (IDLMA) is one of the state-of-the-art supervised multichannel audio source separation methods. It blindly estimates the demixing filters on the basis of source independence, using the source model estimated by the deep neural network (DNN). However, since the ratios of the source to interferer signals vary widely among time-frequency (TF) slots, it is difficult to obtain reliable estimated power spectrograms of sources at all TF slots. In this paper, we propose an IDLMA extension, empirical Bayesian IDLMA (EB-IDLMA), by introducing a prior distribution of source power spectrograms and treating the source power spectrograms as latent random variables. This treatment allows us to implicitly consider the reliability of the estimated source power spectrograms for the estimation of demixing filters through the hyperparameters of the prior distribution estimated by the DNN. Experimental evaluations show the effectiveness of EB-IDLMA and the importance of introducing the reliability of the estimated source power spectrograms.
△ Less
Submitted 7 June, 2021;
originally announced June 2021.
-
Characterization of intermittency in renewal processes: Application to earthquakes
Authors:
Takuma Akimoto,
Tomohiro Hasumi,
Yoji Aizawa
Abstract:
We construct a one-dimensional piecewise linear intermittent map from the interevent time distribution for a given renewal process. Then, we characterize intermittency by the asymptotic behavior near the indifferent fixed point in the piecewise linear intermittent map. Thus, we provide a new framework to understand a unified characterization of intermittency, and also present the Lyapunov expone…
▽ More
We construct a one-dimensional piecewise linear intermittent map from the interevent time distribution for a given renewal process. Then, we characterize intermittency by the asymptotic behavior near the indifferent fixed point in the piecewise linear intermittent map. Thus, we provide a new framework to understand a unified characterization of intermittency, and also present the Lyapunov exponent of renewal processes. This method is applied to the occurrence of earthquakes using the Japan Meteorological Agency (JMA) catalog. We demonstrate that interevent times are not independent and identically distributed random variables by analyzing the return map of interevent times, but that there is a systematic change in conditional probability distribution functions of interevent times.
△ Less
Submitted 1 July, 2009;
originally announced July 2009.
-
The Weibull - log Weibull transition of interoccurrence times for synthetic and natural earthquakes
Authors:
Tomohiro Hasumi,
Chien-chih Chen,
Takuma Akimoto,
Yoji Aizawa
Abstract:
We have studied interoccurrence time distributions by analyzing the synthetic and three natural catalogs of the Japan Meteorological Agency (JMA), the Southern California Earthquake Data Center (SCEDC), and Taiwan Central Weather Bureau (TCWB) and revealed the universal feature of the interoccurrence time statistics, Weibull - log Weibull transition. This transition reinforces the view that the…
▽ More
We have studied interoccurrence time distributions by analyzing the synthetic and three natural catalogs of the Japan Meteorological Agency (JMA), the Southern California Earthquake Data Center (SCEDC), and Taiwan Central Weather Bureau (TCWB) and revealed the universal feature of the interoccurrence time statistics, Weibull - log Weibull transition. This transition reinforces the view that the interoccurrence time statistics possess Weibull statistics and log-Weibull statistics. Here in this paper, the crossover magnitude from the superposition regime to the Weibull regime $m_c^2$ is proportional to the plate velocity. In addition, we have found the region-independent relation, $m_c^2/m_{max} = 0.54 \pm 0.004$.
△ Less
Submitted 20 August, 2008;
originally announced August 2008.
-
The Weibull - log Weibull Transition of the Inter-occurrence time statistics in the two-dimensional Burridge-Knopoff Earthquake model
Authors:
Tomohiro Hasumi,
Takuma Akimoto,
Yoji Aizawa
Abstract:
In analyzing synthetic earthquake catalogs created by a two-dimensional Burridge-Knopoff model, we have found that a probability distribution of the interoccurrence times, the time intervals between successive events, can be described clearly by the superposition of the Weibull distribution and the log-Weibull distribution. In addition, the interoccurrence time statistics depend on frictional pr…
▽ More
In analyzing synthetic earthquake catalogs created by a two-dimensional Burridge-Knopoff model, we have found that a probability distribution of the interoccurrence times, the time intervals between successive events, can be described clearly by the superposition of the Weibull distribution and the log-Weibull distribution. In addition, the interoccurrence time statistics depend on frictional properties and stiffness of a fault and exhibit the Weibull - log Weibull transition, which states that the distribution function changes from the log-Weibull regime to the Weibull regime when the threshold of magnitude is increased. We reinforce a new insight into this model; the model can be recognized as a mechanical model providing a framework of the Weibull - log Weibull transition.
△ Less
Submitted 12 December, 2008; v1 submitted 5 August, 2008;
originally announced August 2008.
-
Hypocenter interval statistics between successive earthquakes in the two-dimensional Burridge-Knopoff model
Authors:
Tomohiro Hasumi
Abstract:
We study statistical properties of spatial distances between successive earthquakes, the so-called hypocenter intervals, produced by a two-dimensional (2D) Burridge-Knopoff model involving stick-slip behavior. It is found that cumulative distributions of hypocenter intervals can be described by the $q$-exponential distributions with $q<1$, which is also observed in nature. The statistics depend…
▽ More
We study statistical properties of spatial distances between successive earthquakes, the so-called hypocenter intervals, produced by a two-dimensional (2D) Burridge-Knopoff model involving stick-slip behavior. It is found that cumulative distributions of hypocenter intervals can be described by the $q$-exponential distributions with $q<1$, which is also observed in nature. The statistics depend on a friction and stiffness parameters characterizing the model and a threshold of magnitude. The conjecture which states that $q_t+q_r \sim 2$, where $q_t$ and $q_r$ are an entropy index of time intervals and spatial intervals, respectively, can be reproduced semi-quantitatively. It is concluded that we provide a new perspective on the Burridge-Knopoff model which addresses that the model can be recognized as a realistic one in view of the reproduction of the spatio-temporal interval statistics of earthquakes on the basis of nonextensive statistical mechanics.
△ Less
Submitted 12 December, 2008; v1 submitted 24 July, 2008;
originally announced July 2008.
-
The Weibull - Log Weibull Distribution for Interoccurrence Times of Earthquakes
Authors:
Tomohiro Hasumi,
Takuma Akimoto,
Yoji Aizawa
Abstract:
By analyzing the Japan Meteorological Agency (JMA) seismic catalog for different tectonic settings, we have found that the probability distributions of time intervals between successive earthquakes --interoccurrence times-- can be described by the superposition of the Weibull distribution and the log-Weibull distribution. In particular, the distribution of large earthquakes obeys the Weibull dis…
▽ More
By analyzing the Japan Meteorological Agency (JMA) seismic catalog for different tectonic settings, we have found that the probability distributions of time intervals between successive earthquakes --interoccurrence times-- can be described by the superposition of the Weibull distribution and the log-Weibull distribution. In particular, the distribution of large earthquakes obeys the Weibull distribution with the exponent $α_1 <1$, indicating the fact that the sequence of large earthquakes is not a Poisson process. It is found that the ratio of the Weibull distribution to the probability distribution of the interoccurrence time gradually increases with increase in the threshold of magnitude. Our results infer that Weibull statistics and log-Weibull statistics coexist in the interoccurrence time statistics, and that the change of the distribution is considered as the change of the dominant distribution. In this case, the dominant distribution changes from the log-Weibull distribution to the Weibull distribution, allowing us to reinforce the view that the interoccurrence time exhibits the transition from the Weibull regime to the log-Weibull regime.
△ Less
Submitted 12 December, 2008; v1 submitted 3 July, 2008;
originally announced July 2008.
-
Statistical Properties of the Inter-occurrence Times in the Two-dimensional Stick-slip Model of Earthquakes
Authors:
Tomohiro Hasumi,
Yoji Aizawa
Abstract:
We study earthquake interval time statistics, paying special attention to inter-occurrence times in the two-dimensional (2D) stick-slip (block-slider) model. Inter-occurrence times are the time interval between successive earthquakes on all faults in a region. We select stiffness and friction parameters as tunable parameters because these physical quantities are considered as essential factors i…
▽ More
We study earthquake interval time statistics, paying special attention to inter-occurrence times in the two-dimensional (2D) stick-slip (block-slider) model. Inter-occurrence times are the time interval between successive earthquakes on all faults in a region. We select stiffness and friction parameters as tunable parameters because these physical quantities are considered as essential factors in describing fault dynamics. It is found that inter-occurrence time statistics depend on the parameters. Varying stiffness and friction parameters systematically, we optimize these parameters so as to reproduce the inter-occurrence time statistics in natural seismicity. For an optimal case, earthquakes produced by the model obey the Gutenberg-Richter law, which states that the magnitude-frequency distribution exhibits the power law with an exponent approximately unity.
△ Less
Submitted 3 January, 2008;
originally announced January 2008.
-
Interoccurrence time statistics in the two-dimensional Burridge-Knopoff earthquake model
Authors:
Tomohiro Hasumi
Abstract:
We have numerically investigated statistical properties of the so-called interoccurrence time or the waiting time, i.e., the time interval between successive earthquakes, based on the two-dimensional (2-D) spring-block (Burridge-Knopoff) model, selecting the velocity-weakening property as the constitutive friction law. The statistical properties of frequency distribution and the cumulative distr…
▽ More
We have numerically investigated statistical properties of the so-called interoccurrence time or the waiting time, i.e., the time interval between successive earthquakes, based on the two-dimensional (2-D) spring-block (Burridge-Knopoff) model, selecting the velocity-weakening property as the constitutive friction law. The statistical properties of frequency distribution and the cumulative distribution of the interoccurrence time are discussed by tuning the dynamical parameters, namely, a stiffness and frictional property of a fault. We optimize these model parameters to reproduce the interoccurrence time statistics in nature; the frequency and cumulative distribution can be described by the power law and Zipf-Mandelbrot type power law, respectively. In an optimal case, the b-value of the Gutenberg-Richter law and the ratio of wave propagation velocity are in agreement with those derived from real earthquakes. As the threshold of magnitude is increased, the interoccurrence time distribution tends to follow an exponential distribution. Hence it is suggested that a temporal sequence of earthquakes, aside from small-magnitude events, is a Poisson process, which is observed in nature. We found that the interoccurrence time statistics derived from the 2-D BK (original) model can efficiently reproduce that of real earthquakes, so that the model can be recognized as a realistic one in view of interoccurrence time statistics.
△ Less
Submitted 30 August, 2007; v1 submitted 2 August, 2007;
originally announced August 2007.