-
On the Statistical Analysis of the Multipath Propagation Model Parameters for Power Line Communications
Authors:
Alberto Pittolo,
Irene Povedano,
José A. Cortés,
Francisco J. Cañete,
Andrea M. Tonello
Abstract:
This paper proposes a fitting procedure that aims to identify the statistical properties of the parameters that describe the most widely known multipath propagation model (MPM) used in power line communication (PLC). Firstly, the MPM parameters are computed by fitting the theoretical model to a large database of single-input-single-output (SISO) experimental measurements, carried out in typical ho…
▽ More
This paper proposes a fitting procedure that aims to identify the statistical properties of the parameters that describe the most widely known multipath propagation model (MPM) used in power line communication (PLC). Firstly, the MPM parameters are computed by fitting the theoretical model to a large database of single-input-single-output (SISO) experimental measurements, carried out in typical home premises. Secondly, the determined parameters are substituted back into the MPM formulation with the aim to prove their faithfulness, thus validating the proposed computation procedure. Then, the MPM parameters properties have been evaluated. In particular, the statistical behavior is established identifying the best fitting distribution by comparing the most common distributions through the use of the likelihood function. Moreover, the relationship among the different paths is highlighted in terms of statistical correlation. The identified statistical behavior for the MPM parameters confirms the assumptions of the previous works that, however, were mostly established in an heuristic way.
△ Less
Submitted 2 May, 2024; v1 submitted 26 March, 2024;
originally announced March 2024.
-
Spanish Pre-trained BERT Model and Evaluation Data
Authors:
José Cañete,
Gabriel Chaperon,
Rodrigo Fuentes,
Jou-Hui Ho,
Ho** Kang,
Jorge Pérez
Abstract:
The Spanish language is one of the top 5 spoken languages in the world. Nevertheless, finding resources to train or evaluate Spanish language models is not an easy task. In this paper we help bridge this gap by presenting a BERT-based language model pre-trained exclusively on Spanish data. As a second contribution, we also compiled several tasks specifically for the Spanish language in a single re…
▽ More
The Spanish language is one of the top 5 spoken languages in the world. Nevertheless, finding resources to train or evaluate Spanish language models is not an easy task. In this paper we help bridge this gap by presenting a BERT-based language model pre-trained exclusively on Spanish data. As a second contribution, we also compiled several tasks specifically for the Spanish language in a single repository much in the spirit of the GLUE benchmark. By fine-tuning our pre-trained Spanish model, we obtain better results compared to other BERT-based models pre-trained on multilingual corpora for most of the tasks, even achieving a new state-of-the-art on some of them. We have publicly released our model, the pre-training data, and the compilation of the Spanish benchmarks.
△ Less
Submitted 5 August, 2023;
originally announced August 2023.
-
ALBETO and DistilBETO: Lightweight Spanish Language Models
Authors:
José Cañete,
Sebastián Donoso,
Felipe Bravo-Marquez,
Andrés Carvallo,
Vladimir Araujo
Abstract:
In recent years there have been considerable advances in pre-trained language models, where non-English language versions have also been made available. Due to their increasing use, many lightweight versions of these models (with reduced parameters) have also been released to speed up training and inference times. However, versions of these lighter models (e.g., ALBERT, DistilBERT) for languages o…
▽ More
In recent years there have been considerable advances in pre-trained language models, where non-English language versions have also been made available. Due to their increasing use, many lightweight versions of these models (with reduced parameters) have also been released to speed up training and inference times. However, versions of these lighter models (e.g., ALBERT, DistilBERT) for languages other than English are still scarce. In this paper we present ALBETO and DistilBETO, which are versions of ALBERT and DistilBERT pre-trained exclusively on Spanish corpora. We train several versions of ALBETO ranging from 5M to 223M parameters and one of DistilBETO with 67M parameters. We evaluate our models in the GLUES benchmark that includes various natural language understanding tasks in Spanish. The results show that our lightweight models achieve competitive results to those of BETO (Spanish-BERT) despite having fewer parameters. More specifically, our larger ALBETO model outperforms all other models on the MLDoc, PAWS-X, XNLI, MLQA, SQAC and XQuAD datasets. However, BETO remains unbeaten for POS and NER. As a further contribution, all models are publicly available to the community for future research.
△ Less
Submitted 25 January, 2023; v1 submitted 19 April, 2022;
originally announced April 2022.
-
Evaluation Benchmarks for Spanish Sentence Representations
Authors:
Vladimir Araujo,
Andrés Carvallo,
Souvik Kundu,
José Cañete,
Marcelo Mendoza,
Robert E. Mercer,
Felipe Bravo-Marquez,
Marie-Francine Moens,
Alvaro Soto
Abstract:
Due to the success of pre-trained language models, versions of languages other than English have been released in recent years. This fact implies the need for resources to evaluate these models. In the case of Spanish, there are few ways to systematically assess the models' quality. In this paper, we narrow the gap by building two evaluation benchmarks. Inspired by previous work (Conneau and Kiela…
▽ More
Due to the success of pre-trained language models, versions of languages other than English have been released in recent years. This fact implies the need for resources to evaluate these models. In the case of Spanish, there are few ways to systematically assess the models' quality. In this paper, we narrow the gap by building two evaluation benchmarks. Inspired by previous work (Conneau and Kiela, 2018; Chen et al., 2019), we introduce Spanish SentEval and Spanish DiscoEval, aiming to assess the capabilities of stand-alone and discourse-aware sentence representations, respectively. Our benchmarks include considerable pre-existing and newly constructed datasets that address different tasks from various domains. In addition, we evaluate and analyze the most recent pre-trained Spanish language models to exhibit their capabilities and limitations. As an example, we discover that for the case of discourse evaluation tasks, mBERT, a language model trained on multiple languages, usually provides a richer latent representation than models trained only with documents in Spanish. We hope our contribution will motivate a fairer, more comparable, and less cumbersome way to evaluate future Spanish language models.
△ Less
Submitted 15 April, 2022;
originally announced April 2022.
-
Some Formulae of Genocchi Polynomials of Higher Order
Authors:
Cristina B. Corcino,
Roberto B. Corcino,
Joy Ann A. Canete
Abstract:
In this paper, some formulae for Genoochi polynomials of higher order are derived using the fact that sets of Bernoulli and Euler polynomials of higher order form basis for the polynomial space.
In this paper, some formulae for Genoochi polynomials of higher order are derived using the fact that sets of Bernoulli and Euler polynomials of higher order form basis for the polynomial space.
△ Less
Submitted 7 September, 2020;
originally announced September 2020.
-
A $q$-Analogue of $r$-Whitney Numbers of the Second Kind and Its Hankel Transform
Authors:
Roberto B. Corcino,
Jay M. Ontolan,
Jennifer Cañete,
Mary Joy R. Latayada
Abstract:
A $q$-analogue of $r$-Whitney numbers of the second kind, denoted by $W_{m,r}[n,k]_q$, is defined by means of a triangular recurrence relation. In this paper, several fundamental properties for the $q$-analogue are established including other forms of recurrence relations, explicit formulas and generating functions. Moreover, a kind of Hankel transform for $W_{m,r}[n,k]_q$ is obtained.
A $q$-analogue of $r$-Whitney numbers of the second kind, denoted by $W_{m,r}[n,k]_q$, is defined by means of a triangular recurrence relation. In this paper, several fundamental properties for the $q$-analogue are established including other forms of recurrence relations, explicit formulas and generating functions. Moreover, a kind of Hankel transform for $W_{m,r}[n,k]_q$ is obtained.
△ Less
Submitted 22 July, 2019; v1 submitted 6 July, 2019;
originally announced July 2019.
-
Unveiling the Hyper-Rayleigh Regime of the Fluctuating Two-Ray Fading Model
Authors:
Celia Garcia-Corrales,
Unai Fernandez-Plazaola,
Francisco J. Cañete,
José F. Paris,
F. Javier Lopez-Martinez
Abstract:
The recently proposed Fluctuating Two-Ray (FTR) model is gaining momentum as a reference fading model in scenarios where two dominant specular waves are present. Despite the numerous research works devoted to the performance analysis under FTR fading, little attention has been paid to effectively understanding the interplay between the fading model parameters and the fading severity. According to…
▽ More
The recently proposed Fluctuating Two-Ray (FTR) model is gaining momentum as a reference fading model in scenarios where two dominant specular waves are present. Despite the numerous research works devoted to the performance analysis under FTR fading, little attention has been paid to effectively understanding the interplay between the fading model parameters and the fading severity. According to a new scale defined in this work, which measures the hyper-Rayleigh character of a fading channel in terms of the Amount of Fading, the outage probability and the average capacity, we see that the FTR fading model exhibits a full hyper-Rayleigh behavior. However, the Two-Wave with Diffuse Power fading model from which the former is derived has only strong hyper-Rayleigh behavior, which constitutes an interesting new insight. We also identify that the random fluctuations in the dominant specular waves are ultimately responsible for the full hyper-Rayleigh behavior of this class of fading channels.
△ Less
Submitted 30 April, 2019;
originally announced May 2019.
-
A Generalized Spectral Sha** Method for OFDM Signals
Authors:
Luis Díez,
José A. Cortés,
Francisco J. Cañete,
Eduardo Martos,
Salvador Iranzo
Abstract:
Orthogonal frequency division multiplexing (OFDM) signals with rectangularly windowed pulses exhibit low spectral confinement. Two approaches usually referred to as pulse-sha** and active interference cancellation (AIC) are classically employed to reduce the out-of-band emission (OOBE) without affecting the receiver. This paper proposes a spectral sha** method that generalizes and unifies thes…
▽ More
Orthogonal frequency division multiplexing (OFDM) signals with rectangularly windowed pulses exhibit low spectral confinement. Two approaches usually referred to as pulse-sha** and active interference cancellation (AIC) are classically employed to reduce the out-of-band emission (OOBE) without affecting the receiver. This paper proposes a spectral sha** method that generalizes and unifies these two strategies. To this end, the OFDM carriers are shaped with novel pulses, referred to as generalized pulses, that consist of the ones used in conventional OFDM systems plus a series of cancellation terms aimed at reducing the OOBE of the former. Hence, each generalized pulse embeds all the terms required to reduce its spectrum in the desired bands. This leads to a data-independent optimization problem that notably simplifies the implementation complexity and allows the analytical calculation of the resulting power spectral density (PSD), which in most methods found in the literature can only be estimated by means of simulations. As an example of its performance, the proposed technique allows complying with the stringent PSD mask imposed by the EN 50561-1 with a data carrier loss lower than 4%. By contrasts, 28% of the data carriers have to be nulled when pulse-sha** is employed in this scenario.
△ Less
Submitted 25 July, 2018;
originally announced July 2018.