-
BEIR-PL: Zero Shot Information Retrieval Benchmark for the Polish Language
Authors:
Konrad Wojtasik,
Vadim Shishkin,
Kacper Wołowiec,
Arkadiusz Janz,
Maciej Piasecki
Abstract:
The BEIR dataset is a large, heterogeneous benchmark for Information Retrieval (IR) in zero-shot settings, garnering considerable attention within the research community. However, BEIR and analogous datasets are predominantly restricted to the English language. Our objective is to establish extensive large-scale resources for IR in the Polish language, thereby advancing the research in this NLP ar…
▽ More
The BEIR dataset is a large, heterogeneous benchmark for Information Retrieval (IR) in zero-shot settings, garnering considerable attention within the research community. However, BEIR and analogous datasets are predominantly restricted to the English language. Our objective is to establish extensive large-scale resources for IR in the Polish language, thereby advancing the research in this NLP area. In this work, inspired by mMARCO and Mr.~TyDi datasets, we translated all accessible open IR datasets into Polish, and we introduced the BEIR-PL benchmark -- a new benchmark which comprises 13 datasets, facilitating further development, training and evaluation of modern Polish language models for IR tasks. We executed an evaluation and comparison of numerous IR models on the newly introduced BEIR-PL benchmark. Furthermore, we publish pre-trained open IR models for Polish language,d marking a pioneering development in this field. Additionally, the evaluation revealed that BM25 achieved significantly lower scores for Polish than for English, which can be attributed to high inflection and intricate morphological structure of the Polish language. Finally, we trained various re-ranking models to enhance the BM25 retrieval, and we compared their performance to identify their unique characteristic features. To ensure accurate model comparisons, it is necessary to scrutinise individual results rather than to average across the entire benchmark. Thus, we thoroughly analysed the outcomes of IR models in relation to each individual data subset encompassed by the BEIR benchmark. The benchmark data is available at URL {\bf https://huggingface.co/clarin-knext}.
△ Less
Submitted 16 May, 2024; v1 submitted 31 May, 2023;
originally announced May 2023.
-
ChatGPT: Jack of all trades, master of none
Authors:
Jan Kocoń,
Igor Cichecki,
Oliwier Kaszyca,
Mateusz Kochanek,
Dominika Szydło,
Joanna Baran,
Julita Bielaniewicz,
Marcin Gruza,
Arkadiusz Janz,
Kamil Kanclerz,
Anna Kocoń,
Bartłomiej Koptyra,
Wiktoria Mieleszczenko-Kowszewicz,
Piotr Miłkowski,
Marcin Oleksy,
Maciej Piasecki,
Łukasz Radliński,
Konrad Wojtasik,
Stanisław Woźniak,
Przemysław Kazienko
Abstract:
OpenAI has released the Chat Generative Pre-trained Transformer (ChatGPT) and revolutionized the approach in artificial intelligence to human-model interaction. Several publications on ChatGPT evaluation test its effectiveness on well-known natural language processing (NLP) tasks. However, the existing studies are mostly non-automated and tested on a very limited scale. In this work, we examined C…
▽ More
OpenAI has released the Chat Generative Pre-trained Transformer (ChatGPT) and revolutionized the approach in artificial intelligence to human-model interaction. Several publications on ChatGPT evaluation test its effectiveness on well-known natural language processing (NLP) tasks. However, the existing studies are mostly non-automated and tested on a very limited scale. In this work, we examined ChatGPT's capabilities on 25 diverse analytical NLP tasks, most of them subjective even to humans, such as sentiment analysis, emotion recognition, offensiveness, and stance detection. In contrast, the other tasks require more objective reasoning like word sense disambiguation, linguistic acceptability, and question answering. We also evaluated GPT-4 model on five selected subsets of NLP tasks. We automated ChatGPT and GPT-4 prompting process and analyzed more than 49k responses. Our comparison of its results with available State-of-the-Art (SOTA) solutions showed that the average loss in quality of the ChatGPT model was about 25% for zero-shot and few-shot evaluation. For GPT-4 model, a loss for semantic tasks is significantly lower than for ChatGPT. We showed that the more difficult the task (lower SOTA performance), the higher the ChatGPT loss. It especially refers to pragmatic NLP problems like emotion recognition. We also tested the ability to personalize ChatGPT responses for selected subjective tasks via Random Contextual Few-Shot Personalization, and we obtained significantly better user-based predictions. Additional qualitative analysis revealed a ChatGPT bias, most likely due to the rules imposed on human trainers by OpenAI. Our results provide the basis for a fundamental discussion of whether the high quality of recent predictive NLP models can indicate a tool's usefulness to society and how the learning and validation procedures for such systems should be established.
△ Less
Submitted 9 June, 2023; v1 submitted 21 February, 2023;
originally announced February 2023.
-
This is the way: designing and compiling LEPISZCZE, a comprehensive NLP benchmark for Polish
Authors:
Łukasz Augustyniak,
Kamil Tagowski,
Albert Sawczyn,
Denis Janiak,
Roman Bartusiak,
Adrian Szymczak,
Marcin Wątroba,
Arkadiusz Janz,
Piotr Szymański,
Mikołaj Morzy,
Tomasz Kajdanowicz,
Maciej Piasecki
Abstract:
The availability of compute and data to train larger and larger language models increases the demand for robust methods of benchmarking the true progress of LM training. Recent years witnessed significant progress in standardized benchmarking for English. Benchmarks such as GLUE, SuperGLUE, or KILT have become de facto standard tools to compare large language models. Following the trend to replica…
▽ More
The availability of compute and data to train larger and larger language models increases the demand for robust methods of benchmarking the true progress of LM training. Recent years witnessed significant progress in standardized benchmarking for English. Benchmarks such as GLUE, SuperGLUE, or KILT have become de facto standard tools to compare large language models. Following the trend to replicate GLUE for other languages, the KLEJ benchmark has been released for Polish. In this paper, we evaluate the progress in benchmarking for low-resourced languages. We note that only a handful of languages have such comprehensive benchmarks. We also note the gap in the number of tasks being evaluated by benchmarks for resource-rich English/Chinese and the rest of the world. In this paper, we introduce LEPISZCZE (the Polish word for glew, the Middle English predecessor of glue), a new, comprehensive benchmark for Polish NLP with a large variety of tasks and high-quality operationalization of the benchmark. We design LEPISZCZE with flexibility in mind. Including new models, datasets, and tasks is as simple as possible while still offering data versioning and model tracking. In the first run of the benchmark, we test 13 experiments (task and dataset pairs) based on the five most recent LMs for Polish. We use five datasets from the Polish benchmark and add eight novel datasets. As the paper's main contribution, apart from LEPISZCZE, we provide insights and experiences learned while creating the benchmark for Polish as the blueprint to design similar benchmarks for other low-resourced languages.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
Photoluminescence features and nonlinear-optical properties of the Ag0.05Ga0.05Ge0.95S2eEr2S3 glasses
Authors:
V. V. Halyan,
V. O. Yukhymchuk,
Ye. G. Gule,
K. Ozga,
K. J. Jedryka,
I. A. Ivashchenko,
M. A. Skoryk,
A. H. Kevshyn,
I. D. Olekseyuk,
P. V. Tishchenko,
M. V. Shevchuk,
M. Piasecki
Abstract:
Preparation technology, the structure determination, multiband luminescence and nonlinear optical properties of the chalcogenide glasses are subject of present work. The glass samples with two different Er content were prepared by classical two-stage melt-quenching method. Glass state and morphology were confirmed by X-ray and EDS techniques. Influence of the Erbium do** on the luminescence and…
▽ More
Preparation technology, the structure determination, multiband luminescence and nonlinear optical properties of the chalcogenide glasses are subject of present work. The glass samples with two different Er content were prepared by classical two-stage melt-quenching method. Glass state and morphology were confirmed by X-ray and EDS techniques. Influence of the Erbium do** on the luminescence and NLO properties was investigated.
△ Less
Submitted 8 January, 2020;
originally announced January 2020.
-
On some difficulties in the addition of trapezoidal ordered fuzzy numbers
Authors:
Anna Łyczkowska-Hanćkowiak,
Krzysztof Piasecki
Abstract:
At the first, we revise the Kosinski definition of the sum of ordered fuzzy numbers. The associativity of revised sum is investigated here. In addition, we show that the multiple revised sum of finite sequence of trapezoidal ordered fuzzy numbers depends on its summands ordering.
At the first, we revise the Kosinski definition of the sum of ordered fuzzy numbers. The associativity of revised sum is investigated here. In addition, we show that the multiple revised sum of finite sequence of trapezoidal ordered fuzzy numbers depends on its summands ordering.
△ Less
Submitted 10 October, 2017;
originally announced October 2017.
-
WordNet2Vec: Corpora Agnostic Word Vectorization Method
Authors:
Roman Bartusiak,
Łukasz Augustyniak,
Tomasz Kajdanowicz,
Przemysław Kazienko,
Maciej Piasecki
Abstract:
A complex nature of big data resources demands new methods for structuring especially for textual content. WordNet is a good knowledge source for comprehensive abstraction of natural language as its good implementations exist for many languages. Since WordNet embeds natural language in the form of a complex network, a transformation mechanism WordNet2Vec is proposed in the paper. It creates vector…
▽ More
A complex nature of big data resources demands new methods for structuring especially for textual content. WordNet is a good knowledge source for comprehensive abstraction of natural language as its good implementations exist for many languages. Since WordNet embeds natural language in the form of a complex network, a transformation mechanism WordNet2Vec is proposed in the paper. It creates vectors for each word from WordNet. These vectors encapsulate general position - role of a given word towards all other words in the natural language. Any list or set of such vectors contains knowledge about the context of its component within the whole language. Such word representation can be easily applied to many analytic tasks like classification or clustering. The usefulness of the WordNet2Vec method was demonstrated in sentiment analysis, i.e. classification with transfer learning for the real Amazon opinion textual dataset.
△ Less
Submitted 10 June, 2016;
originally announced June 2016.
-
Black-Litterman model with intuitionistic fuzzy posterior return
Authors:
Krzysztof Echaust,
Krzysztof Piasecki
Abstract:
The main objective is to present a some variant of the Black - Litterman model. We consider the canonical case when priori return is determined by means such excess return from the CAPM market portfolio which is derived using reverse optimization method. Then the a priori return is at risk quantified uncertainty. On the side, intensive discussion shows that the experts' views are under knightian u…
▽ More
The main objective is to present a some variant of the Black - Litterman model. We consider the canonical case when priori return is determined by means such excess return from the CAPM market portfolio which is derived using reverse optimization method. Then the a priori return is at risk quantified uncertainty. On the side, intensive discussion shows that the experts' views are under knightian uncertainty. For this reason, we propose such variant of the Black - Litterman model in which the experts' views are described as intuitionistic fuzzy number. The existence of posterior return is proved for this case.We show that then posterior return is an intuitionistic fuzzy probabilistic set.
△ Less
Submitted 3 January, 2016;
originally announced January 2016.
-
Design and Performance of the ARIANNA Hexagonal Radio Array Systems
Authors:
S. W. Barwick,
E. C. Berg,
D. Z. Besson,
E. Cheim,
T. Duffin,
J. C. Hanson,
S. R. Klein,
S. A. Kleinfelder,
T. Prakash,
M. Piasecki,
K. Ratzlaff,
C. Reed,
M. Roumi,
A. Samanta,
T. Stezelberger,
J. Tatar,
J. Walker,
R. Young,
L. Zou
Abstract:
We report on the development, installation and operation of the first three of seven stations deployed at the ARIANNA site's pilot Hexagonal Radio Array in Antarctica. The primary goal of the ARIANNA project is to observe ultra-high energy (>100 PeV) cosmogenic neutrino signatures using a large array of autonomous stations each dispersed 1 km apart on the surface of the Ross Ice Shelf. Sensing rad…
▽ More
We report on the development, installation and operation of the first three of seven stations deployed at the ARIANNA site's pilot Hexagonal Radio Array in Antarctica. The primary goal of the ARIANNA project is to observe ultra-high energy (>100 PeV) cosmogenic neutrino signatures using a large array of autonomous stations each dispersed 1 km apart on the surface of the Ross Ice Shelf. Sensing radio emissions of 100 MHz to 1 GHz, each station in the array contains RF antennas, amplifiers, 1.92 G-sample/s, 850 MHz bandwidth signal acquisition circuitry, pattern-matching trigger capabilities, an embedded CPU, 32 GB of solid-state data storage, and long-distance wireless and satellite communications. Power is provided by the sun and LiFePO4 storage batteries, and the stations consume an average of 7W of power. Operation on solar power has resulted in >=58% per calendar-year live-time. The station's pattern-trigger capabilities reduce the trigger rates to a few milli-Hertz with 4-sigma thresholds while retaining good stability and high efficiency for neutrino signals. The timing resolution of the station has been found to be 0.049 ps, RMS, and the angular precision of event reconstructions of signals bounced off of the sea-ice interface of the Ross Ice Shelf ranged from 0.14 to 0.17 degrees. A new fully-synchronous 2+ G-sample/s, 1.5 GHz bandwidth 4-channel signal acquisition chip with deeper memory and flexible >600 MHz, <1 mV RMS sensitivity triggering has been designed and incorporated into a single-board data acquisition and control system that uses an average of only 1.7W of power. Along with updated amplifiers, these new systems are expected to be deployed during the 2014-2015 Austral summer to complete the Hexagonal Radio Array.
△ Less
Submitted 27 October, 2014;
originally announced October 2014.
-
Time Domain Response of the ARIANNA Detector
Authors:
S. W. Barwick,
E. C. Berg,
D. Z. Besson,
J. C. Hanson,
S. R. Klein,
S. A. Kleinfelder,
M. Piasecki,
K. Ratzlaff,
C. Reed,
M. Roumi,
T. Stezelberger,
J. Tatar,
J. Walker,
R. Young,
L. Zou
Abstract:
The Antarctic Ross Ice Shelf Antenna Neutrino Array (ARIANNA) is a high-energy neutrino detector designed to record the Askaryan electric field signature of cosmogenic neutrino interactions in ice. To understand the inherent radio-frequency (RF) neutrino signature, the time-domain response of the ARIANNA RF receiver must be measured. ARIANNA uses Create CLP5130-2N log-periodic dipole arrays (LPDAs…
▽ More
The Antarctic Ross Ice Shelf Antenna Neutrino Array (ARIANNA) is a high-energy neutrino detector designed to record the Askaryan electric field signature of cosmogenic neutrino interactions in ice. To understand the inherent radio-frequency (RF) neutrino signature, the time-domain response of the ARIANNA RF receiver must be measured. ARIANNA uses Create CLP5130-2N log-periodic dipole arrays (LPDAs). The associated effective height operator converts incident electric fields to voltage waveforms at the LDPA terminals. The effective height versus time and incident angle was measured, along with the associated response of the ARIANNA RF amplifier. The results are verified by correlating to field measurements in air and ice, using oscilloscopes. Finally, theoretical models for the Askaryan electric field are combined with the detector response to predict the neutrino signature.
△ Less
Submitted 20 October, 2014; v1 submitted 3 June, 2014;
originally announced June 2014.
-
Behavioural present value
Authors:
Krzysztof Piasecki
Abstract:
Impact of chosen behavioural factors on imprecision of present value is discussed here. The formal model of behavioural present value is offered as a result of this discussion. Behavioural present value is described here by fuzzy set. These considerations were illustrated by means of extensive numerical case study. Finally there are shown that in proposed model the return rate is given, as a fuzzy…
▽ More
Impact of chosen behavioural factors on imprecision of present value is discussed here. The formal model of behavioural present value is offered as a result of this discussion. Behavioural present value is described here by fuzzy set. These considerations were illustrated by means of extensive numerical case study. Finally there are shown that in proposed model the return rate is given, as a fuzzy probabilistic set.
△ Less
Submitted 3 February, 2013;
originally announced February 2013.
-
On return rate implied by behavioural present value
Authors:
Krzysztof Piasecki
Abstract:
The future value of a security is described as a random variable. Distribution of this random variable is the formal image of risk uncertainty. On the other side, any present value is defined as a value equivalent to the given future value. This equivalence relationship is a subjective. Thus follows, that present value is described as a fuzzy number, which is depend on the investor's susceptibilit…
▽ More
The future value of a security is described as a random variable. Distribution of this random variable is the formal image of risk uncertainty. On the other side, any present value is defined as a value equivalent to the given future value. This equivalence relationship is a subjective. Thus follows, that present value is described as a fuzzy number, which is depend on the investor's susceptibility to behavioural factors. All above reasons imply, that return rate is given as a fuzzy probabilistic set. The basic properties of such image of return rate are studied. At the last the set of effective securities is distinguished as a fuzzy set.
△ Less
Submitted 3 February, 2013;
originally announced February 2013.
-
Basis of financial arithmetic from the viewpoint of the utility theory
Authors:
Krzysztof Piasecki
Abstract:
The main goal of this paper is presentation a modern axiomatic approach to financial arithmetic. At the first, the axiomatic financial arithmetic theory was proposed by Peccati who has introduced the axiomatic definition of the future value. This theory has been extensively developed in past years. Proposed approach to financial arithmetic is based on the financial flow utility concept. This utili…
▽ More
The main goal of this paper is presentation a modern axiomatic approach to financial arithmetic. At the first, the axiomatic financial arithmetic theory was proposed by Peccati who has introduced the axiomatic definition of the future value. This theory has been extensively developed in past years. Proposed approach to financial arithmetic is based on the financial flow utility concept. This utility function is defined as linear extension of multicriteria comparison determined by the time preference and the capital preference. Then the present value is equal to financial flow utility. Therefore, the law of diminishing marginal wealth utility has been considered as additional feature of the present value. The future value is defined as the inverse of utility function. This definition is a generalization of the Peccati one. The net present value is given as the unique additive extension of financial flow utility. Moreover, the synergy effect and the diversification effect will be discussed. At the end, the axiomatic present value definition will be specified in three ways.
△ Less
Submitted 3 February, 2013;
originally announced February 2013.
-
Black hole masses from power density spectra: determinations and consequences
Authors:
B. Czerny,
M. Nikolajuk,
M. Piasecki,
J. Kuraszkiewicz
Abstract:
We analyze the scaling of the X-ray power density spectra with the mass of the black hole on the example of Cyg X-1 and Seyfert 1 galaxy NGC 5548. We show that the high frequency tail of the power density spectrum can be successfully used for determination of the black hole mass. We determine the masses of the black holes in 6 Broad Line Seyfert 1 galaxies, 5 Narrow Line Seyfert 1 galaxies and t…
▽ More
We analyze the scaling of the X-ray power density spectra with the mass of the black hole on the example of Cyg X-1 and Seyfert 1 galaxy NGC 5548. We show that the high frequency tail of the power density spectrum can be successfully used for determination of the black hole mass. We determine the masses of the black holes in 6 Broad Line Seyfert 1 galaxies, 5 Narrow Line Seyfert 1 galaxies and two QSOs using available power density spectra. The proposed scaling is clearly appropriate for other Seyfert galaxies and QSOs. In all but 1 normal Seyferts the resulting luminosity to the Eddington luminosity ratio is smaller than 0.15, with a source MCG -6-15-30 being an exception. The applicability of the same scaling to Narrow Line Seyfert 1 is less clear and there may be a systematic shift between the power spectra of NLS1 and S1 galaxies of the same mass, leading to underestimation of the black hole mass. However, both the method based on variability and the method based on spectral fitting show that those galaxies have relatively low masses and high luminosity to the Eddington luminosity ratio, supporting the view of those objects as analogs of galactic sources in their high/soft or very high state based on the overall spectral shape. Bulge masses of their host galaxies are similar to normal Seyfert galaxies so they do not follow the black hole mass-bulge mass relation for Seyfert galaxies, being evolutionary less advanced, as suggested by Mathur (2000). The bulge mass-black hole mass relation in our sample is consistent with being linear, with black hole to bulge ratio $\sim$ 0.03 %, similar to Wandel (1999) and Laor (1998, 2001) for low mass objects but significantly shifted from the relation of Magorrian et al. (1998) and McLure & Dunlop (2000).
△ Less
Submitted 19 March, 2001; v1 submitted 13 September, 2000;
originally announced September 2000.