-
From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data
Authors:
Zheyang Xiong,
Vasilis Papageorgiou,
Kangwook Lee,
Dimitris Papailiopoulos
Abstract:
Recent studies have shown that Large Language Models (LLMs) struggle to accurately retrieve information and maintain reasoning capabilities when processing long-context inputs. To address these limitations, we propose a finetuning approach utilizing a carefully designed synthetic dataset comprising numerical key-value retrieval tasks. Our experiments on models like GPT-3.5 Turbo and Mistral 7B dem…
▽ More
Recent studies have shown that Large Language Models (LLMs) struggle to accurately retrieve information and maintain reasoning capabilities when processing long-context inputs. To address these limitations, we propose a finetuning approach utilizing a carefully designed synthetic dataset comprising numerical key-value retrieval tasks. Our experiments on models like GPT-3.5 Turbo and Mistral 7B demonstrate that finetuning LLMs on this dataset significantly improves LLMs' information retrieval and reasoning capabilities in longer-context settings. We present an analysis of the finetuned models, illustrating the transfer of skills from synthetic to real task evaluations (e.g., $10.5\%$ improvement on $20$ documents MDQA at position $10$ for GPT-3.5 Turbo). We also find that finetuned LLMs' performance on general benchmarks remains almost constant while LLMs finetuned on other baseline long-context augmentation data can encourage hallucination (e.g., on TriviaQA, Mistral 7B finetuned on our synthetic data cause no performance drop while other baseline data can cause a drop that ranges from $2.33\%$ to $6.19\%$). Our study highlights the potential of finetuning on synthetic data for improving the performance of LLMs on longer-context tasks.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
A semi-periodic initial-value problem for the Kadomtsev-Petviashvili II equation
Authors:
P. Kalamvokas,
V. G. Papageorgiou,
A. S. Fokas,
L. -Y. Sung
Abstract:
We investigate the Cauchy problem on the cylinder, namely the semi-periodic problem where there is periodicity in the $x$-direction and decay in the $y$-direction, for the Kadomtsev-Petviashvili II equation by the inverse spectral transform method. For initial data with small $L^1$ and $L^2$ norms, assuming the zero mass constraint, this initial-value problem is reduced to a Riemann-Hilbert proble…
▽ More
We investigate the Cauchy problem on the cylinder, namely the semi-periodic problem where there is periodicity in the $x$-direction and decay in the $y$-direction, for the Kadomtsev-Petviashvili II equation by the inverse spectral transform method. For initial data with small $L^1$ and $L^2$ norms, assuming the zero mass constraint, this initial-value problem is reduced to a Riemann-Hilbert problem on the boundary of certain infinite strips with shift.
△ Less
Submitted 19 March, 2023;
originally announced March 2023.
-
A convolutional neural network of low complexity for tumor anomaly detection
Authors:
Vasileios E. Papageorgiou,
Pantelis Dogoulis,
Dimitrios-Panagiotis Papageorgiou
Abstract:
The automated detection of cancerous tumors has attracted interest mainly during the last decade, due to the necessity of early and efficient diagnosis that will lead to the most effective possible treatment of the impending risk. Several machine learning and artificial intelligence methodologies has been employed aiming to provide trustworthy hel** tools that will contribute efficiently to this…
▽ More
The automated detection of cancerous tumors has attracted interest mainly during the last decade, due to the necessity of early and efficient diagnosis that will lead to the most effective possible treatment of the impending risk. Several machine learning and artificial intelligence methodologies has been employed aiming to provide trustworthy hel** tools that will contribute efficiently to this attempt. In this article, we present a low-complexity convolutional neural network architecture for tumor classification enhanced by a robust image augmentation methodology. The effectiveness of the presented deep learning model has been investigated based on 3 datasets containing brain, kidney and lung images, showing remarkable diagnostic efficiency with classification accuracies of 99.33%, 100% and 99.7% for the 3 datasets respectively. The impact of the augmentation preprocessing step has also been extensively examined using 4 evaluation measures. The proposed low-complexity scheme, in contrast to other models in the literature, renders our model quite robust to cases of overfitting that typically accompany small datasets frequently encountered in medical classification challenges. Finally, the model can be easily re-trained in case additional volume images are included, as its simplistic architecture does not impose a significant computational burden.
△ Less
Submitted 12 October, 2023; v1 submitted 24 January, 2023;
originally announced January 2023.
-
Seeking the Truth Beyond the Data. An Unsupervised Machine Learning Approach
Authors:
Dimitrios Saligkaras,
Vasileios E. Papageorgiou
Abstract:
Clustering is an unsupervised machine learning methodology where unlabeled elements/objects are grouped together aiming to the construction of well-established clusters that their elements are classified according to their similarity. The goal of this process is to provide a useful aid to the researcher that will help her/him to identify patterns among the data. Dealing with large databases, such…
▽ More
Clustering is an unsupervised machine learning methodology where unlabeled elements/objects are grouped together aiming to the construction of well-established clusters that their elements are classified according to their similarity. The goal of this process is to provide a useful aid to the researcher that will help her/him to identify patterns among the data. Dealing with large databases, such patterns may not be easily detectable without the contribution of a clustering algorithm. This article provides a deep description of the most widely used clustering methodologies accompanied by useful presentations concerning suitable parameter selection and initializations. Simultaneously, this article not only represents a review highlighting the major elements of examined clustering techniques but emphasizes the comparison of these algorithms' clustering efficiency based on 3 datasets, revealing their existing weaknesses and capabilities through accuracy and complexity, during the confrontation of discrete and continuous observations. The produced results help us extract valuable conclusions about the appropriateness of the examined clustering techniques in accordance with the dataset's size.
△ Less
Submitted 19 October, 2023; v1 submitted 14 July, 2022;
originally announced July 2022.
-
An improved Epidemiological-Unscented Kalman Filter (Hybrid SEIHCRDV-UKF) model for the prediction of COVID-19. Application on real-time data
Authors:
Vasileios E. Papageorgiou,
George Tsaklidis
Abstract:
The prevalence of COVID-19 has been the most serious health challenge of the 21th century to date, concerning national health systems on a daily basis, since December 2019 when it appeared in Wuhan City. Nevertheless, most of the proposed mathematical methodologies aiming to describe the dynamics of an epidemic, rely on deterministic models that are not able to reflect the true nature of its sprea…
▽ More
The prevalence of COVID-19 has been the most serious health challenge of the 21th century to date, concerning national health systems on a daily basis, since December 2019 when it appeared in Wuhan City. Nevertheless, most of the proposed mathematical methodologies aiming to describe the dynamics of an epidemic, rely on deterministic models that are not able to reflect the true nature of its spread. In this paper, we propose a SEIHCRDV model - an extension/improvement of the classic SIR compartmental model - which also takes into consideration the populations of exposed, hospitalized, admitted in intensive care units (ICU), deceased and vaccinated cases, in combination with an unscented Kalman filter (UKF), providing a dynamic estimation of the time dependent system's parameters. The stochastic approach is considered necessary, as both observations and system equations are characterized by uncertainties. Apparently, this new consideration is useful for examining various pandemics more effectively. The reliability of the model is examined on the daily recordings of COVID-19 in France, over a long period of 265 days. Two major waves of infection are observed, starting in January 2021, which signified the start of vaccinations in Europe providing quite encouraging predictive performance, based on the produced NRMSE values. Special emphasis is placed on proving the non-negativity of SEIHCRDV model, achieving a representative basic reproductive number R0 and demonstrating the existence and stability of disease equilibria according to the formula produced to estimate R0. The model outperforms in predictive ability not only deterministic approaches but also state-of-the-art stochastic models that employ Kalman filters.
△ Less
Submitted 29 November, 2022; v1 submitted 3 July, 2022;
originally announced July 2022.
-
Analysis of Digitalized ECG Signals Based on Artificial Intelligence and Spectral Analysis Methods Specialized in ARVC
Authors:
Vasileios E. Papageorgiou,
Thomas Zegkos,
Georgios Efthimiadis,
George Tsaklidis
Abstract:
Arrhythmogenic right ventricular cardiomyopathy (ARVC) is an inherited heart muscle disease that appears between the second and forth decade of a patient's life, being responsible for 20% of sudden cardiac deaths before the age of 35. The effective and punctual diagnosis of this disease based on Electrocardiograms (ECGs) could have a vital role in reducing premature cardiovascular mortality. In ou…
▽ More
Arrhythmogenic right ventricular cardiomyopathy (ARVC) is an inherited heart muscle disease that appears between the second and forth decade of a patient's life, being responsible for 20% of sudden cardiac deaths before the age of 35. The effective and punctual diagnosis of this disease based on Electrocardiograms (ECGs) could have a vital role in reducing premature cardiovascular mortality. In our analysis, we firstly outline the digitalization process of paper - based ECG signals enhanced by a spatial filter aiming to eliminate dark regions in the dataset's images that do not correspond to ECG waveform, producing undesirable noise. Next, we propose the utilization of a low - complexity convolutional neural network for the detection of an arrhythmogenic heart disease, that has not been studied through the usage of deep learning methodology to date, achieving high classification accuracy, namely 99.98% training and 98.6% testing accuracy, on a disease the major identification criterion of which are infinitesimal millivolt variations in the ECG's morphology, in contrast with other arrhythmogenic abnormalities. Finally, by performing spectral analysis we investigate significant differentiations in the field of frequencies between normal ECGs and ECGs corresponding to patients suffering from ARVC. In 16 out of the 18 frequencies where we encounter statistically significant differentiations, the normal ECGs are characterized by greater normalized amplitudes compared to the abnormal ones. The overall research carried out in this article highlights the importance of integrating mathematical methods into the examination and effective diagnosis of various diseases, aiming to a substantial contribution to their successful treatment.
△ Less
Submitted 3 September, 2022; v1 submitted 28 February, 2022;
originally announced March 2022.
-
Tetrahedron maps and symmetries of three dimensional integrable discrete equations
Authors:
Pavlos Kassotakis,
Maciej Nieszporski,
Vassilios Papageorgiou,
Anastasios Tongas
Abstract:
A relationship between the tetrahedron equation for maps and the consistency property of integrable discrete equations on $\mathbb{Z}^3$ is investigated. Our approach is a generalization of a method developed in the context of Yang-Baxter maps, based on the invariants of symmetry groups of the lattice equations. The method is demonstrated by a case-by-case analysis of the octahedron type lattice e…
▽ More
A relationship between the tetrahedron equation for maps and the consistency property of integrable discrete equations on $\mathbb{Z}^3$ is investigated. Our approach is a generalization of a method developed in the context of Yang-Baxter maps, based on the invariants of symmetry groups of the lattice equations. The method is demonstrated by a case-by-case analysis of the octahedron type lattice equations classified recently, leading to some new examples of tetrahedron maps and integrable coupled lattice equations.
△ Less
Submitted 8 August, 2019;
originally announced August 2019.
-
Integrable multi-component difference systems of equations
Authors:
Pavlos Kassotakis,
Maciej Nieszporski,
Vassilios Papageorgiou,
Anastasios Tongas
Abstract:
We present two lists of multi-component systems of integrable difference equations defined on the edges of a $\mathbb{Z}^2$ graph. The integrability of these systems is manifested by their Lax formulation which is a consequence of the multi-dimensional compatibility of these systems. Imposing constraints consistent with the systems of difference equations, we recover known integrable quad-equation…
▽ More
We present two lists of multi-component systems of integrable difference equations defined on the edges of a $\mathbb{Z}^2$ graph. The integrability of these systems is manifested by their Lax formulation which is a consequence of the multi-dimensional compatibility of these systems. Imposing constraints consistent with the systems of difference equations, we recover known integrable quad-equations including the discrete version of the Krichever-Novikov equation. The systems of difference equations allow us for a straightforward reformulation as Yang-Baxter maps. Certain two-component systems of equation defined on the vertices of a $\mathbb{Z}^2$ lattice, their non-potential form and integrable equations defined on 5-point stencils, are also obtained.
△ Less
Submitted 6 August, 2019;
originally announced August 2019.
-
3D compatible ternary systems and Yang-Baxter maps
Authors:
Theodoros E. Kouloukas,
Vassilios G. Papageorgiou
Abstract:
According to Shibukawa, ternary systems defined on quasigroups and satisfying certain conditions provide a way of constructing dynamical Yang-Baxter maps. After noticing that these conditions can be interpreted as 3-dimensional compatibility of equations on quad-graphs, we investigate when the associated dynamical Yang-Baxter maps are in fact parametric Yang-Baxter maps. In some cases these maps c…
▽ More
According to Shibukawa, ternary systems defined on quasigroups and satisfying certain conditions provide a way of constructing dynamical Yang-Baxter maps. After noticing that these conditions can be interpreted as 3-dimensional compatibility of equations on quad-graphs, we investigate when the associated dynamical Yang-Baxter maps are in fact parametric Yang-Baxter maps. In some cases these maps can be obtained as reductions of higher dimensional maps through compatible constraints. Conversely, parametric YB maps on quasigroups with an invariance condition give rise to 3-dimensional compatible systems. The application of this method on spaces with certain quasigroup structures provides new examples of multi-parametric YB maps and 3-dimensional compatible systems.
△ Less
Submitted 26 February, 2013; v1 submitted 9 March, 2012;
originally announced March 2012.
-
Poisson Yang-Baxter maps with binomial Lax matrices
Authors:
Theodoros E. Kouloukas,
Vassilios G. Papageorgiou
Abstract:
A construction of multidimensional parametric Yang-Baxter maps is presented. The corresponding Lax matrices are the symplectic leaves of first degree matrix polynomials equipped with the Sklyanin bracket. These maps are symplectic with respect to the reduced symplectic structure on these leaves and provide examples of integrable map**s. An interesting family of quadrirational symplectic YB maps…
▽ More
A construction of multidimensional parametric Yang-Baxter maps is presented. The corresponding Lax matrices are the symplectic leaves of first degree matrix polynomials equipped with the Sklyanin bracket. These maps are symplectic with respect to the reduced symplectic structure on these leaves and provide examples of integrable map**s. An interesting family of quadrirational symplectic YB maps on $\mathbb{C}^4 \times \mathbb{C}^4$ with $3\times 3$ Lax matrices is also presented.
△ Less
Submitted 4 June, 2011; v1 submitted 1 June, 2011;
originally announced June 2011.
-
Entwining Yang-Baxter maps and integrable lattices
Authors:
Theodoros E. Kouloukas,
Vassilios G. Papageorgiou
Abstract:
Yang-Baxter (YB) map systems (or set-theoretic analoga of entwining YB structures) are presented. They admit zero curvature representations with spectral parameter depended Lax triples L1, L2, L3 derived from symplectic leaves of 2 x 2 binomial matrices equipped with the Sklyanin bracket. A unique factorization condition of the Lax triple implies a 3-dimensional compatibility property of these map…
▽ More
Yang-Baxter (YB) map systems (or set-theoretic analoga of entwining YB structures) are presented. They admit zero curvature representations with spectral parameter depended Lax triples L1, L2, L3 derived from symplectic leaves of 2 x 2 binomial matrices equipped with the Sklyanin bracket. A unique factorization condition of the Lax triple implies a 3-dimensional compatibility property of these maps. In case L1 = L2 = L3 this property yields the se--theoretic quantum Yang-Baxter equation, i.e. the YB map property. By considering periodic 'staircase' initial value problems on quadrilateral lattices, these maps give rise to multidimensional integrable map**s which preserve the spectrum of the corresponding monodromy matrix.
△ Less
Submitted 10 June, 2010;
originally announced June 2010.
-
On Quadrirational Yang-Baxter Maps
Authors:
V. G. Papageorgiou,
Yu. B. Suris,
A. G. Tongas,
A. P. Veselov
Abstract:
We use the classification of the quadrirational maps given by Adler, Bobenko and Suris to describe when such maps satisfy the Yang-Baxter relation. We show that the corresponding maps can be characterized by certain singularity invariance condition. This leads to some new families of Yang-Baxter maps corresponding to the geometric symmetries of pencils of quadrics.
We use the classification of the quadrirational maps given by Adler, Bobenko and Suris to describe when such maps satisfy the Yang-Baxter relation. We show that the corresponding maps can be characterized by certain singularity invariance condition. This leads to some new families of Yang-Baxter maps corresponding to the geometric symmetries of pencils of quadrics.
△ Less
Submitted 16 April, 2010; v1 submitted 15 November, 2009;
originally announced November 2009.
-
Yang-Baxter maps associated to elliptic curves
Authors:
Vassilios G. Papageorgiou,
Anastasios G. Tongas
Abstract:
We present Yang-Baxter maps associated to elliptic curves. They are related to discrete versions of the Krichever-Novikov and the Landau-Lifshits equations. A lifting of scalar integrable quad-graph equations to two-field equations is also shown.
We present Yang-Baxter maps associated to elliptic curves. They are related to discrete versions of the Krichever-Novikov and the Landau-Lifshits equations. A lifting of scalar integrable quad-graph equations to two-field equations is also shown.
△ Less
Submitted 17 June, 2009;
originally announced June 2009.
-
Symmetries and integrability of discrete equations defined on a black-white lattice
Authors:
P. D. Xenitidis,
V. G. Papageorgiou
Abstract:
We study the deformations of the H equations, presented recently by Adler, Bobenko and Suris, which are naturally defined on a black-white lattice. For each one of these equations, two different three-leg forms are constructed, leading to two different discrete Toda type equations. Their multidimensional consistency leads to B{ä}cklund transformations relating different members of this class, as…
▽ More
We study the deformations of the H equations, presented recently by Adler, Bobenko and Suris, which are naturally defined on a black-white lattice. For each one of these equations, two different three-leg forms are constructed, leading to two different discrete Toda type equations. Their multidimensional consistency leads to B{ä}cklund transformations relating different members of this class, as well as to Lax pairs. Their symmetry analysis is presented yielding infinite hierarchies of generalized symmetries.
△ Less
Submitted 18 March, 2009;
originally announced March 2009.
-
Yang Baxter maps with first degree polynomial 2 by 2 Lax matrices
Authors:
Theodoros E. Kouloukas,
Vassilios G. Papageorgiou
Abstract:
A family of nonparametric Yang Baxter (YB) maps is constructed by refactorization of the product of two 2 by 2 matrix polynomials of first degree. These maps are Poisson with respect to the Sklyanin bracket. For each Casimir function a parametric Poisson YB map is generated by reduction on the corresponding level set. By considering a complete set of Casimir functions symplectic multiparametric…
▽ More
A family of nonparametric Yang Baxter (YB) maps is constructed by refactorization of the product of two 2 by 2 matrix polynomials of first degree. These maps are Poisson with respect to the Sklyanin bracket. For each Casimir function a parametric Poisson YB map is generated by reduction on the corresponding level set. By considering a complete set of Casimir functions symplectic multiparametric YB maps are derived. These maps are quadrirational with explicit formulae in terms of matrix operations. Their Lax matrices are, by construction, 2 by 2 first degree polynomial in the spectral parameter and are classified by Jordan normal form of the leading term. Nonquadrirational parametric YB maps constructed as limits of the quadrirational ones are connected to known integrable systems on quad graphs.
△ Less
Submitted 10 March, 2009;
originally announced March 2009.
-
Yang-Baxter maps and multi-field integrable lattice equations
Authors:
V. G. Papageorgiou,
A. G. Tongas
Abstract:
A variety of Yang-Baxter maps are obtained from integrable multi-field equations on quad-graphs. A systematic framework for investigating this connection relies on the symmetry groups of the equations. The method is applied to lattice equations introduced by Adler and Yamilov and which are related to the nonlinear superposition formulae for the Bäcklund transformations of the nonlinear Schröding…
▽ More
A variety of Yang-Baxter maps are obtained from integrable multi-field equations on quad-graphs. A systematic framework for investigating this connection relies on the symmetry groups of the equations. The method is applied to lattice equations introduced by Adler and Yamilov and which are related to the nonlinear superposition formulae for the Bäcklund transformations of the nonlinear Schrödinger system and specific ferromagnetic models.
△ Less
Submitted 5 October, 2007; v1 submitted 20 February, 2007;
originally announced February 2007.
-
The last integrable case of kozlov-Treshchev Birkhoff integrable potentials
Authors:
Pantelis A. Damianou,
Vassilis Papageorgiou
Abstract:
We establish the integrability of the last open case in the Kozlov-Treshchev classification of Birkhoff integrable Hamiltonian systems. The technique used is a modification of the so called quadratic Lax pair for $D_n$ Toda lattice combined with a method used by M. Ranada in proving the integrability of the Sklyanin case.
We establish the integrability of the last open case in the Kozlov-Treshchev classification of Birkhoff integrable Hamiltonian systems. The technique used is a modification of the so called quadratic Lax pair for $D_n$ Toda lattice combined with a method used by M. Ranada in proving the integrability of the Sklyanin case.
△ Less
Submitted 10 January, 2007;
originally announced January 2007.
-
Yang-Baxter maps and symmetries of integrable equations on quad-graphs
Authors:
Vassilios G. Papageorgiou,
Anastasios G. Tongas,
Alexander P. Veselov
Abstract:
A connection between the Yang-Baxter relation for maps and the multi-dimensional consistency property of integrable equations on quad-graphs is investigated. The approach is based on the symmetry analysis of the corresponding equations. It is shown that the Yang-Baxter variables can be chosen as invariants of the multi-parameter symmetry groups of the equations. We use the classification results…
▽ More
A connection between the Yang-Baxter relation for maps and the multi-dimensional consistency property of integrable equations on quad-graphs is investigated. The approach is based on the symmetry analysis of the corresponding equations. It is shown that the Yang-Baxter variables can be chosen as invariants of the multi-parameter symmetry groups of the equations. We use the classification results by Adler, Bobenko and Suris to demonstrate this method. Some new examples of Yang-Baxter maps are derived in this way from multi-field integrable equations.
△ Less
Submitted 8 May, 2006;
originally announced May 2006.
-
Linearization And Solutions Of The Discrete Painlevé-III Equation
Authors:
B. Grammaticos,
F. W. Nijhoff,
V. Papageorgiou,
A. Ramani
Abstract:
We present particular solutions of the discrete Painlevé III (d-P$\rm_{III}$) equation of rational and special function (Bessel) type. These solutions allow us to establish a close parallel between this discrete equation and its continuous counterpart. Moreover, we propose an alternate form for d-P$\rm_{III}$ and confirm its integrability by explicitly deriving its Lax pair.
We present particular solutions of the discrete Painlevé III (d-P$\rm_{III}$) equation of rational and special function (Bessel) type. These solutions allow us to establish a close parallel between this discrete equation and its continuous counterpart. Moreover, we propose an alternate form for d-P$\rm_{III}$ and confirm its integrability by explicitly deriving its Lax pair.
△ Less
Submitted 3 November, 1993;
originally announced October 1993.