-
A stochastic approach to handle knapsack problems in the creation of ensembles
Authors:
Andras Hajdu,
Gyorgy Terdik,
Attila Tiba,
Henrietta Toman
Abstract:
Ensemble-based methods are highly popular approaches that increase the accuracy of a decision by aggregating the opinions of individual voters. The common point is to maximize accuracy; however, a natural limitation occurs if incremental costs are also assigned to the individual voters. Consequently, we investigate creating ensembles under an additional constraint on the total cost of the members.…
▽ More
Ensemble-based methods are highly popular approaches that increase the accuracy of a decision by aggregating the opinions of individual voters. The common point is to maximize accuracy; however, a natural limitation occurs if incremental costs are also assigned to the individual voters. Consequently, we investigate creating ensembles under an additional constraint on the total cost of the members. This task can be formulated as a knapsack problem, where the energy is the ensemble accuracy formed by some aggregation rules. However, the generally applied aggregation rules lead to a nonseparable energy function, which takes the common solution tools -- such as dynamic programming -- out of action. We introduce a novel stochastic approach that considers the energy as the joint probability function of the member accuracies. This type of knowledge can be efficiently incorporated in a stochastic search process as a stop** rule, since we have the information on the expected accuracy or, alternatively, the probability of finding more accurate ensembles. Experimental analyses of the created ensembles of pattern classifiers and object detectors confirm the efficiency of our approach. Moreover, we propose a novel stochastic search strategy that better fits the energy, compared with general approaches such as simulated annealing.
△ Less
Submitted 17 April, 2020;
originally announced April 2020.
-
Optimizing Majority Voting Based Systems Under a Resource Constraint for Multiclass Problems
Authors:
Attila Tiba,
Andras Hajdu,
Gyorgy Terdik,
Henrietta Toman
Abstract:
Ensemble-based approaches are very effective in various fields in raising the accuracy of its individual members, when some voting rule is applied for aggregating the individual decisions. In this paper, we investigate how to find and characterize the ensembles having the highest accuracy if the total cost of the ensemble members is bounded. This question leads to Knapsack problem with non-linear…
▽ More
Ensemble-based approaches are very effective in various fields in raising the accuracy of its individual members, when some voting rule is applied for aggregating the individual decisions. In this paper, we investigate how to find and characterize the ensembles having the highest accuracy if the total cost of the ensemble members is bounded. This question leads to Knapsack problem with non-linear and non-separable objective function in binary and multiclass classification if the majority voting is chosen for the aggregation. As the conventional solving methods cannot be applied for this task, a novel stochastic approach was introduced in the binary case where the energy function is discussed as the joint probability function of the member accuracy. We show some theoretical results with respect to the expected ensemble accuracy and its variance in the multiclass classification problem which can help us to solve the Knapsack problem.
△ Less
Submitted 8 April, 2019;
originally announced April 2019.
-
Harmonic analysis and distribution-free inference for spherical distributions
Authors:
S. Rao Jammalamadaka,
Gyorgy Terdik
Abstract:
Fourier analysis and representation of circular distributions in terms of their Fourier coefficients, is quite commonly discussed and used for model-free inference such as testing uniformity and symmetry etc. in dealing with 2-dimensional directions. However a similar discussion for spherical distributions, which are used to model 3-dimensional directional data, has not been fully developed in the…
▽ More
Fourier analysis and representation of circular distributions in terms of their Fourier coefficients, is quite commonly discussed and used for model-free inference such as testing uniformity and symmetry etc. in dealing with 2-dimensional directions. However a similar discussion for spherical distributions, which are used to model 3-dimensional directional data, has not been fully developed in the literature in terms of their harmonics. This paper, in what we believe is the first such attempt, looks at the probability distributions on a unit sphere, through the perspective of spherical harmonics, analogous to the Fourier analysis for distributions on a unit circle. Harmonic representations of many currently used spherical models are presented and discussed. A very general family of spherical distributions is then introduced, special cases of which yield many known spherical models. Through the prism of harmonic analysis, one can look at the mean direction, dispersion, and various forms of symmetry for these models in a generic setting. Aspects of distribution free inference such as estimation and large-sample tests for these symmetries, are provided. The paper concludes with a real-data example analyzing the longitudinal sunspot activity.
△ Less
Submitted 24 February, 2018; v1 submitted 30 September, 2017;
originally announced October 2017.
-
Change-point analysis in frequency domain for chronological data
Authors:
Gyorgy H. Terdik,
Stergios B. Fotopoulos,
Venkata K. Jandhyala
Abstract:
The purpose of this study is to provide a new methodology of how one can consistently estimate a change-point in time series data. In contrast with previous studies, the suggested methodology employs only the empirical spectral density and its first moment. This is accomplished when both the means and variances before and after the unidentified time point are unknown. Then, the well-known Gauss-Ne…
▽ More
The purpose of this study is to provide a new methodology of how one can consistently estimate a change-point in time series data. In contrast with previous studies, the suggested methodology employs only the empirical spectral density and its first moment. This is accomplished when both the means and variances before and after the unidentified time point are unknown. Then, the well-known Gauss-Newton algorithm is applied to estimate and provide asymptotic results for the parameters involved. Simulations carried out under different distributions, sizes and unknown time points confirm the validity and accuracy of the methodology. The real-world example considered in the paper illustrates the robustness of the methodology in the presence of even extreme outliers.
△ Less
Submitted 19 November, 2016;
originally announced November 2016.
-
On the frequency variogram and on frequency domain methods for the analysis of spatio-temporal data
Authors:
T. Subba Rao,
Gy. Terdik
Abstract:
The covariance function and the variogram play very important roles in modelling and in prediction of spatial and spatio-temporal data. The assumption of second order stationarity, in space and time, is often made in the analysis of spatial data and the spatio-temporal data. Several times the assumption of stationarity is considered to be very restrictive, and therefore, a weaker assumption that t…
▽ More
The covariance function and the variogram play very important roles in modelling and in prediction of spatial and spatio-temporal data. The assumption of second order stationarity, in space and time, is often made in the analysis of spatial data and the spatio-temporal data. Several times the assumption of stationarity is considered to be very restrictive, and therefore, a weaker assumption that the data is Intrinsically stationary both in space and time is often made and used, mainly by the geo-statisticians and other environmental scientists. In this paper we consider the data to be intrinsically stationary. Because of the inclusion of time dimension,the estimation and derivation of the sampling properties of various estimators related to spatio-temporal data become complicated. In this paper our object is to present an alternative way, based on Frequency Domain methods for modelling the data. Here we consider Discrete Fourier Transforms (DFT) defined for the (Intrinsic) time series data observed at several locations as our data, and then consider the estimation of the parameters of spatio-temporal covariance function, estimation of Frequency Variogram, tests of independence etc. We use the well known property that the Discrete Fourier Transforms of stationary time series evaluated at distinct Fourier Frequencies are asymptotically independent and distributed as complex normal in deriving many results considered in this paper.
△ Less
Submitted 19 October, 2016;
originally announced October 2016.
-
A space-time covariance function for spatio-temporal random processes and spatio-temporal prediction (kriging)
Authors:
T. Subba Rao,
Gy. Terdik
Abstract:
We consider a stationary spatio-temporal random process and assume that we have a sample. By defining a sequence of discrete Fourier transforms at canonical frequencies at each location, and using these complex valued random varables as observed sample, we obtain expressions for the spatio-temporal covariance functions and the spectral density functions of the spatio-temporal random processes. The…
▽ More
We consider a stationary spatio-temporal random process and assume that we have a sample. By defining a sequence of discrete Fourier transforms at canonical frequencies at each location, and using these complex valued random varables as observed sample, we obtain expressions for the spatio-temporal covariance functions and the spectral density functions of the spatio-temporal random processes. These spectra correspond to non separable class of random processes. The spatio-temporal covariance functions, obtained here are functions of the spatial distance and the temporal frequency and are similar to Matern class. These are in terms of modified Bessel functions of the second kind. and the parameters are in terms of the second order spectral density functions of the random proces and the spatial distances. We consider the estimation of the parameters of the covariance function and also briefly mention their asymptotic properties. The estimation of the entire data at a known location, and also the estimation of a value given the above sample is also considered. The predictors are obtained using the vectors of Discrete Fourier Transforms. We also describe a statistical test for testing the independence of the m spatial time series (testing for spatial independence) using the Finite Fourier Transforms and it is based on the likelihood ratio test of complex valued random variables The methods are illustrated with real data.
Keywords: Discrete Fourier Transforms, Covariance functions, Spectral density functions, Space-Time Processses, Prediction(kriging) Laplacian operators, Frequency Variogram, Tests for independence, Whittle likelihood.
△ Less
Submitted 29 December, 2015; v1 submitted 8 November, 2013;
originally announced November 2013.
-
When the bispectrum is real-valued
Authors:
E. Iglói,
Gy. Terdik
Abstract:
Let {X(t)} be a stationary time series with a.e. positive spectrum. Two consequences of that the bispectrum of {X(t)} is real-valued but nonzero: 1) if {X(t)} is also linear, then it is reversible; 2) {X(t),} can not be causal linear. A corollary of the first statement: if {X(t)} is linear, and the skewness of X(0) is nonzero, then third order reversibility implies reversibility. In this paper the…
▽ More
Let {X(t)} be a stationary time series with a.e. positive spectrum. Two consequences of that the bispectrum of {X(t)} is real-valued but nonzero: 1) if {X(t)} is also linear, then it is reversible; 2) {X(t),} can not be causal linear. A corollary of the first statement: if {X(t)} is linear, and the skewness of X(0) is nonzero, then third order reversibility implies reversibility. In this paper the notion of bispectrum is of a broader scope.
△ Less
Submitted 17 July, 2013;
originally announced July 2013.
-
Trispectrum and higher order spectra for non-Gaussian homogenous and isotropic field on the 2D-plane
Authors:
György Terdik
Abstract:
In this paper we study the non-Gaussian homogenous and isotropic field on the plane in frequency domain. The trispectrum and higher order spectra of such a field are described in terms of Bessel functions. Poisson formulae are given for the spectrum and for the bispectrum. Some particular integrals of Bessel functions are considered as well.
In this paper we study the non-Gaussian homogenous and isotropic field on the plane in frequency domain. The trispectrum and higher order spectra of such a field are described in terms of Bessel functions. Poisson formulae are given for the spectrum and for the bispectrum. Some particular integrals of Bessel functions are considered as well.
△ Less
Submitted 8 April, 2016; v1 submitted 17 July, 2013;
originally announced July 2013.
-
Bispectrum for non-Gaussian homogenous and isotropic field on the plane
Authors:
György Terdik
Abstract:
The object of this paper is to characterize the third order moments (cumulants) and bispectra of a homogeneous isotropic field defined on a plane. We establish a one to one correspondence between the third order cumulants and the bispectra of such a process in terms of Bessel functions.
The object of this paper is to characterize the third order moments (cumulants) and bispectra of a homogeneous isotropic field defined on a plane. We establish a one to one correspondence between the third order cumulants and the bispectra of such a process in terms of Bessel functions.
△ Less
Submitted 2 November, 2013; v1 submitted 12 April, 2013;
originally announced April 2013.
-
Angular Spectra for non-Gaussian Isotropic Fields
Authors:
Gyorgy Terdik
Abstract:
Cosmic Microwave Background (CMB) Anisotropies is a subject of intensive research in several fields of sciences. In this paper we start a systematic development of basic notions and theory in statistics according to the application for CMB. The main result of this paper is the necessary and sufficient condition for isotropy of a non-Gaussian field in terms of spectra. Clear formulae for bi-, tri-…
▽ More
Cosmic Microwave Background (CMB) Anisotropies is a subject of intensive research in several fields of sciences. In this paper we start a systematic development of basic notions and theory in statistics according to the application for CMB. The main result of this paper is the necessary and sufficient condition for isotropy of a non-Gaussian field in terms of spectra. Clear formulae for bi-, tri- and polyspectra and bi-, tri-, and higher order covariances are also given. Keywords: Bispectrum, Trispectrum, Angular poly-Spectra, Cosmic microwave background radiation; Gaussianity; spherical random fields
△ Less
Submitted 27 December, 2013; v1 submitted 17 February, 2013;
originally announced February 2013.