-
Gaussian Measures Conditioned on Nonlinear Observations: Consistency, MAP Estimators, and Simulation
Authors:
Yifan Chen,
Bamdad Hosseini,
Houman Owhadi,
Andrew M Stuart
Abstract:
The article presents a systematic study of the problem of conditioning a Gaussian random variable $ξ$ on nonlinear observations of the form $F \circ φ(ξ)$ where $φ: \mathcal{X} \to \mathbb{R}^N$ is a bounded linear operator and $F$ is nonlinear. Such problems arise in the context of Bayesian inference and recent machine learning-inspired PDE solvers. We give a representer theorem for the condition…
▽ More
The article presents a systematic study of the problem of conditioning a Gaussian random variable $ξ$ on nonlinear observations of the form $F \circ φ(ξ)$ where $φ: \mathcal{X} \to \mathbb{R}^N$ is a bounded linear operator and $F$ is nonlinear. Such problems arise in the context of Bayesian inference and recent machine learning-inspired PDE solvers. We give a representer theorem for the conditioned random variable $ξ\mid F\circ φ(ξ)$, stating that it decomposes as the sum of an infinite-dimensional Gaussian (which is identified analytically) as well as a finite-dimensional non-Gaussian measure. We also introduce a novel notion of the mode of a conditional measure by taking the limit of the natural relaxation of the problem, to which we can apply the existing notion of maximum a posteriori estimators of posterior measures. Finally, we introduce a variant of the Laplace approximation for the efficient simulation of the aforementioned conditioned Gaussian random variables towards uncertainty quantification.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Diffeomorphic Measure Matching with Kernels for Generative Modeling
Authors:
Biraj Pandey,
Bamdad Hosseini,
Pau Batlle,
Houman Owhadi
Abstract:
This article presents a general framework for the transport of probability measures towards minimum divergence generative modeling and sampling using ordinary differential equations (ODEs) and Reproducing Kernel Hilbert Spaces (RKHSs), inspired by ideas from diffeomorphic matching and image registration. A theoretical analysis of the proposed method is presented, giving a priori error bounds in te…
▽ More
This article presents a general framework for the transport of probability measures towards minimum divergence generative modeling and sampling using ordinary differential equations (ODEs) and Reproducing Kernel Hilbert Spaces (RKHSs), inspired by ideas from diffeomorphic matching and image registration. A theoretical analysis of the proposed method is presented, giving a priori error bounds in terms of the complexity of the model, the number of samples in the training set, and model misspecification. An extensive suite of numerical experiments further highlights the properties, strengths, and weaknesses of the method and extends its applicability to other tasks, such as conditional simulation and inference.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
Nonlinear Filtering with Brenier Optimal Transport Maps
Authors:
Mohammad Al-Jarrah,
Niyizhen **,
Bamdad Hosseini,
Amirhossein Taghvaei
Abstract:
This paper is concerned with the problem of nonlinear filtering, i.e., computing the conditional distribution of the state of a stochastic dynamical system given a history of noisy partial observations. Conventional sequential importance resampling (SIR) particle filters suffer from fundamental limitations, in scenarios involving degenerate likelihoods or high-dimensional states, due to the weight…
▽ More
This paper is concerned with the problem of nonlinear filtering, i.e., computing the conditional distribution of the state of a stochastic dynamical system given a history of noisy partial observations. Conventional sequential importance resampling (SIR) particle filters suffer from fundamental limitations, in scenarios involving degenerate likelihoods or high-dimensional states, due to the weight degeneracy issue. In this paper, we explore an alternative method, which is based on estimating the Brenier optimal transport (OT) map from the current prior distribution of the state to the posterior distribution at the next time step. Unlike SIR particle filters, the OT formulation does not require the analytical form of the likelihood. Moreover, it allows us to harness the approximation power of neural networks to model complex and multi-modal distributions and employ stochastic optimization algorithms to enhance scalability. Extensive numerical experiments are presented that compare the OT method to the SIR particle filter and the ensemble Kalman filter, evaluating the performance in terms of sample efficiency, high-dimensional scalability, and the ability to capture complex and multi-modal distributions.
△ Less
Submitted 2 February, 2024; v1 submitted 20 October, 2023;
originally announced October 2023.
-
Kernel Methods are Competitive for Operator Learning
Authors:
Pau Batlle,
Matthieu Darcy,
Bamdad Hosseini,
Houman Owhadi
Abstract:
We present a general kernel-based framework for learning operators between Banach spaces along with a priori error analysis and comprehensive numerical comparisons with popular neural net (NN) approaches such as Deep Operator Net (DeepONet) [Lu et al.] and Fourier Neural Operator (FNO) [Li et al.]. We consider the setting where the input/output spaces of target operator…
▽ More
We present a general kernel-based framework for learning operators between Banach spaces along with a priori error analysis and comprehensive numerical comparisons with popular neural net (NN) approaches such as Deep Operator Net (DeepONet) [Lu et al.] and Fourier Neural Operator (FNO) [Li et al.]. We consider the setting where the input/output spaces of target operator $\mathcal{G}^\dagger\,:\, \mathcal{U}\to \mathcal{V}$ are reproducing kernel Hilbert spaces (RKHS), the data comes in the form of partial observations $φ(u_i), \varphi(v_i)$ of input/output functions $v_i=\mathcal{G}^\dagger(u_i)$ ($i=1,\ldots,N$), and the measurement operators $φ\,:\, \mathcal{U}\to \mathbb{R}^n$ and $\varphi\,:\, \mathcal{V} \to \mathbb{R}^m$ are linear. Writing $ψ\,:\, \mathbb{R}^n \to \mathcal{U}$ and $χ\,:\, \mathbb{R}^m \to \mathcal{V}$ for the optimal recovery maps associated with $φ$ and $\varphi$, we approximate $\mathcal{G}^\dagger$ with $\bar{\mathcal{G}}=χ\circ \bar{f} \circ φ$ where $\bar{f}$ is an optimal recovery approximation of $f^\dagger:=\varphi \circ \mathcal{G}^\dagger \circ ψ\,:\,\mathbb{R}^n \to \mathbb{R}^m$. We show that, even when using vanilla kernels (e.g., linear or Matérn), our approach is competitive in terms of cost-accuracy trade-off and either matches or beats the performance of NN methods on a majority of benchmarks. Additionally, our framework offers several advantages inherited from kernel methods: simplicity, interpretability, convergence guarantees, a priori error estimates, and Bayesian uncertainty quantification. As such, it can serve as a natural benchmark for operator learning.
△ Less
Submitted 8 October, 2023; v1 submitted 25 April, 2023;
originally announced April 2023.
-
A Kernel Approach for PDE Discovery and Operator Learning
Authors:
Da Long,
Nicole Mrvaljevic,
Shandian Zhe,
Bamdad Hosseini
Abstract:
This article presents a three-step framework for learning and solving partial differential equations (PDEs) using kernel methods. Given a training set consisting of pairs of noisy PDE solutions and source/boundary terms on a mesh, kernel smoothing is utilized to denoise the data and approximate derivatives of the solution. This information is then used in a kernel regression model to learn the alg…
▽ More
This article presents a three-step framework for learning and solving partial differential equations (PDEs) using kernel methods. Given a training set consisting of pairs of noisy PDE solutions and source/boundary terms on a mesh, kernel smoothing is utilized to denoise the data and approximate derivatives of the solution. This information is then used in a kernel regression model to learn the algebraic form of the PDE. The learned PDE is then used within a kernel based solver to approximate the solution of the PDE with a new source/boundary term, thereby constituting an operator learning framework. Numerical experiments compare the method to state-of-the-art algorithms and demonstrate its competitive performance.
△ Less
Submitted 30 March, 2023; v1 submitted 14 October, 2022;
originally announced October 2022.
-
Bending behavior of additively manufactured lattice structures: numerical characterization and experimental validation
Authors:
Nina Korshunova,
Gianluca Alaimo,
Seyyed Bahram Hosseini,
Massimo Carraturo,
Alessandro Reali,
Jarkko Niiranen,
Ferdinando Auricchio,
Ernst Rank,
Stefan Kollmannsberger
Abstract:
Selective Laser Melting (SLM) technology has undergone significant development in the past years providing unique flexibility for the fabrication of complex metamaterials such as octet-truss lattices. However, the microstructure of the final parts can exhibit significant variations due to the high complexity of the manufacturing process. Consequently, the mechanical behavior of these lattices is s…
▽ More
Selective Laser Melting (SLM) technology has undergone significant development in the past years providing unique flexibility for the fabrication of complex metamaterials such as octet-truss lattices. However, the microstructure of the final parts can exhibit significant variations due to the high complexity of the manufacturing process. Consequently, the mechanical behavior of these lattices is strongly dependent on the process-induced defects, raising the importance on the incorporation of as-manufactured geometries into the computational structural analysis. This, in turn, challenges the traditional mesh-conforming methods making the computational costs prohibitively large. In the present work, an immersed image-to-analysis framework is applied to efficiently evaluate the bending behavior of AM lattices. To this end, we employ the Finite Cell Method (FCM) to perform a three-dimensional numerical analysis of the three-point bending test of a lattice structure and compare the as-designed to as-manufactured effective properties. Furthermore, we undertake a comprehensive study on the applicability of dimensionally reduced beam models to the prediction of the bending behavior of lattice beams and validate classical and strain gradient beam theories applied in combination with the FCM. The numerical findings suggest that the SLM octet-truss lattices exhibit size effects, thus, requiring a flexible framework to incorporate high-order continuum theories.
△ Less
Submitted 22 January, 2021;
originally announced January 2021.
-
Image-based numerical characterization and experimental validation of tensile behavior of octet-truss lattice structures
Authors:
Nina Korshunova,
Gianluca Alaimo,
Seyyed Bahram Hosseini,
Massimo Carraturo,
Alessandro Reali,
Jarkko Niiranen,
Ferdinando Auricchio,
Ernst Rank,
Stefan Kollmannsberger
Abstract:
The production of lightweight metal lattice structures has received much attention due to the recent developments in additive manufacturing (AM). The design flexibility comes, however, with the complexity of the underlying physics. In fact, metal additive manufacturing introduces process-induced geometrical defects that mainly result in deviations of the effective geometry from the nominal one. Th…
▽ More
The production of lightweight metal lattice structures has received much attention due to the recent developments in additive manufacturing (AM). The design flexibility comes, however, with the complexity of the underlying physics. In fact, metal additive manufacturing introduces process-induced geometrical defects that mainly result in deviations of the effective geometry from the nominal one. This change in the final printed shape is the primary cause of the gap between the as-designed and as-manufactured mechanical behavior of AM products. Thus, the possibility to incorporate the precise manufactured geometries into the computational analysis is crucial for the quality and performance assessment of the final parts. Computed tomography (CT) is an accurate method for the acquisition of the manufactured shape. However, it is often not feasible to integrate the CT-based geometrical information into the traditional computational analysis due to the complexity of the meshing procedure for such high-resolution geometrical models and the prohibitive numerical costs. In this work, an embedded numerical framework is applied to efficiently simulate and compare the mechanical behavior of as-designed to as-manufactured octet-truss lattice structures. The parts are produced using laser powder bed fusion (LPBF). Employing an immersed boundary method, namely the Finite Cell Method (FCM), we perform direct numerical simulations (DNS) and first-order numerical homogenization analysis of a tensile test for a 3D printed octet-truss structure. Numerical results based on CT scan (as-manufactured geometry) show an excellent agreement with experimental measurements, whereas both DNS and first-order numerical homogenization performed directly on the 3D virtual model (as-designed geometry) of the structure show a significant deviation from experimental data.
△ Less
Submitted 11 March, 2021; v1 submitted 14 December, 2020;
originally announced December 2020.
-
Posterior Consistency of Semi-Supervised Regression on Graphs
Authors:
Andrea L. Bertozzi,
Bamdad Hosseini,
Hao Li,
Kevin Miller,
Andrew M. Stuart
Abstract:
Graph-based semi-supervised regression (SSR) is the problem of estimating the value of a function on a weighted graph from its values (labels) on a small subset of the vertices. This paper is concerned with the consistency of SSR in the context of classification, in the setting where the labels have small noise and the underlying graph weighting is consistent with well-clustered nodes. We present…
▽ More
Graph-based semi-supervised regression (SSR) is the problem of estimating the value of a function on a weighted graph from its values (labels) on a small subset of the vertices. This paper is concerned with the consistency of SSR in the context of classification, in the setting where the labels have small noise and the underlying graph weighting is consistent with well-clustered nodes. We present a Bayesian formulation of SSR in which the weighted graph defines a Gaussian prior, using a graph Laplacian, and the labeled data defines a likelihood. We analyze the rate of contraction of the posterior measure around the ground truth in terms of parameters that quantify the small label error and inherent clustering in the graph. We obtain bounds on the rates of contraction and illustrate their sharpness through numerical experiments. The analysis also gives insight into the choice of hyperparameters that enter the definition of the prior.
△ Less
Submitted 24 March, 2021; v1 submitted 24 July, 2020;
originally announced July 2020.
-
Conditional Sampling with Monotone GANs: from Generative Models to Likelihood-Free Inference
Authors:
Ricardo Baptista,
Bamdad Hosseini,
Nikola B. Kovachki,
Youssef Marzouk
Abstract:
We present a novel framework for conditional sampling of probability measures, using block triangular transport maps. We develop the theoretical foundations of block triangular transport in a Banach space setting, establishing general conditions under which conditional sampling can be achieved and drawing connections between monotone block triangular maps and optimal transport. Based on this theor…
▽ More
We present a novel framework for conditional sampling of probability measures, using block triangular transport maps. We develop the theoretical foundations of block triangular transport in a Banach space setting, establishing general conditions under which conditional sampling can be achieved and drawing connections between monotone block triangular maps and optimal transport. Based on this theory, we then introduce a computational approach, called monotone generative adversarial networks (M-GANs), to learn suitable block triangular maps. Our algorithm uses only samples from the underlying joint probability measure and is hence likelihood-free. Numerical experiments with M-GAN demonstrate accurate sampling of conditional measures in synthetic examples, Bayesian inverse problems involving ordinary and partial differential equations, and probabilistic image in-painting.
△ Less
Submitted 5 June, 2023; v1 submitted 11 June, 2020;
originally announced June 2020.
-
Model Reduction and Neural Networks for Parametric PDEs
Authors:
Kaushik Bhattacharya,
Bamdad Hosseini,
Nikola B. Kovachki,
Andrew M. Stuart
Abstract:
We develop a general framework for data-driven approximation of input-output maps between infinite-dimensional spaces. The proposed approach is motivated by the recent successes of neural networks and deep learning, in combination with ideas from model reduction. This combination results in a neural network approximation which, in principle, is defined on infinite-dimensional spaces and, in practi…
▽ More
We develop a general framework for data-driven approximation of input-output maps between infinite-dimensional spaces. The proposed approach is motivated by the recent successes of neural networks and deep learning, in combination with ideas from model reduction. This combination results in a neural network approximation which, in principle, is defined on infinite-dimensional spaces and, in practice, is robust to the dimension of finite-dimensional approximations of these spaces required for computation. For a class of input-output maps, and suitably chosen probability measures on the inputs, we prove convergence of the proposed approximation methodology. We also include numerical experiments which demonstrate the effectiveness of the method, showing convergence and robustness of the approximation scheme with respect to the size of the discretization, and compare it with existing algorithms from the literature; our examples include the map** from coefficient to solution in a divergence form elliptic partial differential equation (PDE) problem, and the solution operator for viscous Burgers' equation.
△ Less
Submitted 17 June, 2021; v1 submitted 6 May, 2020;
originally announced May 2020.
-
Deep-Aligned Convolutional Neural Network for Skeleton-based Action Recognition and Segmentation
Authors:
Babak Hosseini,
Romain Montagne,
Barbara Hammer
Abstract:
Convolutional neural networks (CNNs) are deep learning frameworks which are well-known for their notable performance in classification tasks. Hence, many skeleton-based action recognition and segmentation (SBARS) algorithms benefit from them in their designs. However, a shortcoming of such applications is the general lack of spatial relationships between the input features in such data types. Besi…
▽ More
Convolutional neural networks (CNNs) are deep learning frameworks which are well-known for their notable performance in classification tasks. Hence, many skeleton-based action recognition and segmentation (SBARS) algorithms benefit from them in their designs. However, a shortcoming of such applications is the general lack of spatial relationships between the input features in such data types. Besides, non-uniform temporal scalings is a common issue in skeleton-based data streams which leads to having different input sizes even within one specific action category. In this work, we propose a novel deep-aligned convolutional neural network (DACNN) to tackle the above challenges for the particular problem of SBARS. Our network is designed by introducing a new type of filters in the context of CNNs which are trained based on their alignments to the local subsequences in the inputs. These filters result in efficient predictions as well as learning interpretable patterns in the data. We empirically evaluate our framework on real-world benchmarks showing that the proposed DACNN algorithm obtains a competitive performance compared to the state-of-the-art while benefiting from a less complicated yet more interpretable model.
△ Less
Submitted 12 November, 2019;
originally announced November 2019.
-
Interpretable Multiple-Kernel Prototype Learning for Discriminative Representation and Feature Selection
Authors:
Babak Hosseini,
Barbara Hammer
Abstract:
Prototype-based methods are of the particular interest for domain specialists and practitioners as they summarize a dataset by a small set of representatives. Therefore, in a classification setting, interpretability of the prototypes is as significant as the prediction accuracy of the algorithm. Nevertheless, the state-of-the-art methods make inefficient trade-offs between these concerns by sacrif…
▽ More
Prototype-based methods are of the particular interest for domain specialists and practitioners as they summarize a dataset by a small set of representatives. Therefore, in a classification setting, interpretability of the prototypes is as significant as the prediction accuracy of the algorithm. Nevertheless, the state-of-the-art methods make inefficient trade-offs between these concerns by sacrificing one in favor of the other, especially if the given data has a kernel-based representation. In this paper, we propose a novel interpretable multiple-kernel prototype learning (IMKPL) to construct highly interpretable prototypes in the feature space, which are also efficient for the discriminative representation of the data. Our method focuses on the local discrimination of the classes in the feature space and sha** the prototypes based on condensed class-homogeneous neighborhoods of data. Besides, IMKPL learns a combined embedding in the feature space in which the above objectives are better fulfilled. When the base kernels coincide with the data dimensions, this embedding results in a discriminative features selection. We evaluate IMKPL on several benchmarks from different domains which demonstrate its superiority to the related state-of-the-art methods regarding both interpretability and discriminative representation.
△ Less
Submitted 10 November, 2019;
originally announced November 2019.
-
Interpretable Discriminative Dimensionality Reduction and Feature Selection on the Manifold
Authors:
Babak Hosseini,
Barbara Hammer
Abstract:
Dimensionality reduction (DR) on the manifold includes effective methods which project the data from an implicit relational space onto a vectorial space. Regardless of the achievements in this area, these algorithms suffer from the lack of interpretation of the projection dimensions. Therefore, it is often difficult to explain the physical meaning behind the embedding dimensions. In this research,…
▽ More
Dimensionality reduction (DR) on the manifold includes effective methods which project the data from an implicit relational space onto a vectorial space. Regardless of the achievements in this area, these algorithms suffer from the lack of interpretation of the projection dimensions. Therefore, it is often difficult to explain the physical meaning behind the embedding dimensions. In this research, we propose the interpretable kernel DR algorithm (I-KDR) as a new algorithm which maps the data from the feature space to a lower dimensional space where the classes are more condensed with less overlap**. Besides, the algorithm creates the dimensions upon local contributions of the data samples, which makes it easier to interpret them by class labels. Additionally, we efficiently fuse the DR with feature selection task to select the most relevant features of the original space to the discriminative objective. Based on the empirical evidence, I-KDR provides better interpretations for embedding dimensions as well as higher discriminative performance in the embedded space compared to the state-of-the-art and popular DR algorithms.
△ Less
Submitted 19 September, 2019;
originally announced September 2019.
-
Consistency of semi-supervised learning algorithms on graphs: Probit and one-hot methods
Authors:
Franca Hoffmann,
Bamdad Hosseini,
Zhi Ren,
Andrew M. Stuart
Abstract:
Graph-based semi-supervised learning is the problem of propagating labels from a small number of labelled data points to a larger set of unlabelled data. This paper is concerned with the consistency of optimization-based techniques for such problems, in the limit where the labels have small noise and the underlying unlabelled data is well clustered. We study graph-based probit for binary classific…
▽ More
Graph-based semi-supervised learning is the problem of propagating labels from a small number of labelled data points to a larger set of unlabelled data. This paper is concerned with the consistency of optimization-based techniques for such problems, in the limit where the labels have small noise and the underlying unlabelled data is well clustered. We study graph-based probit for binary classification, and a natural generalization of this method to multi-class classification using one-hot encoding. The resulting objective function to be optimized comprises the sum of a quadratic form defined through a rational function of the graph Laplacian, involving only the unlabelled data, and a fidelity term involving only the labelled data. The consistency analysis sheds light on the choice of the rational function defining the optimization.
△ Less
Submitted 9 March, 2020; v1 submitted 18 June, 2019;
originally announced June 2019.
-
Non-Negative Local Sparse Coding for Subspace Clustering
Authors:
Babak Hosseini,
Barbara Hammer
Abstract:
Subspace sparse coding (SSC) algorithms have proven to be beneficial to clustering problems. They provide an alternative data representation in which the underlying structure of the clusters can be better captured. However, most of the research in this area is mainly focused on enhancing the sparse coding part of the problem. In contrast, we introduce a novel objective term in our proposed SSC fra…
▽ More
Subspace sparse coding (SSC) algorithms have proven to be beneficial to clustering problems. They provide an alternative data representation in which the underlying structure of the clusters can be better captured. However, most of the research in this area is mainly focused on enhancing the sparse coding part of the problem. In contrast, we introduce a novel objective term in our proposed SSC framework which focuses on the separability of data points in the coding space. We also provide mathematical insights into how this local-separability term improves the clustering result of the SSC framework. Our proposed non-linear local SSC algorithm (NLSSC) also benefits from the efficient choice of its sparsity terms and constraints. The NLSSC algorithm is also formulated in the kernel-based framework (NLKSSC) which can represent the nonlinear structure of data. In addition, we address the possibility of having redundancies in sparse coding results and its negative effect on graph-based clustering problems. We introduce the link-restore post-processing step to improve the representation graph of non-negative SSC algorithms such as ours. Empirical evaluations on well-known clustering benchmarks show that our proposed NLSSC framework results in better clusterings compared to the state-of-the-art baselines and demonstrate the effectiveness of the link-restore post-processing in improving the clustering accuracy via correcting the broken links of the representation graph.
△ Less
Submitted 12 March, 2019;
originally announced March 2019.
-
Confident Kernel Sparse Coding and Dictionary Learning
Authors:
Babak Hosseini,
Barbara Hammer
Abstract:
In recent years, kernel-based sparse coding (K-SRC) has received particular attention due to its efficient representation of nonlinear data structures in the feature space. Nevertheless, the existing K-SRC methods suffer from the lack of consistency between their training and test optimization frameworks. In this work, we propose a novel confident K-SRC and dictionary learning algorithm (CKSC) whi…
▽ More
In recent years, kernel-based sparse coding (K-SRC) has received particular attention due to its efficient representation of nonlinear data structures in the feature space. Nevertheless, the existing K-SRC methods suffer from the lack of consistency between their training and test optimization frameworks. In this work, we propose a novel confident K-SRC and dictionary learning algorithm (CKSC) which focuses on the discriminative reconstruction of the data based on its representation in the kernel space. CKSC focuses on reconstructing each data sample via weighted contributions which are confident in its corresponding class of data. We employ novel discriminative terms to apply this scheme to both training and test frameworks in our algorithm. This specific design increases the consistency of these optimization frameworks and improves the discriminative performance in the recall phase. In addition, CKSC directly employs the supervised information in its dictionary learning framework to enhance the discriminative structure of the dictionary. For empirical evaluations, we implement our CKSC algorithm on multivariate time-series benchmarks such as DynTex++ and UTKinect. Our claims regarding the superior performance of the proposed algorithm are justified throughout comparing its classification results to the state-of-the-art K-SRC algorithms.
△ Less
Submitted 12 March, 2019;
originally announced March 2019.
-
Non-Negative Kernel Sparse Coding for the Classification of Motion Data
Authors:
Babak Hosseini,
Felix Hülsmann,
Mario Botsch,
Barbara Hammer
Abstract:
We are interested in the decomposition of motion data into a sparse linear combination of base functions which enable efficient data processing. We combine two prominent frameworks: dynamic time war** (DTW), which offers particularly successful pairwise motion data comparison, and sparse coding (SC), which enables an automatic decomposition of vectorial data into a sparse linear combination of b…
▽ More
We are interested in the decomposition of motion data into a sparse linear combination of base functions which enable efficient data processing. We combine two prominent frameworks: dynamic time war** (DTW), which offers particularly successful pairwise motion data comparison, and sparse coding (SC), which enables an automatic decomposition of vectorial data into a sparse linear combination of base vectors. We enhance SC as follows: an efficient kernelization which extends its application domain to general similarity data such as offered by DTW, and its restriction to non-negative linear representations of signals and base vectors in order to guarantee a meaningful dictionary. Empirical evaluations on motion capture benchmarks show the effectiveness of our framework regarding interpretation and discrimination concerns.
△ Less
Submitted 12 March, 2019; v1 submitted 9 March, 2019;
originally announced March 2019.
-
Large-Margin Multiple Kernel Learning for Discriminative Features Selection and Representation Learning
Authors:
Babak Hosseini,
Barbara Hammer
Abstract:
Multiple kernel learning (MKL) algorithms combine different base kernels to obtain a more efficient representation in the feature space. Focusing on discriminative tasks, MKL has been used successfully for feature selection and finding the significant modalities of the data. In such applications, each base kernel represents one dimension of the data or is derived from one specific descriptor. Ther…
▽ More
Multiple kernel learning (MKL) algorithms combine different base kernels to obtain a more efficient representation in the feature space. Focusing on discriminative tasks, MKL has been used successfully for feature selection and finding the significant modalities of the data. In such applications, each base kernel represents one dimension of the data or is derived from one specific descriptor. Therefore, MKL finds an optimal weighting scheme for the given kernels to increase the classification accuracy. Nevertheless, the majority of the works in this area focus on only binary classification problems or aim for linear separation of the classes in the kernel space, which are not realistic assumptions for many real-world problems. In this paper, we propose a novel multi-class MKL framework which improves the state-of-the-art by enhancing the local separation of the classes in the feature space. Besides, by using a sparsity term, our large-margin multiple kernel algorithm (LMMK) performs discriminative feature selection by aiming to employ a small subset of the base kernels. Based on our empirical evaluations on different real-world datasets, LMMK provides a competitive classification accuracy compared with the state-of-the-art algorithms in MKL. Additionally, it learns a sparse set of non-zero kernel weights which leads to a more interpretable feature selection and representation learning.
△ Less
Submitted 12 March, 2019; v1 submitted 8 March, 2019;
originally announced March 2019.
-
Multiple-Kernel Dictionary Learning for Reconstruction and Clustering of Unseen Multivariate Time-series
Authors:
Babak Hosseini,
Barbara Hammer
Abstract:
There exist many approaches for description and recognition of unseen classes in datasets. Nevertheless, it becomes a challenging problem when we deal with multivariate time-series (MTS) (e.g., motion data), where we cannot apply the vectorial algorithms directly to the inputs. In this work, we propose a novel multiple-kernel dictionary learning (MKD) which learns semantic attributes based on spec…
▽ More
There exist many approaches for description and recognition of unseen classes in datasets. Nevertheless, it becomes a challenging problem when we deal with multivariate time-series (MTS) (e.g., motion data), where we cannot apply the vectorial algorithms directly to the inputs. In this work, we propose a novel multiple-kernel dictionary learning (MKD) which learns semantic attributes based on specific combinations of MTS dimensions in the feature space. Hence, MKD can fully/partially reconstructs the unseen classes based on the training data (seen classes). Furthermore, we obtain sparse encodings for unseen classes based on the learned MKD attributes, and upon which we propose a simple but effective incremental clustering algorithm to categorize the unseen MTS classes in an unsupervised way. According to the empirical evaluation of our MKD framework on real benchmarks, it provides an interpretable reconstruction of unseen MTS data as well as a high performance regarding their online clustering.
△ Less
Submitted 12 March, 2019; v1 submitted 5 March, 2019;
originally announced March 2019.
-
Feasibility Based Large Margin Nearest Neighbor Metric Learning
Authors:
Babak Hosseini,
Barbara Hammer
Abstract:
Large margin nearest neighbor (LMNN) is a metric learner which optimizes the performance of the popular $k$NN classifier. However, its resulting metric relies on pre-selected target neighbors. In this paper, we address the feasibility of LMNN's optimization constraints regarding these target points, and introduce a mathematical measure to evaluate the size of the feasible region of the optimizatio…
▽ More
Large margin nearest neighbor (LMNN) is a metric learner which optimizes the performance of the popular $k$NN classifier. However, its resulting metric relies on pre-selected target neighbors. In this paper, we address the feasibility of LMNN's optimization constraints regarding these target points, and introduce a mathematical measure to evaluate the size of the feasible region of the optimization problem. We enhance the optimization framework of LMNN by a weighting scheme which prefers data triplets which yield a larger feasible region. This increases the chances to obtain a good metric as the solution of LMNN's problem. We evaluate the performance of the resulting feasibility-based LMNN algorithm using synthetic and real datasets. The empirical results show an improved accuracy for different types of datasets in comparison to regular LMNN.
△ Less
Submitted 2 May, 2018; v1 submitted 18 October, 2016;
originally announced October 2016.
-
Efficient Metric Learning for the Analysis of Motion Data
Authors:
Babak Hosseini,
Barbara Hammer
Abstract:
We investigate metric learning in the context of dynamic time war** (DTW), the by far most popular dissimilarity measure used for the comparison and analysis of motion capture data. While metric learning enables a problem-adapted representation of data, the majority of methods has been proposed for vectorial data only. In this contribution, we extend the popular principle offered by the large ma…
▽ More
We investigate metric learning in the context of dynamic time war** (DTW), the by far most popular dissimilarity measure used for the comparison and analysis of motion capture data. While metric learning enables a problem-adapted representation of data, the majority of methods has been proposed for vectorial data only. In this contribution, we extend the popular principle offered by the large margin nearest neighbors learner (LMNN) to DTW by treating the resulting component-wise dissimilarity values as features. We demonstrate that this principle greatly enhances the classification accuracy in several benchmarks. Further, we show that recent auxiliary concepts such as metric regularization can be transferred from the vectorial case to component-wise DTW in a similar way. We illustrate that metric regularization constitutes a crucial prerequisite for the interpretation of the resulting relevance profiles.
△ Less
Submitted 12 March, 2019; v1 submitted 17 October, 2016;
originally announced October 2016.