Search | arXiv e-print repository

doi 10.1109/TPAMI.2023.3346212

Dimension Reduction with Prior Information for Knowledge Discovery

Abstract: This paper addresses the problem of map** high-dimensional data to a low-dimensional space, in the presence of other known features. This problem is ubiquitous in science and engineering as there are often controllable/measurable features in most applications. To solve this problem, this paper proposes a broad class of methods, which is referred to as conditional multidimensional scaling (MDS).… ▽ More This paper addresses the problem of map** high-dimensional data to a low-dimensional space, in the presence of other known features. This problem is ubiquitous in science and engineering as there are often controllable/measurable features in most applications. To solve this problem, this paper proposes a broad class of methods, which is referred to as conditional multidimensional scaling (MDS). An algorithm for optimizing the objective function of conditional MDS is also developed. The convergence of this algorithm is proven under mild assumptions. Conditional MDS is illustrated with kinship terms, facial expressions, textile fabrics, car-brand perception, and cylinder machining examples. These examples demonstrate the advantages of conditional MDS over conventional dimension reduction in improving the estimation quality of the reduced-dimension space and simplifying visualization and knowledge discovery tasks. Computer codes for this work are available in the open-source cml R package. △ Less

Submitted 29 December, 2023; v1 submitted 26 November, 2021; originally announced November 2021.

Comments: Article accepted for publication in IEEE Transactions on Pattern Analysis and Machine Intelligence, 12 pages, 8 figures

arXiv:2012.06916 [pdf, other]

Concept Drift Monitoring and Diagnostics of Supervised Learning Models via Score Vectors

Authors: Kungang Zhang, Anh T. Bui, Daniel W. Apley

Abstract: Supervised learning models are one of the most fundamental classes of models. Viewing supervised learning from a probabilistic perspective, the set of training data to which the model is fitted is usually assumed to follow a stationary distribution. However, this stationarity assumption is often violated in a phenomenon called concept drift, which refers to changes over time in the predictive rela… ▽ More Supervised learning models are one of the most fundamental classes of models. Viewing supervised learning from a probabilistic perspective, the set of training data to which the model is fitted is usually assumed to follow a stationary distribution. However, this stationarity assumption is often violated in a phenomenon called concept drift, which refers to changes over time in the predictive relationship between covariates $\mathbf{X}$ and a response variable $Y$ and can render trained models suboptimal or obsolete. We develop a comprehensive and computationally efficient framework for detecting, monitoring, and diagnosing concept drift. Specifically, we monitor the Fisher score vector, defined as the gradient of the log-likelihood for the fitted model, using a form of multivariate exponentially weighted moving average, which monitors for general changes in the mean of a random vector. In spite of the substantial performance advantages that we demonstrate over popular error-based methods, a score-based approach has not been previously considered for concept drift monitoring. Advantages of the proposed score-based framework include applicability to any parametric model, more powerful detection of changes as shown in theory and experiments, and inherent diagnostic capabilities for hel** to identify the nature of the changes. △ Less

Submitted 12 September, 2022; v1 submitted 12 December, 2020; originally announced December 2020.

arXiv:1702.02966 [pdf]

doi 10.1080/00401706.2017.1302362

A monitoring and diagnostic approach for stochastic textured surfaces

Authors: Anh Tuan Bui, Daniel W. Apley

Abstract: We develop a supervised-learning-based approach for monitoring and diagnosing texture-related defects in manufactured products characterized by stochastic textured surfaces that satisfy the locality and stationarity properties of Markov random fields. Examples of stochastic textured surface data include images of woven textiles; image or surface metrology data for machined, cast, or formed metal p… ▽ More We develop a supervised-learning-based approach for monitoring and diagnosing texture-related defects in manufactured products characterized by stochastic textured surfaces that satisfy the locality and stationarity properties of Markov random fields. Examples of stochastic textured surface data include images of woven textiles; image or surface metrology data for machined, cast, or formed metal parts; microscopy images of material microstructure samples; etc. To characterize the complex spatial statistical dependencies of in-control samples of the stochastic textured surface, we use rather generic supervised learning methods, which provide an implicit characterization of the joint distribution of the surface texture. We propose two spatial moving statistics, which are computed from residual errors of the fitted supervised learning model, for monitoring and diagnosing local aberrations in the general spatial statistical behavior of newly manufactured stochastic textured surface samples in a statistical process control context. We illustrate the approach using images of textile fabric samples and simulated 2-D stochastic processes, for which the algorithm successfully detects local defects of various natures. Supplemental discussions, results, data and computer codes are available online. △ Less

Submitted 21 July, 2017; v1 submitted 9 February, 2017; originally announced February 2017.

Showing 1–3 of 3 results for author: Bui, A T