Skip to main content

Showing 1–13 of 13 results for author: Moon, K R

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.04421  [pdf, other

    cs.LG stat.ML

    Enhancing Supervised Visualization through Autoencoder and Random Forest Proximities for Out-of-Sample Extension

    Authors: Shuang Ni, Adrien Aumon, Guy Wolf, Kevin R. Moon, Jake S. Rhodes

    Abstract: The value of supervised dimensionality reduction lies in its ability to uncover meaningful connections between data features and labels. Common dimensionality reduction methods embed a set of fixed, latent points, but are not capable of generalizing to an unseen test set. In this paper, we provide an out-of-sample extension method for the random forest-based supervised dimensionality reduction met… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 7 pages, 3 figures

  2. arXiv:2406.03619  [pdf, other

    cs.LG stat.ML

    Symmetry Discovery Beyond Affine Transformations

    Authors: Ben Shaw, Abram Magner, Kevin R. Moon

    Abstract: Symmetry detection has been shown to improve various machine learning tasks. In the context of continuous symmetry detection, current state of the art experiments are limited to the detection of affine transformations. Under the manifold assumption, we outline a framework for discovering continuous symmetry in data beyond the affine transformation group. We also provide a similar framework for dis… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  3. arXiv:2406.03396  [pdf, other

    cs.LG math.FA stat.ML

    Noisy Data Visualization using Functional Data Analysis

    Authors: Haozhe Chen, Andres Felipe Duque Correa, Guy Wolf, Kevin R. Moon

    Abstract: Data visualization via dimensionality reduction is an important tool in exploratory data analysis. However, when the data are noisy, many existing methods fail to capture the underlying structure of the data. The method called Empirical Intrinsic Geometry (EIG) was previously proposed for performing dimensionality reduction on high dimensional dynamical processes while theoretically eliminating al… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  4. arXiv:2210.12774  [pdf, other

    stat.ML cs.LG

    Manifold Alignment with Label Information

    Authors: Andres F. Duque, Myriam Lizotte, Guy Wolf, Kevin R. Moon

    Abstract: Multi-domain data is becoming increasingly common and presents both challenges and opportunities in the data science community. The integration of distinct data-views can be used for exploratory data analysis, and benefit downstream analysis including machine learning related tasks. With this in mind, we present a novel manifold alignment method called MALI (Manifold alignment with label informati… ▽ More

    Submitted 30 October, 2022; v1 submitted 23 October, 2022; originally announced October 2022.

  5. arXiv:2206.07305  [pdf, other

    stat.ML cs.LG

    Diffusion Transport Alignment

    Authors: Andres F. Duque, Guy Wolf, Kevin R. Moon

    Abstract: The integration of multimodal data presents a challenge in cases when the study of a given phenomena by different instruments or conditions generates distinct but related domains. Many existing data integration methods assume a known one-to-one correspondence between domains of the entire dataset, which may be unrealistic. Furthermore, existing manifold alignment methods are not suited for cases w… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

  6. arXiv:2201.12682  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    Geometry- and Accuracy-Preserving Random Forest Proximities

    Authors: Jake S. Rhodes, Adele Cutler, Kevin R. Moon

    Abstract: Random forests are considered one of the best out-of-the-box classification and regression algorithms due to their high level of predictive performance with relatively little tuning. Pairwise proximities can be computed from a trained random forest and measure the similarity between data points relative to the supervised task. Random forest proximities have been used in many applications including… ▽ More

    Submitted 28 February, 2023; v1 submitted 29 January, 2022; originally announced January 2022.

  7. Extendable and invertible manifold learning with geometry regularized autoencoders

    Authors: Andrés F. Duque, Sacha Morin, Guy Wolf, Kevin R. Moon

    Abstract: A fundamental task in data exploration is to extract simplified low dimensional representations that capture intrinsic geometry in data, especially for faithfully visualizing data in two or three dimensions. Common approaches to this task use kernel methods for manifold learning. However, these methods typically only provide an embedding of fixed input data and cannot extend to new data points. Au… ▽ More

    Submitted 22 November, 2020; v1 submitted 14 July, 2020; originally announced July 2020.

    Comments: 10 pages, 6 figures

    Journal ref: IEEE International Conference on Big Data, pp. 5027-5036, Dec. 2020

  8. arXiv:2006.08701  [pdf, other

    stat.ML cs.HC cs.LG stat.AP

    Supervised Visualization for Data Exploration

    Authors: Jake S. Rhodes, Adele Cutler, Guy Wolf, Kevin R. Moon

    Abstract: Dimensionality reduction is often used as an initial step in data exploration, either as preprocessing for classification or regression or for visualization. Most dimensionality reduction techniques to date are unsupervised; they do not take class labels into account (e.g., PCA, MDS, t-SNE, Isomap). Such methods require large amounts of data and are often sensitive to noise that may obfuscate impo… ▽ More

    Submitted 15 June, 2020; originally announced June 2020.

    Comments: 21 pages, 9 figures

  9. arXiv:1906.10725  [pdf, ps, other

    stat.ML cs.LG eess.SP

    Visualizing High Dimensional Dynamical Processes

    Authors: Andrés F. Duque, Guy Wolf, Kevin R. Moon

    Abstract: Manifold learning techniques for dynamical systems and time series have shown their utility for a broad spectrum of applications in recent years. While these methods are effective at learning a low-dimensional representation, they are often insufficient for visualizing the global and local structure of the data. In this paper, we present DIG (Dynamical Information Geometry), a visualization method… ▽ More

    Submitted 25 June, 2019; originally announced June 2019.

    Comments: 7 pages, 3 figures

    Journal ref: IEEE International Workshop on Machine Learning for Signal Processing, Oct. 2019

  10. arXiv:1702.05222  [pdf, other

    cs.IT cs.AI stat.ML

    Direct Estimation of Information Divergence Using Nearest Neighbor Ratios

    Authors: Morteza Noshad, Kevin R. Moon, Salimeh Yasaei Sekeh, Alfred O. Hero III

    Abstract: We propose a direct estimation method for Rényi and f-divergence measures based on a new graph theoretical interpretation. Suppose that we are given two sample sets $X$ and $Y$, respectively with $N$ and $M$ samples, where $η:=M/N$ is a constant value. Considering the $k$-nearest neighbor ($k$-NN) graph of $Y$ in the joint data set $(X,Y)$, we show that the average powered ratio of the number of… ▽ More

    Submitted 20 November, 2017; v1 submitted 16 February, 2017; originally announced February 2017.

    Comments: 2017 IEEE International Symposium on Information Theory (ISIT)

    Journal ref: In Information Theory (ISIT), 2017 IEEE International Symposium on (pp. 903-907). IEEE

  11. arXiv:1609.03912  [pdf, ps, other

    cs.IT cs.LG stat.ML

    Information Theoretic Structure Learning with Confidence

    Authors: Kevin R. Moon, Morteza Noshad, Salimeh Yasaei Sekeh, Alfred O. Hero III

    Abstract: Information theoretic measures (e.g. the Kullback Liebler divergence and Shannon mutual information) have been used for exploring possibly nonlinear multivariate dependencies in high dimension. If these dependencies are assumed to follow a Markov factor graph model, this exploration process is called structure discovery. For discrete-valued samples, estimates of the information divergence over the… ▽ More

    Submitted 13 September, 2016; originally announced September 2016.

    Comments: 10 pages, 3 figures

    Journal ref: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 6095-6099, Mar. 2017

  12. arXiv:1510.03507  [pdf, ps, other

    q-bio.NC cs.LG stat.ML

    The intrinsic value of HFO features as a biomarker of epileptic activity

    Authors: Stephen V. Gliske, Kevin R. Moon, William C. Stacey, Alfred O. Hero III

    Abstract: High frequency oscillations (HFOs) are a promising biomarker of epileptic brain tissue and activity. HFOs additionally serve as a prototypical example of challenges in the analysis of discrete events in high-temporal resolution, intracranial EEG data. Two primary challenges are 1) dimensionality reduction, and 2) assessing feasibility of classification. Dimensionality reduction assumes that the da… ▽ More

    Submitted 12 October, 2015; originally announced October 2015.

    Comments: 5 pages, 5 figures

    Journal ref: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 6290-6294, Mar. 2016

  13. arXiv:1411.2045  [pdf, other

    cs.IT stat.ML

    Multivariate f-Divergence Estimation With Confidence

    Authors: Kevin R. Moon, Alfred O. Hero III

    Abstract: The problem of f-divergence estimation is important in the fields of machine learning, information theory, and statistics. While several nonparametric divergence estimators exist, relatively few have known convergence properties. In particular, even for those estimators whose MSE convergence rates are known, the asymptotic distributions are unknown. We establish the asymptotic normality of a recen… ▽ More

    Submitted 7 November, 2014; originally announced November 2014.

    Comments: 20 pages, 1 figure. Accepted to NIPS 2014 (supplementary material is included in the appendices)

    Journal ref: K.R. Moon and A.O. Hero III, "Multivariate f-Divergence Estimation With Confidence," In Advances in Neural Information Processing Systems, pp. 2420-2428, 2014