-
Enhanced Response Envelope via Envelope Regularization
Authors:
Oh-Ran Kwon,
Hui Zou
Abstract:
The response envelope model provides substantial efficiency gains over the standard multivariate linear regression by identifying the material part of the response to the model and by excluding the immaterial part. In this paper, we propose the enhanced response envelope by incorporating a novel envelope regularization term based on a nonconvex manifold formulation. It is shown that the enhanced r…
▽ More
The response envelope model provides substantial efficiency gains over the standard multivariate linear regression by identifying the material part of the response to the model and by excluding the immaterial part. In this paper, we propose the enhanced response envelope by incorporating a novel envelope regularization term based on a nonconvex manifold formulation. It is shown that the enhanced response envelope can yield better prediction risk than the original envelope estimator. The enhanced response envelope naturally handles high-dimensional data for which the original response envelope is not serviceable without necessary remedies. In an asymptotic high-dimensional regime where the ratio of the number of predictors over the number of samples converges to a non-zero constant, we characterize the risk function and reveal an interesting double descent phenomenon for the envelope model. A simulation study confirms our main theoretical findings. Simulations and real data applications demonstrate that the enhanced response envelope does have significantly improved prediction performance over the original envelope method, especially when the number of predictors is close to or moderately larger than the number of samples. Proofs and additional simulation results are shown in the supplementary file to this paper.
△ Less
Submitted 30 June, 2024; v1 submitted 11 January, 2023;
originally announced January 2023.
-
Supporting Analysis of Dimensionality Reduction Results with Contrastive Learning
Authors:
Takanori Fujiwara,
Oh-Hyun Kwon,
Kwan-Liu Ma
Abstract:
Dimensionality reduction (DR) is frequently used for analyzing and visualizing high-dimensional data as it provides a good first glance of the data. However, to interpret the DR result for gaining useful insights from the data, it would take additional analysis effort such as identifying clusters and understanding their characteristics. While there are many automatic methods (e.g., density-based c…
▽ More
Dimensionality reduction (DR) is frequently used for analyzing and visualizing high-dimensional data as it provides a good first glance of the data. However, to interpret the DR result for gaining useful insights from the data, it would take additional analysis effort such as identifying clusters and understanding their characteristics. While there are many automatic methods (e.g., density-based clustering methods) to identify clusters, effective methods for understanding a cluster's characteristics are still lacking. A cluster can be mostly characterized by its distribution of feature values. Reviewing the original feature values is not a straightforward task when the number of features is large. To address this challenge, we present a visual analytics method that effectively highlights the essential features of a cluster in a DR result. To extract the essential features, we introduce an enhanced usage of contrastive principal component analysis (cPCA). Our method, called ccPCA (contrasting clusters in PCA), can calculate each feature's relative contribution to the contrast between one cluster and other clusters. With ccPCA, we have created an interactive system including a scalable visualization of clusters' feature contributions. We demonstrate the effectiveness of our method and system with case studies using several publicly available datasets.
△ Less
Submitted 14 October, 2019; v1 submitted 9 May, 2019;
originally announced May 2019.
-
A Deep Generative Model for Graph Layout
Authors:
Oh-Hyun Kwon,
Kwan-Liu Ma
Abstract:
Different layouts can characterize different aspects of the same graph. Finding a "good" layout of a graph is thus an important task for graph visualization. In practice, users often visualize a graph in multiple layouts by using different methods and varying parameter settings until they find a layout that best suits the purpose of the visualization. However, this trial-and-error process is often…
▽ More
Different layouts can characterize different aspects of the same graph. Finding a "good" layout of a graph is thus an important task for graph visualization. In practice, users often visualize a graph in multiple layouts by using different methods and varying parameter settings until they find a layout that best suits the purpose of the visualization. However, this trial-and-error process is often haphazard and time-consuming. To provide users with an intuitive way to navigate the layout design space, we present a technique to systematically visualize a graph in diverse layouts using deep generative models. We design an encoder-decoder architecture to learn a model from a collection of example layouts, where the encoder represents training examples in a latent space and the decoder produces layouts from the latent space. In particular, we train the model to construct a two-dimensional latent space for users to easily explore and generate various layouts. We demonstrate our approach through quantitative and qualitative evaluations of the generated layouts. The results of our evaluations show that our model is capable of learning and generalizing abstract concepts of graph layouts, not just memorizing the training examples. In summary, this paper presents a fundamentally new approach to graph visualization where a machine learning model learns to visualize a graph from examples without manually-defined heuristics.
△ Less
Submitted 15 October, 2019; v1 submitted 27 April, 2019;
originally announced April 2019.
-
What Would a Graph Look Like in This Layout? A Machine Learning Approach to Large Graph Visualization
Authors:
Oh-Hyun Kwon,
Tarik Crnovrsanin,
Kwan-Liu Ma
Abstract:
Using different methods for laying out a graph can lead to very different visual appearances, with which the viewer perceives different information. Selecting a "good" layout method is thus important for visualizing a graph. The selection can be highly subjective and dependent on the given task. A common approach to selecting a good layout is to use aesthetic criteria and visual inspection. Howeve…
▽ More
Using different methods for laying out a graph can lead to very different visual appearances, with which the viewer perceives different information. Selecting a "good" layout method is thus important for visualizing a graph. The selection can be highly subjective and dependent on the given task. A common approach to selecting a good layout is to use aesthetic criteria and visual inspection. However, fully calculating various layouts and their associated aesthetic metrics is computationally expensive. In this paper, we present a machine learning approach to large graph visualization based on computing the topological similarity of graphs using graph kernels. For a given graph, our approach can show what the graph would look like in different layouts and estimate their corresponding aesthetic metrics. An important contribution of our work is the development of a new framework to design graph kernels. Our experimental study shows that our estimation calculation is considerably faster than computing the actual layouts and their aesthetic metrics. Also, our graph kernels outperform the state-of-the-art ones in both time and accuracy. In addition, we conducted a user study to demonstrate that the topological similarity computed with our graph kernel matches perceptual similarity assessed by human users.
△ Less
Submitted 11 October, 2017;
originally announced October 2017.