-
A Computational Design Pipeline to Fabricate Sensing Network Physicalizations
Authors:
S. Sandra Bae,
Takanori Fujiwara,
Anders Ynnerman,
Ellen Yi-Luen Do,
Michael L. Rivera,
Danielle Albers Szafir
Abstract:
Interaction is critical for data analysis and sensemaking. However, designing interactive physicalizations is challenging as it requires cross-disciplinary knowledge in visualization, fabrication, and electronics. Interactive physicalizations are typically produced in an unstructured manner, resulting in unique solutions for a specific dataset, problem, or interaction that cannot be easily extende…
▽ More
Interaction is critical for data analysis and sensemaking. However, designing interactive physicalizations is challenging as it requires cross-disciplinary knowledge in visualization, fabrication, and electronics. Interactive physicalizations are typically produced in an unstructured manner, resulting in unique solutions for a specific dataset, problem, or interaction that cannot be easily extended or adapted to new scenarios or future physicalizations. To mitigate these challenges, we introduce a computational design pipeline to 3D print network physicalizations with integrated sensing capabilities. Networks are ubiquitous, yet their complex geometry also requires significant engineering considerations to provide intuitive, effective interactions for exploration. Using our pipeline, designers can readily produce network physicalizations supporting selection-the most critical atomic operation for interaction-by touch through capacitive sensing and computational inference. Our computational design pipeline introduces a new design paradigm by concurrently considering the form and interactivity of a physicalization into one cohesive fabrication workflow. We evaluate our approach using (i) computational evaluations, (ii) three usage scenarios focusing on general visualization tasks, and (iii) expert interviews. The design paradigm introduced by our pipeline can lower barriers to physicalization research, creation, and adoption.
△ Less
Submitted 12 August, 2023; v1 submitted 9 August, 2023;
originally announced August 2023.
-
Visual Analytics of Multivariate Networks with Representation Learning and Composite Variable Construction
Authors:
Hsiao-Ying Lu,
Takanori Fujiwara,
Ming-Yi Chang,
Yang-chih Fu,
Anders Ynnerman,
Kwan-Liu Ma
Abstract:
Multivariate networks are commonly found in real-world data-driven applications. Uncovering and understanding the relations of interest in multivariate networks is not a trivial task. This paper presents a visual analytics workflow for studying multivariate networks to extract associations between different structural and semantic characteristics of the networks (e.g., what are the combinations of…
▽ More
Multivariate networks are commonly found in real-world data-driven applications. Uncovering and understanding the relations of interest in multivariate networks is not a trivial task. This paper presents a visual analytics workflow for studying multivariate networks to extract associations between different structural and semantic characteristics of the networks (e.g., what are the combinations of attributes largely relating to the density of a social network?). The workflow consists of a neural-network-based learning phase to classify the data based on the chosen input and output attributes, a dimensionality reduction and optimization phase to produce a simplified set of results for examination, and finally an interpreting phase conducted by the user through an interactive visualization interface. A key part of our design is a composite variable construction step that remodels nonlinear features obtained by neural networks into linear features that are intuitive to interpret. We demonstrate the capabilities of this workflow with multiple case studies on networks derived from social media usage and also evaluate the workflow with qualitative feedback from experts.
△ Less
Submitted 2 July, 2024; v1 submitted 16 March, 2023;
originally announced March 2023.
-
Visual Analytics of Neuron Vulnerability to Adversarial Attacks on Convolutional Neural Networks
Authors:
Yiran Li,
Junpeng Wang,
Takanori Fujiwara,
Kwan-Liu Ma
Abstract:
Adversarial attacks on a convolutional neural network (CNN) -- injecting human-imperceptible perturbations into an input image -- could fool a high-performance CNN into making incorrect predictions. The success of adversarial attacks raises serious concerns about the robustness of CNNs, and prevents them from being used in safety-critical applications, such as medical diagnosis and autonomous driv…
▽ More
Adversarial attacks on a convolutional neural network (CNN) -- injecting human-imperceptible perturbations into an input image -- could fool a high-performance CNN into making incorrect predictions. The success of adversarial attacks raises serious concerns about the robustness of CNNs, and prevents them from being used in safety-critical applications, such as medical diagnosis and autonomous driving. Our work introduces a visual analytics approach to understanding adversarial attacks by answering two questions: (1) which neurons are more vulnerable to attacks and (2) which image features do these vulnerable neurons capture during the prediction? For the first question, we introduce multiple perturbation-based measures to break down the attacking magnitude into individual CNN neurons and rank the neurons by their vulnerability levels. For the second, we identify image features (e.g., cat ears) that highly stimulate a user-selected neuron to augment and validate the neuron's responsibility. Furthermore, we support an interactive exploration of a large number of neurons by aiding with hierarchical clustering based on the neurons' roles in the prediction. To this end, a visual analytics system is designed to incorporate visual reasoning for interpreting adversarial attacks. We validate the effectiveness of our system through multiple case studies as well as feedback from domain experts.
△ Less
Submitted 5 March, 2023;
originally announced March 2023.
-
Feature Learning for Nonlinear Dimensionality Reduction toward Maximal Extraction of Hidden Patterns
Authors:
Takanori Fujiwara,
Yun-Hsin Kuo,
Anders Ynnerman,
Kwan-Liu Ma
Abstract:
Dimensionality reduction (DR) plays a vital role in the visual analysis of high-dimensional data. One main aim of DR is to reveal hidden patterns that lie on intrinsic low-dimensional manifolds. However, DR often overlooks important patterns when the manifolds are distorted or masked by certain influential data attributes. This paper presents a feature learning framework, FEALM, designed to genera…
▽ More
Dimensionality reduction (DR) plays a vital role in the visual analysis of high-dimensional data. One main aim of DR is to reveal hidden patterns that lie on intrinsic low-dimensional manifolds. However, DR often overlooks important patterns when the manifolds are distorted or masked by certain influential data attributes. This paper presents a feature learning framework, FEALM, designed to generate a set of optimized data projections for nonlinear DR in order to capture important patterns in the hidden manifolds. These projections produce maximally different nearest-neighbor graphs so that resultant DR outcomes are significantly different. To achieve such a capability, we design an optimization algorithm as well as introduce a new graph dissimilarity measure, named neighbor-shape dissimilarity. Additionally, we develop interactive visualizations to assist comparison of obtained DR results and interpretation of each DR result. We demonstrate FEALM's effectiveness through experiments and case studies using synthetic and real-world datasets.
△ Less
Submitted 24 February, 2023; v1 submitted 28 June, 2022;
originally announced June 2022.
-
A Machine-Learning-Aided Visual Analysis Workflow for Investigating Air Pollution Data
Authors:
Yun-Hsin Kuo,
Takanori Fujiwara,
Charles C. -K. Chou,
Chun-houh Chen,
Kwan-Liu Ma
Abstract:
Analyzing air pollution data is challenging as there are various analysis focuses from different aspects: feature (what), space (where), and time (when). As in most geospatial analysis problems, besides high-dimensional features, the temporal and spatial dependencies of air pollution induce the complexity of performing analysis. Machine learning methods, such as dimensionality reduction, can extra…
▽ More
Analyzing air pollution data is challenging as there are various analysis focuses from different aspects: feature (what), space (where), and time (when). As in most geospatial analysis problems, besides high-dimensional features, the temporal and spatial dependencies of air pollution induce the complexity of performing analysis. Machine learning methods, such as dimensionality reduction, can extract and summarize important information of the data to lift the burden of understanding such a complicated environment. In this paper, we present a methodology that utilizes multiple machine learning methods to uniformly explore these aspects. With this methodology, we develop a visual analytic system that supports a flexible analysis workflow, allowing domain experts to freely explore different aspects based on their analysis needs. We demonstrate the capability of our system and analysis workflow supporting a variety of analysis tasks with multiple use cases.
△ Less
Submitted 10 February, 2022;
originally announced February 2022.
-
A Visual Analytics System for Water Distribution System Optimization
Authors:
Yiran Li,
Erin Musabandesu,
Takanori Fujiwara,
Frank J. Loge,
Kwan-Liu Ma
Abstract:
The optimization of water distribution systems (WDSs) is vital to minimize energy costs required for their operations. A principal approach taken by researchers is identifying an optimal scheme for water pump controls through examining computational simulations of WDSs. However, due to a large number of possible control combinations and the complexity of WDS simulations, it remains non-trivial to…
▽ More
The optimization of water distribution systems (WDSs) is vital to minimize energy costs required for their operations. A principal approach taken by researchers is identifying an optimal scheme for water pump controls through examining computational simulations of WDSs. However, due to a large number of possible control combinations and the complexity of WDS simulations, it remains non-trivial to identify the best pump controls by reviewing the simulation results. To address this problem, we design a visual analytics system that helps understand relationships between simulation inputs and outputs towards better optimization. Our system incorporates interpretable machine learning as well as multiple linked visualizations to capture essential input-output relationships from complex WDS simulations. We demonstrate our system's effectiveness through a practical case study and evaluate its usability through expert reviews. Our results show that our system can lessen the burden of analysis and assist in determining optimal operating schemes.
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
Interactive Dimensionality Reduction for Comparative Analysis
Authors:
Takanori Fujiwara,
Xinhai Wei,
Jian Zhao,
Kwan-Liu Ma
Abstract:
Finding the similarities and differences between groups of datasets is a fundamental analysis task. For high-dimensional data, dimensionality reduction (DR) methods are often used to find the characteristics of each group. However, existing DR methods provide limited capability and flexibility for such comparative analysis as each method is designed only for a narrow analysis target, such as ident…
▽ More
Finding the similarities and differences between groups of datasets is a fundamental analysis task. For high-dimensional data, dimensionality reduction (DR) methods are often used to find the characteristics of each group. However, existing DR methods provide limited capability and flexibility for such comparative analysis as each method is designed only for a narrow analysis target, such as identifying factors that most differentiate groups. This paper presents an interactive DR framework where we integrate our new DR method, called ULCA (unified linear comparative analysis), with an interactive visual interface. ULCA unifies two DR schemes, discriminant analysis and contrastive learning, to support various comparative analysis tasks. To provide flexibility for comparative analysis, we develop an optimization algorithm that enables analysts to interactively refine ULCA results. Additionally, the interactive visualization interface facilitates interpretation and refinement of the ULCA results. We evaluate ULCA and the optimization algorithm to show their efficiency as well as present multiple case studies using real-world datasets to demonstrate the usefulness of this framework.
△ Less
Submitted 27 October, 2021; v1 submitted 29 June, 2021;
originally announced June 2021.
-
A Visual Analytics Approach for Hardware System Monitoring with Streaming Functional Data Analysis
Authors:
Fnu Shilpika,
Takanori Fujiwara,
Naohisa Sakamoto,
Jorji Nonaka,
Kwan-Liu Ma
Abstract:
Many real-world applications involve analyzing time-dependent phenomena, which are intrinsically functional, consisting of curves varying over a continuum (e.g., time). When analyzing continuous data, functional data analysis (FDA) provides substantial benefits, such as the ability to study the derivatives and to restrict the ordering of data. However, continuous data inherently has infinite dimen…
▽ More
Many real-world applications involve analyzing time-dependent phenomena, which are intrinsically functional, consisting of curves varying over a continuum (e.g., time). When analyzing continuous data, functional data analysis (FDA) provides substantial benefits, such as the ability to study the derivatives and to restrict the ordering of data. However, continuous data inherently has infinite dimensions, and for a long time series, FDA methods often suffer from high computational costs. The analysis problem becomes even more challenging when updating the FDA results for continuously arriving data. In this paper, we present a visual analytics approach for monitoring and reviewing time series data streamed from a hardware system with a focus on identifying outliers by using FDA. To perform FDA while addressing the computational problem, we introduce new incremental and progressive algorithms that promptly generate the magnitude-shape (MS) plot, which conveys both the functional magnitude and shape outlyingness of time series data. In addition, by using an MS plot in conjunction with an FDA version of principal component analysis, we enhance the analyst's ability to investigate the visually-identified outliers. We illustrate the effectiveness of our approach with two use scenarios using real-world datasets. The resulting tool is evaluated by industry experts using real-world streaming datasets.
△ Less
Submitted 21 February, 2022; v1 submitted 25 November, 2020;
originally announced November 2020.
-
A Predictive Visual Analytics System for Studying Neurodegenerative Disease based on DTI Fiber Tracts
Authors:
Chaoqing Xu,
Tyson Neuroth,
Takanori Fujiwara,
Ronghua Liang,
Kwan-Liu Ma
Abstract:
Diffusion tensor imaging (DTI) has been used to study the effects of neurodegenerative diseases on neural pathways, which may lead to more reliable and early diagnosis of these diseases as well as a better understanding of how they affect the brain. We introduce an intelligent visual analytics system for studying patient groups based on their labeled DTI fiber tract data and corresponding statisti…
▽ More
Diffusion tensor imaging (DTI) has been used to study the effects of neurodegenerative diseases on neural pathways, which may lead to more reliable and early diagnosis of these diseases as well as a better understanding of how they affect the brain. We introduce an intelligent visual analytics system for studying patient groups based on their labeled DTI fiber tract data and corresponding statistics. The system's AI-augmented interface guides the user through an organized and holistic analysis space, including the statistical feature space, the physical space, and the space of patients over different groups. We use a custom machine learning pipeline to help narrow down this large analysis space, and then explore it pragmatically through a range of linked visualizations. We conduct several case studies using real data from the research database of Parkinson's Progression Markers Initiative.
△ Less
Submitted 13 December, 2021; v1 submitted 13 October, 2020;
originally announced October 2020.
-
A Visual Analytics Framework for Reviewing Multivariate Time-Series Data with Dimensionality Reduction
Authors:
Takanori Fujiwara,
Shilpika,
Naohisa Sakamoto,
Jorji Nonaka,
Keiji Yamamoto,
Kwan-Liu Ma
Abstract:
Data-driven problem solving in many real-world applications involves analysis of time-dependent multivariate data, for which dimensionality reduction (DR) methods are often used to uncover the intrinsic structure and features of the data. However, DR is usually applied to a subset of data that is either single-time-point multivariate or univariate time-series, resulting in the need to manually exa…
▽ More
Data-driven problem solving in many real-world applications involves analysis of time-dependent multivariate data, for which dimensionality reduction (DR) methods are often used to uncover the intrinsic structure and features of the data. However, DR is usually applied to a subset of data that is either single-time-point multivariate or univariate time-series, resulting in the need to manually examine and correlate the DR results out of different data subsets. When the number of dimensions is large either in terms of the number of time points or attributes, this manual task becomes too tedious and infeasible. In this paper, we present MulTiDR, a new DR framework that enables processing of time-dependent multivariate data as a whole to provide a comprehensive overview of the data. With the framework, we employ DR in two steps. When treating the instances, time points, and attributes of the data as a 3D array, the first DR step reduces the three axes of the array to two, and the second DR step visualizes the data in a lower-dimensional space. In addition, by coupling with a contrastive learning method and interactive visualizations, our framework enhances analysts' ability to interpret DR results. We demonstrate the effectiveness of our framework with four case studies using real-world datasets.
△ Less
Submitted 27 October, 2021; v1 submitted 2 August, 2020;
originally announced August 2020.
-
A Visual Analytics Framework for Contrastive Network Analysis
Authors:
Takanori Fujiwara,
Jian Zhao,
Francine Chen,
Kwan-Liu Ma
Abstract:
A common network analysis task is comparison of two networks to identify unique characteristics in one network with respect to the other. For example, when comparing protein interaction networks derived from normal and cancer tissues, one essential task is to discover protein-protein interactions unique to cancer tissues. However, this task is challenging when the networks contain complex structur…
▽ More
A common network analysis task is comparison of two networks to identify unique characteristics in one network with respect to the other. For example, when comparing protein interaction networks derived from normal and cancer tissues, one essential task is to discover protein-protein interactions unique to cancer tissues. However, this task is challenging when the networks contain complex structural (and semantic) relations. To address this problem, we design ContraNA, a visual analytics framework leveraging both the power of machine learning for uncovering unique characteristics in networks and also the effectiveness of visualization for understanding such uniqueness. The basis of ContraNA is cNRL, which integrates two machine learning schemes, network representation learning (NRL) and contrastive learning (CL), to generate a low-dimensional embedding that reveals the uniqueness of one network when compared to another. ContraNA provides an interactive visualization interface to help analyze the uniqueness by relating embedding results and network structures as well as explaining the learned features by cNRL. We demonstrate the usefulness of ContraNA with two case studies using real-world datasets. We also evaluate through a controlled user study with 12 participants on network comparison tasks. The results show that participants were able to both effectively identify unique characteristics from complex networks and interpret the results obtained from cNRL.
△ Less
Submitted 16 August, 2020; v1 submitted 31 July, 2020;
originally announced August 2020.
-
Contrastive Multiple Correspondence Analysis (cMCA): Using Contrastive Learning to Identify Latent Subgroups in Political Parties
Authors:
Takanori Fujiwara,
Tzu-** Liu
Abstract:
Scaling methods have long been utilized to simplify and cluster high-dimensional data. However, the general latent spaces across all predefined groups derived from these methods sometimes do not fall into researchers' interest regarding specific patterns within groups. To tackle this issue, we adopt an emerging analysis approach called contrastive learning. We contribute to this growing field by e…
▽ More
Scaling methods have long been utilized to simplify and cluster high-dimensional data. However, the general latent spaces across all predefined groups derived from these methods sometimes do not fall into researchers' interest regarding specific patterns within groups. To tackle this issue, we adopt an emerging analysis approach called contrastive learning. We contribute to this growing field by extending its ideas to multiple correspondence analysis (MCA) in order to enable an analysis of data often encountered by social scientists -- containing binary, ordinal, and nominal variables. We demonstrate the utility of contrastive MCA (cMCA) by analyzing two different surveys of voters in the U.S. and U.K. Our results suggest that, first, cMCA can identify substantively important dimensions and divisions among subgroups that are overlooked by traditional methods; second, for other cases, cMCA can derive latent traits that emphasize subgroups seen moderately in those derived by traditional methods.
△ Less
Submitted 1 June, 2023; v1 submitted 8 July, 2020;
originally announced July 2020.
-
Network Comparison with Interpretable Contrastive Network Representation Learning
Authors:
Takanori Fujiwara,
Jian Zhao,
Francine Chen,
Yaoliang Yu,
Kwan-Liu Ma
Abstract:
Identifying unique characteristics in a network through comparison with another network is an essential network analysis task. For example, with networks of protein interactions obtained from normal and cancer tissues, we can discover unique types of interactions in cancer tissues. This analysis task could be greatly assisted by contrastive learning, which is an emerging analysis approach to disco…
▽ More
Identifying unique characteristics in a network through comparison with another network is an essential network analysis task. For example, with networks of protein interactions obtained from normal and cancer tissues, we can discover unique types of interactions in cancer tissues. This analysis task could be greatly assisted by contrastive learning, which is an emerging analysis approach to discover salient patterns in one dataset relative to another. However, existing contrastive learning methods cannot be directly applied to networks as they are designed only for high-dimensional data analysis. To address this problem, we introduce a new analysis approach called contrastive network representation learning (cNRL). By integrating two machine learning schemes, network representation learning and contrastive learning, cNRL enables embedding of network nodes into a low-dimensional representation that reveals the uniqueness of one network compared to another. Within this approach, we also design a method, named i-cNRL, which offers interpretability in the learned results, allowing for understanding which specific patterns are only found in one network. We demonstrate the effectiveness of i-cNRL for network comparison with multiple network models and real-world datasets. Furthermore, we compare i-cNRL and other potential cNRL algorithm designs through quantitative and qualitative evaluations.
△ Less
Submitted 15 February, 2022; v1 submitted 25 May, 2020;
originally announced May 2020.
-
A Visual Analytics System for Multi-model Comparison on Clinical Data Predictions
Authors:
Yiran Li,
Takanori Fujiwara,
Yong K. Choi,
Katherine K. Kim,
Kwan-Liu Ma
Abstract:
There is a growing trend of applying machine learning methods to medical datasets in order to predict patients' future status. Although some of these methods achieve high performance, challenges still exist in comparing and evaluating different models through their interpretable information. Such analytics can help clinicians improve evidence-based medical decision making. In this work, we develop…
▽ More
There is a growing trend of applying machine learning methods to medical datasets in order to predict patients' future status. Although some of these methods achieve high performance, challenges still exist in comparing and evaluating different models through their interpretable information. Such analytics can help clinicians improve evidence-based medical decision making. In this work, we develop a visual analytics system that compares multiple models' prediction criteria and evaluates their consistency. With our system, users can generate knowledge on different models' inner criteria and how confidently we can rely on each model's prediction for a certain patient. Through a case study of a publicly available clinical dataset, we demonstrate the effectiveness of our visual analytics system to assist clinicians and researchers in comparing and quantitatively evaluating different machine learning methods.
△ Less
Submitted 23 March, 2020; v1 submitted 18 February, 2020;
originally announced February 2020.
-
Comparative Visual Analytics for Assessing Medical Records with Sequence Embedding
Authors:
Rongchen Guo,
Takanori Fujiwara,
Yiran Li,
Kelly M. Lima,
Soman Sen,
Nam K. Tran,
Kwan-Liu Ma
Abstract:
Machine learning for data-driven diagnosis has been actively studied in medicine to provide better healthcare. Supporting analysis of a patient cohort similar to a patient under treatment is a key task for clinicians to make decisions with high confidence. However, such analysis is not straightforward due to the characteristics of medical records: high dimensionality, irregularity in time, and spa…
▽ More
Machine learning for data-driven diagnosis has been actively studied in medicine to provide better healthcare. Supporting analysis of a patient cohort similar to a patient under treatment is a key task for clinicians to make decisions with high confidence. However, such analysis is not straightforward due to the characteristics of medical records: high dimensionality, irregularity in time, and sparsity. To address this challenge, we introduce a method for similarity calculation of medical records. Our method employs event and sequence embeddings. While we use an autoencoder for the event embedding, we apply its variant with the self-attention mechanism for the sequence embedding. Moreover, in order to better handle the irregularity of data, we enhance the self-attention mechanism with consideration of different time intervals. We have developed a visual analytics system to support comparative studies of patient records. To make a comparison of sequences with different lengths easier, our system incorporates a sequence alignment method. Through its interactive interface, the user can quickly identify patients of interest and conveniently review both the temporal and multivariate aspects of the patient records. We demonstrate the effectiveness of our design and system with case studies using a real-world dataset from the neonatal intensive care unit of UC Davis.
△ Less
Submitted 23 March, 2020; v1 submitted 18 February, 2020;
originally announced February 2020.
-
Model Extraction Attacks against Recurrent Neural Networks
Authors:
Tatsuya Takemura,
Naoto Yanai,
Toru Fujiwara
Abstract:
Model extraction attacks are a kind of attacks in which an adversary obtains a new model, whose performance is equivalent to that of a target model, via query access to the target model efficiently, i.e., fewer datasets and computational resources than those of the target model. Existing works have dealt with only simple deep neural networks (DNNs), e.g., only three layers, as targets of model ext…
▽ More
Model extraction attacks are a kind of attacks in which an adversary obtains a new model, whose performance is equivalent to that of a target model, via query access to the target model efficiently, i.e., fewer datasets and computational resources than those of the target model. Existing works have dealt with only simple deep neural networks (DNNs), e.g., only three layers, as targets of model extraction attacks, and hence are not aware of the effectiveness of recurrent neural networks (RNNs) in dealing with time-series data. In this work, we shed light on the threats of model extraction attacks against RNNs. We discuss whether a model with a higher accuracy can be extracted with a simple RNN from a long short-term memory (LSTM), which is a more complicated and powerful RNN. Specifically, we tackle the following problems. First, in a case of a classification problem, such as image recognition, extraction of an RNN model without final outputs from an LSTM model is presented by utilizing outputs halfway through the sequence. Next, in a case of a regression problem. such as in weather forecasting, a new attack by newly configuring a loss function is presented. We conduct experiments on our model extraction attacks against an RNN and an LSTM trained with publicly available academic datasets. We then show that a model with a higher accuracy can be extracted efficiently, especially through configuring a loss function and a more complex architecture different from the target model.
△ Less
Submitted 31 January, 2020;
originally announced February 2020.
-
A Visual Analytics Framework for Reviewing Streaming Performance Data
Authors:
Suraj P. Kesavan,
Takanori Fujiwara,
Jian** Kelvin Li,
Caitlin Ross,
Misbah Mubarak,
Christopher D. Carothers,
Robert B. Ross,
Kwan-Liu Ma
Abstract:
Understanding and tuning the performance of extreme-scale parallel computing systems demands a streaming approach due to the computational cost of applying offline algorithms to vast amounts of performance log data. Analyzing large streaming data is challenging because the rate of receiving data and limited time to comprehend data make it difficult for the analysts to sufficiently examine the data…
▽ More
Understanding and tuning the performance of extreme-scale parallel computing systems demands a streaming approach due to the computational cost of applying offline algorithms to vast amounts of performance log data. Analyzing large streaming data is challenging because the rate of receiving data and limited time to comprehend data make it difficult for the analysts to sufficiently examine the data without missing important changes or patterns. To support streaming data analysis, we introduce a visual analytic framework comprising of three modules: data management, analysis, and interactive visualization. The data management module collects various computing and communication performance metrics from the monitored system using streaming data processing techniques and feeds the data to the other two modules. The analysis module automatically identifies important changes and patterns at the required latency. In particular, we introduce a set of online and progressive analysis methods for not only controlling the computational costs but also hel** analysts better follow the critical aspects of the analysis results. Finally, the interactive visualization module provides the analysts with a coherent view of the changes and patterns in the continuously captured performance data. Through a multi-faceted case study on performance analysis of parallel discrete-event simulation, we demonstrate the effectiveness of our framework for identifying bottlenecks and locating outliers.
△ Less
Submitted 25 January, 2020;
originally announced January 2020.
-
An Incremental Dimensionality Reduction Method for Visualizing Streaming Multidimensional Data
Authors:
Takanori Fujiwara,
Jia-Kai Chou,
Shilpika,
Panpan Xu,
Liu Ren,
Kwan-Liu Ma
Abstract:
Dimensionality reduction (DR) methods are commonly used for analyzing and visualizing multidimensional data. However, when data is a live streaming feed, conventional DR methods cannot be directly used because of their computational complexity and inability to preserve the projected data positions at previous time points. In addition, the problem becomes even more challenging when the dynamic data…
▽ More
Dimensionality reduction (DR) methods are commonly used for analyzing and visualizing multidimensional data. However, when data is a live streaming feed, conventional DR methods cannot be directly used because of their computational complexity and inability to preserve the projected data positions at previous time points. In addition, the problem becomes even more challenging when the dynamic data records have a varying number of dimensions as often found in real-world applications. This paper presents an incremental DR solution. We enhance an existing incremental PCA method in several ways to ensure its usability for visualizing streaming multidimensional data. First, we use geometric transformation and animation methods to help preserve a viewer's mental map when visualizing the incremental results. Second, to handle data dimension variants, we use an optimization method to estimate the projected data positions, and also convey the resulting uncertainty in the visualization. We demonstrate the effectiveness of our design with two case studies using real-world datasets.
△ Less
Submitted 15 October, 2019; v1 submitted 10 May, 2019;
originally announced May 2019.
-
Supporting Analysis of Dimensionality Reduction Results with Contrastive Learning
Authors:
Takanori Fujiwara,
Oh-Hyun Kwon,
Kwan-Liu Ma
Abstract:
Dimensionality reduction (DR) is frequently used for analyzing and visualizing high-dimensional data as it provides a good first glance of the data. However, to interpret the DR result for gaining useful insights from the data, it would take additional analysis effort such as identifying clusters and understanding their characteristics. While there are many automatic methods (e.g., density-based c…
▽ More
Dimensionality reduction (DR) is frequently used for analyzing and visualizing high-dimensional data as it provides a good first glance of the data. However, to interpret the DR result for gaining useful insights from the data, it would take additional analysis effort such as identifying clusters and understanding their characteristics. While there are many automatic methods (e.g., density-based clustering methods) to identify clusters, effective methods for understanding a cluster's characteristics are still lacking. A cluster can be mostly characterized by its distribution of feature values. Reviewing the original feature values is not a straightforward task when the number of features is large. To address this challenge, we present a visual analytics method that effectively highlights the essential features of a cluster in a DR result. To extract the essential features, we introduce an enhanced usage of contrastive principal component analysis (cPCA). Our method, called ccPCA (contrasting clusters in PCA), can calculate each feature's relative contribution to the contrast between one cluster and other clusters. With ccPCA, we have created an interactive system including a scalable visualization of clusters' feature contributions. We demonstrate the effectiveness of our method and system with case studies using several publicly available datasets.
△ Less
Submitted 14 October, 2019; v1 submitted 9 May, 2019;
originally announced May 2019.
-
Evaluation of Plane Detection with RANSAC According to Density of 3D Point Clouds
Authors:
Tomofumi Fujiwara,
Tetsushi Kamegawa,
Akio Gofuku
Abstract:
We have implemented a method that detects planar regions from 3D scan data using Random Sample Consensus (RANSAC) algorithm to address the issue of a trade-off between the scanning speed and the point density of 3D scanning. However, the limitation of the implemented method has not been clear yet. In this paper, we conducted an additional experiment to evaluate the implemented method by changing i…
▽ More
We have implemented a method that detects planar regions from 3D scan data using Random Sample Consensus (RANSAC) algorithm to address the issue of a trade-off between the scanning speed and the point density of 3D scanning. However, the limitation of the implemented method has not been clear yet. In this paper, we conducted an additional experiment to evaluate the implemented method by changing its parameter and environments in both high and low point density data. As a result, the number of detected planes in high point density data was different from that in low point density data with the same parameter value.
△ Less
Submitted 17 December, 2013;
originally announced December 2013.
-
Toward Security Verification against Inference Attacks on Data Trees
Authors:
Ryo Iwase,
Yasunori Ishihara,
Toru Fujiwara
Abstract:
This paper describes our ongoing work on security verification against inference attacks on data trees. We focus on infinite secrecy against inference attacks, which means that attackers cannot narrow down the candidates for the value of the sensitive information to finite by available information to the attackers. Our purpose is to propose a model under which infinite secrecy is decidable. To be…
▽ More
This paper describes our ongoing work on security verification against inference attacks on data trees. We focus on infinite secrecy against inference attacks, which means that attackers cannot narrow down the candidates for the value of the sensitive information to finite by available information to the attackers. Our purpose is to propose a model under which infinite secrecy is decidable. To be specific, we first propose tree transducers which are expressive enough to represent practical queries. Then, in order to represent attackers' knowledge, we propose data tree types such that type inference and inverse type inference on those tree transducers are possible with respect to data tree types, and infiniteness of data tree types is decidable.
△ Less
Submitted 21 November, 2013;
originally announced December 2013.
-
XPath Satisfiability with Parent Axes or Qualifiers Is Tractable under Many of Real-World DTDs
Authors:
Yasunori Ishihara,
Nobutaka Suzuki,
Kenji Hashimoto,
Shogo Shimizu,
Toru Fujiwara
Abstract:
This paper aims at finding a subclass of DTDs that covers many of the real-world DTDs while offering a polynomial-time complexity for deciding the XPath satisfiability problem. In our previous work, we proposed RW-DTDs, which cover most of the real-world DTDs (26 out of 27 real-world DTDs and 1406 out of 1407 DTD rules). However, under RW-DTDs, XPath satisfiability with only child, descendant-or-s…
▽ More
This paper aims at finding a subclass of DTDs that covers many of the real-world DTDs while offering a polynomial-time complexity for deciding the XPath satisfiability problem. In our previous work, we proposed RW-DTDs, which cover most of the real-world DTDs (26 out of 27 real-world DTDs and 1406 out of 1407 DTD rules). However, under RW-DTDs, XPath satisfiability with only child, descendant-or-self, and sibling axes is tractable. In this paper, we propose MRW-DTDs, which are slightly smaller than RW-DTDs but have tractability on XPath satisfiability with parent axes or qualifiers. MRW-DTDs are a proper superclass of duplicate-free DTDs proposed by Montazerian et al., and cover 24 out of the 27 real-world DTDs and 1403 out of the 1407 DTD rules. Under MRW-DTDs, we show that XPath satisfiability problems with (1) child, parent, and sibling axes, and (2) child and sibling axes and qualifiers are both tractable, which are known to be intractable under RW-DTDs.
△ Less
Submitted 3 August, 2013;
originally announced August 2013.
-
Uncorrectable Errors of Weight Half the Minimum Distance for Binary Linear Codes
Authors:
Kenji Yasunaga,
Toru Fujiwara
Abstract:
A lower bound on the number of uncorrectable errors of weight half the minimum distance is derived for binary linear codes satisfying some condition. The condition is satisfied by some primitive BCH codes, extended primitive BCH codes, Reed-Muller codes, and random linear codes. The bound asymptotically coincides with the corresponding upper bound for Reed-Muller codes and random linear codes. B…
▽ More
A lower bound on the number of uncorrectable errors of weight half the minimum distance is derived for binary linear codes satisfying some condition. The condition is satisfied by some primitive BCH codes, extended primitive BCH codes, Reed-Muller codes, and random linear codes. The bound asymptotically coincides with the corresponding upper bound for Reed-Muller codes and random linear codes. By generalizing the idea of the lower bound, a lower bound on the number of uncorrectable errors for weights larger than half the minimum distance is also obtained, but the generalized lower bound is weak for large weights. The monotone error structure and its related notion larger half and trial set, which are introduced by Helleseth, Kløve, and Levenshtein, are mainly used to derive the bounds.
△ Less
Submitted 30 April, 2008; v1 submitted 25 April, 2008;
originally announced April 2008.
-
Relations between the Local Weight Distributions of a Linear Block Code, Its Extended Code, and Its Even Weight Subcode
Authors:
Kenji Yasunaga,
Toru Fujiwara
Abstract:
Relations between the local weight distributions of a binary linear code, its extended code, and its even weight subcode are presented. In particular, for a code of which the extended code is transitive invariant and contains only codewords with weight multiples of four, the local weight distribution can be obtained from that of the extended code. Using the relations, the local weight distributi…
▽ More
Relations between the local weight distributions of a binary linear code, its extended code, and its even weight subcode are presented. In particular, for a code of which the extended code is transitive invariant and contains only codewords with weight multiples of four, the local weight distribution can be obtained from that of the extended code. Using the relations, the local weight distributions of the $(127,k)$ primitive BCH codes for $k\leq50$, the $(127,64)$ punctured third-order Reed-Muller, and their even weight subcodes are obtained from the local weight distribution of the $(128,k)$ extended primitive BCH codes for $k\leq50$ and the $(128,64)$ third-order Reed-Muller code. We also show an approach to improve an algorithm for computing the local weight distribution proposed before.
△ Less
Submitted 2 August, 2005;
originally announced August 2005.