Search | arXiv e-print repository

Two-Stage Holistic and Contrastive Explanation of Image Classification

Authors: Weiyan Xie, Xiao-Hui Li, Zhi Lin, Leonard K. M. Poon, Caleb Chen Cao, Nevin L. Zhang

Abstract: The need to explain the output of a deep neural network classifier is now widely recognized. While previous methods typically explain a single class in the output, we advocate explaining the whole output, which is a probability distribution over multiple classes. A whole-output explanation can help a human user gain an overall understanding of model behaviour instead of only one aspect of it. It c… ▽ More The need to explain the output of a deep neural network classifier is now widely recognized. While previous methods typically explain a single class in the output, we advocate explaining the whole output, which is a probability distribution over multiple classes. A whole-output explanation can help a human user gain an overall understanding of model behaviour instead of only one aspect of it. It can also provide a natural framework where one can examine the evidence used to discriminate between competing classes, and thereby obtain contrastive explanations. In this paper, we propose a contrastive whole-output explanation (CWOX) method for image classification, and evaluate it using quantitative metrics and through human subject studies. The source code of CWOX is available at https://github.com/vaynexie/CWOX. △ Less

Submitted 10 June, 2023; originally announced June 2023.

Comments: To appear at UAI 2023

arXiv:2206.08851 [pdf, other]

SMPL: Simulated Industrial Manufacturing and Process Control Learning Environments

Authors: Mohan Zhang, Xiaozhou Wang, Benjamin Decardi-Nelson, Song Bo, An Zhang, **feng Liu, Sile Tao, Jiayi Cheng, Xiaohong Liu, DengDeng Yu, Matthew Poon, Animesh Garg

Abstract: Traditional biological and pharmaceutical manufacturing plants are controlled by human workers or pre-defined thresholds. Modernized factories have advanced process control algorithms such as model predictive control (MPC). However, there is little exploration of applying deep reinforcement learning to control manufacturing plants. One of the reasons is the lack of high fidelity simulations and st… ▽ More Traditional biological and pharmaceutical manufacturing plants are controlled by human workers or pre-defined thresholds. Modernized factories have advanced process control algorithms such as model predictive control (MPC). However, there is little exploration of applying deep reinforcement learning to control manufacturing plants. One of the reasons is the lack of high fidelity simulations and standard APIs for benchmarking. To bridge this gap, we develop an easy-to-use library that includes five high-fidelity simulation environments: BeerFMTEnv, ReactorEnv, AtropineEnv, PenSimEnv and mAbEnv, which cover a wide range of manufacturing processes. We build these environments on published dynamics models. Furthermore, we benchmark online and offline, model-based and model-free reinforcement learning algorithms for comparisons of follow-up research. △ Less

Submitted 15 January, 2023; v1 submitted 17 June, 2022; originally announced June 2022.

Comments: Neurips 2022. https://openreview.net/forum?id=TscdNx8udf5

ACM Class: I.2.6

arXiv:2206.00813 [pdf, other]

doi 10.3847/2041-8213/ac7b2d

Two Candidate KH 15D-like Systems from the Zwicky Transient Facility

Authors: Wei Zhu, Klaus Bernhard, Fei Dai, Min Fang, J. J. Zanazzi, Weicheng Zang, Subo Dong, Franz-Josef Hambsch, Tianjun Gan, Zexuan Wu, Michael Poon

Abstract: KH 15D contains a circumbinary disk that is tilted relative to the orbital plane of the central binary. The precession of the disk and the orbital motion of the binary together produce rich phenomena in the photometric light curve. In this work, we present the discovery and preliminary analysis of two objects that resemble the key features of KH 15D from the Zwicky Transient Facility. These new ob… ▽ More KH 15D contains a circumbinary disk that is tilted relative to the orbital plane of the central binary. The precession of the disk and the orbital motion of the binary together produce rich phenomena in the photometric light curve. In this work, we present the discovery and preliminary analysis of two objects that resemble the key features of KH 15D from the Zwicky Transient Facility. These new objects, Bernhard-1 and Bernhard-2, show large-amplitude ($>1.5\,$mag), long-duration (more than tens of days), and periodic dimming events. A one-sided screen model is developed to model the photometric behaviour of these objects, the physical interpretation of which is a tilted, warped circumbinary disk occulting the inner binary. Changes in the object light curves suggest potential precession periods over timescales longer than 10 years. Additional photometric and spectroscopic observations are encouraged to better understand the nature of these interesting systems. △ Less

Submitted 20 June, 2022; v1 submitted 1 June, 2022; originally announced June 2022.

Comments: 10 pages, 5 figures, 2 tables, accepted to ApJ Letters

arXiv:2204.08101 [pdf]

Ultrafast disinfection of SARS-CoV-2 viruses

Authors: Yang Xu, Alex Wing Hong Chin, Haosong Zhong, Connie Kong Wai Lee, Yi Chen, Timothy Yee Him Chan, Zhiyong Fan, Molong Duan, Leo Lit Man Poon, Mitch Guijun Li

Abstract: The wide use of surgical masks has been proven effective for mitigating the spread of respiration diseases, such as COVID-19, alongside social distance control, vaccines, and other efforts. With the newly reported variants, such as Delta and Omicron, a higher spread rate had been found compared to the initial strains. People might get infected even by inhaling fewer loading of viruses. More freque… ▽ More The wide use of surgical masks has been proven effective for mitigating the spread of respiration diseases, such as COVID-19, alongside social distance control, vaccines, and other efforts. With the newly reported variants, such as Delta and Omicron, a higher spread rate had been found compared to the initial strains. People might get infected even by inhaling fewer loading of viruses. More frequent sterilization of surgical masks is needed to protect the wearers. However, it is challenging to sterilize the commodity surgical masks with a fast and effective method. Herein, we reported the sterilization of the SARS-CoV-2 viruses within an ultra-short time, while retaining the mask performance. Silver thin film is coated on commercial polyimide film by physical vapor deposition and patterned by laser scribing to form a Joule heating electrode. Another layer of the gold thin film was coated onto the opposite side of the device to promote the uniformity of the Joule heating through nano-heat transfer regulation. As a result, the surgical mask can be heated to inactivation temperature within a short time and with high uniformity. By Joule-heating the surgical mask with the temperature at 90 °C for 3 minutes, the inactivation of the SARS-CoV-2 showed an efficacy of 99.89%. Normal commodity surgical masks can be sterilized faster, more frequently, and efficiently against SARS-CoV-2 viruses and the new invariants. △ Less

Submitted 17 April, 2022; originally announced April 2022.

arXiv:2203.04294 [pdf, other]

NaviAirway: a Bronchiole-sensitive Deep Learning-based Airway Segmentation Pipeline

Authors: Andong Wang, Terence Chi Chun Tam, Ho Ming Poon, Kun-Chang Yu, Wei-Ning Lee

Abstract: Airway segmentation is essential for chest CT image analysis. Different from natural image segmentation, which pursues high pixel-wise accuracy, airway segmentation focuses on topology. The task is challenging not only because of its complex tree-like structure but also the severe pixel imbalance among airway branches of different generations. To tackle the problems, we present a NaviAirway method… ▽ More Airway segmentation is essential for chest CT image analysis. Different from natural image segmentation, which pursues high pixel-wise accuracy, airway segmentation focuses on topology. The task is challenging not only because of its complex tree-like structure but also the severe pixel imbalance among airway branches of different generations. To tackle the problems, we present a NaviAirway method which consists of a bronchiole-sensitive loss function for airway topology preservation and an iterative training strategy for accurate model learning across different airway generations. To supplement the features of airway branches learned by the model, we distill the knowledge from numerous unlabeled chest CT images in a teacher-student manner. Experimental results show that NaviAirway outperforms existing methods, particularly in the identification of higher-generation bronchioles and robustness to new CT scans. Moreover, NaviAirway is general enough to be combined with different backbone models to significantly improve their performance. NaviAirway can generate an airway roadmap for Navigation Bronchoscopy and can also be applied to other scenarios when segmenting fine and long tubular structures in biomedical images. The code is publicly available on https://github.com/AntonotnaWang/NaviAirway. △ Less

Submitted 16 June, 2023; v1 submitted 7 March, 2022; originally announced March 2022.

arXiv:2106.09424 [pdf, other]

Interpretable Machine Learning Classifiers for Brain Tumour Survival Prediction

Authors: Colleen E. Charlton, Michael Tin Chung Poon, Paul M. Brennan, Jacques D. Fleuriot

Abstract: Prediction of survival in patients diagnosed with a brain tumour is challenging because of heterogeneous tumour behaviours and responses to treatment. Better estimations of prognosis would support treatment planning and patient support. Advances in machine learning have informed development of clinical predictive models, but their integration into clinical practice is almost non-existent. One reas… ▽ More Prediction of survival in patients diagnosed with a brain tumour is challenging because of heterogeneous tumour behaviours and responses to treatment. Better estimations of prognosis would support treatment planning and patient support. Advances in machine learning have informed development of clinical predictive models, but their integration into clinical practice is almost non-existent. One reasons for this is the lack of interpretability of models. In this paper, we use a novel brain tumour dataset to compare two interpretable rule list models against popular machine learning approaches for brain tumour survival prediction. All models are quantitatively evaluated using standard performance metrics. The rule lists are also qualitatively assessed for their interpretability and clinical utility. The interpretability of the black box machine learning models is evaluated using two post-hoc explanation techniques, LIME and SHAP. Our results show that the rule lists were only slightly outperformed by the black box models. We demonstrate that rule list algorithms produced simple decision lists that align with clinical expertise. By comparison, post-hoc interpretability methods applied to black box models may produce unreliable explanations of local model predictions. Model interpretability is essential for understanding differences in predictive performance and for integration into clinical practice. △ Less

Submitted 17 June, 2021; originally announced June 2021.

arXiv:2102.09553 [pdf, other]

doi 10.1186/s12911-021-01533-7

A Systematic Review of Natural Language Processing Applied to Radiology Reports

Authors: Arlene Casey, Emma Davidson, Michael Poon, Hang Dong, Daniel Duma, Andreas Grivas, Claire Grover, Víctor Suárez-Paniagua, Richard Tobin, William Whiteley, Honghan Wu, Beatrice Alex

Abstract: NLP has a significant role in advancing healthcare and has been found to be key in extracting structured information from radiology reports. Understanding recent developments in NLP application to radiology is of significance but recent reviews on this are limited. This study systematically assesses recent literature in NLP applied to radiology reports. Our automated literature search yields 4,799… ▽ More NLP has a significant role in advancing healthcare and has been found to be key in extracting structured information from radiology reports. Understanding recent developments in NLP application to radiology is of significance but recent reviews on this are limited. This study systematically assesses recent literature in NLP applied to radiology reports. Our automated literature search yields 4,799 results using automated filtering, metadata enriching steps and citation search combined with manual review. Our analysis is based on 21 variables including radiology characteristics, NLP methodology, performance, study, and clinical application characteristics. We present a comprehensive analysis of the 164 publications retrieved with each categorised into one of 6 clinical application categories. Deep learning use increases but conventional machine learning approaches are still prevalent. Deep learning remains challenged when data is scarce and there is little evidence of adoption into clinical practice. Despite 17% of studies reporting greater than 0.85 F1 scores, it is hard to comparatively evaluate these approaches given that most of them use different datasets. Only 14 studies made their data and 15 their code available with 10 externally validating results. Automated understanding of clinical narratives of the radiology reports has the potential to enhance the healthcare process but reproducibility and explainability of models are important if the domain is to move applications into clinical use. More could be done to share code enabling validation of methods on different institutional data and to reduce heterogeneity in reporting of study properties allowing inter-study comparisons. Our results have significance for researchers providing a systematic synthesis of existing work to build on, identify gaps, opportunities for collaboration and avoid duplication. △ Less

Submitted 18 February, 2021; originally announced February 2021.

Journal ref: BMC Medical Informatics and Decision Making 2021

arXiv:2009.14204 [pdf, other]

doi 10.1093/mnras/stab575

Constraining the Circumbinary Disk Tilt in the KH 15D system

Authors: Michael Poon, J. J. Zanazzi, Wei Zhu

Abstract: KH 15D is a system which consists of a young, eccentric binary, and a circumbinary disk which obscures the binary as the disk precesses. We develop a self-consistent model that provides a reasonable fit to the photometric variability that was observed in the KH 15D system over the past 60 years. Our model suggests that the circumbinary disk has an inner edge $r_{\rm in}\lesssim 1 \ {\rm au}$, an o… ▽ More KH 15D is a system which consists of a young, eccentric binary, and a circumbinary disk which obscures the binary as the disk precesses. We develop a self-consistent model that provides a reasonable fit to the photometric variability that was observed in the KH 15D system over the past 60 years. Our model suggests that the circumbinary disk has an inner edge $r_{\rm in}\lesssim 1 \ {\rm au}$, an outer edge $r_{\rm out} \sim {\rm a \ few \ au}$, and that the disk is misaligned relative to the stellar binary by $\sim$5-16 degrees, with the inner edge more inclined than the outer edge. The difference between the inclinations (warp) and longitude of ascending nodes (twist) at the inner and outer edges of the disk are of order $\sim$10 degrees and $\sim$15 degrees, respectively. We also provide constraints on other properties of the disk, such as the precession period and surface density profile. Our work demonstrates the power of photometric data in constraining the physical properties of planet-forming circumbinary disks. △ Less

Submitted 22 March, 2021; v1 submitted 29 September, 2020; originally announced September 2020.

Comments: 17 pages, 9 figures

Journal ref: 2021MNRAS.503.1599P

arXiv:2008.07956 [pdf, other]

doi 10.1145/3366423.3380135

Learning the Structure of Auto-Encoding Recommenders

Authors: Farhan Khawar, Leonard Kin Man Poon, Nevin Lianwen Zhang

Abstract: Autoencoder recommenders have recently shown state-of-the-art performance in the recommendation task due to their ability to model non-linear item relationships effectively. However, existing autoencoder recommenders use fully-connected neural network layers and do not employ structure learning. This can lead to inefficient training, especially when the data is sparse as commonly found in collabor… ▽ More Autoencoder recommenders have recently shown state-of-the-art performance in the recommendation task due to their ability to model non-linear item relationships effectively. However, existing autoencoder recommenders use fully-connected neural network layers and do not employ structure learning. This can lead to inefficient training, especially when the data is sparse as commonly found in collaborative filtering. The aforementioned results in lower generalization ability and reduced performance. In this paper, we introduce structure learning for autoencoder recommenders by taking advantage of the inherent item groups present in the collaborative filtering domain. Due to the nature of items in general, we know that certain items are more related to each other than to other items. Based on this, we propose a method that first learns groups of related items and then uses this information to determine the connectivity structure of an auto-encoding neural network. This results in a network that is sparsely connected. This sparse structure can be viewed as a prior that guides the network training. Empirically we demonstrate that the proposed structure learning enables the autoencoder to converge to a local optimum with a much smaller spectral norm and generalization error bound than the fully-connected network. The resultant sparse network considerably outperforms the state-of-the-art methods like \textsc{Mult-vae/Mult-dae} on multiple benchmarked datasets even when the same number of parameters and flops are used. It also has a better cold-start performance. △ Less

Submitted 18 August, 2020; originally announced August 2020.

Comments: Proceedings of The Web Conference 2020

arXiv:2007.05163 [pdf, other]

Handling Collocations in Hierarchical Latent Tree Analysis for Topic Modeling

Authors: Leonard K. M. Poon, Nevin L. Zhang, Haoran Xie, Gary Cheng

Abstract: Topic modeling has been one of the most active research areas in machine learning in recent years. Hierarchical latent tree analysis (HLTA) has been recently proposed for hierarchical topic modeling and has shown superior performance over state-of-the-art methods. However, the models used in HLTA have a tree structure and cannot represent the different meanings of multiword expressions sharing the… ▽ More Topic modeling has been one of the most active research areas in machine learning in recent years. Hierarchical latent tree analysis (HLTA) has been recently proposed for hierarchical topic modeling and has shown superior performance over state-of-the-art methods. However, the models used in HLTA have a tree structure and cannot represent the different meanings of multiword expressions sharing the same word appropriately. Therefore, we propose a method for extracting and selecting collocations as a preprocessing step for HLTA. The selected collocations are replaced with single tokens in the bag-of-words model before running HLTA. Our empirical evaluation shows that the proposed method led to better performance of HLTA on three of the four data sets tested. △ Less

Submitted 10 July, 2020; originally announced July 2020.

arXiv:1803.05206 [pdf, other]

Learning Latent Superstructures in Variational Autoencoders for Deep Multidimensional Clustering

Authors: Xiaopeng Li, Zhourong Chen, Leonard K. M. Poon, Nevin L. Zhang

Abstract: We investigate a variant of variational autoencoders where there is a superstructure of discrete latent variables on top of the latent features. In general, our superstructure is a tree structure of multiple super latent variables and it is automatically learned from data. When there is only one latent variable in the superstructure, our model reduces to one that assumes the latent features to be… ▽ More We investigate a variant of variational autoencoders where there is a superstructure of discrete latent variables on top of the latent features. In general, our superstructure is a tree structure of multiple super latent variables and it is automatically learned from data. When there is only one latent variable in the superstructure, our model reduces to one that assumes the latent features to be generated from a Gaussian mixture model. We call our model the latent tree variational autoencoder (LTVAE). Whereas previous deep learning methods for clustering produce only one partition of data, LTVAE produces multiple partitions of data, each being given by one super latent variable. This is desirable because high dimensional data usually have many different natural facets and can be meaningfully partitioned in multiple ways. △ Less

Submitted 22 February, 2019; v1 submitted 14 March, 2018; originally announced March 2018.

Comments: Published in ICLR 2019

arXiv:1610.00085 [pdf, other]

Latent Tree Analysis

Authors: Nevin L. Zhang, Leonard K. M. Poon

Abstract: Latent tree analysis seeks to model the correlations among a set of random variables using a tree of latent variables. It was proposed as an improvement to latent class analysis --- a method widely used in social sciences and medicine to identify homogeneous subgroups in a population. It provides new and fruitful perspectives on a number of machine learning areas, including cluster analysis, topic… ▽ More Latent tree analysis seeks to model the correlations among a set of random variables using a tree of latent variables. It was proposed as an improvement to latent class analysis --- a method widely used in social sciences and medicine to identify homogeneous subgroups in a population. It provides new and fruitful perspectives on a number of machine learning areas, including cluster analysis, topic detection, and deep probabilistic modeling. This paper gives an overview of the research on latent tree analysis and various ways it is used in practice. △ Less

Submitted 1 October, 2016; originally announced October 2016.

Comments: 7 pages, 5 figures

arXiv:1609.09188 [pdf, other]

Topic Browsing for Research Papers with Hierarchical Latent Tree Analysis

Authors: Leonard K. M. Poon, Nevin L. Zhang

Abstract: Academic researchers often need to face with a large collection of research papers in the literature. This problem may be even worse for postgraduate students who are new to a field and may not know where to start. To address this problem, we have developed an online catalog of research papers where the papers have been automatically categorized by a topic model. The catalog contains 7719 papers f… ▽ More Academic researchers often need to face with a large collection of research papers in the literature. This problem may be even worse for postgraduate students who are new to a field and may not know where to start. To address this problem, we have developed an online catalog of research papers where the papers have been automatically categorized by a topic model. The catalog contains 7719 papers from the proceedings of two artificial intelligence conferences from 2000 to 2015. Rather than the commonly used Latent Dirichlet Allocation, we use a recently proposed method called hierarchical latent tree analysis for topic modeling. The resulting topic model contains a hierarchy of topics so that users can browse the topics from the top level to the bottom level. The topic model contains a manageable number of general topics at the top level and allows thousands of fine-grained topics at the bottom level. It also can detect topics that have emerged recently. △ Less

Submitted 28 September, 2016; originally announced September 2016.

arXiv:1605.06650 [pdf, other]

Latent Tree Models for Hierarchical Topic Detection

Authors: Peixian Chen, Nevin L. Zhang, Tengfei Liu, Leonard K. M. Poon, Zhourong Chen, Farhan Khawar

Abstract: We present a novel method for hierarchical topic detection where topics are obtained by clustering documents in multiple ways. Specifically, we model document collections using a class of graphical models called hierarchical latent tree models (HLTMs). The variables at the bottom level of an HLTM are observed binary variables that represent the presence/absence of words in a document. The variable… ▽ More We present a novel method for hierarchical topic detection where topics are obtained by clustering documents in multiple ways. Specifically, we model document collections using a class of graphical models called hierarchical latent tree models (HLTMs). The variables at the bottom level of an HLTM are observed binary variables that represent the presence/absence of words in a document. The variables at other levels are binary latent variables, with those at the lowest latent level representing word co-occurrence patterns and those at higher levels representing co-occurrence of patterns at the level below. Each latent variable gives a soft partition of the documents, and document clusters in the partitions are interpreted as topics. Latent variables at high levels of the hierarchy capture long-range word co-occurrence patterns and hence give thematically more general topics, while those at low levels of the hierarchy capture short-range word co-occurrence patterns and give thematically more specific topics. Unlike LDA-based topic models, HLTMs do not refer to a document generation process and use word variables instead of token variables. They use a tree structure to model the relationships between topics and words, which is conducive to the discovery of meaningful topics and topic hierarchies. △ Less

Submitted 21 December, 2016; v1 submitted 21 May, 2016; originally announced May 2016.

Comments: 46 pages

arXiv:1508.00973 [pdf, other]

Progressive EM for Latent Tree Models and Hierarchical Topic Detection

Authors: Peixian Chen, Nevin L. Zhang, Leonard K. M. Poon, Zhourong Chen

Abstract: Hierarchical latent tree analysis (HLTA) is recently proposed as a new method for topic detection. It differs fundamentally from the LDA-based methods in terms of topic definition, topic-document relationship, and learning method. It has been shown to discover significantly more coherent topics and better topic hierarchies. However, HLTA relies on the Expectation-Maximization (EM) algorithm for pa… ▽ More Hierarchical latent tree analysis (HLTA) is recently proposed as a new method for topic detection. It differs fundamentally from the LDA-based methods in terms of topic definition, topic-document relationship, and learning method. It has been shown to discover significantly more coherent topics and better topic hierarchies. However, HLTA relies on the Expectation-Maximization (EM) algorithm for parameter estimation and hence is not efficient enough to deal with large datasets. In this paper, we propose a method to drastically speed up HLTA using a technique inspired by recent advances in the moments method. Empirical experiments show that our method greatly improves the efficiency of HLTA. It is as efficient as the state-of-the-art LDA-based method for hierarchical topic detection and finds substantially better topics and topic hierarchies. △ Less

Submitted 5 August, 2015; originally announced August 2015.

arXiv:1410.7140 [pdf]

A data-driven method for syndrome type identification and classification in traditional Chinese medicine

Authors: Nevin L. Zhang, Chen Fu, Teng Fei Liu, Bao Xin Chen, Kin Man Poon, Pei Xian Chen, Yun Ling Zhang

Abstract: Objective: The efficacy of traditional Chinese medicine (TCM) treatments for Western medicine (WM) diseases relies heavily on the proper classification of patients into TCM syndrome types. We develop a data-driven method for solving the classification problem, where syndrome types are identified and quantified based on patterns detected in unlabeled symptom survey data. Method: Latent class anal… ▽ More Objective: The efficacy of traditional Chinese medicine (TCM) treatments for Western medicine (WM) diseases relies heavily on the proper classification of patients into TCM syndrome types. We develop a data-driven method for solving the classification problem, where syndrome types are identified and quantified based on patterns detected in unlabeled symptom survey data. Method: Latent class analysis (LCA) has been applied in WM research to solve a similar problem, i.e., to identify subtypes of a patient population in the absence of a gold standard. A widely known weakness of LCA is that it makes an unrealistically strong independence assumption. We relax the assumption by first detecting symptom co-occurrence patterns from survey data and use those patterns instead of the symptoms as features for LCA. Results: The result of the investigation is a six-step method: Data collection, symptom co-occurrence pattern discovery, pattern interpretation, syndrome identification, syndrome type identification, and syndrome type classification. A software package called Lantern is developed to support the application of the method. The method is illustrated using a data set on Vascular Mild Cognitive Impairment (VMCI). Conclusions: A data-driven method for TCM syndrome identification and classification is presented. The method can be used to answer the following questions about a Western medicine disease: What TCM syndrome types are there among the patients with the disease? What is the prevalence of each syndrome type? What are the statistical characteristics of each syndrome type in terms of occurrence of symptoms? How can we determine the syndrome type(s) of a patient? △ Less

Submitted 24 February, 2016; v1 submitted 27 October, 2014; originally announced October 2014.

arXiv:1210.4883 [pdf]

A Model-Based Approach to Rounding in Spectral Clustering

Authors: Leonard K. M. Poon, April H. Liu, Tengfei Liu, Nevin Lianwen Zhang

Abstract: In spectral clustering, one defines a similarity matrix for a collection of data points, transforms the matrix to get the Laplacian matrix, finds the eigenvectors of the Laplacian matrix, and obtains a partition of the data using the leading eigenvectors. The last step is sometimes referred to as rounding, where one needs to decide how many leading eigenvectors to use, to determine the number of c… ▽ More In spectral clustering, one defines a similarity matrix for a collection of data points, transforms the matrix to get the Laplacian matrix, finds the eigenvectors of the Laplacian matrix, and obtains a partition of the data using the leading eigenvectors. The last step is sometimes referred to as rounding, where one needs to decide how many leading eigenvectors to use, to determine the number of clusters, and to partition the data points. In this paper, we propose a novel method for rounding. The method differs from previous methods in three ways. First, we relax the assumption that the number of clusters equals the number of eigenvectors used. Second, when deciding the number of leading eigenvectors to use, we not only rely on information contained in the leading eigenvectors themselves, but also use subsequent eigenvectors. Third, our method is model-based and solves all the three subproblems of rounding using a class of graphical models called latent tree models. We evaluate our method on both synthetic and real-world data. The results show that our method works correctly in the ideal case where between-clusters similarity is 0, and degrades gracefully as one moves away from the ideal case. △ Less

Submitted 16 October, 2012; originally announced October 2012.

Comments: Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012)

Report number: UAI-P-2012-PG-685-694

arXiv:astro-ph/0302296 [pdf, ps, other]

doi 10.1086/382497

Chaotic Loss Cones, Black Hole Fueling and the M-Sigma Relation

Authors: David Merritt, M. Y. Poon

Abstract: In classical loss cone theory, stars are supplied to a central black hole via gravitational scattering onto low angular momentum orbits. Higher feeding rates are possible if the gravitational potential near the black hole is non-axisymmetric and the orbits are chaotic. Motivated by recently published, self-consistent models, we evaluate rates of stellar capture and disruption in triaxial nuclei.… ▽ More In classical loss cone theory, stars are supplied to a central black hole via gravitational scattering onto low angular momentum orbits. Higher feeding rates are possible if the gravitational potential near the black hole is non-axisymmetric and the orbits are chaotic. Motivated by recently published, self-consistent models, we evaluate rates of stellar capture and disruption in triaxial nuclei. Rates are found to substantially exceed those in collisionally-resupplied loss cones, as long as an appreciable fraction of the orbits are centrophilic. The mass captured by a black hole after a given time in a steep nucleus scales as the fifth power of the velocity dispersion, and the accumulated mass in 10^10 yr is of the correct order to reproduce the M-sigma relation. Triaxiality can solve the "final parsec problem" of decaying black hole binaries by increasing the flux of stars into the binary's loss cone. △ Less

Submitted 13 August, 2003; v1 submitted 14 February, 2003; originally announced February 2003.

Comments: 29 pages, 6 figures

Report number: Rutgers Astrophysics Preprint Series No. 381

Journal ref: Astrophys.J. 606 (2004) 788-798

arXiv:astro-ph/0212581 [pdf, ps, other]

doi 10.1086/383190

A Self-Consistent Study of Triaxial Black-Hole Nuclei

Authors: M. Y. Poon, D. Merritt

Abstract: We construct models of triaxial galactic nuclei containing central black holes using the method of orbital superposition, then verify their stability by advancing N-body realizations of the models forward in time. We assume a power-law form for the stellar density, rho ~ 1/r and 1/r^2; these correspond approximately to the nuclear density profiles of bright and faint galaxies respectively. Equid… ▽ More We construct models of triaxial galactic nuclei containing central black holes using the method of orbital superposition, then verify their stability by advancing N-body realizations of the models forward in time. We assume a power-law form for the stellar density, rho ~ 1/r and 1/r^2; these correspond approximately to the nuclear density profiles of bright and faint galaxies respectively. Equidensity surfaces are ellipsoids with fixed axis ratios. The central black hole is represented by a Newtonian point mass. We consider three triaxial shapes: almost prolate, almost oblate and maximally triaxial. Two kinds of orbital solution are attempted for each mass model: the first including only regular orbits, the second including chaotic orbits as well. We find that stable configurations exist in the maximally triaxial and nearly-oblate cases; however steady-state solutions in the nearly-prolate geometry could not be found. A large fraction of the mass, of order 50% or more, could be assigned to the chaotic orbits without inducing evolution. Our results demonstrate that triaxiality may persist even within the sphere of influence of the central black hole, and that chaotic orbits may constitute an important building block of galactic nuclei. △ Less

Submitted 30 December, 2002; originally announced December 2002.

Comments: 17 pages, 12 figures, uses emulateapj.sty

Journal ref: Astrophys.J. 606 (2004) 774-787

arXiv:astro-ph/0111020 [pdf, ps, other]

doi 10.1086/340395

Triaxial Black-Hole Nuclei

Authors: M. Y. Poon, David Merritt

Abstract: We demonstrate that the nuclei of galaxies containing supermassive black holes can be triaxial in shape. Schwarzschild's method was first used to construct self-consistent orbital superpositions representing nuclei with axis ratios of 1:0.79:0.5 and containing a central point mass representing a black hole. Two different density laws were considered, with power-law slopes of -1 and -2. We constr… ▽ More We demonstrate that the nuclei of galaxies containing supermassive black holes can be triaxial in shape. Schwarzschild's method was first used to construct self-consistent orbital superpositions representing nuclei with axis ratios of 1:0.79:0.5 and containing a central point mass representing a black hole. Two different density laws were considered, with power-law slopes of -1 and -2. We constructed two solutions for each power law: one containing only regular orbits and the other containing both regular and chaotic orbits. Monte-Carlo realizations of the models were then advanced in time using an N-body code to verify their stability. All four models were found to retain their triaxial shapes for many crossing times. The possibility that galactic nuclei may be triaxial complicates the interpretation of stellar-kinematical data from the centers of galaxies and may alter the inferred interaction rates between stars and supermassive black holes. △ Less

Submitted 3 April, 2002; v1 submitted 1 November, 2001; originally announced November 2001.

Comments: 4 pages, 4 postscript figures, uses emulateapj.sty

Report number: Rutgers Astrophysics Preprint Series No. 332

Journal ref: Astrophys.J. 568 (2002) L89

arXiv:astro-ph/0006447 [pdf, ps, other]

Orbital Dynamics of Triaxial Black-Hole Nuclei

Authors: M. Y. Poon, D. Merritt

Abstract: Orbital motion in triaxial nuclei with central point masses, representing supermassive black holes, is investigated. The stellar density is assumed to follow a power law, rho ~ 1/r^gamma, with gamma=1 or gamma=2. At low energies the motion is essentially regular; the major families of orbits are the tubes and the pyramids. Pyramid orbits are similar to box orbits but have their major elongation… ▽ More Orbital motion in triaxial nuclei with central point masses, representing supermassive black holes, is investigated. The stellar density is assumed to follow a power law, rho ~ 1/r^gamma, with gamma=1 or gamma=2. At low energies the motion is essentially regular; the major families of orbits are the tubes and the pyramids. Pyramid orbits are similar to box orbits but have their major elongation parallel to the short axis of the figure. A number of regular orbit families associated with resonances also exist, most prominently the banana orbits, which are also elongated with the short axis. At a radius where the enclosed stellar mass is a few times the black hole mass, the pyramid orbits become stochastic. The energy of transition to this ``zone of chaos'' is computed as a function of gamma and of the shape of the stellar figure; it occurs at lower energies in more elongated potentials. Our results suggest that supermassive black holes may place tight constraints on departures from triaxiality in galactic nuclei, both by limiting the allowed shapes of regular orbits and by inducing chaos. △ Less

Submitted 3 March, 2001; v1 submitted 30 June, 2000; originally announced June 2000.

Comments: Astrophysical Journal, Vol. 549, Number 1, Part 1, Page 192

Report number: Rutgers Astrophysics Preprint Series No. 276

Showing 1–21 of 21 results for author: Poon, M