-
Two-Stage Holistic and Contrastive Explanation of Image Classification
Authors:
Weiyan Xie,
Xiao-Hui Li,
Zhi Lin,
Leonard K. M. Poon,
Caleb Chen Cao,
Nevin L. Zhang
Abstract:
The need to explain the output of a deep neural network classifier is now widely recognized. While previous methods typically explain a single class in the output, we advocate explaining the whole output, which is a probability distribution over multiple classes. A whole-output explanation can help a human user gain an overall understanding of model behaviour instead of only one aspect of it. It c…
▽ More
The need to explain the output of a deep neural network classifier is now widely recognized. While previous methods typically explain a single class in the output, we advocate explaining the whole output, which is a probability distribution over multiple classes. A whole-output explanation can help a human user gain an overall understanding of model behaviour instead of only one aspect of it. It can also provide a natural framework where one can examine the evidence used to discriminate between competing classes, and thereby obtain contrastive explanations. In this paper, we propose a contrastive whole-output explanation (CWOX) method for image classification, and evaluate it using quantitative metrics and through human subject studies. The source code of CWOX is available at https://github.com/vaynexie/CWOX.
△ Less
Submitted 10 June, 2023;
originally announced June 2023.
-
SMPL: Simulated Industrial Manufacturing and Process Control Learning Environments
Authors:
Mohan Zhang,
Xiaozhou Wang,
Benjamin Decardi-Nelson,
Song Bo,
An Zhang,
**feng Liu,
Sile Tao,
Jiayi Cheng,
Xiaohong Liu,
DengDeng Yu,
Matthew Poon,
Animesh Garg
Abstract:
Traditional biological and pharmaceutical manufacturing plants are controlled by human workers or pre-defined thresholds. Modernized factories have advanced process control algorithms such as model predictive control (MPC). However, there is little exploration of applying deep reinforcement learning to control manufacturing plants. One of the reasons is the lack of high fidelity simulations and st…
▽ More
Traditional biological and pharmaceutical manufacturing plants are controlled by human workers or pre-defined thresholds. Modernized factories have advanced process control algorithms such as model predictive control (MPC). However, there is little exploration of applying deep reinforcement learning to control manufacturing plants. One of the reasons is the lack of high fidelity simulations and standard APIs for benchmarking. To bridge this gap, we develop an easy-to-use library that includes five high-fidelity simulation environments: BeerFMTEnv, ReactorEnv, AtropineEnv, PenSimEnv and mAbEnv, which cover a wide range of manufacturing processes. We build these environments on published dynamics models. Furthermore, we benchmark online and offline, model-based and model-free reinforcement learning algorithms for comparisons of follow-up research.
△ Less
Submitted 15 January, 2023; v1 submitted 17 June, 2022;
originally announced June 2022.
-
Two Candidate KH 15D-like Systems from the Zwicky Transient Facility
Authors:
Wei Zhu,
Klaus Bernhard,
Fei Dai,
Min Fang,
J. J. Zanazzi,
Weicheng Zang,
Subo Dong,
Franz-Josef Hambsch,
Tianjun Gan,
Zexuan Wu,
Michael Poon
Abstract:
KH 15D contains a circumbinary disk that is tilted relative to the orbital plane of the central binary. The precession of the disk and the orbital motion of the binary together produce rich phenomena in the photometric light curve. In this work, we present the discovery and preliminary analysis of two objects that resemble the key features of KH 15D from the Zwicky Transient Facility. These new ob…
▽ More
KH 15D contains a circumbinary disk that is tilted relative to the orbital plane of the central binary. The precession of the disk and the orbital motion of the binary together produce rich phenomena in the photometric light curve. In this work, we present the discovery and preliminary analysis of two objects that resemble the key features of KH 15D from the Zwicky Transient Facility. These new objects, Bernhard-1 and Bernhard-2, show large-amplitude ($>1.5\,$mag), long-duration (more than tens of days), and periodic dimming events. A one-sided screen model is developed to model the photometric behaviour of these objects, the physical interpretation of which is a tilted, warped circumbinary disk occulting the inner binary. Changes in the object light curves suggest potential precession periods over timescales longer than 10 years. Additional photometric and spectroscopic observations are encouraged to better understand the nature of these interesting systems.
△ Less
Submitted 20 June, 2022; v1 submitted 1 June, 2022;
originally announced June 2022.
-
Ultrafast disinfection of SARS-CoV-2 viruses
Authors:
Yang Xu,
Alex Wing Hong Chin,
Haosong Zhong,
Connie Kong Wai Lee,
Yi Chen,
Timothy Yee Him Chan,
Zhiyong Fan,
Molong Duan,
Leo Lit Man Poon,
Mitch Guijun Li
Abstract:
The wide use of surgical masks has been proven effective for mitigating the spread of respiration diseases, such as COVID-19, alongside social distance control, vaccines, and other efforts. With the newly reported variants, such as Delta and Omicron, a higher spread rate had been found compared to the initial strains. People might get infected even by inhaling fewer loading of viruses. More freque…
▽ More
The wide use of surgical masks has been proven effective for mitigating the spread of respiration diseases, such as COVID-19, alongside social distance control, vaccines, and other efforts. With the newly reported variants, such as Delta and Omicron, a higher spread rate had been found compared to the initial strains. People might get infected even by inhaling fewer loading of viruses. More frequent sterilization of surgical masks is needed to protect the wearers. However, it is challenging to sterilize the commodity surgical masks with a fast and effective method. Herein, we reported the sterilization of the SARS-CoV-2 viruses within an ultra-short time, while retaining the mask performance. Silver thin film is coated on commercial polyimide film by physical vapor deposition and patterned by laser scribing to form a Joule heating electrode. Another layer of the gold thin film was coated onto the opposite side of the device to promote the uniformity of the Joule heating through nano-heat transfer regulation. As a result, the surgical mask can be heated to inactivation temperature within a short time and with high uniformity. By Joule-heating the surgical mask with the temperature at 90 °C for 3 minutes, the inactivation of the SARS-CoV-2 showed an efficacy of 99.89%. Normal commodity surgical masks can be sterilized faster, more frequently, and efficiently against SARS-CoV-2 viruses and the new invariants.
△ Less
Submitted 17 April, 2022;
originally announced April 2022.
-
NaviAirway: a Bronchiole-sensitive Deep Learning-based Airway Segmentation Pipeline
Authors:
Andong Wang,
Terence Chi Chun Tam,
Ho Ming Poon,
Kun-Chang Yu,
Wei-Ning Lee
Abstract:
Airway segmentation is essential for chest CT image analysis. Different from natural image segmentation, which pursues high pixel-wise accuracy, airway segmentation focuses on topology. The task is challenging not only because of its complex tree-like structure but also the severe pixel imbalance among airway branches of different generations. To tackle the problems, we present a NaviAirway method…
▽ More
Airway segmentation is essential for chest CT image analysis. Different from natural image segmentation, which pursues high pixel-wise accuracy, airway segmentation focuses on topology. The task is challenging not only because of its complex tree-like structure but also the severe pixel imbalance among airway branches of different generations. To tackle the problems, we present a NaviAirway method which consists of a bronchiole-sensitive loss function for airway topology preservation and an iterative training strategy for accurate model learning across different airway generations. To supplement the features of airway branches learned by the model, we distill the knowledge from numerous unlabeled chest CT images in a teacher-student manner. Experimental results show that NaviAirway outperforms existing methods, particularly in the identification of higher-generation bronchioles and robustness to new CT scans. Moreover, NaviAirway is general enough to be combined with different backbone models to significantly improve their performance. NaviAirway can generate an airway roadmap for Navigation Bronchoscopy and can also be applied to other scenarios when segmenting fine and long tubular structures in biomedical images. The code is publicly available on https://github.com/AntonotnaWang/NaviAirway.
△ Less
Submitted 16 June, 2023; v1 submitted 7 March, 2022;
originally announced March 2022.
-
Interpretable Machine Learning Classifiers for Brain Tumour Survival Prediction
Authors:
Colleen E. Charlton,
Michael Tin Chung Poon,
Paul M. Brennan,
Jacques D. Fleuriot
Abstract:
Prediction of survival in patients diagnosed with a brain tumour is challenging because of heterogeneous tumour behaviours and responses to treatment. Better estimations of prognosis would support treatment planning and patient support. Advances in machine learning have informed development of clinical predictive models, but their integration into clinical practice is almost non-existent. One reas…
▽ More
Prediction of survival in patients diagnosed with a brain tumour is challenging because of heterogeneous tumour behaviours and responses to treatment. Better estimations of prognosis would support treatment planning and patient support. Advances in machine learning have informed development of clinical predictive models, but their integration into clinical practice is almost non-existent. One reasons for this is the lack of interpretability of models. In this paper, we use a novel brain tumour dataset to compare two interpretable rule list models against popular machine learning approaches for brain tumour survival prediction. All models are quantitatively evaluated using standard performance metrics. The rule lists are also qualitatively assessed for their interpretability and clinical utility. The interpretability of the black box machine learning models is evaluated using two post-hoc explanation techniques, LIME and SHAP. Our results show that the rule lists were only slightly outperformed by the black box models. We demonstrate that rule list algorithms produced simple decision lists that align with clinical expertise. By comparison, post-hoc interpretability methods applied to black box models may produce unreliable explanations of local model predictions. Model interpretability is essential for understanding differences in predictive performance and for integration into clinical practice.
△ Less
Submitted 17 June, 2021;
originally announced June 2021.
-
A Systematic Review of Natural Language Processing Applied to Radiology Reports
Authors:
Arlene Casey,
Emma Davidson,
Michael Poon,
Hang Dong,
Daniel Duma,
Andreas Grivas,
Claire Grover,
Víctor Suárez-Paniagua,
Richard Tobin,
William Whiteley,
Honghan Wu,
Beatrice Alex
Abstract:
NLP has a significant role in advancing healthcare and has been found to be key in extracting structured information from radiology reports. Understanding recent developments in NLP application to radiology is of significance but recent reviews on this are limited. This study systematically assesses recent literature in NLP applied to radiology reports. Our automated literature search yields 4,799…
▽ More
NLP has a significant role in advancing healthcare and has been found to be key in extracting structured information from radiology reports. Understanding recent developments in NLP application to radiology is of significance but recent reviews on this are limited. This study systematically assesses recent literature in NLP applied to radiology reports. Our automated literature search yields 4,799 results using automated filtering, metadata enriching steps and citation search combined with manual review. Our analysis is based on 21 variables including radiology characteristics, NLP methodology, performance, study, and clinical application characteristics. We present a comprehensive analysis of the 164 publications retrieved with each categorised into one of 6 clinical application categories. Deep learning use increases but conventional machine learning approaches are still prevalent. Deep learning remains challenged when data is scarce and there is little evidence of adoption into clinical practice. Despite 17% of studies reporting greater than 0.85 F1 scores, it is hard to comparatively evaluate these approaches given that most of them use different datasets. Only 14 studies made their data and 15 their code available with 10 externally validating results. Automated understanding of clinical narratives of the radiology reports has the potential to enhance the healthcare process but reproducibility and explainability of models are important if the domain is to move applications into clinical use. More could be done to share code enabling validation of methods on different institutional data and to reduce heterogeneity in reporting of study properties allowing inter-study comparisons. Our results have significance for researchers providing a systematic synthesis of existing work to build on, identify gaps, opportunities for collaboration and avoid duplication.
△ Less
Submitted 18 February, 2021;
originally announced February 2021.
-
Constraining the Circumbinary Disk Tilt in the KH 15D system
Authors:
Michael Poon,
J. J. Zanazzi,
Wei Zhu
Abstract:
KH 15D is a system which consists of a young, eccentric binary, and a circumbinary disk which obscures the binary as the disk precesses. We develop a self-consistent model that provides a reasonable fit to the photometric variability that was observed in the KH 15D system over the past 60 years. Our model suggests that the circumbinary disk has an inner edge $r_{\rm in}\lesssim 1 \ {\rm au}$, an o…
▽ More
KH 15D is a system which consists of a young, eccentric binary, and a circumbinary disk which obscures the binary as the disk precesses. We develop a self-consistent model that provides a reasonable fit to the photometric variability that was observed in the KH 15D system over the past 60 years. Our model suggests that the circumbinary disk has an inner edge $r_{\rm in}\lesssim 1 \ {\rm au}$, an outer edge $r_{\rm out} \sim {\rm a \ few \ au}$, and that the disk is misaligned relative to the stellar binary by $\sim$5-16 degrees, with the inner edge more inclined than the outer edge. The difference between the inclinations (warp) and longitude of ascending nodes (twist) at the inner and outer edges of the disk are of order $\sim$10 degrees and $\sim$15 degrees, respectively. We also provide constraints on other properties of the disk, such as the precession period and surface density profile. Our work demonstrates the power of photometric data in constraining the physical properties of planet-forming circumbinary disks.
△ Less
Submitted 22 March, 2021; v1 submitted 29 September, 2020;
originally announced September 2020.
-
Learning the Structure of Auto-Encoding Recommenders
Authors:
Farhan Khawar,
Leonard Kin Man Poon,
Nevin Lianwen Zhang
Abstract:
Autoencoder recommenders have recently shown state-of-the-art performance in the recommendation task due to their ability to model non-linear item relationships effectively. However, existing autoencoder recommenders use fully-connected neural network layers and do not employ structure learning. This can lead to inefficient training, especially when the data is sparse as commonly found in collabor…
▽ More
Autoencoder recommenders have recently shown state-of-the-art performance in the recommendation task due to their ability to model non-linear item relationships effectively. However, existing autoencoder recommenders use fully-connected neural network layers and do not employ structure learning. This can lead to inefficient training, especially when the data is sparse as commonly found in collaborative filtering. The aforementioned results in lower generalization ability and reduced performance. In this paper, we introduce structure learning for autoencoder recommenders by taking advantage of the inherent item groups present in the collaborative filtering domain. Due to the nature of items in general, we know that certain items are more related to each other than to other items. Based on this, we propose a method that first learns groups of related items and then uses this information to determine the connectivity structure of an auto-encoding neural network. This results in a network that is sparsely connected. This sparse structure can be viewed as a prior that guides the network training. Empirically we demonstrate that the proposed structure learning enables the autoencoder to converge to a local optimum with a much smaller spectral norm and generalization error bound than the fully-connected network. The resultant sparse network considerably outperforms the state-of-the-art methods like \textsc{Mult-vae/Mult-dae} on multiple benchmarked datasets even when the same number of parameters and flops are used. It also has a better cold-start performance.
△ Less
Submitted 18 August, 2020;
originally announced August 2020.
-
Handling Collocations in Hierarchical Latent Tree Analysis for Topic Modeling
Authors:
Leonard K. M. Poon,
Nevin L. Zhang,
Haoran Xie,
Gary Cheng
Abstract:
Topic modeling has been one of the most active research areas in machine learning in recent years. Hierarchical latent tree analysis (HLTA) has been recently proposed for hierarchical topic modeling and has shown superior performance over state-of-the-art methods. However, the models used in HLTA have a tree structure and cannot represent the different meanings of multiword expressions sharing the…
▽ More
Topic modeling has been one of the most active research areas in machine learning in recent years. Hierarchical latent tree analysis (HLTA) has been recently proposed for hierarchical topic modeling and has shown superior performance over state-of-the-art methods. However, the models used in HLTA have a tree structure and cannot represent the different meanings of multiword expressions sharing the same word appropriately. Therefore, we propose a method for extracting and selecting collocations as a preprocessing step for HLTA. The selected collocations are replaced with single tokens in the bag-of-words model before running HLTA. Our empirical evaluation shows that the proposed method led to better performance of HLTA on three of the four data sets tested.
△ Less
Submitted 10 July, 2020;
originally announced July 2020.
-
Learning Latent Superstructures in Variational Autoencoders for Deep Multidimensional Clustering
Authors:
Xiaopeng Li,
Zhourong Chen,
Leonard K. M. Poon,
Nevin L. Zhang
Abstract:
We investigate a variant of variational autoencoders where there is a superstructure of discrete latent variables on top of the latent features. In general, our superstructure is a tree structure of multiple super latent variables and it is automatically learned from data. When there is only one latent variable in the superstructure, our model reduces to one that assumes the latent features to be…
▽ More
We investigate a variant of variational autoencoders where there is a superstructure of discrete latent variables on top of the latent features. In general, our superstructure is a tree structure of multiple super latent variables and it is automatically learned from data. When there is only one latent variable in the superstructure, our model reduces to one that assumes the latent features to be generated from a Gaussian mixture model. We call our model the latent tree variational autoencoder (LTVAE). Whereas previous deep learning methods for clustering produce only one partition of data, LTVAE produces multiple partitions of data, each being given by one super latent variable. This is desirable because high dimensional data usually have many different natural facets and can be meaningfully partitioned in multiple ways.
△ Less
Submitted 22 February, 2019; v1 submitted 14 March, 2018;
originally announced March 2018.
-
Latent Tree Analysis
Authors:
Nevin L. Zhang,
Leonard K. M. Poon
Abstract:
Latent tree analysis seeks to model the correlations among a set of random variables using a tree of latent variables. It was proposed as an improvement to latent class analysis --- a method widely used in social sciences and medicine to identify homogeneous subgroups in a population. It provides new and fruitful perspectives on a number of machine learning areas, including cluster analysis, topic…
▽ More
Latent tree analysis seeks to model the correlations among a set of random variables using a tree of latent variables. It was proposed as an improvement to latent class analysis --- a method widely used in social sciences and medicine to identify homogeneous subgroups in a population. It provides new and fruitful perspectives on a number of machine learning areas, including cluster analysis, topic detection, and deep probabilistic modeling. This paper gives an overview of the research on latent tree analysis and various ways it is used in practice.
△ Less
Submitted 1 October, 2016;
originally announced October 2016.
-
Topic Browsing for Research Papers with Hierarchical Latent Tree Analysis
Authors:
Leonard K. M. Poon,
Nevin L. Zhang
Abstract:
Academic researchers often need to face with a large collection of research papers in the literature. This problem may be even worse for postgraduate students who are new to a field and may not know where to start. To address this problem, we have developed an online catalog of research papers where the papers have been automatically categorized by a topic model. The catalog contains 7719 papers f…
▽ More
Academic researchers often need to face with a large collection of research papers in the literature. This problem may be even worse for postgraduate students who are new to a field and may not know where to start. To address this problem, we have developed an online catalog of research papers where the papers have been automatically categorized by a topic model. The catalog contains 7719 papers from the proceedings of two artificial intelligence conferences from 2000 to 2015. Rather than the commonly used Latent Dirichlet Allocation, we use a recently proposed method called hierarchical latent tree analysis for topic modeling. The resulting topic model contains a hierarchy of topics so that users can browse the topics from the top level to the bottom level. The topic model contains a manageable number of general topics at the top level and allows thousands of fine-grained topics at the bottom level. It also can detect topics that have emerged recently.
△ Less
Submitted 28 September, 2016;
originally announced September 2016.
-
Latent Tree Models for Hierarchical Topic Detection
Authors:
Peixian Chen,
Nevin L. Zhang,
Tengfei Liu,
Leonard K. M. Poon,
Zhourong Chen,
Farhan Khawar
Abstract:
We present a novel method for hierarchical topic detection where topics are obtained by clustering documents in multiple ways. Specifically, we model document collections using a class of graphical models called hierarchical latent tree models (HLTMs). The variables at the bottom level of an HLTM are observed binary variables that represent the presence/absence of words in a document. The variable…
▽ More
We present a novel method for hierarchical topic detection where topics are obtained by clustering documents in multiple ways. Specifically, we model document collections using a class of graphical models called hierarchical latent tree models (HLTMs). The variables at the bottom level of an HLTM are observed binary variables that represent the presence/absence of words in a document. The variables at other levels are binary latent variables, with those at the lowest latent level representing word co-occurrence patterns and those at higher levels representing co-occurrence of patterns at the level below. Each latent variable gives a soft partition of the documents, and document clusters in the partitions are interpreted as topics. Latent variables at high levels of the hierarchy capture long-range word co-occurrence patterns and hence give thematically more general topics, while those at low levels of the hierarchy capture short-range word co-occurrence patterns and give thematically more specific topics. Unlike LDA-based topic models, HLTMs do not refer to a document generation process and use word variables instead of token variables. They use a tree structure to model the relationships between topics and words, which is conducive to the discovery of meaningful topics and topic hierarchies.
△ Less
Submitted 21 December, 2016; v1 submitted 21 May, 2016;
originally announced May 2016.
-
Progressive EM for Latent Tree Models and Hierarchical Topic Detection
Authors:
Peixian Chen,
Nevin L. Zhang,
Leonard K. M. Poon,
Zhourong Chen
Abstract:
Hierarchical latent tree analysis (HLTA) is recently proposed as a new method for topic detection. It differs fundamentally from the LDA-based methods in terms of topic definition, topic-document relationship, and learning method. It has been shown to discover significantly more coherent topics and better topic hierarchies. However, HLTA relies on the Expectation-Maximization (EM) algorithm for pa…
▽ More
Hierarchical latent tree analysis (HLTA) is recently proposed as a new method for topic detection. It differs fundamentally from the LDA-based methods in terms of topic definition, topic-document relationship, and learning method. It has been shown to discover significantly more coherent topics and better topic hierarchies. However, HLTA relies on the Expectation-Maximization (EM) algorithm for parameter estimation and hence is not efficient enough to deal with large datasets. In this paper, we propose a method to drastically speed up HLTA using a technique inspired by recent advances in the moments method. Empirical experiments show that our method greatly improves the efficiency of HLTA. It is as efficient as the state-of-the-art LDA-based method for hierarchical topic detection and finds substantially better topics and topic hierarchies.
△ Less
Submitted 5 August, 2015;
originally announced August 2015.
-
A data-driven method for syndrome type identification and classification in traditional Chinese medicine
Authors:
Nevin L. Zhang,
Chen Fu,
Teng Fei Liu,
Bao Xin Chen,
Kin Man Poon,
Pei Xian Chen,
Yun Ling Zhang
Abstract:
Objective: The efficacy of traditional Chinese medicine (TCM) treatments for Western medicine (WM) diseases relies heavily on the proper classification of patients into TCM syndrome types. We develop a data-driven method for solving the classification problem, where syndrome types are identified and quantified based on patterns detected in unlabeled symptom survey data.
Method: Latent class anal…
▽ More
Objective: The efficacy of traditional Chinese medicine (TCM) treatments for Western medicine (WM) diseases relies heavily on the proper classification of patients into TCM syndrome types. We develop a data-driven method for solving the classification problem, where syndrome types are identified and quantified based on patterns detected in unlabeled symptom survey data.
Method: Latent class analysis (LCA) has been applied in WM research to solve a similar problem, i.e., to identify subtypes of a patient population in the absence of a gold standard. A widely known weakness of LCA is that it makes an unrealistically strong independence assumption. We relax the assumption by first detecting symptom co-occurrence patterns from survey data and use those patterns instead of the symptoms as features for LCA. Results: The result of the investigation is a six-step method: Data collection, symptom co-occurrence pattern discovery, pattern interpretation, syndrome identification, syndrome type identification, and syndrome type classification. A software package called Lantern is developed to support the application of the method. The method is illustrated using a data set on Vascular Mild Cognitive Impairment (VMCI).
Conclusions: A data-driven method for TCM syndrome identification and classification is presented. The method can be used to answer the following questions about a Western medicine disease: What TCM syndrome types are there among the patients with the disease? What is the prevalence of each syndrome type? What are the statistical characteristics of each syndrome type in terms of occurrence of symptoms? How can we determine the syndrome type(s) of a patient?
△ Less
Submitted 24 February, 2016; v1 submitted 27 October, 2014;
originally announced October 2014.
-
A Model-Based Approach to Rounding in Spectral Clustering
Authors:
Leonard K. M. Poon,
April H. Liu,
Tengfei Liu,
Nevin Lianwen Zhang
Abstract:
In spectral clustering, one defines a similarity matrix for a collection of data points, transforms the matrix to get the Laplacian matrix, finds the eigenvectors of the Laplacian matrix, and obtains a partition of the data using the leading eigenvectors. The last step is sometimes referred to as rounding, where one needs to decide how many leading eigenvectors to use, to determine the number of c…
▽ More
In spectral clustering, one defines a similarity matrix for a collection of data points, transforms the matrix to get the Laplacian matrix, finds the eigenvectors of the Laplacian matrix, and obtains a partition of the data using the leading eigenvectors. The last step is sometimes referred to as rounding, where one needs to decide how many leading eigenvectors to use, to determine the number of clusters, and to partition the data points. In this paper, we propose a novel method for rounding. The method differs from previous methods in three ways. First, we relax the assumption that the number of clusters equals the number of eigenvectors used. Second, when deciding the number of leading eigenvectors to use, we not only rely on information contained in the leading eigenvectors themselves, but also use subsequent eigenvectors. Third, our method is model-based and solves all the three subproblems of rounding using a class of graphical models called latent tree models. We evaluate our method on both synthetic and real-world data. The results show that our method works correctly in the ideal case where between-clusters similarity is 0, and degrades gracefully as one moves away from the ideal case.
△ Less
Submitted 16 October, 2012;
originally announced October 2012.
-
Chaotic Loss Cones, Black Hole Fueling and the M-Sigma Relation
Authors:
David Merritt,
M. Y. Poon
Abstract:
In classical loss cone theory, stars are supplied to a central black hole via gravitational scattering onto low angular momentum orbits. Higher feeding rates are possible if the gravitational potential near the black hole is non-axisymmetric and the orbits are chaotic. Motivated by recently published, self-consistent models, we evaluate rates of stellar capture and disruption in triaxial nuclei.…
▽ More
In classical loss cone theory, stars are supplied to a central black hole via gravitational scattering onto low angular momentum orbits. Higher feeding rates are possible if the gravitational potential near the black hole is non-axisymmetric and the orbits are chaotic. Motivated by recently published, self-consistent models, we evaluate rates of stellar capture and disruption in triaxial nuclei. Rates are found to substantially exceed those in collisionally-resupplied loss cones, as long as an appreciable fraction of the orbits are centrophilic. The mass captured by a black hole after a given time in a steep nucleus scales as the fifth power of the velocity dispersion, and the accumulated mass in 10^10 yr is of the correct order to reproduce the M-sigma relation. Triaxiality can solve the "final parsec problem" of decaying black hole binaries by increasing the flux of stars into the binary's loss cone.
△ Less
Submitted 13 August, 2003; v1 submitted 14 February, 2003;
originally announced February 2003.
-
A Self-Consistent Study of Triaxial Black-Hole Nuclei
Authors:
M. Y. Poon,
D. Merritt
Abstract:
We construct models of triaxial galactic nuclei containing central black holes using the method of orbital superposition, then verify their stability by advancing N-body realizations of the models forward in time. We assume a power-law form for the stellar density, rho ~ 1/r and 1/r^2; these correspond approximately to the nuclear density profiles of bright and faint galaxies respectively. Equid…
▽ More
We construct models of triaxial galactic nuclei containing central black holes using the method of orbital superposition, then verify their stability by advancing N-body realizations of the models forward in time. We assume a power-law form for the stellar density, rho ~ 1/r and 1/r^2; these correspond approximately to the nuclear density profiles of bright and faint galaxies respectively. Equidensity surfaces are ellipsoids with fixed axis ratios. The central black hole is represented by a Newtonian point mass. We consider three triaxial shapes: almost prolate, almost oblate and maximally triaxial. Two kinds of orbital solution are attempted for each mass model: the first including only regular orbits, the second including chaotic orbits as well. We find that stable configurations exist in the maximally triaxial and nearly-oblate cases; however steady-state solutions in the nearly-prolate geometry could not be found. A large fraction of the mass, of order 50% or more, could be assigned to the chaotic orbits without inducing evolution. Our results demonstrate that triaxiality may persist even within the sphere of influence of the central black hole, and that chaotic orbits may constitute an important building block of galactic nuclei.
△ Less
Submitted 30 December, 2002;
originally announced December 2002.
-
Triaxial Black-Hole Nuclei
Authors:
M. Y. Poon,
David Merritt
Abstract:
We demonstrate that the nuclei of galaxies containing supermassive black holes can be triaxial in shape. Schwarzschild's method was first used to construct self-consistent orbital superpositions representing nuclei with axis ratios of 1:0.79:0.5 and containing a central point mass representing a black hole. Two different density laws were considered, with power-law slopes of -1 and -2. We constr…
▽ More
We demonstrate that the nuclei of galaxies containing supermassive black holes can be triaxial in shape. Schwarzschild's method was first used to construct self-consistent orbital superpositions representing nuclei with axis ratios of 1:0.79:0.5 and containing a central point mass representing a black hole. Two different density laws were considered, with power-law slopes of -1 and -2. We constructed two solutions for each power law: one containing only regular orbits and the other containing both regular and chaotic orbits. Monte-Carlo realizations of the models were then advanced in time using an N-body code to verify their stability. All four models were found to retain their triaxial shapes for many crossing times. The possibility that galactic nuclei may be triaxial complicates the interpretation of stellar-kinematical data from the centers of galaxies and may alter the inferred interaction rates between stars and supermassive black holes.
△ Less
Submitted 3 April, 2002; v1 submitted 1 November, 2001;
originally announced November 2001.
-
Orbital Dynamics of Triaxial Black-Hole Nuclei
Authors:
M. Y. Poon,
D. Merritt
Abstract:
Orbital motion in triaxial nuclei with central point masses, representing supermassive black holes, is investigated. The stellar density is assumed to follow a power law, rho ~ 1/r^gamma, with gamma=1 or gamma=2. At low energies the motion is essentially regular; the major families of orbits are the tubes and the pyramids. Pyramid orbits are similar to box orbits but have their major elongation…
▽ More
Orbital motion in triaxial nuclei with central point masses, representing supermassive black holes, is investigated. The stellar density is assumed to follow a power law, rho ~ 1/r^gamma, with gamma=1 or gamma=2. At low energies the motion is essentially regular; the major families of orbits are the tubes and the pyramids. Pyramid orbits are similar to box orbits but have their major elongation parallel to the short axis of the figure. A number of regular orbit families associated with resonances also exist, most prominently the banana orbits, which are also elongated with the short axis. At a radius where the enclosed stellar mass is a few times the black hole mass, the pyramid orbits become stochastic. The energy of transition to this ``zone of chaos'' is computed as a function of gamma and of the shape of the stellar figure; it occurs at lower energies in more elongated potentials. Our results suggest that supermassive black holes may place tight constraints on departures from triaxiality in galactic nuclei, both by limiting the allowed shapes of regular orbits and by inducing chaos.
△ Less
Submitted 3 March, 2001; v1 submitted 30 June, 2000;
originally announced June 2000.