Search | arXiv e-print repository

arXiv:2405.11922 [pdf, other]

Effective Clustering on Large Attributed Bipartite Graphs

Authors: Renchi Yang, Yidu Wu, Xiaoyang Lin, Qichen Wang, Tsz Nam Chan, Jieming Shi

Abstract: Attributed bipartite graphs (ABGs) are an expressive data model for describing the interactions between two sets of heterogeneous nodes that are associated with rich attributes, such as customer-product purchase networks and author-paper authorship graphs. Partitioning the target node set in such graphs into k disjoint clusters (referred to as k-ABGC) finds widespread use in various domains, inclu… ▽ More Attributed bipartite graphs (ABGs) are an expressive data model for describing the interactions between two sets of heterogeneous nodes that are associated with rich attributes, such as customer-product purchase networks and author-paper authorship graphs. Partitioning the target node set in such graphs into k disjoint clusters (referred to as k-ABGC) finds widespread use in various domains, including social network analysis, recommendation systems, information retrieval, and bioinformatics. However, the majority of existing solutions towards k-ABGC either overlook attribute information or fail to capture bipartite graph structures accurately, engendering severely compromised result quality. The severity of these issues is accentuated in real ABGs, which often encompass millions of nodes and a sheer volume of attribute data, rendering effective k-ABGC over such graphs highly challenging. In this paper, we propose TPO, an effective and efficient approach to k-ABGC that achieves superb clustering performance on multiple real datasets. TPO obtains high clustering quality through two major contributions: (i) a novel formulation and transformation of the k-ABGC problem based on multi-scale attribute affinity specialized for capturing attribute affinities between nodes with the consideration of their multi-hop connections in ABGs, and (ii) a highly efficient solver that includes a suite of carefully-crafted optimizations for sidestep** explicit affinity matrix construction and facilitating faster convergence. Extensive experiments, comparing TPO against 19 baselines over 5 real ABGs, showcase the superior clustering quality of TPO measured against ground-truth labels. Moreover, compared to the state of the arts, TPO is often more than 40x faster over both small and large ABGs. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: The technical report for the paper was accepted to KDD 2024. 14 pages

arXiv:2307.12858 [pdf, other]

Treatment Outcome Prediction for Intracerebral Hemorrhage via Generative Prognostic Model with Imaging and Tabular Data

Authors: Wenao Ma, Cheng Chen, Jill Abrigo, Calvin Hoi-Kwan Mak, Yuqi Gong, Nga Yan Chan, Chu Han, Zaiyi Liu, Qi Dou

Abstract: Intracerebral hemorrhage (ICH) is the second most common and deadliest form of stroke. Despite medical advances, predicting treat ment outcomes for ICH remains a challenge. This paper proposes a novel prognostic model that utilizes both imaging and tabular data to predict treatment outcome for ICH. Our model is trained on observational data collected from non-randomized controlled trials, providin… ▽ More Intracerebral hemorrhage (ICH) is the second most common and deadliest form of stroke. Despite medical advances, predicting treat ment outcomes for ICH remains a challenge. This paper proposes a novel prognostic model that utilizes both imaging and tabular data to predict treatment outcome for ICH. Our model is trained on observational data collected from non-randomized controlled trials, providing reliable predictions of treatment success. Specifically, we propose to employ a variational autoencoder model to generate a low-dimensional prognostic score, which can effectively address the selection bias resulting from the non-randomized controlled trials. Importantly, we develop a variational distributions combination module that combines the information from imaging data, non-imaging clinical data, and treatment assignment to accurately generate the prognostic score. We conducted extensive experiments on a real-world clinical dataset of intracerebral hemorrhage. Our proposed method demonstrates a substantial improvement in treatment outcome prediction compared to existing state-of-the-art approaches. Code is available at https://github.com/med-air/TOP-GPM △ Less

Submitted 24 July, 2023; originally announced July 2023.

arXiv:2305.07594 [pdf, other]

Opti Code Pro: A Heuristic Search-based Approach to Code Refactoring

Authors: Sourena Khanzadeh, Samad Alias Nyein Chan, Richard Valenzano, Manar Alalfi

Abstract: This paper presents an approach that evaluates best-first search methods to code refactoring. The motivation for code refactoring could be to improve the design, structure, or implementation of an existing program without changing its functionality. To solve a very specific problem of coupling and cohesion, we propose using heuristic search-based techniques on an approximation of the full code ref… ▽ More This paper presents an approach that evaluates best-first search methods to code refactoring. The motivation for code refactoring could be to improve the design, structure, or implementation of an existing program without changing its functionality. To solve a very specific problem of coupling and cohesion, we propose using heuristic search-based techniques on an approximation of the full code refactoring problem, to guide the refactoring process toward solutions that have high cohesion and low coupling. We evaluated our approach by providing demonstrative examples of the effectiveness of this approach on random state problems and created a tool to implement the algorithm on Java projects. △ Less

Submitted 12 May, 2023; originally announced May 2023.

Comments: 12 pages, 2 figures

arXiv:2301.00409 [pdf, other]

Diffusion Model based Semi-supervised Learning on Brain Hemorrhage Images for Efficient Midline Shift Quantification

Authors: Shizhan Gong, Cheng Chen, Yuqi Gong, Nga Yan Chan, Wenao Ma, Calvin Hoi-Kwan Mak, Jill Abrigo, Qi Dou

Abstract: Brain midline shift (MLS) is one of the most critical factors to be considered for clinical diagnosis and treatment decision-making for intracranial hemorrhage. Existing computational methods on MLS quantification not only require intensive labeling in millimeter-level measurement but also suffer from poor performance due to their dependence on specific landmarks or simplified anatomical assumptio… ▽ More Brain midline shift (MLS) is one of the most critical factors to be considered for clinical diagnosis and treatment decision-making for intracranial hemorrhage. Existing computational methods on MLS quantification not only require intensive labeling in millimeter-level measurement but also suffer from poor performance due to their dependence on specific landmarks or simplified anatomical assumptions. In this paper, we propose a novel semi-supervised framework to accurately measure the scale of MLS from head CT scans. We formulate the MLS measurement task as a deformation estimation problem and solve it using a few MLS slices with sparse labels. Meanwhile, with the help of diffusion models, we are able to use a great number of unlabeled MLS data and 2793 non-MLS cases for representation learning and regularization. The extracted representation reflects how the image is different from a non-MLS image and regularization serves an important role in the sparse-to-dense refinement of the deformation field. Our experiment on a real clinical brain hemorrhage dataset has achieved state-of-the-art performance and can generate interpretable deformation fields. △ Less

Submitted 1 January, 2023; originally announced January 2023.

Comments: 12 pages, 5 figures

arXiv:2107.06211 [pdf, other]

doi 10.1109/TIP.2022.3160070

Attention-Guided Progressive Neural Texture Fusion for High Dynamic Range Image Restoration

Authors: Jie Chen, Zaifeng Yang, Tsz Nam Chan, Hui Li, Junhui Hou, Lap-Pui Chau

Abstract: High Dynamic Range (HDR) imaging via multi-exposure fusion is an important task for most modern imaging platforms. In spite of recent developments in both hardware and algorithm innovations, challenges remain over content association ambiguities caused by saturation, motion, and various artifacts introduced during multi-exposure fusion such as ghosting, noise, and blur. In this work, we propose an… ▽ More High Dynamic Range (HDR) imaging via multi-exposure fusion is an important task for most modern imaging platforms. In spite of recent developments in both hardware and algorithm innovations, challenges remain over content association ambiguities caused by saturation, motion, and various artifacts introduced during multi-exposure fusion such as ghosting, noise, and blur. In this work, we propose an Attention-guided Progressive Neural Texture Fusion (APNT-Fusion) HDR restoration model which aims to address these issues within one framework. An efficient two-stream structure is proposed which separately focuses on texture feature transfer over saturated regions and multi-exposure tonal and texture feature fusion. A neural feature transfer mechanism is proposed which establishes spatial correspondence between different exposures based on multi-scale VGG features in the masked saturated HDR domain for discriminative contextual clues over the ambiguous image areas. A progressive texture blending module is designed to blend the encoded two-stream features in a multi-scale and progressive manner. In addition, we introduce several novel attention mechanisms, i.e., the motion attention module detects and suppresses the content discrepancies among the reference images; the saturation attention module facilitates differentiating the misalignment caused by saturation from those caused by motion; and the scale attention module ensures texture blending consistency between different coder/decoder scales. We carry out comprehensive qualitative and quantitative evaluations and ablation studies, which validate that these novel modules work coherently under the same framework and outperform state-of-the-art methods. △ Less

Submitted 13 July, 2021; originally announced July 2021.

arXiv:2101.00474 [pdf, other]

Securing Isosceles Triangular Formations under Heterogeneous Sensing and Mixed Constraints

Authors: Nelson P. K. Chan, Bayu Jayawardhana, Hector Garcia de Marina

Abstract: This paper focuses on securing a triangular shape (up to translation) for a team of three mobile robots that uses heterogeneous sensing mechanism. Based on the available local information, each robot employs the popular gradient-based control law to attain the assigned individual task(s). In the current work, robots are assigned either distance and signed area task(s) or bearing task(s). We provid… ▽ More This paper focuses on securing a triangular shape (up to translation) for a team of three mobile robots that uses heterogeneous sensing mechanism. Based on the available local information, each robot employs the popular gradient-based control law to attain the assigned individual task(s). In the current work, robots are assigned either distance and signed area task(s) or bearing task(s). We provide a sufficient condition on the gain ratio $R_{\text{Ad}}$ between the signed area and the distance control term such that the desired formation shape, an isosceles triangle, is reached from all feasible starting positions. Numerical simulations are provided to support the theoretical analyses. △ Less

Submitted 2 January, 2021; originally announced January 2021.

Comments: 13 pages, submitted for journal publication

MSC Class: 93D99 ACM Class: I.2.9

arXiv:2007.06677 [pdf, ps, other]

Gradient Descent over Metagrammars for Syntax-Guided Synthesis

Authors: Nicolas Chan, Elizabeth Polgreen, Sanjit A. Seshia

Abstract: The performance of a syntax-guided synthesis algorithm is highly dependent on the provision of a good syntactic template, or grammar. Provision of such a template is often left to the user to do manually, though in the absence of such a grammar, state-of-the-art solvers will provide their own default grammar, which is dependent on the signature of the target program to be sythesized. In this work,… ▽ More The performance of a syntax-guided synthesis algorithm is highly dependent on the provision of a good syntactic template, or grammar. Provision of such a template is often left to the user to do manually, though in the absence of such a grammar, state-of-the-art solvers will provide their own default grammar, which is dependent on the signature of the target program to be sythesized. In this work, we speculate this default grammar could be improved upon substantially. We build sets of rules, or metagrammars, for constructing grammars, and perform a gradient descent over these metagrammars aiming to find a metagrammar which solves more benchmarks and on average faster. We show the resulting metagrammar enables CVC4 to solve 26% more benchmarks than the default grammar within a 300s time-out, and that metagrammars learnt from tens of benchmarks generalize to performance on 100s of benchmarks. △ Less

Submitted 16 July, 2020; v1 submitted 13 July, 2020; originally announced July 2020.

Comments: 5 pages, SYNT 2020

arXiv:2006.02518 [pdf, other]

Autonomous Vehicle Benchmarking using Unbiased Metrics

Authors: David Paz, Po-jung Lai, Nathan Chan, Yuqing Jiang, Henrik I. Christensen

Abstract: With the recent development of autonomous vehicle technology, there have been active efforts on the deployment of this technology at different scales that include urban and highway driving. While many of the prototypes showcased have been shown to operate under specific cases, little effort has been made to better understand their shortcomings and generalizability to new areas. Distance, uptime an… ▽ More With the recent development of autonomous vehicle technology, there have been active efforts on the deployment of this technology at different scales that include urban and highway driving. While many of the prototypes showcased have been shown to operate under specific cases, little effort has been made to better understand their shortcomings and generalizability to new areas. Distance, uptime and number of manual disengagements performed during autonomous driving provide a high-level idea on the performance of an autonomous system but without proper data normalization, testing location information, and the number of vehicles involved in testing, the disengagement reports alone do not fully encompass system performance and robustness. Thus, in this study a complete set of metrics are applied for benchmarking autonomous vehicle systems in a variety of scenarios that can be extended for comparison with human drivers and other autonomous vehicle systems. These metrics have been used to benchmark UC San Diego's autonomous vehicle platforms during early deployments for micro-transit and autonomous mail delivery applications. △ Less

Submitted 11 September, 2020; v1 submitted 3 June, 2020; originally announced June 2020.

Comments: 6 pages, 7 figures, IROS 2020

arXiv:2005.04694 [pdf, other]

doi 10.1109/LCSYS.2020.3000061

Angle-Constrained Formation Control for Circular Mobile Robots

Authors: Nelson P. K. Chan, Bayu Jayawardhana, Hector Garcia de Marina

Abstract: In this letter, we investigate the formation control problem of mobile robots moving in the plane where, instead of assuming robots to be simple points, each robot is assumed to have the form of a disk with equal radius. Based on interior angle measurements of the neighboring robots' disk, which can be obtained from low-cost vision sensors, we propose a gradient-based distributed control law and s… ▽ More In this letter, we investigate the formation control problem of mobile robots moving in the plane where, instead of assuming robots to be simple points, each robot is assumed to have the form of a disk with equal radius. Based on interior angle measurements of the neighboring robots' disk, which can be obtained from low-cost vision sensors, we propose a gradient-based distributed control law and show the exponential convergence property of the associated error system. By construction, the proposed control law has the appealing property of ensuring collision avoidance between neighboring robots. We also present simulation results for {a team} of four circular mobile robots forming a rectangular shape. △ Less

Submitted 10 May, 2020; originally announced May 2020.

Comments: 6 pages, submitted to the 59th IEEE Conf. Dec. Control and IEEE Control Systems Letters

MSC Class: 34H05; 70E60; 93C10; 93C85

arXiv:2003.08031 [pdf, other]

PolyFit: Polynomial-based Indexing Approach for Fast Approximate Range Aggregate Queries

Authors: Zhe Li, Tsz Nam Chan, Man Lung Yiu, Christian S. Jensen

Abstract: Range aggregate queries find frequent application in data analytics. In some use cases, approximate results are preferred over accurate results if they can be computed rapidly and satisfy approximation guarantees. Inspired by a recent indexing approach, we provide means of representing a discrete point data set by continuous functions that can then serve as compact index structures. More specifica… ▽ More Range aggregate queries find frequent application in data analytics. In some use cases, approximate results are preferred over accurate results if they can be computed rapidly and satisfy approximation guarantees. Inspired by a recent indexing approach, we provide means of representing a discrete point data set by continuous functions that can then serve as compact index structures. More specifically, we develop a polynomial-based indexing approach, called PolyFit, for processing approximate range aggregate queries. PolyFit is capable of supporting multiple types of range aggregate queries, including COUNT, SUM, MIN and MAX aggregates, with guaranteed absolute and relative error bounds. Experiment results show that PolyFit is faster and more accurate and compact than existing learned index structures. △ Less

Submitted 10 February, 2021; v1 submitted 17 March, 2020; originally announced March 2020.

Comments: 13 pages

arXiv:1907.06688 [pdf, ps, other]

Matrices of optimal tree-depth and a row-invariant parameterized algorithm for integer programming

Authors: Timothy F. N. Chan, Jacob W. Cooper, Martin Koutecky, Daniel Kral, Kristyna Pekarkova

Abstract: A long line of research on fixed parameter tractability of integer programming culminated with showing that integer programs with n variables and a constraint matrix with dual tree-depth d and largest entry D are solvable in time g(d,D)poly(n) for some function g. However, the dual tree-depth of a constraint matrix is not preserved by row operations, i.e., a given integer program can be equivalent… ▽ More A long line of research on fixed parameter tractability of integer programming culminated with showing that integer programs with n variables and a constraint matrix with dual tree-depth d and largest entry D are solvable in time g(d,D)poly(n) for some function g. However, the dual tree-depth of a constraint matrix is not preserved by row operations, i.e., a given integer program can be equivalent to another with a smaller dual tree-depth, and thus does not reflect its geometric structure. We prove that the minimum dual tree-depth of a row-equivalent matrix is equal to the branch-depth of the matroid defined by the columns of the matrix. We design a fixed parameter algorithm for computing branch-depth of matroids represented over a finite field and a fixed parameter algorithm for computing a row-equivalent matrix with minimum dual tree-depth. Finally, we use these results to obtain an algorithm for integer programming running in time g(d*,D)poly(n) where d* is the branch-depth of the constraint matrix; the branch-depth cannot be replaced by the more permissive notion of branch-width. △ Less

Submitted 31 January, 2022; v1 submitted 15 July, 2019; originally announced July 2019.

Comments: Full version. 48 pages, 7 figures

arXiv:1710.08632 [pdf, other]

Distributed estimation from relative measurements of heterogeneous and uncertain quality

Authors: Chiara Ravazzi, Nelson P. K. Chan, Paolo Frasca

Abstract: This paper studies the problem of estimation from relative measurements in a graph, in which a vector indexed over the nodes has to be reconstructed from pairwise measurements of differences between its components associated to nodes connected by an edge. In order to model heterogeneity and uncertainty of the measurements, we assume them to be affected by additive noise distributed according to a… ▽ More This paper studies the problem of estimation from relative measurements in a graph, in which a vector indexed over the nodes has to be reconstructed from pairwise measurements of differences between its components associated to nodes connected by an edge. In order to model heterogeneity and uncertainty of the measurements, we assume them to be affected by additive noise distributed according to a Gaussian mixture. In this original setup, we formulate the problem of computing the Maximum-Likelihood (ML) estimates and we design two novel algorithms, based on Least Squares regression and Expectation-Maximization (EM). The first algorithm (LS- EM) is centralized and performs the estimation from relative measurements, the soft classification of the measurements, and the estimation of the noise parameters. The second algorithm (Distributed LS-EM) is distributed and performs estimation and soft classification of the measurements, but requires the knowledge of the noise parameters. We provide rigorous proofs of convergence of both algorithms and we present numerical experiments to evaluate and compare their performance with classical solutions. The experiments show the robustness of the proposed methods against different kinds of noise and, for the Distributed LS-EM, against errors in the knowledge of noise parameters. △ Less

Submitted 26 July, 2018; v1 submitted 24 October, 2017; originally announced October 2017.

Comments: Submitted to IEEE transactions

Showing 1–12 of 12 results for author: Chan, N