Search | arXiv e-print repository

arXiv:2406.05349 [pdf, other]

Blurry-Consistency Segmentation Framework with Selective Stacking on Differential Interference Contrast 3D Breast Cancer Spheroid

Authors: Thanh-Huy Nguyen, Thi Kim Ngan Ngo, Mai Anh Vu, Ting-Yuan Tu

Abstract: The ability of three-dimensional (3D) spheroid modeling to study the invasive behavior of breast cancer cells has drawn increased attention. The deep learning-based image processing framework is very effective at speeding up the cell morphological analysis process. Out-of-focus photos taken while capturing 3D cells under several z-slices, however, could negatively impact the deep learning model. I… ▽ More The ability of three-dimensional (3D) spheroid modeling to study the invasive behavior of breast cancer cells has drawn increased attention. The deep learning-based image processing framework is very effective at speeding up the cell morphological analysis process. Out-of-focus photos taken while capturing 3D cells under several z-slices, however, could negatively impact the deep learning model. In this work, we created a new algorithm to handle blurry images while preserving the stacked image quality. Furthermore, we proposed a unique training architecture that leverages consistency training to help reduce the bias of the model when dense-slice stacking is applied. Additionally, the model's stability is increased under the sparse-slice stacking effect by utilizing the self-training approach. The new blurring stacking technique and training flow are combined with the suggested architecture and self-training mechanism to provide an innovative yet easy-to-use framework. Our methods produced noteworthy experimental outcomes in terms of both quantitative and qualitative aspects. △ Less

Submitted 8 June, 2024; originally announced June 2024.

arXiv:2305.06044 [pdf, other]

Correlation visualization under missing values: a comparison between imputation and direct parameter estimation methods

Authors: Nhat-Hao Pham, Khanh-Linh Vo, Mai Anh Vu, Thu Nguyen, Michael A. Riegler, Pål Halvorsen, Binh T. Nguyen

Abstract: Correlation matrix visualization is essential for understanding the relationships between variables in a dataset, but missing data can pose a significant challenge in estimating correlation coefficients. In this paper, we compare the effects of various missing data methods on the correlation plot, focusing on two common missing patterns: random and monotone. We aim to provide practical strategies… ▽ More Correlation matrix visualization is essential for understanding the relationships between variables in a dataset, but missing data can pose a significant challenge in estimating correlation coefficients. In this paper, we compare the effects of various missing data methods on the correlation plot, focusing on two common missing patterns: random and monotone. We aim to provide practical strategies and recommendations for researchers and practitioners in creating and analyzing the correlation plot. Our experimental results suggest that while imputation is commonly used for missing data, using imputed data for plotting the correlation matrix may lead to a significantly misleading inference of the relation between the features. We recommend using DPER, a direct parameter estimation approach, for plotting the correlation matrix based on its performance in the experiments. △ Less

Submitted 5 September, 2023; v1 submitted 10 May, 2023; originally announced May 2023.

arXiv:2305.06042 [pdf, other]

Blockwise Principal Component Analysis for monotone missing data imputation and dimensionality reduction

Authors: Tu T. Do, Mai Anh Vu, Tuan L. Vo, Hoang Thien Ly, Thu Nguyen, Steven A. Hicks, Michael A. Riegler, Pål Halvorsen, Binh T. Nguyen

Abstract: Monotone missing data is a common problem in data analysis. However, imputation combined with dimensionality reduction can be computationally expensive, especially with the increasing size of datasets. To address this issue, we propose a Blockwise principal component analysis Imputation (BPI) framework for dimensionality reduction and imputation of monotone missing data. The framework conducts Pri… ▽ More Monotone missing data is a common problem in data analysis. However, imputation combined with dimensionality reduction can be computationally expensive, especially with the increasing size of datasets. To address this issue, we propose a Blockwise principal component analysis Imputation (BPI) framework for dimensionality reduction and imputation of monotone missing data. The framework conducts Principal Component Analysis (PCA) on the observed part of each monotone block of the data and then imputes on merging the obtained principal components using a chosen imputation technique. BPI can work with various imputation techniques and can significantly reduce imputation time compared to conducting dimensionality reduction after imputation. This makes it a practical and efficient approach for large datasets with monotone missing data. Our experiments validate the improvement in speed. In addition, our experiments also show that while applying MICE imputation directly on missing data may not yield convergence, applying BPI with MICE for the data may lead to convergence. △ Less

Submitted 10 January, 2024; v1 submitted 10 May, 2023; originally announced May 2023.

arXiv:2302.00911 [pdf, other]

Conditional expectation with regularization for missing data imputation

Authors: Mai Anh Vu, Thu Nguyen, Tu T. Do, Nhan Phan, Nitesh V. Chawla, Pål Halvorsen, Michael A. Riegler, Binh T. Nguyen

Abstract: Missing data frequently occurs in datasets across various domains, such as medicine, sports, and finance. In many cases, to enable proper and reliable analyses of such data, the missing values are often imputed, and it is necessary that the method used has a low root mean square error (RMSE) between the imputed and the true values. In addition, for some critical applications, it is also often a re… ▽ More Missing data frequently occurs in datasets across various domains, such as medicine, sports, and finance. In many cases, to enable proper and reliable analyses of such data, the missing values are often imputed, and it is necessary that the method used has a low root mean square error (RMSE) between the imputed and the true values. In addition, for some critical applications, it is also often a requirement that the imputation method is scalable and the logic behind the imputation is explainable, which is especially difficult for complex methods that are, for example, based on deep learning. Based on these considerations, we propose a new algorithm named "conditional Distribution-based Imputation of Missing Values with Regularization" (DIMV). DIMV operates by determining the conditional distribution of a feature that has missing entries, using the information from the fully observed features as a basis. As will be illustrated via experiments in the paper, DIMV (i) gives a low RMSE for the imputed values compared to state-of-the-art methods; (ii) fast and scalable; (iii) is explainable as coefficients in a regression model, allowing reliable and trustable analysis, makes it a suitable choice for critical domains where understanding is important such as in medical fields, finance, etc; (iv) can provide an approximated confidence region for the missing values in a given sample; (v) suitable for both small and large scale data; (vi) in many scenarios, does not require a huge number of parameters as deep learning approaches; (vii) handle multicollinearity in imputation effectively; and (viii) is robust to the normally distributed assumption that its theoretical grounds rely on. △ Less

Submitted 11 September, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

arXiv:2104.02983 [pdf, other]

Optimal fire allocation in a combat model of mixed NCW type

Authors: My A. Vu, Nam H. Nguyen, Hanh Le T. Nguyen, Anh N. Ta, Mong H. Nguyen

Abstract: In this work, we introduce a nonlinear Lanchester model of NCW-type and study a problem of finding the optimal fire allocation for this model. A Blue party $B$ will fight against a Red party consisting of $A$ and $R$, where $A$ is an independent force and $R$ fights with supports from a supply unit $N$. A battle may consist of several stages but we consider the problem of finding optimal fire allo… ▽ More In this work, we introduce a nonlinear Lanchester model of NCW-type and study a problem of finding the optimal fire allocation for this model. A Blue party $B$ will fight against a Red party consisting of $A$ and $R$, where $A$ is an independent force and $R$ fights with supports from a supply unit $N$. A battle may consist of several stages but we consider the problem of finding optimal fire allocation for $B$ in the first stage only. Optimal fire allocation is a set of three non-negative numbers whose sum equals to one, such that the remaining force of $B$ is maximal at any instants. In order to tackle this problem, we introduce the notion of \textit{threatening rates} which are computed for $A, R, N$ at the beginning of the battle. Numerical illustrations are presented to justify the theoretical findings. △ Less

Submitted 7 April, 2021; originally announced April 2021.

arXiv:2008.05250 [pdf, ps, other]

Optimizing fire allocation in a NCW-type model

Authors: Nam Hong Nguyen, My Anh Vu, Dinh Van Bui, Anh Ngoc Ta, Manh Duc Hy

Abstract: In this paper, we introduce a non-linear Lanchester model of NCW-type and investigate an optimization problem for this model, where only the Red force is supplied by several supply agents. Optimal fire allocation of the Blue force is sought in the form of a piece-wise constant function of time. A threatening rate is computed for the Red force and each of its supply agents at the beginning of each… ▽ More In this paper, we introduce a non-linear Lanchester model of NCW-type and investigate an optimization problem for this model, where only the Red force is supplied by several supply agents. Optimal fire allocation of the Blue force is sought in the form of a piece-wise constant function of time. A threatening rate is computed for the Red force and each of its supply agents at the beginning of each stage of the combat. These rates can be used to derive the optimal decision for the Blue force to focus its firepower to the Red force itself or one of its supply agents. This optimal fire allocation is derived and proved by considering an optimization problem of number of Blue force troops. Numerical experiments are included to demonstrate the theoretical results. △ Less

Submitted 12 August, 2020; originally announced August 2020.

Comments: 6 pages on NCW-type model

arXiv:1809.01564 [pdf, other]

Traffic Density Estimation using a Convolutional Neural Network

Authors: Julian Nubert, Nicholas Giai Truong, Abel Lim, Herbert Ilhan Tanujaya, Leah Lim, Mai Anh Vu

Abstract: The goal of this project is to introduce and present a machine learning application that aims to improve the quality of life of people in Singapore. In particular, we investigate the use of machine learning solutions to tackle the problem of traffic congestion in Singapore. In layman's terms, we seek to make Singapore (or any other city) a smoother place. To accomplish this aim, we present an end-… ▽ More The goal of this project is to introduce and present a machine learning application that aims to improve the quality of life of people in Singapore. In particular, we investigate the use of machine learning solutions to tackle the problem of traffic congestion in Singapore. In layman's terms, we seek to make Singapore (or any other city) a smoother place. To accomplish this aim, we present an end-to-end system comprising of 1. A traffic density estimation algorithm at traffic lights/junctions and 2. a suitable traffic signal control algorithms that make use of the density information for better traffic control. Traffic density estimation can be obtained from traffic junction images using various machine learning techniques (combined with CV tools). After research into various advanced machine learning methods, we decided on convolutional neural networks (CNNs). We conducted experiments on our algorithms, using the publicly available traffic camera dataset published by the Land Transport Authority (LTA) to demonstrate the feasibility of this approach. With these traffic density estimates, different traffic algorithms can be applied to minimize congestion at traffic junctions in general. △ Less

Submitted 5 September, 2018; originally announced September 2018.

Comments: Machine Learning Project National University of Singapore. 6 pages, 5 figures

Showing 1–7 of 7 results for author: Vu, M A