-
SarcNet: A Novel AI-based Framework to Automatically Analyze and Score Sarcomere Organizations in Fluorescently Tagged hiPSC-CMs
Authors:
Huyen Le,
Khiet Dang,
Tien Lai,
Nhung Nguyen,
Mai Tran,
Hieu Pham
Abstract:
Quantifying sarcomere structure organization in human-induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) is crucial for understanding cardiac disease pathology, improving drug screening, and advancing regenerative medicine. Traditional methods, such as manual annotation and Fourier transform analysis, are labor-intensive, error-prone, and lack high-throughput capabilities. In this st…
▽ More
Quantifying sarcomere structure organization in human-induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) is crucial for understanding cardiac disease pathology, improving drug screening, and advancing regenerative medicine. Traditional methods, such as manual annotation and Fourier transform analysis, are labor-intensive, error-prone, and lack high-throughput capabilities. In this study, we present a novel deep learning-based framework that leverages cell images and integrates cell features to automatically evaluate the sarcomere structure of hiPSC-CMs from the onset of differentiation. This framework overcomes the limitations of traditional methods through automated, high-throughput analysis, providing consistent, reliable results while accurately detecting complex sarcomere patterns across diverse samples. The proposed framework contains the SarcNet, a linear layers-added ResNet-18 module, to output a continuous score ranging from one to five that captures the level of sarcomere structure organization. It is trained and validated on an open-source dataset of hiPSC-CMs images with the endogenously GFP-tagged alpha-actinin-2 structure developed by the Allen Institute for Cell Science (AICS). SarcNet achieves a Spearman correlation of 0.831 with expert evaluations, demonstrating superior performance and an improvement of 0.075 over the current state-of-the-art approach, which uses Linear Regression. Our results also show a consistent pattern of increasing organization from day 18 to day 32 of differentiation, aligning with expert evaluations. By integrating the quantitative features calculated directly from the images with the visual features learned during the deep learning model, our framework offers a more comprehensive and accurate assessment, thereby enhancing the further utility of hiPSC-CMs in medical research and therapy development.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Can we Defend Against the Unknown? An Empirical Study About Threshold Selection for Neural Network Monitoring
Authors:
Khoi Tran Dang,
Kevin Delmas,
Jérémie Guiochet,
Joris Guérin
Abstract:
With the increasing use of neural networks in critical systems, runtime monitoring becomes essential to reject unsafe predictions during inference. Various techniques have emerged to establish rejection scores that maximize the separability between the distributions of safe and unsafe predictions. The efficacy of these approaches is mostly evaluated using threshold-agnostic metrics, such as the ar…
▽ More
With the increasing use of neural networks in critical systems, runtime monitoring becomes essential to reject unsafe predictions during inference. Various techniques have emerged to establish rejection scores that maximize the separability between the distributions of safe and unsafe predictions. The efficacy of these approaches is mostly evaluated using threshold-agnostic metrics, such as the area under the receiver operating characteristic curve. However, in real-world applications, an effective monitor also requires identifying a good threshold to transform these scores into meaningful binary decisions. Despite the pivotal importance of threshold optimization, this problem has received little attention. A few studies touch upon this question, but they typically assume that the runtime data distribution mirrors the training distribution, which is a strong assumption as monitors are supposed to safeguard a system against potentially unforeseen threats. In this work, we present rigorous experiments on various image datasets to investigate: 1. The effectiveness of monitors in handling unforeseen threats, which are not available during threshold adjustments. 2. Whether integrating generic threats into the threshold optimization scheme can enhance the robustness of monitors.
△ Less
Submitted 21 May, 2024; v1 submitted 14 May, 2024;
originally announced May 2024.
-
Fuzzy Fault Trees Formalized
Authors:
Thi Kim Nhung Dang,
Milan Lopuhaä-Zwakenberg,
Mariëlle Stoelinga
Abstract:
Fault tree analysis is a vital method of assessing safety risks. It helps to identify potential causes of accidents, assess their likelihood and severity, and suggest preventive measures. Quantitative analysis of fault trees is often done via the dependability metrics that compute the system's failure behaviour over time. However, the lack of precise data is a major obstacle to quantitative analys…
▽ More
Fault tree analysis is a vital method of assessing safety risks. It helps to identify potential causes of accidents, assess their likelihood and severity, and suggest preventive measures. Quantitative analysis of fault trees is often done via the dependability metrics that compute the system's failure behaviour over time. However, the lack of precise data is a major obstacle to quantitative analysis, and so to reliability analysis. Fuzzy logic is a popular framework for dealing with ambiguous values and has applications in many domains. A number of fuzzy approaches have been proposed to fault tree analysis, but -- to the best of our knowledge -- none of them provide rigorous definitions or algorithms for computing fuzzy unreliability values. In this paper, we define a rigorous framework for fuzzy unreliability values. In addition, we provide a bottom-up algorithm to efficiently calculate fuzzy reliability for a system. The algorithm incorporates the concept of $α$-cuts method. That is, performing binary algebraic operations on intervals on horizontally discretised $α$-cut representations of fuzzy numbers. The method preserves the nonlinearity of fuzzy unreliability. Finally, we illustrate the results obtained from two case studies.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
VlogQA: Task, Dataset, and Baseline Models for Vietnamese Spoken-Based Machine Reading Comprehension
Authors:
Thinh Phuoc Ngo,
Khoa Tran Anh Dang,
Son T. Luu,
Kiet Van Nguyen,
Ngan Luu-Thuy Nguyen
Abstract:
This paper presents the development process of a Vietnamese spoken language corpus for machine reading comprehension (MRC) tasks and provides insights into the challenges and opportunities associated with using real-world data for machine reading comprehension tasks. The existing MRC corpora in Vietnamese mainly focus on formal written documents such as Wikipedia articles, online newspapers, or te…
▽ More
This paper presents the development process of a Vietnamese spoken language corpus for machine reading comprehension (MRC) tasks and provides insights into the challenges and opportunities associated with using real-world data for machine reading comprehension tasks. The existing MRC corpora in Vietnamese mainly focus on formal written documents such as Wikipedia articles, online newspapers, or textbooks. In contrast, the VlogQA consists of 10,076 question-answer pairs based on 1,230 transcript documents sourced from YouTube -- an extensive source of user-uploaded content, covering the topics of food and travel. By capturing the spoken language of native Vietnamese speakers in natural settings, an obscure corner overlooked in Vietnamese research, the corpus provides a valuable resource for future research in reading comprehension tasks for the Vietnamese language. Regarding performance evaluation, our deep-learning models achieved the highest F1 score of 75.34% on the test set, indicating significant progress in machine reading comprehension for Vietnamese spoken language data. In terms of EM, the highest score we accomplished is 53.97%, which reflects the challenge in processing spoken-based content and highlights the need for further improvement.
△ Less
Submitted 6 April, 2024; v1 submitted 4 February, 2024;
originally announced February 2024.
-
Fuzzy quantitative attack tree analysis
Authors:
Thi Kim Nhung Dang,
Milan Lopuhaä-Zwakenberg,
Mariëlle Stoelinga
Abstract:
Attack trees are important for security, as they help to identify weaknesses and vulnerabilities in a system. Quantitative attack tree analysis supports a number security metrics, which formulate important KPIs such as the shortest, most likely and cheapest attacks.
A key bottleneck in quantitative analysis is that the values are usually not known exactly, due to insufficient data and/or lack of…
▽ More
Attack trees are important for security, as they help to identify weaknesses and vulnerabilities in a system. Quantitative attack tree analysis supports a number security metrics, which formulate important KPIs such as the shortest, most likely and cheapest attacks.
A key bottleneck in quantitative analysis is that the values are usually not known exactly, due to insufficient data and/or lack of knowledge. Fuzzy logic is a prominent framework to handle such uncertain values, with applications in numerous domains. While several studies proposed fuzzy approaches to attack tree analysis, none of them provided a firm definition of fuzzy metric values or generic algorithms for computation of fuzzy metrics.
In this work, we define a generic formulation for fuzzy metric values that applies to most quantitative metrics. The resulting metric value is a fuzzy number obtained by following Zadeh's extension principle, obtained when we equip the basis attack steps, i.e., the leaves of the attack trees, with fuzzy numbers. In addition, we prove a modular decomposition theorem that yields a bottom-up algorithm to efficiently calculate the top fuzzy metric value.
△ Less
Submitted 22 January, 2024;
originally announced January 2024.
-
ReSynthDetect: A Fundus Anomaly Detection Network with Reconstruction and Synthetic Features
Authors:
**gqi Niu,
Qinji Yu,
Shiwen Dong,
Zilong Wang,
Kang Dang,
Xiaowei Ding
Abstract:
Detecting anomalies in fundus images through unsupervised methods is a challenging task due to the similarity between normal and abnormal tissues, as well as their indistinct boundaries. The current methods have limitations in accurately detecting subtle anomalies while avoiding false positives. To address these challenges, we propose the ReSynthDetect network which utilizes a reconstruction netwo…
▽ More
Detecting anomalies in fundus images through unsupervised methods is a challenging task due to the similarity between normal and abnormal tissues, as well as their indistinct boundaries. The current methods have limitations in accurately detecting subtle anomalies while avoiding false positives. To address these challenges, we propose the ReSynthDetect network which utilizes a reconstruction network for modeling normal images, and an anomaly generator that produces synthetic anomalies consistent with the appearance of fundus images. By combining the features of consistent anomaly generation and image reconstruction, our method is suited for detecting fundus abnormalities. The proposed approach has been extensively tested on benchmark datasets such as EyeQ and IDRiD, demonstrating state-of-the-art performance in both image-level and pixel-level anomaly detection. Our experiments indicate a substantial 9% improvement in AUROC on EyeQ and a significant 17.1% improvement in AUPR on IDRiD.
△ Less
Submitted 27 December, 2023;
originally announced December 2023.
-
Zero-Shot Object Goal Visual Navigation With Class-Independent Relationship Network
Authors:
Xinting Li,
Shiguang Zhang,
Yue LU,
Kerry Dang,
Lingyan Ran
Abstract:
This paper investigates the zero-shot object goal visual navigation problem. In the object goal visual navigation task, the agent needs to locate navigation targets from its egocentric visual input. "Zero-shot" means that the target the agent needs to find is not trained during the training phase. To address the issue of coupling navigation ability with target features during training, we propose…
▽ More
This paper investigates the zero-shot object goal visual navigation problem. In the object goal visual navigation task, the agent needs to locate navigation targets from its egocentric visual input. "Zero-shot" means that the target the agent needs to find is not trained during the training phase. To address the issue of coupling navigation ability with target features during training, we propose the Class-Independent Relationship Network (CIRN). This method combines target detection information with the relative semantic similarity between the target and the navigation target, and constructs a brand new state representation based on similarity ranking, this state representation does not include target feature or environment feature, effectively decoupling the agent's navigation ability from target features. And a Graph Convolutional Network (GCN) is employed to learn the relationships between different objects based on their similarities. During testing, our approach demonstrates strong generalization capabilities, including zero-shot navigation tasks with different targets and environments. Through extensive experiments in the AI2-THOR virtual environment, our method outperforms the current state-of-the-art approaches in the zero-shot object goal visual navigation task. Furthermore, we conducted experiments in more challenging cross-target and cross-scene settings, which further validate the robustness and generalization ability of our method. Our code is available at: https://github.com/SmartAndCleverRobot/ICRA-CIRN.
△ Less
Submitted 14 March, 2024; v1 submitted 15 October, 2023;
originally announced October 2023.
-
Qwen Technical Report
Authors:
**ze Bai,
Shuai Bai,
Yunfei Chu,
Zeyu Cui,
Kai Dang,
Xiaodong Deng,
Yang Fan,
Wenbin Ge,
Yu Han,
Fei Huang,
Binyuan Hui,
Luo Ji,
Mei Li,
Junyang Lin,
Runji Lin,
Dayiheng Liu,
Gao Liu,
Chengqiang Lu,
Keming Lu,
Jianxin Ma,
Rui Men,
Xingzhang Ren,
Xuancheng Ren,
Chuanqi Tan,
Sinan Tan
, et al. (23 additional authors not shown)
Abstract:
Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans. In this work, we introduce Qwen, the first installment of our large language model series. Qwen is a comprehensive language model series that encompasses distinct models with varying parameter counts. It includes Q…
▽ More
Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans. In this work, we introduce Qwen, the first installment of our large language model series. Qwen is a comprehensive language model series that encompasses distinct models with varying parameter counts. It includes Qwen, the base pretrained language models, and Qwen-Chat, the chat models finetuned with human alignment techniques. The base language models consistently demonstrate superior performance across a multitude of downstream tasks, and the chat models, particularly those trained using Reinforcement Learning from Human Feedback (RLHF), are highly competitive. The chat models possess advanced tool-use and planning capabilities for creating agent applications, showcasing impressive performance even when compared to bigger models on complex tasks like utilizing a code interpreter. Furthermore, we have developed coding-specialized models, Code-Qwen and Code-Qwen-Chat, as well as mathematics-focused models, Math-Qwen-Chat, which are built upon base language models. These models demonstrate significantly improved performance in comparison with open-source models, and slightly fall behind the proprietary models.
△ Less
Submitted 28 September, 2023;
originally announced September 2023.
-
Source-Free Domain Adaptation for Medical Image Segmentation via Prototype-Anchored Feature Alignment and Contrastive Learning
Authors:
Qinji Yu,
Nan Xi,
Junsong Yuan,
Ziyu Zhou,
Kang Dang,
Xiaowei Ding
Abstract:
Unsupervised domain adaptation (UDA) has increasingly gained interests for its capacity to transfer the knowledge learned from a labeled source domain to an unlabeled target domain. However, typical UDA methods require concurrent access to both the source and target domain data, which largely limits its application in medical scenarios where source data is often unavailable due to privacy concern.…
▽ More
Unsupervised domain adaptation (UDA) has increasingly gained interests for its capacity to transfer the knowledge learned from a labeled source domain to an unlabeled target domain. However, typical UDA methods require concurrent access to both the source and target domain data, which largely limits its application in medical scenarios where source data is often unavailable due to privacy concern. To tackle the source data-absent problem, we present a novel two-stage source-free domain adaptation (SFDA) framework for medical image segmentation, where only a well-trained source segmentation model and unlabeled target data are available during domain adaptation. Specifically, in the prototype-anchored feature alignment stage, we first utilize the weights of the pre-trained pixel-wise classifier as source prototypes, which preserve the information of source features. Then, we introduce the bi-directional transport to align the target features with class prototypes by minimizing its expected cost. On top of that, a contrastive learning stage is further devised to utilize those pixels with unreliable predictions for a more compact target feature distribution. Extensive experiments on a cross-modality medical segmentation task demonstrate the superiority of our method in large domain discrepancy settings compared with the state-of-the-art SFDA approaches and even some UDA methods. Code is available at https://github.com/CSCYQJ/MICCAI23-ProtoContra-SFDA.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
Mixtures of Gaussian process experts based on kernel stick-breaking processes
Authors:
Yuji Saikai,
Khue-Dung Dang
Abstract:
Mixtures of Gaussian process experts is a class of models that can simultaneously address two of the key limitations inherent in standard Gaussian processes: scalability and predictive performance. In particular, models that use Dirichlet processes as gating functions permit straightforward interpretation and automatic selection of the number of experts in a mixture. While the existing models are…
▽ More
Mixtures of Gaussian process experts is a class of models that can simultaneously address two of the key limitations inherent in standard Gaussian processes: scalability and predictive performance. In particular, models that use Dirichlet processes as gating functions permit straightforward interpretation and automatic selection of the number of experts in a mixture. While the existing models are intuitive and capable of capturing non-stationarity, multi-modality and heteroskedasticity, the simplicity of their gating functions may limit the predictive performance when applied to complex data-generating processes. Capitalising on the recent advancement in the dependent Dirichlet processes literature, we propose a new mixture model of Gaussian process experts based on kernel stick-breaking processes. Our model maintains the intuitive appeal yet improve the performance of the existing models. To make it practical, we design a sampler for posterior computation based on the slice sampling. The model behaviour and improved predictive performance are demonstrated in experiments using six datasets.
△ Less
Submitted 5 May, 2023; v1 submitted 26 April, 2023;
originally announced April 2023.
-
Region and Spatial Aware Anomaly Detection for Fundus Images
Authors:
**gqi Niu,
Shiwen Dong,
Qinji Yu,
Kang Dang,
Xiaowei Ding
Abstract:
Recently anomaly detection has drawn much attention in diagnosing ocular diseases. Most existing anomaly detection research in fundus images has relatively large anomaly scores in the salient retinal structures, such as blood vessels, optical cups and discs. In this paper, we propose a Region and Spatial Aware Anomaly Detection (ReSAD) method for fundus images, which obtains local region and long-…
▽ More
Recently anomaly detection has drawn much attention in diagnosing ocular diseases. Most existing anomaly detection research in fundus images has relatively large anomaly scores in the salient retinal structures, such as blood vessels, optical cups and discs. In this paper, we propose a Region and Spatial Aware Anomaly Detection (ReSAD) method for fundus images, which obtains local region and long-range spatial information to reduce the false positives in the normal structure. ReSAD transfers a pre-trained model to extract the features of normal fundus images and applies the Region-and-Spatial-Aware feature Combination module (ReSC) for pixel-level features to build a memory bank. In the testing phase, ReSAD uses the memory bank to determine out-of-distribution samples as abnormalities. Our method significantly outperforms the existing anomaly detection methods for fundus images on two publicly benchmark datasets.
△ Less
Submitted 7 March, 2023;
originally announced March 2023.
-
DualStreamFoveaNet: A Dual Stream Fusion Architecture with Anatomical Awareness for Robust Fovea Localization
Authors:
Sifan Song,
**feng Wang,
Zilong Wang,
Jionglong Su,
Xiaowei Ding,
Kang Dang
Abstract:
Accurate fovea localization is essential for analyzing retinal diseases to prevent irreversible vision loss. While current deep learning-based methods outperform traditional ones, they still face challenges such as the lack of local anatomical landmarks around the fovea, the inability to robustly handle diseased retinal images, and the variations in image conditions. In this paper, we propose a no…
▽ More
Accurate fovea localization is essential for analyzing retinal diseases to prevent irreversible vision loss. While current deep learning-based methods outperform traditional ones, they still face challenges such as the lack of local anatomical landmarks around the fovea, the inability to robustly handle diseased retinal images, and the variations in image conditions. In this paper, we propose a novel transformer-based architecture called DualStreamFoveaNet (DSFN) for multi-cue fusion. This architecture explicitly incorporates long-range connections and global features using retina and vessel distributions for robust fovea localization. We introduce a spatial attention mechanism in the dual-stream encoder to extract and fuse self-learned anatomical information, focusing more on features distributed along blood vessels and significantly reducing computational costs by decreasing token numbers. Our extensive experiments show that the proposed architecture achieves state-of-the-art performance on two public datasets and one large-scale private dataset. Furthermore, we demonstrate that the DSFN is more robust on both normal and diseased retina images and has better generalization capacity in cross-dataset experiments.
△ Less
Submitted 26 December, 2023; v1 submitted 14 February, 2023;
originally announced February 2023.
-
OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models
Authors:
**ze Bai,
Rui Men,
Hao Yang,
Xuancheng Ren,
Kai Dang,
Yichang Zhang,
Xiaohuan Zhou,
Peng Wang,
Sinan Tan,
An Yang,
Zeyu Cui,
Yu Han,
Shuai Bai,
Wenbin Ge,
Jianxin Ma,
Junyang Lin,
**gren Zhou,
Chang Zhou
Abstract:
Generalist models, which are capable of performing diverse multi-modal tasks in a task-agnostic way within a single model, have been explored recently. Being, hopefully, an alternative to approaching general-purpose AI, existing generalist models are still at an early stage, where modality and task coverage is limited. To empower multi-modal task-scaling and speed up this line of research, we rele…
▽ More
Generalist models, which are capable of performing diverse multi-modal tasks in a task-agnostic way within a single model, have been explored recently. Being, hopefully, an alternative to approaching general-purpose AI, existing generalist models are still at an early stage, where modality and task coverage is limited. To empower multi-modal task-scaling and speed up this line of research, we release a generalist model learning system, OFASys, built on top of a declarative task interface named multi-modal instruction. At the core of OFASys is the idea of decoupling multi-modal task representations from the underlying model implementations. In OFASys, a task involving multiple modalities can be defined declaratively even with just a single line of code. The system automatically generates task plans from such instructions for training and inference. It also facilitates multi-task training for diverse multi-modal workloads. As a starting point, we provide presets of 7 different modalities and 23 highly-diverse example tasks in OFASys, with which we also develop a first-in-kind, single model, OFA+, that can handle text, image, speech, video, and motion data. The single OFA+ model achieves 95% performance in average with only 16% parameters of 15 task-finetuned models, showcasing the performance reliability of multi-modal task-scaling provided by OFASys. Available at https://github.com/OFA-Sys/OFASys
△ Less
Submitted 8 December, 2022;
originally announced December 2022.
-
Coarse Retinal Lesion Annotations Refinement via Prototypical Learning
Authors:
Qinji Yu,
Kang Dang,
Ziyu Zhou,
Yongwei Chen,
Xiaowei Ding
Abstract:
Deep-learning-based approaches for retinal lesion segmentation often require an abundant amount of precise pixel-wise annotated data. However, coarse annotations such as circles or ellipses for outlining the lesion area can be six times more efficient than pixel-level annotation. Therefore, this paper proposes an annotation refinement network to convert a coarse annotation into a pixel-level segme…
▽ More
Deep-learning-based approaches for retinal lesion segmentation often require an abundant amount of precise pixel-wise annotated data. However, coarse annotations such as circles or ellipses for outlining the lesion area can be six times more efficient than pixel-level annotation. Therefore, this paper proposes an annotation refinement network to convert a coarse annotation into a pixel-level segmentation mask. Our main novelty is the application of the prototype learning paradigm to enhance the generalization ability across different datasets or types of lesions. We also introduce a prototype weighing module to handle challenging cases where the lesion is overly small. The proposed method was trained on the publicly available IDRiD dataset and then generalized to the public DDR and our real-world private datasets. Experiments show that our approach substantially improved the initial coarse mask and outperformed the non-prototypical baseline by a large margin. Moreover, we demonstrate the usefulness of the prototype weighing module in both cross-dataset and cross-class settings.
△ Less
Submitted 30 August, 2022;
originally announced August 2022.
-
Weakly Supervised Online Action Detection for Infant General Movements
Authors:
Tongyi Luo,
Jia Xiao,
Chuncao Zhang,
Siheng Chen,
Yuan Tian,
Guangjun Yu,
Kang Dang,
Xiaowei Ding
Abstract:
To make the earlier medical intervention of infants' cerebral palsy (CP), early diagnosis of brain damage is critical. Although general movements assessment(GMA) has shown promising results in early CP detection, it is laborious. Most existing works take videos as input to make fidgety movements(FMs) classification for the GMA automation. Those methods require a complete observation of videos and…
▽ More
To make the earlier medical intervention of infants' cerebral palsy (CP), early diagnosis of brain damage is critical. Although general movements assessment(GMA) has shown promising results in early CP detection, it is laborious. Most existing works take videos as input to make fidgety movements(FMs) classification for the GMA automation. Those methods require a complete observation of videos and can not localize video frames containing normal FMs. Therefore we propose a novel approach named WO-GMA to perform FMs localization in the weakly supervised online setting. Infant body keypoints are first extracted as the inputs to WO-GMA. Then WO-GMA performs local spatio-temporal extraction followed by two network branches to generate pseudo clip labels and model online actions. With the clip-level pseudo labels, the action modeling branch learns to detect FMs in an online fashion. Experimental results on a dataset with 757 videos of different infants show that WO-GMA can get state-of-the-art video-level classification and cliplevel detection results. Moreover, only the first 20% duration of the video is needed to get classification results as good as fully observed, implying a significantly shortened FMs diagnosis time. Code is available at: https://github.com/scofiedluo/WO-GMA.
△ Less
Submitted 7 August, 2022;
originally announced August 2022.
-
A Robust Framework of Chromosome Straightening with ViT-Patch GAN
Authors:
Sifan Song,
**feng Wang,
Fengrui Cheng,
Qirui Cao,
Yihan Zuo,
Yongteng Lei,
Ruomai Yang,
Chunxiao Yang,
Frans Coenen,
Jia Meng,
Kang Dang,
Jionglong Su
Abstract:
Chromosomes carry the genetic information of humans. They exhibit non-rigid and non-articulated nature with varying degrees of curvature. Chromosome straightening is an important step for subsequent karyotype construction, pathological diagnosis and cytogenetic map development. However, robust chromosome straightening remains challenging, due to the unavailability of training images, distorted chr…
▽ More
Chromosomes carry the genetic information of humans. They exhibit non-rigid and non-articulated nature with varying degrees of curvature. Chromosome straightening is an important step for subsequent karyotype construction, pathological diagnosis and cytogenetic map development. However, robust chromosome straightening remains challenging, due to the unavailability of training images, distorted chromosome details and shapes after straightening, as well as poor generalization capability. In this paper, we propose a novel architecture, ViT-Patch GAN, consisting of a self-learned motion transformation generator and a Vision Transformer-based patch (ViT-Patch) discriminator. The generator learns the motion representation of chromosomes for straightening. With the help of the ViT-Patch discriminator, the straightened chromosomes retain more shape and banding pattern details. The experimental results show that the proposed method achieves better performance on Fréchet Inception Distance (FID), Learned Perceptual Image Patch Similarity (LPIPS) and downstream chromosome classification accuracy, and shows excellent generalization capability on a large dataset.
△ Less
Submitted 16 May, 2023; v1 submitted 6 March, 2022;
originally announced March 2022.
-
ADAM Challenge: Detecting Age-related Macular Degeneration from Fundus Images
Authors:
Huihui Fang,
Fei Li,
Huazhu Fu,
Xu Sun,
Xingxing Cao,
Fengbin Lin,
Jaemin Son,
Sunho Kim,
Gwenole Quellec,
Sarah Matta,
Sharath M Shankaranarayana,
Yi-Ting Chen,
Chuen-heng Wang,
Nisarg A. Shah,
Chia-Yen Lee,
Chih-Chung Hsu,
Hai Xie,
Baiying Lei,
Ujjwal Baid,
Shubham Innani,
Kang Dang,
Wenxiu Shi,
Ravi Kamble,
Nitin Singhal,
Ching-Wei Wang
, et al. (6 additional authors not shown)
Abstract:
Age-related macular degeneration (AMD) is the leading cause of visual impairment among elderly in the world. Early detection of AMD is of great importance, as the vision loss caused by this disease is irreversible and permanent. Color fundus photography is the most cost-effective imaging modality to screen for retinal disorders. Cutting edge deep learning based algorithms have been recently develo…
▽ More
Age-related macular degeneration (AMD) is the leading cause of visual impairment among elderly in the world. Early detection of AMD is of great importance, as the vision loss caused by this disease is irreversible and permanent. Color fundus photography is the most cost-effective imaging modality to screen for retinal disorders. Cutting edge deep learning based algorithms have been recently developed for automatically detecting AMD from fundus images. However, there are still lack of a comprehensive annotated dataset and standard evaluation benchmarks. To deal with this issue, we set up the Automatic Detection challenge on Age-related Macular degeneration (ADAM), which was held as a satellite event of the ISBI 2020 conference. The ADAM challenge consisted of four tasks which cover the main aspects of detecting and characterizing AMD from fundus images, including detection of AMD, detection and segmentation of optic disc, localization of fovea, and detection and segmentation of lesions. As part of the challenge, we have released a comprehensive dataset of 1200 fundus images with AMD diagnostic labels, pixel-wise segmentation masks for both optic disc and AMD-related lesions (drusen, exudates, hemorrhages and scars, among others), as well as the coordinates corresponding to the location of the macular fovea. A uniform evaluation framework has been built to make a fair comparison of different models using this dataset. During the challenge, 610 results were submitted for online evaluation, with 11 teams finally participating in the onsite challenge. This paper introduces the challenge, the dataset and the evaluation methods, as well as summarizes the participating methods and analyzes their results for each task. In particular, we observed that the ensembling strategy and the incorporation of clinical domain knowledge were the key to improve the performance of the deep learning models.
△ Less
Submitted 6 May, 2022; v1 submitted 16 February, 2022;
originally announced February 2022.
-
Look back, look around: a systematic analysis of effective predictors for new outlinks in focused Web crawling
Authors:
Thi Kim Nhung Dang,
Doina Bucur,
Berk Atil,
Guillaume Pitel,
Frank Ruis,
Hamidreza Kadkhodaei,
Nelly Litvak
Abstract:
Small and medium enterprises rely on detailed Web analytics to be informed about their market and competition. Focused crawlers meet this demand by crawling and indexing specific parts of the Web. Critically, a focused crawler must quickly find new pages that have not yet been indexed. Since a new page can be discovered only by following a new outlink, predicting new outlinks is very relevant in p…
▽ More
Small and medium enterprises rely on detailed Web analytics to be informed about their market and competition. Focused crawlers meet this demand by crawling and indexing specific parts of the Web. Critically, a focused crawler must quickly find new pages that have not yet been indexed. Since a new page can be discovered only by following a new outlink, predicting new outlinks is very relevant in practice. In the literature, many feature designs have been proposed for predicting changes in the Web. In this work we provide a structured analysis of this problem, using new outlinks as our running prediction target. Specifically, we unify earlier feature designs in a taxonomic arrangement of features along two dimensions: static versus dynamic features, and features of a page versus features of the network around it. Within this taxonomy, complemented by our new (mainly, dynamic network) features, we identify best predictors for new outlinks. Our main conclusion is that most informative features are the recent history of new outlinks on a page itself, and of its content-related pages. Hence, we propose a new 'look back, look around' (LBLA) model, that uses only these features. With the obtained predictions, we design a number of scoring functions to guide a focused crawler to pages with most new outlinks, and compare their performance. The LBLA approach proved extremely effective, outperforming other models including those that use a most complete set of features. One of the learners we use, is the recent NGBoost method that assumes a Poisson distribution for the number of new outlinks on a page, and learns its parameters. This connects the two so far unrelated avenues in the literature: predictions based on features of a page, and those based on probabilistic modelling. All experiments were carried out on an original dataset, made available by a commercial focused crawler.
△ Less
Submitted 15 November, 2022; v1 submitted 9 November, 2021;
originally announced November 2021.
-
Bilateral-ViT for Robust Fovea Localization
Authors:
Sifan Song,
Kang Dang,
Qinji Yu,
Zilong Wang,
Frans Coenen,
Jionglong Su,
Xiaowei Ding
Abstract:
The fovea is an important anatomical landmark of the retina. Detecting the location of the fovea is essential for the analysis of many retinal diseases. However, robust fovea localization remains a challenging problem, as the fovea region often appears fuzzy, and retina diseases may further obscure its appearance. This paper proposes a novel Vision Transformer (ViT) approach that integrates inform…
▽ More
The fovea is an important anatomical landmark of the retina. Detecting the location of the fovea is essential for the analysis of many retinal diseases. However, robust fovea localization remains a challenging problem, as the fovea region often appears fuzzy, and retina diseases may further obscure its appearance. This paper proposes a novel Vision Transformer (ViT) approach that integrates information both inside and outside the fovea region to achieve robust fovea localization. Our proposed network, named Bilateral-Vision-Transformer (Bilateral-ViT), consists of two network branches: a transformer-based main network branch for integrating global context across the entire fundus image and a vessel branch for explicitly incorporating the structure of blood vessels. The encoded features from both network branches are subsequently merged with a customized Multi-scale Feature Fusion (MFF) module. Our comprehensive experiments demonstrate that the proposed approach is significantly more robust for diseased images and establishes the new state of the arts using the Messidor and PALM datasets.
△ Less
Submitted 3 March, 2022; v1 submitted 19 October, 2021;
originally announced October 2021.
-
A Location-Sensitive Local Prototype Network for Few-Shot Medical Image Segmentation
Authors:
Qinji Yu,
Kang Dang,
Nima Tajbakhsh,
Demetri Terzopoulos,
Xiaowei Ding
Abstract:
Despite the tremendous success of deep neural networks in medical image segmentation, they typically require a large amount of costly, expert-level annotated data. Few-shot segmentation approaches address this issue by learning to transfer knowledge from limited quantities of labeled examples. Incorporating appropriate prior knowledge is critical in designing high-performance few-shot segmentation…
▽ More
Despite the tremendous success of deep neural networks in medical image segmentation, they typically require a large amount of costly, expert-level annotated data. Few-shot segmentation approaches address this issue by learning to transfer knowledge from limited quantities of labeled examples. Incorporating appropriate prior knowledge is critical in designing high-performance few-shot segmentation algorithms. Since strong spatial priors exist in many medical imaging modalities, we propose a prototype-based method -- namely, the location-sensitive local prototype network -- that leverages spatial priors to perform few-shot medical image segmentation. Our approach divides the difficult problem of segmenting the entire image with global prototypes into easily solvable subproblems of local region segmentation with local prototypes. For organ segmentation experiments on the VISCERAL CT image dataset, our method outperforms the state-of-the-art approaches by 10% in the mean Dice coefficient. Extensive ablation studies demonstrate the substantial benefits of incorporating spatial information and confirm the effectiveness of our approach.
△ Less
Submitted 18 March, 2021;
originally announced March 2021.
-
XACs-DyPol: Towards an XACML-based Access Control Model for Dynamic Security Policy
Authors:
Tran Khanh Dang,
Xuan Son Ha,
Luong Khiem Tran
Abstract:
Authorization and access control play an essential role in protecting sensitive information from malicious users. The system is based on security policies to determine if an access request is allowed. However, of late, the growing popularity of big data has created a new challenge which the security policy management is facing with such as dynamic and update policies in run time. Applications of d…
▽ More
Authorization and access control play an essential role in protecting sensitive information from malicious users. The system is based on security policies to determine if an access request is allowed. However, of late, the growing popularity of big data has created a new challenge which the security policy management is facing with such as dynamic and update policies in run time. Applications of dynamic policies have brought many benefits to modern domains. To the best of our knowledge, there are no previous studies focusing on solving authorization problems in the dynamic policy environments. In this article, we focus on analyzing and classifying when an update policy occurs, and provide a pragmatic solution for such dynamic policies. The contribution of this work is twofold: a novel solution for managing the policy changes even when the access request has been granted, and an XACML-based implementation to empirically evaluate the proposed solution. The experimental results show the comparison between the newly introduced XACs-DyPol framework with Balana (an open source framework supporting XACML 3.0). The datasets are XACML 3.0-based policies, including three samples of real-world policy sets. According to the comparison results, our XACs-DyPol framework performs better than Balana in terms of all updates in dynamic security policy cases. Specially, our proposed solution outperforms by an order of magnitude when the policy structure includes complex policy sets, policies, and rules or some complicated comparison expression which contains higher than function and less than function.
△ Less
Submitted 10 April, 2020;
originally announced May 2020.
-
A low-overhead soft-hard fault-tolerant architecture, design and management scheme for reliable high-performance many-core 3D-NoC systems
Authors:
Khanh N Dang,
Michael Meyer,
Yuichi Okuyama,
Abderazek Ben Abdallah
Abstract:
The Network-on-Chip (NoC) paradigm has been proposed as a favorable solution to handle the strict communication requirements between the increasingly large number of cores on a single chip. However, NoC systems are exposed to the aggressive scaling down of transistors, low operating voltages, and high integration and power densities, making them vulnerable to permanent (hard) faults and transient…
▽ More
The Network-on-Chip (NoC) paradigm has been proposed as a favorable solution to handle the strict communication requirements between the increasingly large number of cores on a single chip. However, NoC systems are exposed to the aggressive scaling down of transistors, low operating voltages, and high integration and power densities, making them vulnerable to permanent (hard) faults and transient (soft) errors. A hard fault in a NoC can lead to external blocking, causing congestion across the whole network. A soft error is more challenging because of its silent data corruption, which leads to a large area of erroneous data due to error propagation, packet re-transmission, and deadlock. In this paper, we present the architecture and design of a comprehensive soft error and hard fault-tolerant 3D-NoC system, named 3D-Hard-Fault-Soft-Error-Tolerant-OASIS-NoC (3D-FETO). With the aid of efficient mechanisms and algorithms, 3D-FETO is capable of detecting and recovering from soft errors which occur in the routing pipeline stages and leverages reconfigurable components to handle permanent faults in links, input buffers, and crossbars. In-depth evaluation results show that the 3D-FETO system is able to work around different kinds of hard faults and soft errors, ensuring graceful performance degradation, while minimizing additional hardware complexity and remaining power efficient.
△ Less
Submitted 21 March, 2020;
originally announced March 2020.
-
An Efficient Software-Hardware Design Framework for Spiking Neural Network Systems
Authors:
Khanh N. Dang,
Abderazek Ben Abdallah
Abstract:
Spiking Neural Network (SNN) is the third generation of Neural Network (NN) mimicking the natural behavior of the brain. By processing based on binary input/output, SNNs offer lower complexity, higher density and lower power consumption. This work presents an efficient software-hardware design framework for develo** SNN systems in hardware. In addition, a design of low-cost neurosynaptic core is…
▽ More
Spiking Neural Network (SNN) is the third generation of Neural Network (NN) mimicking the natural behavior of the brain. By processing based on binary input/output, SNNs offer lower complexity, higher density and lower power consumption. This work presents an efficient software-hardware design framework for develo** SNN systems in hardware. In addition, a design of low-cost neurosynaptic core is presented based on packet-switching communication approach. The evaluation results show that the ANN to SNN conversion method with the size 784:1200:1200:10 performs 99% accuracy for MNIST while the unsupervised STDP archives 89% with the size 784:400 with recurrent connections. The design of 256-neurons and 65k synapses is also implemented in ASIC 45nm technology with an area cost of 0.205 $m m^2$.
△ Less
Submitted 22 March, 2020;
originally announced March 2020.
-
Reliability Assessment and Quantitative Evaluation of Soft-Error Resilient 3D Network-on-Chip Systems
Authors:
Khanh N Dang,
Michael Meyer,
Yuichi Okuyama,
Abderazek Ben Abdallah
Abstract:
Three-Dimensional Networks-on-Chips (3D-NoCs) have been proposed as an auspicious solution, merging the high parallelism of the Network-on-Chip (NoC) paradigm with the high-performance and low-power cost of 3D-ICs. However, as technology scales down, the reliability issues are becoming more crucial, especially for complex 3D-NoC which provides the communication requirements of multi and many-core…
▽ More
Three-Dimensional Networks-on-Chips (3D-NoCs) have been proposed as an auspicious solution, merging the high parallelism of the Network-on-Chip (NoC) paradigm with the high-performance and low-power cost of 3D-ICs. However, as technology scales down, the reliability issues are becoming more crucial, especially for complex 3D-NoC which provides the communication requirements of multi and many-core systems-on-chip. Reliability assessment is prominent for early stages of the manufacturing process to prevent costly redesigns of a target system. In this paper, we present an accurate reliability assessment and quantitative evaluation of a soft-error resilient 3D-NoC based on a soft-error resilient mechanism. The system can recover from transient errors occurring in different pipeline stages of the router. Based on this analysis, the effects of failures in the network's principal components are determined.
△ Less
Submitted 21 March, 2020;
originally announced March 2020.
-
Soft-Error and Hard-fault Tolerant Architecture and Routing Algorithm for Reliable 3D-NoC Systems
Authors:
Khanh N. Dang,
Yuichi Okuyama,
Abderazek Ben Abdallah
Abstract:
Network-on-Chip (NoC) paradigm has been proposed as an auspicious solution to handle the strict communication requirements between the increasingly large number of cores on a single multi and many-core chips. However, NoC systems are exposed to a variety of manufacturing, design and energetic particles factors making them vulnerable to permanent (hard) faults and transient (soft) errors. In this p…
▽ More
Network-on-Chip (NoC) paradigm has been proposed as an auspicious solution to handle the strict communication requirements between the increasingly large number of cores on a single multi and many-core chips. However, NoC systems are exposed to a variety of manufacturing, design and energetic particles factors making them vulnerable to permanent (hard) faults and transient (soft) errors. In this paper, we present a comprehensive soft error and hard fault tolerant 3D-NoC architecture, named 3D-Hard-Fault-Soft-Error-Tolerant-OASIS-NoC (3D-FETO). With the aid of adaptive algorithms, 3D-FETO is capable of detecting and recovering from soft errors occurring in the routing pipeline stages and is leveraging on reconfigurable components to handle permanent faults occurrence in links, input buffers, and crossbar. In-depth evaluation results show that the 3D-FETO system is able to work around different kinds of hard faults and soft errors while ensuring graceful performance degradation, minimizing the additional hardware complexity and remaining power-efficient.
△ Less
Submitted 21 March, 2020;
originally announced March 2020.
-
Report on power, thermal and reliability prediction for 3D Networks-on-Chip
Authors:
Khanh N. Dang,
Akram Ben Ahmed,
Abderazek Ben Abdallah,
Xuan-Tu Tran
Abstract:
By combining Three Dimensional Integrated Circuits with the Network-on-Chip infrastructure to obtain 3D Networks-on-Chip (3D-NoCs), the new on-chip communication paradigm brings several advantages on lower power, smaller footprint and lower latency. However, thermal dissipation is one of the most critical challenges for 3D-ICs where the heat cannot easily transfer through several layers of silicon…
▽ More
By combining Three Dimensional Integrated Circuits with the Network-on-Chip infrastructure to obtain 3D Networks-on-Chip (3D-NoCs), the new on-chip communication paradigm brings several advantages on lower power, smaller footprint and lower latency. However, thermal dissipation is one of the most critical challenges for 3D-ICs where the heat cannot easily transfer through several layers of silicon. Consequently, the high-temperature area also confronts the reliability threat as the Mean Time to Failure (MTTF) decreases exponentially with the operating temperature. Apparently, 3D-NoCs must tackle this fundamental problem in order to be widely used. Therefore, in this work, we investigate the thermal distribution and reliability prediction of 3D-NoCs. We first present a new method to help simulate the temperature (both steady and transient) using traffics value from realistic and synthetic benchmarks and the power consumption from standard VLSI design flow. Then, based on the proposed method, we further predict the relative reliability between different parts of the network. Experimental results show that the method has an extremely fast execution time in comparison to the acceleration lifetime test. Furthermore, we compare the thermal behavior and reliability between Monolithic design and TSV-based TSV. We also explorer the ability to implement the thermal via a mechanism to help reduce the operating temperature.
△ Less
Submitted 19 March, 2020;
originally announced March 2020.
-
Context Model for Pedestrian Intention Prediction using Factored Latent-Dynamic Conditional Random Fields
Authors:
Satyajit Neogi,
Michael Hoy,
Kang Dang,
Hang Yu,
Justin Dauwels
Abstract:
Smooth handling of pedestrian interactions is a key requirement for Autonomous Vehicles (AV) and Advanced Driver Assistance Systems (ADAS). Such systems call for early and accurate prediction of a pedestrian's crossing/not-crossing behaviour in front of the vehicle. Existing approaches to pedestrian behaviour prediction make use of pedestrian motion, his/her location in a scene and static context…
▽ More
Smooth handling of pedestrian interactions is a key requirement for Autonomous Vehicles (AV) and Advanced Driver Assistance Systems (ADAS). Such systems call for early and accurate prediction of a pedestrian's crossing/not-crossing behaviour in front of the vehicle. Existing approaches to pedestrian behaviour prediction make use of pedestrian motion, his/her location in a scene and static context variables such as traffic lights, zebra crossings etc. We stress on the necessity of early prediction for smooth operation of such systems. We introduce the influence of vehicle interactions on pedestrian intention for this purpose. In this paper, we show a discernible advance in prediction time aided by the inclusion of such vehicle interaction context. We apply our methods to two different datasets, one in-house collected - NTU dataset and another public real-life benchmark - JAAD dataset. We also propose a generic graphical model Factored Latent-Dynamic Conditional Random Fields (FLDCRF) for single and multi-label sequence prediction as well as joint interaction modeling tasks. FLDCRF outperforms Long Short-Term Memory (LSTM) networks across the datasets ($\sim$100 sequences per dataset) over identical time-series features. While the existing best system predicts pedestrian stop** behaviour with 70\% accuracy 0.38 seconds before the actual events, our system achieves such accuracy at least 0.9 seconds on an average before the actual events across datasets.
△ Less
Submitted 15 September, 2020; v1 submitted 27 July, 2019;
originally announced July 2019.
-
Secure Biometric-based Remote Authentication Protocol using Chebyshev Polynomials and Fuzzy Extractor
Authors:
Thi Ai Thao Nguyen,
Tran Khanh Dang,
Quynh Chi Truong,
Dinh Thanh Nguyen
Abstract:
In this paper, we have proposed a multi factor biometric-based remote authentication protocol. Our proposal overcomes the vulnerabilities of some previous works. At the same time, the protocol also obtains a low false accept rate (FAR) and false reject rate (FRR).
In this paper, we have proposed a multi factor biometric-based remote authentication protocol. Our proposal overcomes the vulnerabilities of some previous works. At the same time, the protocol also obtains a low false accept rate (FAR) and false reject rate (FRR).
△ Less
Submitted 9 April, 2019;
originally announced April 2019.
-
A Visual Model for Web Applications Security Monitoring
Authors:
Tran Tri Dang,
Tran Khanh Dang
Abstract:
This paper proposes a novel visual model for web applications security monitoring. Although an automated intrusion detection system can shield a web application from common attacks, it usually cannot detect more complicated break-ins. So, a human-assisted monitoring system is an indispensable complement, following the "Defense in depth" strategy. To support human operators working more effectively…
▽ More
This paper proposes a novel visual model for web applications security monitoring. Although an automated intrusion detection system can shield a web application from common attacks, it usually cannot detect more complicated break-ins. So, a human-assisted monitoring system is an indispensable complement, following the "Defense in depth" strategy. To support human operators working more effectively and efficiently, information visualization techniques are utilized in this model. A prototype implementation of this model is created and is used to test against a popular open source web application. Testing results prove the model's usefulness, at least in understanding the web application security structure.
△ Less
Submitted 5 April, 2019;
originally announced April 2019.
-
A New Biometric Template Protection using Random Orthonormal Projection and Fuzzy Commitment
Authors:
Thi Ai Thao Nguyen,
Tran Khanh Dang,
Dinh Thanh Nguyen
Abstract:
Biometric template protection is one of most essential parts in putting a biometric-based authentication system into practice. There have been many researches proposing different solutions to secure biometric templates of users. They can be categorized into two approaches: feature transformation and biometric cryptosystem. However, no one single template protection approach can satisfy all the req…
▽ More
Biometric template protection is one of most essential parts in putting a biometric-based authentication system into practice. There have been many researches proposing different solutions to secure biometric templates of users. They can be categorized into two approaches: feature transformation and biometric cryptosystem. However, no one single template protection approach can satisfy all the requirements of a secure biometric-based authentication system. In this work, we will propose a novel hybrid biometric template protection which takes benefits of both approaches while preventing their limitations. The experiments demonstrate that the performance of the system can be maintained with the support of a new random orthonormal project technique, which reduces the computational complexity while preserving the accuracy. Meanwhile, the security of biometric templates is guaranteed by employing fuzzy commitment protocol.
△ Less
Submitted 30 March, 2019;
originally announced April 2019.
-
The Meeting of Acquaintances: A Cost-efficient Authentication Scheme for Light-weight Objects with Transient Trust Level and Plurality Approach
Authors:
Tran Khanh Dang,
Khanh T. K. Tran
Abstract:
Wireless sensor networks consist of a large number of distributed sensor nodes so that potential risks are becoming more and more unpredictable. The new entrants pose the potential risks when they move into the secure zone. To build a door wall that provides safe and secured for the system, many recent research works applied the initial authentication process. However, the majority of the previous…
▽ More
Wireless sensor networks consist of a large number of distributed sensor nodes so that potential risks are becoming more and more unpredictable. The new entrants pose the potential risks when they move into the secure zone. To build a door wall that provides safe and secured for the system, many recent research works applied the initial authentication process. However, the majority of the previous articles only focused on the Central Authority (CA) since this leads to an increase in the computation cost and energy consumption for the specific cases on the Internet of Things (IoT). Hence, in this article, we will lessen the importance of these third parties through proposing an enhanced authentication mechanism that includes key management and evaluation based on the past interactions to assist the objects joining a secured area without any nearby CA. We refer to a mobility dataset from CRAWDAD collected at the University Politehnica of Bucharest and rebuild into a new random dataset larger than the old one. The new one is an input for a simulated authenticating algorithm to observe the communication cost and resource usage of devices. Our proposal helps the authenticating flexible, being strict with unknown devices into the secured zone. The threshold of maximum friends can modify based on the optimization of the symmetric-key algorithm to diminish communication costs (our experimental results compare to previous schemes less than 2000 bits) and raise flexibility in resource-constrained environments.
△ Less
Submitted 24 March, 2019;
originally announced March 2019.
-
Actor-Action Semantic Segmentation with Region Masks
Authors:
Kang Dang,
Chunluan Zhou,
Zhigang Tu,
Michael Hoy,
Justin Dauwels,
Junsong Yuan
Abstract:
In this paper, we study the actor-action semantic segmentation problem, which requires joint labeling of both actor and action categories in video frames. One major challenge for this task is that when an actor performs an action, different body parts of the actor provide different types of cues for the action category and may receive inconsistent action labeling when they are labeled independentl…
▽ More
In this paper, we study the actor-action semantic segmentation problem, which requires joint labeling of both actor and action categories in video frames. One major challenge for this task is that when an actor performs an action, different body parts of the actor provide different types of cues for the action category and may receive inconsistent action labeling when they are labeled independently. To address this issue, we propose an end-to-end region-based actor-action segmentation approach which relies on region masks from an instance segmentation algorithm. Our main novelty is to avoid labeling pixels in a region mask independently - instead we assign a single action label to these pixels to achieve consistent action labeling. When a pixel belongs to multiple region masks, max pooling is applied to resolve labeling conflicts. Our approach uses a two-stream network as the front-end (which learns features capturing both appearance and motion information), and uses two region-based segmentation networks as the back-end (which takes the fused features from the two-stream network as the input and predicts actor-action labeling). Experiments on the A2D dataset demonstrate that both the region-based segmentation strategy and the fused features from the two-stream network contribute to the performance improvements. The proposed approach outperforms the state-of-the-art results by more than 8% in mean class accuracy, and more than 5% in mean class IOU, which validates its effectiveness.
△ Less
Submitted 23 July, 2018;
originally announced July 2018.
-
A NoSQL Data-based Personalized Recommendation System for C2C e-Commerce
Authors:
Khanh Dang,
Khuong Vo,
Josef Küng
Abstract:
With the considerable development of customer-to-customer (C2C) e-commerce in the recent years, there is a big demand for an effective recommendation system that suggests suitable websites for users to sell their items with some specified needs. Nonetheless, e-commerce recommendation systems are mostly designed for business-to-customer (B2C) websites, where the systems offer the consumers the prod…
▽ More
With the considerable development of customer-to-customer (C2C) e-commerce in the recent years, there is a big demand for an effective recommendation system that suggests suitable websites for users to sell their items with some specified needs. Nonetheless, e-commerce recommendation systems are mostly designed for business-to-customer (B2C) websites, where the systems offer the consumers the products that they might like to buy. Almost none of the related research works focus on choosing selling sites for target items. In this paper, we introduce an approach that recommends the selling websites based upon the item's description, category, and desired selling price. This approach employs NoSQL data-based machine learning techniques for building and training topic models and classification models. The trained models can then be used to rank the websites dynamically with respect to the user needs. The experimental results with real-world datasets from Vietnam C2C websites will demonstrate the effectiveness of our proposed method.
△ Less
Submitted 26 June, 2018;
originally announced June 2018.
-
Security Visualization for peer-to-peer resource sharing applications
Authors:
Dand Tran Tri,
Tran Khanh Dang
Abstract:
Security of an information system is only as strong as its weakest element. Popular elements of such system include hardware, software, network and people. Current approaches to computer security problems usually exclude people in their studies even though it is an integral part of these systems. To fill that gap, this paper discusses crucial people-related problems in computer security and prop…
▽ More
Security of an information system is only as strong as its weakest element. Popular elements of such system include hardware, software, network and people. Current approaches to computer security problems usually exclude people in their studies even though it is an integral part of these systems. To fill that gap, this paper discusses crucial people-related problems in computer security and proposes a method of improving security in such systems by integrating people tightly into the whole system. The integration is implemented via visualization to provide visual feedbacks and capture people's awareness of their actions and consequent results. By doing it, we can improve system usability, shorten user's learning curve, and hence enable user uses computer systems more securely.
△ Less
Submitted 11 December, 2009;
originally announced December 2009.