Search | arXiv e-print repository

arXiv:2405.19538 [pdf, other]

CheXpert Plus: Augmenting a Large Chest X-ray Dataset with Text Radiology Reports, Patient Demographics and Additional Image Formats

Authors: Pierre Chambon, Jean-Benoit Delbrouck, Thomas Sounack, Shih-Cheng Huang, Zhihong Chen, Maya Varma, Steven QH Truong, Chu The Chuong, Curtis P. Langlotz

Abstract: Since the release of the original CheXpert paper five years ago, CheXpert has become one of the most widely used and cited clinical AI datasets. The emergence of vision language models has sparked an increase in demands for sharing reports linked to CheXpert images, along with a growing interest among AI fairness researchers in obtaining demographic data. To address this, CheXpert Plus serves as a… ▽ More Since the release of the original CheXpert paper five years ago, CheXpert has become one of the most widely used and cited clinical AI datasets. The emergence of vision language models has sparked an increase in demands for sharing reports linked to CheXpert images, along with a growing interest among AI fairness researchers in obtaining demographic data. To address this, CheXpert Plus serves as a new collection of radiology data sources, made publicly available to enhance the scaling, performance, robustness, and fairness of models for all subsequent machine learning tasks in the field of radiology. CheXpert Plus is the largest text dataset publicly released in radiology, with a total of 36 million text tokens, including 13 million impression tokens. To the best of our knowledge, it represents the largest text de-identification effort in radiology, with almost 1 million PHI spans anonymized. It is only the second time that a large-scale English paired dataset has been released in radiology, thereby enabling, for the first time, cross-institution training at scale. All reports are paired with high-quality images in DICOM format, along with numerous image and patient metadata covering various clinical and socio-economic groups, as well as many pathology labels and RadGraph annotations. We hope this dataset will boost research for AI models that can further assist radiologists and help improve medical care. Data is available at the following URL: https://stanfordaimi.azurewebsites.net/datasets/5158c524-d3ab-4e02-96e9-6ee9efc110a1 Models are available at the following URL: https://github.com/Stanford-AIMI/chexpert-plus △ Less

Submitted 3 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

Comments: 13 pages Updated title

arXiv:2404.11152 [pdf, other]

Multi-target and multi-stage liver lesion segmentation and detection in multi-phase computed tomography scans

Authors: Abdullah F. Al-Battal, Soan T. M. Duong, Van Ha Tang, Quang Duc Tran, Steven Q. H. Truong, Chien Phan, Truong Q. Nguyen, Cheolhong An

Abstract: Multi-phase computed tomography (CT) scans use contrast agents to highlight different anatomical structures within the body to improve the probability of identifying and detecting anatomical structures of interest and abnormalities such as liver lesions. Yet, detecting these lesions remains a challenging task as these lesions vary significantly in their size, shape, texture, and contrast with resp… ▽ More Multi-phase computed tomography (CT) scans use contrast agents to highlight different anatomical structures within the body to improve the probability of identifying and detecting anatomical structures of interest and abnormalities such as liver lesions. Yet, detecting these lesions remains a challenging task as these lesions vary significantly in their size, shape, texture, and contrast with respect to surrounding tissue. Therefore, radiologists need to have an extensive experience to be able to identify and detect these lesions. Segmentation-based neural networks can assist radiologists with this task. Current state-of-the-art lesion segmentation networks use the encoder-decoder design paradigm based on the UNet architecture where the multi-phase CT scan volume is fed to the network as a multi-channel input. Although this approach utilizes information from all the phases and outperform single-phase segmentation networks, we demonstrate that their performance is not optimal and can be further improved by incorporating the learning from models trained on each single-phase individually. Our approach comprises three stages. The first stage identifies the regions within the liver where there might be lesions at three different scales (4, 8, and 16 mm). The second stage includes the main segmentation model trained using all the phases as well as a segmentation model trained on each of the phases individually. The third stage uses the multi-phase CT volumes together with the predictions from each of the segmentation models to generate the final segmentation map. Overall, our approach improves relative liver lesion segmentation performance by 1.6% while reducing performance variability across subjects by 8% when compared to the current state-of-the-art models. △ Less

Submitted 17 April, 2024; originally announced April 2024.

arXiv:2402.02441 [pdf, other]

TopoX: A Suite of Python Packages for Machine Learning on Topological Domains

Authors: Mustafa Hajij, Mathilde Papillon, Florian Frantzen, Jens Agerberg, Ibrahem AlJabea, Ruben Ballester, Claudio Battiloro, Guillermo Bernárdez, Tolga Birdal, Aiden Brent, Peter Chin, Sergio Escalera, Simone Fiorellino, Odin Hoff Gardaa, Gurusankar Gopalakrishnan, Devendra Govil, Josef Hoppe, Maneel Reddy Karri, Jude Khouja, Manuel Lecha, Neal Livesay, Jan Meißner, Soham Mukherjee, Alexander Nikitin, Theodore Papamarkou , et al. (18 additional authors not shown)

Abstract: We introduce TopoX, a Python software suite that provides reliable and user-friendly building blocks for computing and machine learning on topological domains that extend graphs: hypergraphs, simplicial, cellular, path and combinatorial complexes. TopoX consists of three packages: TopoNetX facilitates constructing and computing on these domains, including working with nodes, edges and higher-order… ▽ More We introduce TopoX, a Python software suite that provides reliable and user-friendly building blocks for computing and machine learning on topological domains that extend graphs: hypergraphs, simplicial, cellular, path and combinatorial complexes. TopoX consists of three packages: TopoNetX facilitates constructing and computing on these domains, including working with nodes, edges and higher-order cells; TopoEmbedX provides methods to embed topological domains into vector spaces, akin to popular graph-based embedding algorithms such as node2vec; TopoModelx is built on top of PyTorch and offers a comprehensive toolbox of higher-order message passing functions for neural networks on topological domains. The extensively documented and unit-tested source code of TopoX is available under MIT license at https://pyt-team.github.io/. △ Less

Submitted 17 February, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

arXiv:2401.13937 [pdf, other]

Self-supervised Video Object Segmentation with Distillation Learning of Deformable Attention

Authors: Quang-Trung Truong, Duc Thanh Nguyen, Binh-Son Hua, Sai-Kit Yeung

Abstract: Video object segmentation is a fundamental research problem in computer vision. Recent techniques have often applied attention mechanism to object representation learning from video sequences. However, due to temporal changes in the video data, attention maps may not well align with the objects of interest across video frames, causing accumulated errors in long-term video processing. In addition,… ▽ More Video object segmentation is a fundamental research problem in computer vision. Recent techniques have often applied attention mechanism to object representation learning from video sequences. However, due to temporal changes in the video data, attention maps may not well align with the objects of interest across video frames, causing accumulated errors in long-term video processing. In addition, existing techniques have utilised complex architectures, requiring highly computational complexity and hence limiting the ability to integrate video object segmentation into low-powered devices. To address these issues, we propose a new method for self-supervised video object segmentation based on distillation learning of deformable attention. Specifically, we devise a lightweight architecture for video object segmentation that is effectively adapted to temporal changes. This is enabled by deformable attention mechanism, where the keys and values capturing the memory of a video sequence in the attention module have flexible locations updated across frames. The learnt object representations are thus adaptive to both the spatial and temporal dimensions. We train the proposed architecture in a self-supervised fashion through a new knowledge distillation paradigm where deformable attention maps are integrated into the distillation loss. We qualitatively and quantitatively evaluate our method and compare it with existing methods on benchmark datasets including DAVIS 2016/2017 and YouTube-VOS 2018/2019. Experimental results verify the superiority of our method via its achieved state-of-the-art performance and optimal memory usage. △ Less

Submitted 18 March, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

Comments: under review

arXiv:2309.04646 [pdf, ps, other]

Efficient Finetuning Large Language Models For Vietnamese Chatbot

Authors: Vu-Thuan Doan, Quoc-Truong Truong, Duc-Vu Nguyen, Vinh-Tiep Nguyen, Thuy-Ngan Nguyen Luu

Abstract: Large language models (LLMs), such as GPT-4, PaLM, and LLaMa, have been shown to achieve remarkable performance across a variety of natural language tasks. Recent advancements in instruction tuning bring LLMs with ability in following user's instructions and producing human-like responses. However, the high costs associated with training and implementing LLMs pose challenges to academic research.… ▽ More Large language models (LLMs), such as GPT-4, PaLM, and LLaMa, have been shown to achieve remarkable performance across a variety of natural language tasks. Recent advancements in instruction tuning bring LLMs with ability in following user's instructions and producing human-like responses. However, the high costs associated with training and implementing LLMs pose challenges to academic research. Furthermore, the availability of pretrained LLMs and instruction-tune datasets for Vietnamese language is limited. To tackle these concerns, we leverage large-scale instruction-following datasets from open-source projects, namely Alpaca, GPT4All, and Chat-Doctor, which cover general domain and specific medical domain. To the best of our knowledge, these are the first instructional dataset for Vietnamese. Subsequently, we utilize parameter-efficient tuning through Low-Rank Adaptation (LoRA) on two open LLMs: Bloomz (Multilingual) and GPTJ-6B (Vietnamese), resulting four models: Bloomz-Chat, Bloomz-Doctor, GPTJ-Chat, GPTJ-Doctor.Finally, we assess the effectiveness of our methodology on a per-sample basis, taking into consideration the helpfulness, relevance, accuracy, level of detail in their responses. This evaluation process entails the utilization of GPT-4 as an automated scoring mechanism. Despite utilizing a low-cost setup, our method demonstrates about 20-30\% improvement over the original models in our evaluation tasks. △ Less

Submitted 8 September, 2023; originally announced September 2023.

Comments: arXiv admin note: text overlap with arXiv:2304.08177, arXiv:2303.16199 by other authors

arXiv:2308.06838 [pdf, other]

doi 10.1609/aaai.v38i14.29463

Weisfeiler and Lehman Go Paths: Learning Topological Features via Path Complexes

Authors: Quang Truong, Peter Chin

Abstract: Graph Neural Networks (GNNs), despite achieving remarkable performance across different tasks, are theoretically bounded by the 1-Weisfeiler-Lehman test, resulting in limitations in terms of graph expressivity. Even though prior works on topological higher-order GNNs overcome that boundary, these models often depend on assumptions about sub-structures of graphs. Specifically, topological GNNs leve… ▽ More Graph Neural Networks (GNNs), despite achieving remarkable performance across different tasks, are theoretically bounded by the 1-Weisfeiler-Lehman test, resulting in limitations in terms of graph expressivity. Even though prior works on topological higher-order GNNs overcome that boundary, these models often depend on assumptions about sub-structures of graphs. Specifically, topological GNNs leverage the prevalence of cliques, cycles, and rings to enhance the message-passing procedure. Our study presents a novel perspective by focusing on simple paths within graphs during the topological message-passing process, thus liberating the model from restrictive inductive biases. We prove that by lifting graphs to path complexes, our model can generalize the existing works on topology while inheriting several theoretical results on simplicial complexes and regular cell complexes. Without making prior assumptions about graph sub-structures, our method outperforms earlier works in other topological domains and achieves state-of-the-art results on various benchmarks. △ Less

Submitted 31 March, 2024; v1 submitted 13 August, 2023; originally announced August 2023.

Comments: AAAI'24. Contains 17 pages, 4 figures

Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence 38 (14), 15382-15391, 2024

arXiv:2308.05046 [pdf, other]

RadGraph2: Modeling Disease Progression in Radiology Reports via Hierarchical Information Extraction

Authors: Sameer Khanna, Adam Dejl, Kibo Yoon, Quoc Hung Truong, Hanh Duong, Agustina Saenz, Pranav Rajpurkar

Abstract: We present RadGraph2, a novel dataset for extracting information from radiology reports that focuses on capturing changes in disease state and device placement over time. We introduce a hierarchical schema that organizes entities based on their relationships and show that using this hierarchy during training improves the performance of an information extraction model. Specifically, we propose a mo… ▽ More We present RadGraph2, a novel dataset for extracting information from radiology reports that focuses on capturing changes in disease state and device placement over time. We introduce a hierarchical schema that organizes entities based on their relationships and show that using this hierarchy during training improves the performance of an information extraction model. Specifically, we propose a modification to the DyGIE++ framework, resulting in our model HGIE, which outperforms previous models in entity and relation extraction tasks. We demonstrate that RadGraph2 enables models to capture a wider variety of findings and perform better at relation extraction compared to those trained on the original RadGraph dataset. Our work provides the foundation for develo** automated systems that can track disease progression over time and develop information extraction models that leverage the natural hierarchy of labels in the medical domain. △ Less

Submitted 9 August, 2023; originally announced August 2023.

Comments: Accepted at Machine Learning for Healthcare 2023

arXiv:2304.14405 [pdf, other]

doi 10.1007/978-3-030-92310-5_76

ViMQ: A Vietnamese Medical Question Dataset for Healthcare Dialogue System Development

Authors: Ta Duc Huy, Nguyen Anh Tu, Tran Hoang Vu, Nguyen Phuc Minh, Nguyen Phan, Trung H. Bui, Steven Q. H. Truong

Abstract: Existing medical text datasets usually take the form of question and answer pairs that support the task of natural language generation, but lacking the composite annotations of the medical terms. In this study, we publish a Vietnamese dataset of medical questions from patients with sentence-level and entity-level annotations for the Intent Classification and Named Entity Recognition tasks. The tag… ▽ More Existing medical text datasets usually take the form of question and answer pairs that support the task of natural language generation, but lacking the composite annotations of the medical terms. In this study, we publish a Vietnamese dataset of medical questions from patients with sentence-level and entity-level annotations for the Intent Classification and Named Entity Recognition tasks. The tag sets for two tasks are in medical domain and can facilitate the development of task-oriented healthcare chatbots with better comprehension of queries from patients. We train baseline models for the two tasks and propose a simple self-supervised training strategy with span-noise modelling that substantially improves the performance. Dataset and code will be published at https://github.com/tadeephuy/ViMQ △ Less

Submitted 27 April, 2023; originally announced April 2023.

Comments: accepted at ICONIP 2021

arXiv:2301.02910 [pdf, other]

doi 10.1103/PhysRevA.108.023109

Universality in odd-even harmonic generation and application in terahertz waveform sampling

Authors: Doan-An Trieu, Ngoc-Loan Phan, Quan-Hao Truong, Hien T. Nguyen, Cam-Tu Le, DinhDuy Vu, Van-Hoang Le

Abstract: Odd-even harmonics emitted from a laser-target system imprint rich, subtle information characterizing the system's dynamical asymmetry, which is desirable to decipher. In this Letter, we discover a simple universal relation between the odd-even harmonics and the asymmetry of the THz-assisted laser-atomic system -- atoms in a fundamental mid-IR laser pulse combined with a THz laser. First, we demon… ▽ More Odd-even harmonics emitted from a laser-target system imprint rich, subtle information characterizing the system's dynamical asymmetry, which is desirable to decipher. In this Letter, we discover a simple universal relation between the odd-even harmonics and the asymmetry of the THz-assisted laser-atomic system -- atoms in a fundamental mid-IR laser pulse combined with a THz laser. First, we demonstrate numerically and then analytically formulize the harmonic even-to-odd ratio as a function of the THz electric field, the source of the system's asymmetry. Notably, we suggest a scaling that makes the obtained rule universal, independent of the parameters of both the fundamental pulse and atomic target. This universality facilitates us to propose a general pump-probe scheme for THz waveform sampling from the even-to-odd ratio, measurable within a conventional compact setup. △ Less

Submitted 16 January, 2023; v1 submitted 7 January, 2023; originally announced January 2023.

arXiv:2209.11518 [pdf, other]

Marine Video Kit: A New Marine Video Dataset for Content-based Analysis and Retrieval

Authors: Quang-Trung Truong, Tuan-Anh Vu, Tan-Sang Ha, Lokoc Jakub, Yue Him Wong Tim, Ajay Joneja, Sai-Kit Yeung

Abstract: Effective analysis of unusual domain specific video collections represents an important practical problem, where state-of-the-art general purpose models still face limitations. Hence, it is desirable to design benchmark datasets that challenge novel powerful models for specific domains with additional constraints. It is important to remember that domain specific data may be noisier (e.g., endoscop… ▽ More Effective analysis of unusual domain specific video collections represents an important practical problem, where state-of-the-art general purpose models still face limitations. Hence, it is desirable to design benchmark datasets that challenge novel powerful models for specific domains with additional constraints. It is important to remember that domain specific data may be noisier (e.g., endoscopic or underwater videos) and often require more experienced users for effective search. In this paper, we focus on single-shot videos taken from moving cameras in underwater environments, which constitute a nontrivial challenge for research purposes. The first shard of a new Marine Video Kit dataset is presented to serve for video retrieval and other computer vision challenges. Our dataset is used in a special session during Video Browser Showdown 2023. In addition to basic meta-data statistics, we present several insights based on low-level features as well as semantic annotations of selected keyframes. The analysis also contains experiments showing limitations of respected general purpose models for retrieval. Our dataset and code are publicly available at https://hkust-vgd.github.io/marinevideokit. △ Less

Submitted 6 December, 2022; v1 submitted 23 September, 2022; originally announced September 2022.

Comments: Camera Ready for MMM 2023, Bergen, Norway

arXiv:2206.06992 [pdf, other]

An Experimental Investigation of Part-Of-Speech Taggers for Vietnamese

Authors: Tuan-Phong Nguyen, Quoc-Tuan Truong, Xuan-Nam Nguyen, Anh-Cuong Le

Abstract: Part-of-speech (POS) tagging plays an important role in Natural Language Processing (NLP). Its applications can be found in many NLP tasks such as named entity recognition, syntactic parsing, dependency parsing and text chunking. In the investigation conducted in this paper, we utilize the technologies of two widely-used toolkits, ClearNLP and Stanford POS Tagger, as well as develop two new POS ta… ▽ More Part-of-speech (POS) tagging plays an important role in Natural Language Processing (NLP). Its applications can be found in many NLP tasks such as named entity recognition, syntactic parsing, dependency parsing and text chunking. In the investigation conducted in this paper, we utilize the technologies of two widely-used toolkits, ClearNLP and Stanford POS Tagger, as well as develop two new POS taggers for Vietnamese, then compare them to three well-known Vietnamese taggers, namely JVnTagger, vnTagger and RDRPOSTagger. We make a systematic comparison to find out the tagger having the best performance. We also design a new feature set to measure the performance of the statistical taggers. Our new taggers built from Stanford Tagger and ClearNLP with the new feature set can outperform all other current Vietnamese taggers in term of tagging accuracy. Moreover, we also analyze the affection of some features to the performance of statistical taggers. Lastly, the experimental results also reveal that the transformation-based tagger, RDRPOSTagger, can run significantly faster than any other statistical tagger. △ Less

Submitted 14 June, 2022; originally announced June 2022.

Journal ref: VNU Journal of Science Computer Science and Communication Engineering, Vol. 32, No. 3 (2016), 11-25

arXiv:2106.14463 [pdf, other]

RadGraph: Extracting Clinical Entities and Relations from Radiology Reports

Authors: Saahil Jain, Ashwin Agrawal, Adriel Saporta, Steven QH Truong, Du Nguyen Duong, Tan Bui, Pierre Chambon, Yuhao Zhang, Matthew P. Lungren, Andrew Y. Ng, Curtis P. Langlotz, Pranav Rajpurkar

Abstract: Extracting structured clinical information from free-text radiology reports can enable the use of radiology report information for a variety of critical healthcare applications. In our work, we present RadGraph, a dataset of entities and relations in full-text chest X-ray radiology reports based on a novel information extraction schema we designed to structure radiology reports. We release a devel… ▽ More Extracting structured clinical information from free-text radiology reports can enable the use of radiology report information for a variety of critical healthcare applications. In our work, we present RadGraph, a dataset of entities and relations in full-text chest X-ray radiology reports based on a novel information extraction schema we designed to structure radiology reports. We release a development dataset, which contains board-certified radiologist annotations for 500 radiology reports from the MIMIC-CXR dataset (14,579 entities and 10,889 relations), and a test dataset, which contains two independent sets of board-certified radiologist annotations for 100 radiology reports split equally across the MIMIC-CXR and CheXpert datasets. Using these datasets, we train and test a deep learning model, RadGraph Benchmark, that achieves a micro F1 of 0.82 and 0.73 on relation extraction on the MIMIC-CXR and CheXpert test sets respectively. Additionally, we release an inference dataset, which contains annotations automatically generated by RadGraph Benchmark across 220,763 MIMIC-CXR reports (around 6 million entities and 4 million relations) and 500 CheXpert reports (13,783 entities and 9,908 relations) with map**s to associated chest radiographs. Our freely available dataset can facilitate a wide range of research in medical natural language processing, as well as computer vision and multi-modal learning when linked to chest radiographs. △ Less

Submitted 29 August, 2021; v1 submitted 28 June, 2021; originally announced June 2021.

Comments: Accepted to the 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks

arXiv:2102.11467 [pdf, other]

doi 10.1145/3450439.3451862

VisualCheXbert: Addressing the Discrepancy Between Radiology Report Labels and Image Labels

Authors: Saahil Jain, Akshay Smit, Steven QH Truong, Chanh DT Nguyen, Minh-Thanh Huynh, Mudit Jain, Victoria A. Young, Andrew Y. Ng, Matthew P. Lungren, Pranav Rajpurkar

Abstract: Automatic extraction of medical conditions from free-text radiology reports is critical for supervising computer vision models to interpret medical images. In this work, we show that radiologists labeling reports significantly disagree with radiologists labeling corresponding chest X-ray images, which reduces the quality of report labels as proxies for image labels. We develop and evaluate methods… ▽ More Automatic extraction of medical conditions from free-text radiology reports is critical for supervising computer vision models to interpret medical images. In this work, we show that radiologists labeling reports significantly disagree with radiologists labeling corresponding chest X-ray images, which reduces the quality of report labels as proxies for image labels. We develop and evaluate methods to produce labels from radiology reports that have better agreement with radiologists labeling images. Our best performing method, called VisualCheXbert, uses a biomedically-pretrained BERT model to directly map from a radiology report to the image labels, with a supervisory signal determined by a computer vision model trained to detect medical conditions from chest X-ray images. We find that VisualCheXbert outperforms an approach using an existing radiology report labeler by an average F1 score of 0.14 (95% CI 0.12, 0.17). We also find that VisualCheXbert better agrees with radiologists labeling chest X-ray images than do radiologists labeling the corresponding radiology reports by an average F1 score across several medical conditions of between 0.12 (95% CI 0.09, 0.15) and 0.21 (95% CI 0.18, 0.24). △ Less

Submitted 15 March, 2021; v1 submitted 22 February, 2021; originally announced February 2021.

Comments: Accepted to ACM Conference on Health, Inference, and Learning (ACM-CHIL) 2021

arXiv:2007.06199 [pdf, other]

CheXphoto: 10,000+ Photos and Transformations of Chest X-rays for Benchmarking Deep Learning Robustness

Authors: Nick A. Phillips, Pranav Rajpurkar, Mark Sabini, Rayan Krishnan, Sharon Zhou, Anuj Pareek, Nguyet Minh Phu, Chris Wang, Mudit Jain, Nguyen Duong Du, Steven QH Truong, Andrew Y. Ng, Matthew P. Lungren

Abstract: Clinical deployment of deep learning algorithms for chest x-ray interpretation requires a solution that can integrate into the vast spectrum of clinical workflows across the world. An appealing approach to scaled deployment is to leverage the ubiquity of smartphones by capturing photos of x-rays to share with clinicians using messaging services like WhatsApp. However, the application of chest x-ra… ▽ More Clinical deployment of deep learning algorithms for chest x-ray interpretation requires a solution that can integrate into the vast spectrum of clinical workflows across the world. An appealing approach to scaled deployment is to leverage the ubiquity of smartphones by capturing photos of x-rays to share with clinicians using messaging services like WhatsApp. However, the application of chest x-ray algorithms to photos of chest x-rays requires reliable classification in the presence of artifacts not typically encountered in digital x-rays used to train machine learning models. We introduce CheXphoto, a dataset of smartphone photos and synthetic photographic transformations of chest x-rays sampled from the CheXpert dataset. To generate CheXphoto we (1) automatically and manually captured photos of digital x-rays under different settings, and (2) generated synthetic transformations of digital x-rays targeted to make them look like photos of digital x-rays and x-ray films. We release this dataset as a resource for testing and improving the robustness of deep learning algorithms for automated chest x-ray interpretation on smartphone photos of chest x-rays. △ Less

Submitted 11 December, 2020; v1 submitted 13 July, 2020; originally announced July 2020.

arXiv:2007.01818 [pdf, other]

Image-based Vehicle Re-identification Model with Adaptive Attention Modules and Metadata Re-ranking

Authors: Quang Truong, Hy Dang, Zhankai Ye, Minh Nguyen, Bo Mei

Abstract: Vehicle Re-identification is a challenging task due to intra-class variability and inter-class similarity across non-overlap** cameras. To tackle these problems, recently proposed methods require additional annotation to extract more features for false positive image exclusion. In this paper, we propose a model powered by adaptive attention modules that requires fewer label annotations but still… ▽ More Vehicle Re-identification is a challenging task due to intra-class variability and inter-class similarity across non-overlap** cameras. To tackle these problems, recently proposed methods require additional annotation to extract more features for false positive image exclusion. In this paper, we propose a model powered by adaptive attention modules that requires fewer label annotations but still out-performs the previous models. We also include a re-ranking method that takes account of the importance of metadata feature embeddings in our paper. The proposed method is evaluated on CVPR AI City Challenge 2020 dataset and achieves mAP of 37.25% in Track 2. △ Less

Submitted 3 July, 2020; originally announced July 2020.

arXiv:2006.07100 [pdf, other]

Reinforced Data Sampling for Model Diversification

Authors: Hoang D. Nguyen, Xuan-Son Vu, Quoc-Tuan Truong, Duc-Trong Le

Abstract: With the rising number of machine learning competitions, the world has witnessed an exciting race for the best algorithms. However, the involved data selection process may fundamentally suffer from evidence ambiguity and concept drift issues, thereby possibly leading to deleterious effects on the performance of various models. This paper proposes a new Reinforced Data Sampling (RDS) method to lear… ▽ More With the rising number of machine learning competitions, the world has witnessed an exciting race for the best algorithms. However, the involved data selection process may fundamentally suffer from evidence ambiguity and concept drift issues, thereby possibly leading to deleterious effects on the performance of various models. This paper proposes a new Reinforced Data Sampling (RDS) method to learn how to sample data adequately on the search for useful models and insights. We formulate the optimisation problem of model diversification $δ{-div}$ in data sampling to maximise learning potentials and optimum allocation by injecting model diversity. This work advocates the employment of diverse base learners as value functions such as neural networks, decision trees, or logistic regressions to reinforce the selection process of data subsets with multi-modal belief. We introduce different ensemble reward mechanisms, including soft voting and stochastic choice to approximate optimal sampling policy. The evaluation conducted on four datasets evidently highlights the benefits of using RDS method over traditional sampling approaches. Our experimental results suggest that the trainable sampling for model diversification is useful for competition organisers, researchers, or even starters to pursue full potentials of various machine learning tasks such as classification and regression. The source code is available at https://github.com/probeu/RDS. △ Less

Submitted 12 June, 2020; originally announced June 2020.

arXiv:2002.02634 [pdf, other]

SideInfNet: A Deep Neural Network for Semi-Automatic Semantic Segmentation with Side Information

Authors: **g Yu Koh, Duc Thanh Nguyen, Quang-Trung Truong, Sai-Kit Yeung, Alexander Binder

Abstract: Fully-automatic execution is the ultimate goal for many Computer Vision applications. However, this objective is not always realistic in tasks associated with high failure costs, such as medical applications. For these tasks, semi-automatic methods allowing minimal effort from users to guide computer algorithms are often preferred due to desirable accuracy and performance. Inspired by the practica… ▽ More Fully-automatic execution is the ultimate goal for many Computer Vision applications. However, this objective is not always realistic in tasks associated with high failure costs, such as medical applications. For these tasks, semi-automatic methods allowing minimal effort from users to guide computer algorithms are often preferred due to desirable accuracy and performance. Inspired by the practicality and applicability of the semi-automatic approach, this paper proposes a novel deep neural network architecture, namely SideInfNet that effectively integrates features learnt from images with side information extracted from user annotations. To evaluate our method, we applied the proposed network to three semantic segmentation tasks and conducted extensive experiments on benchmark datasets. Experimental results and comparison with prior work have verified the superiority of our model, suggesting the generality and effectiveness of the model in semi-automatic semantic segmentation. △ Less

Submitted 17 July, 2020; v1 submitted 7 February, 2020; originally announced February 2020.

Comments: ECCV 2020

arXiv:1904.04710 [pdf]

Secure Biometric-based Remote Authentication Protocol using Chebyshev Polynomials and Fuzzy Extractor

Authors: Thi Ai Thao Nguyen, Tran Khanh Dang, Quynh Chi Truong, Dinh Thanh Nguyen

Abstract: In this paper, we have proposed a multi factor biometric-based remote authentication protocol. Our proposal overcomes the vulnerabilities of some previous works. At the same time, the protocol also obtains a low false accept rate (FAR) and false reject rate (FRR). In this paper, we have proposed a multi factor biometric-based remote authentication protocol. Our proposal overcomes the vulnerabilities of some previous works. At the same time, the protocol also obtains a low false accept rate (FAR) and false reject rate (FRR). △ Less

Submitted 9 April, 2019; originally announced April 2019.

Comments: RCCIE17

arXiv:1702.02361 [pdf, ps, other]

Theta functions for Holomorphic triples

Authors: Georg Hein, Thang Quyet Truong

Abstract: We introduce an generalization of the theta divisor to the theory of holomorphic triples on a smooth projective curve $X$. We show that a given triple $T=(E_1 \to E_0)$ is $α$-semistable iff there exists an orthogonal tripe $S=(F_1 \to F_0)$ with given numerical invariants. This yields globally generated theta line bundles on the moduli space of semistable triples. We introduce an generalization of the theta divisor to the theory of holomorphic triples on a smooth projective curve $X$. We show that a given triple $T=(E_1 \to E_0)$ is $α$-semistable iff there exists an orthogonal tripe $S=(F_1 \to F_0)$ with given numerical invariants. This yields globally generated theta line bundles on the moduli space of semistable triples. △ Less

Submitted 8 February, 2017; originally announced February 2017.

Comments: 18 pages

MSC Class: 14D20; 14H60

arXiv:1609.01348 [pdf, other]

Incentive Engineering Framework for Crowdsourcing Systems

Authors: Nhat V. Q. Truong, Sebastian Stein, Long Tran-Thanh, Nicholas R. Jennings

Abstract: Significant effort has been made to understand user motivation and to elicit user participation in crowdsourcing systems. However, incentive engineering, i.e., designing incentives that can purposefully motivate users, is still an open question and remains one of the key challenges of crowdsourcing initiatives. In this work in progress, we propose a general and systematic incentive engineering fra… ▽ More Significant effort has been made to understand user motivation and to elicit user participation in crowdsourcing systems. However, incentive engineering, i.e., designing incentives that can purposefully motivate users, is still an open question and remains one of the key challenges of crowdsourcing initiatives. In this work in progress, we propose a general and systematic incentive engineering framework that system designers can use to implement appropriate incentives in order to effect desirable user behaviours. △ Less

Submitted 5 September, 2016; originally announced September 2016.

arXiv:1003.0422 [pdf, ps, other]

Uniform Parametrization in Pseudo-Complex Hyperbolic Space

Authors: Minh Q. Truong

Abstract: The parametrization theorem is derived in a flat nD pseudo-complex affine space. The pseudo-complex hyperbolic space accomodates n-number of uncompactified time-like extra dimensions with sugnature (s,r), where s and r are the numbers of minus and plus signs associated with the diagonalized metric matrix. The main result of the theorem suggests a uniform parametrization for both time-like and sp… ▽ More The parametrization theorem is derived in a flat nD pseudo-complex affine space. The pseudo-complex hyperbolic space accomodates n-number of uncompactified time-like extra dimensions with sugnature (s,r), where s and r are the numbers of minus and plus signs associated with the diagonalized metric matrix. The main result of the theorem suggests a uniform parametrization for both time-like and space-like dimensions. The uniformization requirement preserves complex-hyperbolic inner product associated with the space. As application, the elements of the space is shown to be invariant under linear transformation. △ Less

Submitted 1 March, 2010; originally announced March 2010.

arXiv:1002.3821 [pdf, ps, other]

Lorentz Transformation in Flat 5D Complex-Hyperbolic Space

Authors: Minh Q. Truong

Abstract: The Lorentz transfomation is derived in 5D flat pseudo-complex affine space or TT Space. The TT space or pseudo-Complex space accomodates one uncompactified time-like extra dimension. It is shown that the maximum allowable speed for particles living in TT space exceeds the speed of light, c, the absolute speed of the Minkowski space. The Lorentz transfomation is derived in 5D flat pseudo-complex affine space or TT Space. The TT space or pseudo-Complex space accomodates one uncompactified time-like extra dimension. It is shown that the maximum allowable speed for particles living in TT space exceeds the speed of light, c, the absolute speed of the Minkowski space. △ Less

Submitted 23 February, 2010; v1 submitted 19 February, 2010; originally announced February 2010.

Comments: Removal of non-alpha numeric characters from the title and abstract

arXiv:1001.4098 [pdf, ps, other]

Extra-Dimensional Approach to Option Pricing and Stochastic Volatility

Authors: Minh Q. Truong

Abstract: The generalized 5D Black-Scholes differential equation with stochastic volatility is derived. The projections of the stochastic evolutions associated with the random variables from an enlarged space or superspace onto an ordinary space can be achieved via higher-dimensional operators. The stochastic nature of the securities and volatility associated with the 3D Merton-Garman equation can then be… ▽ More The generalized 5D Black-Scholes differential equation with stochastic volatility is derived. The projections of the stochastic evolutions associated with the random variables from an enlarged space or superspace onto an ordinary space can be achieved via higher-dimensional operators. The stochastic nature of the securities and volatility associated with the 3D Merton-Garman equation can then be interpreted as the effects of the extra dimensions. We showed that the Merton-Garman equation is the first excited state, i.e. n=m=1, within a family which contain an infinite numbers of Merton-Garman-like equations. △ Less

Submitted 5 February, 2010; v1 submitted 23 January, 2010; originally announced January 2010.

Comments: Ease the time-independent restriction on the extra dimensional coordinates. Fixed typos and expand the conclusion

arXiv:0805.1374 [pdf, ps, other]

Mining a medieval social network by kernel SOM and related methods

Authors: Nathalie Villa, Fabrice Rossi, Quoc-Dinh Truong

Abstract: This paper briefly presents several ways to understand the organization of a large social network (several hundreds of persons). We compare approaches coming from data mining for clustering the vertices of a graph (spectral clustering, self-organizing algorithms. . .) and provide methods for representing the graph from these analysis. All these methods are illustrated on a medieval social networ… ▽ More This paper briefly presents several ways to understand the organization of a large social network (several hundreds of persons). We compare approaches coming from data mining for clustering the vertices of a graph (spectral clustering, self-organizing algorithms. . .) and provide methods for representing the graph from these analysis. All these methods are illustrated on a medieval social network and the way they can help to understand its organization is underlined. △ Less

Submitted 9 May, 2008; originally announced May 2008.

Journal ref: MASHS 2008 (Modèles et Apprentissages en Sciences Humaines et Sociales), Créteil : France (2008)

arXiv:hep-ph/0602088 [pdf, ps, other]

doi 10.1103/PhysRevD.74.035008

Superfield Calculation of Loop Contribution in Extra Dimensional Theories

Authors: Minh Q. Truong

Abstract: Superfields provide a compact description of supersymmetry representations. Loop corrections with superfield formalism are simpler and much more manageable than calculation in terms of component fields. In this paper we calculate the contribution of the Kaluza-Klein states, associated with extra dimensions, to the renormalization group beta function. These Kaluza-Klein particles circulate in the… ▽ More Superfields provide a compact description of supersymmetry representations. Loop corrections with superfield formalism are simpler and much more manageable than calculation in terms of component fields. In this paper we calculate the contribution of the Kaluza-Klein states, associated with extra dimensions, to the renormalization group beta function. These Kaluza-Klein particles circulate in the virtual loop, hence affecting the overall corrections at any order. We obtain the one-loop correction, which checks with the result previously obtained using the more laborious component field method. In addition, we calculate the two-loop correction coming from chiral KK states. △ Less

Submitted 24 April, 2006; v1 submitted 9 February, 2006; originally announced February 2006.

Comments: 38 pages, 7 figures. Added references

Journal ref: Phys.Rev. D74 (2006) 035008

Showing 1–25 of 25 results for author: Truong, Q