-
How good are my search strings? Reflections on using an existing review as a quasi-gold standard
Authors:
Huynh Khanh Vi Tran,
Jürgen Börstler,
Nauman Bin Ali,
Michael Unterkalmsteiner
Abstract:
Background: Systematic literature studies (SLS) have become a core research methodology in Evidence-based Software Engineering (EBSE). Search completeness, ie, finding all relevant papers on the topic of interest, has been recognized as one of the most commonly discussed validity issues of SLSs. Aim: This study aims at raising awareness on the issues related to search string construction and on se…
▽ More
Background: Systematic literature studies (SLS) have become a core research methodology in Evidence-based Software Engineering (EBSE). Search completeness, ie, finding all relevant papers on the topic of interest, has been recognized as one of the most commonly discussed validity issues of SLSs. Aim: This study aims at raising awareness on the issues related to search string construction and on search validation using a quasi-gold standard (QGS). Furthermore, we aim at providing guidelines for search string validation. Method: We use a recently completed tertiary study as a case and complement our findings with the observations from other researchers studying and advancing EBSE. Results: We found that the issue of assessing QGS quality has not seen much attention in the literature, and the validation of automated searches in SLSs could be improved. Hence, we propose to extend the current search validation approach by the additional analysis step of the automated search validation results and provide recommendations for the QGS construction. Conclusion: In this paper, we report on new issues which could affect search completeness in SLSs. Furthermore, the proposed guideline and recommendations could help researchers implement a more reliable search strategy in their SLSs.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
Assessing test artifact quality -- A tertiary study
Authors:
Huynh Khanh Vi Tran,
Michael Unterkalmsteiner,
Jürgen Börstler,
Nauman bin Ali
Abstract:
Context: Modern software development increasingly relies on software testing for an ever more frequent delivery of high quality software. This puts high demands on the quality of the central artifacts in software testing, test suites and test cases. Objective: We aim to develop a comprehensive model for capturing the dimensions of test case/suite quality, which are relevant for a variety of perspe…
▽ More
Context: Modern software development increasingly relies on software testing for an ever more frequent delivery of high quality software. This puts high demands on the quality of the central artifacts in software testing, test suites and test cases. Objective: We aim to develop a comprehensive model for capturing the dimensions of test case/suite quality, which are relevant for a variety of perspectives. Method: We have carried out a systematic literature review to identify and analyze existing secondary studies on quality aspects of software testing artifacts. Results: We identified 49 relevant secondary studies. Of these 49 studies, less than half did some form of quality appraisal of the included primary studies and only 3 took into account the quality of the primary study when synthesizing the results. We present an aggregation of the context dimensions and factors that can be used to characterize the environment in which the test case/suite quality is investigated. We also provide a comprehensive model of test case/suite quality with definitions for the quality attributes and measurements based on findings in the literature and ISO/IEC 25010:2011. Conclusion: The test artifact quality model presented in the paper can be used to support test artifact quality assessment and improvement initiatives in practice. Furtherm Information and Software Technology 139 (2021): 106620ore, the model can also be used as a framework for documenting context characteristics to make research results more accessible for research and practice.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
ViCLEVR: A Visual Reasoning Dataset and Hybrid Multimodal Fusion Model for Visual Question Answering in Vietnamese
Authors:
Khiem Vinh Tran,
Hao Phu Phan,
Kiet Van Nguyen,
Ngan Luu Thuy Nguyen
Abstract:
In recent years, Visual Question Answering (VQA) has gained significant attention for its diverse applications, including intelligent car assistance, aiding visually impaired individuals, and document image information retrieval using natural language queries. VQA requires effective integration of information from questions and images to generate accurate answers. Neural models for VQA have made r…
▽ More
In recent years, Visual Question Answering (VQA) has gained significant attention for its diverse applications, including intelligent car assistance, aiding visually impaired individuals, and document image information retrieval using natural language queries. VQA requires effective integration of information from questions and images to generate accurate answers. Neural models for VQA have made remarkable progress on large-scale datasets, with a primary focus on resource-rich languages like English. To address this, we introduce the ViCLEVR dataset, a pioneering collection for evaluating various visual reasoning capabilities in Vietnamese while mitigating biases. The dataset comprises over 26,000 images and 30,000 question-answer pairs (QAs), each question annotated to specify the type of reasoning involved. Leveraging this dataset, we conduct a comprehensive analysis of contemporary visual reasoning systems, offering valuable insights into their strengths and limitations. Furthermore, we present PhoVIT, a comprehensive multimodal fusion that identifies objects in images based on questions. The architecture effectively employs transformers to enable simultaneous reasoning over textual and visual data, merging both modalities at an early model stage. The experimental findings demonstrate that our proposed model achieves state-of-the-art performance across four evaluation metrics. The accompanying code and dataset have been made publicly accessible at \url{https://github.com/kvt0012/ViCLEVR}. This provision seeks to stimulate advancements within the research community, fostering the development of more multimodal fusion algorithms, specifically tailored to address the nuances of low-resource languages, exemplified by Vietnamese.
△ Less
Submitted 27 October, 2023;
originally announced October 2023.
-
Generative Pre-trained Transformer for Vietnamese Community-based COVID-19 Question Answering
Authors:
Tam Minh Vo,
Khiem Vinh Tran
Abstract:
Recent studies have provided empirical evidence of the wide-ranging potential of Generative Pre-trained Transformer (GPT), a pretrained language model, in the field of natural language processing. GPT has been effectively employed as a decoder within state-of-the-art (SOTA) question answering systems, yielding exceptional performance across various tasks. However, the current research landscape co…
▽ More
Recent studies have provided empirical evidence of the wide-ranging potential of Generative Pre-trained Transformer (GPT), a pretrained language model, in the field of natural language processing. GPT has been effectively employed as a decoder within state-of-the-art (SOTA) question answering systems, yielding exceptional performance across various tasks. However, the current research landscape concerning GPT's application in Vietnamese remains limited. This paper aims to address this gap by presenting an implementation of GPT-2 for community-based question answering specifically focused on COVID-19 related queries in Vietnamese. We introduce a novel approach by conducting a comparative analysis of different Transformers vs SOTA models in the community-based COVID-19 question answering dataset. The experimental findings demonstrate that the GPT-2 models exhibit highly promising outcomes, outperforming other SOTA models as well as previous community-based COVID-19 question answering models developed for Vietnamese.
△ Less
Submitted 31 October, 2023; v1 submitted 23 October, 2023;
originally announced October 2023.
-
Test-Case Quality -- Understanding Practitioners' Perspectives
Authors:
Huynh Khanh Vi Tran,
Nauman Bin Ali,
Jürgen Börstler,
Michael Unterkalmsteiner
Abstract:
Background: Test-case quality has always been one of the major concerns in software testing. To improve test-case quality, it is important to better understand how practitioners perceive the quality of test-cases. Objective: Motivated by that need, we investigated how practitioners define test-case quality and which aspects of test-cases are important for quality assessment. Method: We conducted s…
▽ More
Background: Test-case quality has always been one of the major concerns in software testing. To improve test-case quality, it is important to better understand how practitioners perceive the quality of test-cases. Objective: Motivated by that need, we investigated how practitioners define test-case quality and which aspects of test-cases are important for quality assessment. Method: We conducted semi-structured interviews with professional developers, testers and test architects from a multinational software company in Sweden. Before the interviews, we asked participants for actual test cases (written in natural language) that they perceive as good, normal, and bad respectively together with rationales for their assessment. We also compared their opinions on shared test cases and contrasted their views with the relevant literature. Results: We present a quality model which consists of 11 test-case quality attributes. We also identify a misalignment in defining test-case quality among practitioners and between academia and industry, along with suggestions for improving test-case quality in industry. Conclusion: The results show that practitioners' background, including roles and working experience, are critical dimensions of how test-case quality is defined and assessed.
△ Less
Submitted 28 September, 2023;
originally announced September 2023.
-
BARTPhoBEiT: Pre-trained Sequence-to-Sequence and Image Transformers Models for Vietnamese Visual Question Answering
Authors:
Khiem Vinh Tran,
Kiet Van Nguyen,
Ngan Luu Thuy Nguyen
Abstract:
Visual Question Answering (VQA) is an intricate and demanding task that integrates natural language processing (NLP) and computer vision (CV), capturing the interest of researchers. The English language, renowned for its wealth of resources, has witnessed notable advancements in both datasets and models designed for VQA. However, there is a lack of models that target specific countries such as Vie…
▽ More
Visual Question Answering (VQA) is an intricate and demanding task that integrates natural language processing (NLP) and computer vision (CV), capturing the interest of researchers. The English language, renowned for its wealth of resources, has witnessed notable advancements in both datasets and models designed for VQA. However, there is a lack of models that target specific countries such as Vietnam. To address this limitation, we introduce a transformer-based Vietnamese model named BARTPhoBEiT. This model includes pre-trained Sequence-to-Sequence and bidirectional encoder representation from Image Transformers in Vietnamese and evaluates Vietnamese VQA datasets. Experimental results demonstrate that our proposed model outperforms the strong baseline and improves the state-of-the-art in six metrics: Accuracy, Precision, Recall, F1-score, WUPS 0.0, and WUPS 0.9.
△ Less
Submitted 28 July, 2023;
originally announced July 2023.
-
A Comparative Study of Question Answering over Knowledge Bases
Authors:
Khiem Vinh Tran,
Hao Phu Phan,
Khang Nguyen Duc Quach,
Ngan Luu-Thuy Nguyen,
Jun Jo,
Thanh Tam Nguyen
Abstract:
Question answering over knowledge bases (KBQA) has become a popular approach to help users extract information from knowledge bases. Although several systems exist, choosing one suitable for a particular application scenario is difficult. In this article, we provide a comparative study of six representative KBQA systems on eight benchmark datasets. In that, we study various question types, propert…
▽ More
Question answering over knowledge bases (KBQA) has become a popular approach to help users extract information from knowledge bases. Although several systems exist, choosing one suitable for a particular application scenario is difficult. In this article, we provide a comparative study of six representative KBQA systems on eight benchmark datasets. In that, we study various question types, properties, languages, and domains to provide insights on where existing systems struggle. On top of that, we propose an advanced map** algorithm to aid existing models in achieving superior results. Moreover, we also develop a multilingual corpus COVID-KGQA, which encourages COVID-19 research and multilingualism for the diversity of future AI. Finally, we discuss the key findings and their implications as well as performance guidelines and some future improvements. Our source code is available at \url{https://github.com/tamlhp/kbqa}.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
Conversational Machine Reading Comprehension for Vietnamese Healthcare Texts
Authors:
Son T. Luu,
Mao Nguyen Bui,
Loi Duc Nguyen,
Khiem Vinh Tran,
Kiet Van Nguyen,
Ngan Luu-Thuy Nguyen
Abstract:
Machine reading comprehension (MRC) is a sub-field in natural language processing that aims to assist computers understand unstructured texts and then answer questions related to them. In practice, the conversation is an essential way to communicate and transfer information. To help machines understand conversation texts, we present UIT-ViCoQA, a new corpus for conversational machine reading compr…
▽ More
Machine reading comprehension (MRC) is a sub-field in natural language processing that aims to assist computers understand unstructured texts and then answer questions related to them. In practice, the conversation is an essential way to communicate and transfer information. To help machines understand conversation texts, we present UIT-ViCoQA, a new corpus for conversational machine reading comprehension in the Vietnamese language. This corpus consists of 10,000 questions with answers over 2,000 conversations about health news articles. Then, we evaluate several baseline approaches for conversational machine comprehension on the UIT-ViCoQA corpus. The best model obtains an F1 score of 45.27%, which is 30.91 points behind human performance (76.18%), indicating that there is ample room for improvement. Our dataset is available at our website: http://nlp.uit.edu.vn/datasets/ for research purposes.
△ Less
Submitted 30 September, 2021; v1 submitted 4 May, 2021;
originally announced May 2021.
-
On Sampling-Based Training Criteria for Neural Language Modeling
Authors:
Yingbo Gao,
David Thulke,
Alexander Gerstenberger,
Khoa Viet Tran,
Ralf Schlüter,
Hermann Ney
Abstract:
As the vocabulary size of modern word-based language models becomes ever larger, many sampling-based training criteria are proposed and investigated. The essence of these sampling methods is that the softmax-related traversal over the entire vocabulary can be simplified, giving speedups compared to the baseline. A problem we notice about the current landscape of such sampling methods is the lack o…
▽ More
As the vocabulary size of modern word-based language models becomes ever larger, many sampling-based training criteria are proposed and investigated. The essence of these sampling methods is that the softmax-related traversal over the entire vocabulary can be simplified, giving speedups compared to the baseline. A problem we notice about the current landscape of such sampling methods is the lack of a systematic comparison and some myths about preferring one over another. In this work, we consider Monte Carlo sampling, importance sampling, a novel method we call compensated partial summation, and noise contrastive estimation. Linking back to the three traditional criteria, namely mean squared error, binary cross-entropy, and cross-entropy, we derive the theoretical solutions to the training problems. Contrary to some common belief, we show that all these sampling methods can perform equally well, as long as we correct for the intended class posterior probabilities. Experimental results in language modeling and automatic speech recognition on Switchboard and LibriSpeech support our claim, with all sampling-based methods showing similar perplexities and word error rates while giving the expected speedups.
△ Less
Submitted 17 June, 2021; v1 submitted 21 April, 2021;
originally announced April 2021.
-
UIT-HSE at WNUT-2020 Task 2: Exploiting CT-BERT for Identifying COVID-19 Information on the Twitter Social Network
Authors:
Khiem Vinh Tran,
Hao Phu Phan,
Kiet Van Nguyen,
Ngan Luu-Thuy Nguyen
Abstract:
Recently, COVID-19 has affected a variety of real-life aspects of the world and led to dreadful consequences. More and more tweets about COVID-19 has been shared publicly on Twitter. However, the plurality of those Tweets are uninformative, which is challenging to build automatic systems to detect the informative ones for useful AI applications. In this paper, we present our results at the W-NUT 2…
▽ More
Recently, COVID-19 has affected a variety of real-life aspects of the world and led to dreadful consequences. More and more tweets about COVID-19 has been shared publicly on Twitter. However, the plurality of those Tweets are uninformative, which is challenging to build automatic systems to detect the informative ones for useful AI applications. In this paper, we present our results at the W-NUT 2020 Shared Task 2: Identification of Informative COVID-19 English Tweets. In particular, we propose our simple but effective approach using the transformer-based models based on COVID-Twitter-BERT (CT-BERT) with different fine-tuning techniques. As a result, we achieve the F1-Score of 90.94\% with the third place on the leaderboard of this task which attracted 56 submitted teams in total.
△ Less
Submitted 13 November, 2020; v1 submitted 7 September, 2020;
originally announced September 2020.
-
Enhancing lexical-based approach with external knowledge for Vietnamese multiple-choice machine reading comprehension
Authors:
Kiet Van Nguyen,
Khiem Vinh Tran,
Son T. Luu,
Anh Gia-Tuan Nguyen,
Ngan Luu-Thuy Nguyen
Abstract:
Although Vietnamese is the 17th most popular native-speaker language in the world, there are not many research studies on Vietnamese machine reading comprehension (MRC), the task of understanding a text and answering questions about it. One of the reasons is because of the lack of high-quality benchmark datasets for this task. In this work, we construct a dataset which consists of 2,783 pairs of m…
▽ More
Although Vietnamese is the 17th most popular native-speaker language in the world, there are not many research studies on Vietnamese machine reading comprehension (MRC), the task of understanding a text and answering questions about it. One of the reasons is because of the lack of high-quality benchmark datasets for this task. In this work, we construct a dataset which consists of 2,783 pairs of multiple-choice questions and answers based on 417 Vietnamese texts which are commonly used for teaching reading comprehension for elementary school pupils. In addition, we propose a lexical-based MRC method that utilizes semantic similarity measures and external knowledge sources to analyze questions and extract answers from the given text. We compare the performance of the proposed model with several baseline lexical-based and neural network-based models. Our proposed method achieves 61.81% by accuracy, which is 5.51% higher than the best baseline model. We also measure human performance on our dataset and find that there is a big gap between machine-model and human performances. This indicates that significant progress can be made on this task. The dataset is freely available on our website for research purposes.
△ Less
Submitted 1 November, 2020; v1 submitted 16 January, 2020;
originally announced January 2020.