Search | arXiv e-print repository

An active learning model to classify animal species in Hong Kong

Authors: Gareth Lamb, Ching Hei Lo, ** Wu, Calvin K. F. Lee

Abstract: Camera traps are used by ecologists globally as an efficient and non-invasive method to monitor animals. While it is time-consuming to manually label the collected images, recent advances in deep learning and computer vision has made it possible to automating this process [1]. A major obstacle to this is the generalisability of these models when applying these images to independently collected dat… ▽ More Camera traps are used by ecologists globally as an efficient and non-invasive method to monitor animals. While it is time-consuming to manually label the collected images, recent advances in deep learning and computer vision has made it possible to automating this process [1]. A major obstacle to this is the generalisability of these models when applying these images to independently collected data from other parts of the world [2]. Here, we use a deep active learning workflow [3], and train a model that is applicable to camera trap images collected in Hong Kong. △ Less

Submitted 22 March, 2024; originally announced March 2024.

Comments: 6 pages, 2 figures, 1 table

arXiv:2309.08325 [pdf, other]

Distributional Inclusion Hypothesis and Quantifications: Probing for Hypernymy in Functional Distributional Semantics

Authors: Chun Hei Lo, Wai Lam, Hong Cheng, Guy Emerson

Abstract: Functional Distributional Semantics (FDS) models the meaning of words by truth-conditional functions. This provides a natural representation for hypernymy but no guarantee that it can be learnt when FDS models are trained on a corpus. In this paper, we probe into FDS models and study the representations learnt, drawing connections between quantifications, the Distributional Inclusion Hypothesis (D… ▽ More Functional Distributional Semantics (FDS) models the meaning of words by truth-conditional functions. This provides a natural representation for hypernymy but no guarantee that it can be learnt when FDS models are trained on a corpus. In this paper, we probe into FDS models and study the representations learnt, drawing connections between quantifications, the Distributional Inclusion Hypothesis (DIH), and the variational-autoencoding objective of FDS model training. Using synthetic data sets, we reveal that FDS models learn hypernymy on a restricted class of corpus that strictly follows the DIH. We further introduce a training objective that both enables hypernymy learning under the reverse of the DIH and improves hypernymy detection from real corpora. △ Less

Submitted 10 February, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

Comments: 12 pages

arXiv:2101.06066 [pdf, other]

Unstructured Knowledge Access in Task-oriented Dialog Modeling using Language Inference, Knowledge Retrieval and Knowledge-Integrative Response Generation

Authors: Mudit Chaudhary, Borislav Dzodzo, Sida Huang, Chun Hei Lo, Mingzhi Lyu, Lun Yiu Nie, **bo Xing, Tianhua Zhang, Xiaoying Zhang, **gyan Zhou, Hong Cheng, Wai Lam, Helen Meng

Abstract: Dialog systems enriched with external knowledge can handle user queries that are outside the scope of the supporting databases/APIs. In this paper, we follow the baseline provided in DSTC9 Track 1 and propose three subsystems, KDEAK, KnowleDgEFactor, and Ens-GPT, which form the pipeline for a task-oriented dialog system capable of accessing unstructured knowledge. Specifically, KDEAK performs know… ▽ More Dialog systems enriched with external knowledge can handle user queries that are outside the scope of the supporting databases/APIs. In this paper, we follow the baseline provided in DSTC9 Track 1 and propose three subsystems, KDEAK, KnowleDgEFactor, and Ens-GPT, which form the pipeline for a task-oriented dialog system capable of accessing unstructured knowledge. Specifically, KDEAK performs knowledge-seeking turn detection by formulating the problem as natural language inference using knowledge from dialogs, databases and FAQs. KnowleDgEFactor accomplishes the knowledge selection task by formulating a factorized knowledge/document retrieval problem with three modules performing domain, entity and knowledge level analyses. Ens-GPT generates a response by first processing multiple knowledge snippets, followed by an ensemble algorithm that decides if the response should be solely derived from a GPT2-XL model, or regenerated in combination with the top-ranking knowledge snippet. Experimental results demonstrate that the proposed pipeline system outperforms the baseline and generates high-quality responses, achieving at least 58.77% improvement on BLEU-4 score. △ Less

Submitted 15 January, 2021; originally announced January 2021.

Showing 1–3 of 3 results for author: Lo, C H