-
Harvesting Events from Multiple Sources: Towards a Cross-Document Event Extraction Paradigm
Authors:
Qiang Gao,
Zixiang Meng,
Bobo Li,
Jun Zhou,
Fei Li,
Chong Teng,
Donghong Ji
Abstract:
Document-level event extraction aims to extract structured event information from unstructured text. However, a single document often contains limited event information and the roles of different event arguments may be biased due to the influence of the information source. This paper addresses the limitations of traditional document-level event extraction by proposing the task of cross-document ev…
▽ More
Document-level event extraction aims to extract structured event information from unstructured text. However, a single document often contains limited event information and the roles of different event arguments may be biased due to the influence of the information source. This paper addresses the limitations of traditional document-level event extraction by proposing the task of cross-document event extraction (CDEE) to integrate event information from multiple documents and provide a comprehensive perspective on events. We construct a novel cross-document event extraction dataset, namely CLES, which contains 20,059 documents and 37,688 mention-level events, where over 70% of them are cross-document. To build a benchmark, we propose a CDEE pipeline that includes 5 steps, namely event extraction, coreference resolution, entity normalization, role normalization and entity-role resolution. Our CDEE pipeline achieves about 72% F1 in end-to-end cross-document event extraction, suggesting the challenge of this task. Our work builds a new line of information extraction research and will attract new research attention.
△ Less
Submitted 23 June, 2024;
originally announced June 2024.
-
Enhancing Cross-Document Event Coreference Resolution by Discourse Structure and Semantic Information
Authors:
Qiang Gao,
Bobo Li,
Zixiang Meng,
Yunlong Li,
Jun Zhou,
Fei Li,
Chong Teng,
Donghong Ji
Abstract:
Existing cross-document event coreference resolution models, which either compute mention similarity directly or enhance mention representation by extracting event arguments (such as location, time, agent, and patient), lacking the ability to utilize document-level information. As a result, they struggle to capture long-distance dependencies. This shortcoming leads to their underwhelming performan…
▽ More
Existing cross-document event coreference resolution models, which either compute mention similarity directly or enhance mention representation by extracting event arguments (such as location, time, agent, and patient), lacking the ability to utilize document-level information. As a result, they struggle to capture long-distance dependencies. This shortcoming leads to their underwhelming performance in determining coreference for the events where their argument information relies on long-distance dependencies. In light of these limitations, we propose the construction of document-level Rhetorical Structure Theory (RST) trees and cross-document Lexical Chains to model the structural and semantic information of documents. Subsequently, cross-document heterogeneous graphs are constructed and GAT is utilized to learn the representations of events. Finally, a pair scorer calculates the similarity between each pair of events and co-referred events can be recognized using standard clustering algorithm. Additionally, as the existing cross-document event coreference datasets are limited to English, we have developed a large-scale Chinese cross-document event coreference dataset to fill this gap, which comprises 53,066 event mentions and 4,476 clusters. After applying our model on the English and Chinese datasets respectively, it outperforms all baselines by large margins.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Modeling Unified Semantic Discourse Structure for High-quality Headline Generation
Authors:
Minghui Xu,
Hao Fei,
Fei Li,
Shengqiong Wu,
Rui Sun,
Chong Teng,
Donghong Ji
Abstract:
Headline generation aims to summarize a long document with a short, catchy title that reflects the main idea. This requires accurately capturing the core document semantics, which is challenging due to the lengthy and background information-rich na ture of the texts. In this work, We propose using a unified semantic discourse structure (S3) to represent document semantics, achieved by combining do…
▽ More
Headline generation aims to summarize a long document with a short, catchy title that reflects the main idea. This requires accurately capturing the core document semantics, which is challenging due to the lengthy and background information-rich na ture of the texts. In this work, We propose using a unified semantic discourse structure (S3) to represent document semantics, achieved by combining document-level rhetorical structure theory (RST) trees with sentence-level abstract meaning representation (AMR) graphs to construct S3 graphs. The hierarchical composition of sentence, clause, and word intrinsically characterizes the semantic meaning of the overall document. We then develop a headline generation framework, in which the S3 graphs are encoded as contextual features. To consolidate the efficacy of S3 graphs, we further devise a hierarchical structure pruning mechanism to dynamically screen the redundant and nonessential nodes within the graph. Experimental results on two headline generation datasets demonstrate that our method outperforms existing state-of-art methods consistently. Our work can be instructive for a broad range of document modeling tasks, more than headline or summarization generation.
△ Less
Submitted 23 March, 2024;
originally announced March 2024.
-
CMNER: A Chinese Multimodal NER Dataset based on Social Media
Authors:
Yuanze Ji,
Bobo Li,
Jun Zhou,
Fei Li,
Chong Teng,
Donghong Ji
Abstract:
Multimodal Named Entity Recognition (MNER) is a pivotal task designed to extract named entities from text with the support of pertinent images. Nonetheless, a notable paucity of data for Chinese MNER has considerably impeded the progress of this natural language processing task within the Chinese domain. Consequently, in this study, we compile a Chinese Multimodal NER dataset (CMNER) utilizing dat…
▽ More
Multimodal Named Entity Recognition (MNER) is a pivotal task designed to extract named entities from text with the support of pertinent images. Nonetheless, a notable paucity of data for Chinese MNER has considerably impeded the progress of this natural language processing task within the Chinese domain. Consequently, in this study, we compile a Chinese Multimodal NER dataset (CMNER) utilizing data sourced from Weibo, China's largest social media platform. Our dataset encompasses 5,000 Weibo posts paired with 18,326 corresponding images. The entities are classified into four distinct categories: person, location, organization, and miscellaneous. We perform baseline experiments on CMNER, and the outcomes underscore the effectiveness of incorporating images for NER. Furthermore, we conduct cross-lingual experiments on the publicly available English MNER dataset (Twitter2015), and the results substantiate our hypothesis that Chinese and English multimodal NER data can mutually enhance the performance of the NER model.
△ Less
Submitted 1 March, 2024; v1 submitted 21 February, 2024;
originally announced February 2024.
-
Reverse Multi-Choice Dialogue Commonsense Inference with Graph-of-Thought
Authors:
Li Zheng,
Hao Fei,
Fei Li,
Bobo Li,
Lizi Liao,
Donghong Ji,
Chong Teng
Abstract:
With the proliferation of dialogic data across the Internet, the Dialogue Commonsense Multi-choice Question Answering (DC-MCQ) task has emerged as a response to the challenge of comprehending user queries and intentions. Although prevailing methodologies exhibit effectiveness in addressing single-choice questions, they encounter difficulties in handling multi-choice queries due to the heightened i…
▽ More
With the proliferation of dialogic data across the Internet, the Dialogue Commonsense Multi-choice Question Answering (DC-MCQ) task has emerged as a response to the challenge of comprehending user queries and intentions. Although prevailing methodologies exhibit effectiveness in addressing single-choice questions, they encounter difficulties in handling multi-choice queries due to the heightened intricacy and informational density. In this paper, inspired by the human cognitive process of progressively excluding options, we propose a three-step Reverse Exclusion Graph-of-Thought (ReX-GoT) framework, including Option Exclusion, Error Analysis, and Combine Information. Specifically, our ReX-GoT mimics human reasoning by gradually excluding irrelevant options and learning the reasons for option errors to choose the optimal path of the GoT and ultimately infer the correct answer. By progressively integrating intricate clues, our method effectively reduces the difficulty of multi-choice reasoning and provides a novel solution for DC-MCQ. Extensive experiments on the CICERO and CICERO$_{v2}$ datasets validate the significant improvement of our approach on DC-MCQ task. On zero-shot setting, our model outperform the best baseline by 17.67% in terms of F1 score for the multi-choice task. Most strikingly, our GPT3.5-based ReX-GoT framework achieves a remarkable 39.44% increase in F1 score.
△ Less
Submitted 26 December, 2023; v1 submitted 23 December, 2023;
originally announced December 2023.
-
Subject-Oriented Video Captioning
Authors:
Yunchuan Ma,
Chang Teng,
Yuankai Qi,
Guorong Li,
Laiyu Qing,
Qi Wu,
Qingming Huang
Abstract:
Describing video content according to users' needs is a long-held goal. Although existing video captioning methods have made significant progress, the generated captions may not focus on the entity that users are particularly interested in. To address this problem, we propose a new video captioning task, subject-oriented video captioning, which allows users to specify the describing target via a b…
▽ More
Describing video content according to users' needs is a long-held goal. Although existing video captioning methods have made significant progress, the generated captions may not focus on the entity that users are particularly interested in. To address this problem, we propose a new video captioning task, subject-oriented video captioning, which allows users to specify the describing target via a bounding box. To support this task, we construct two subject-oriented video captioning datasets based on two widely used video captioning datasets: MSVD and MSRVTT, by annotating subjects in each video for each caption. These datasets pave the way for future technique development. As the first attempt, we evaluate four state-of-the-art general video captioning models, and have observed a large performance drop. We then explore several strategies to enable them to describe the desired target. Experimental results show obvious improvement, but there is still a large room for further exploration in this field.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
Compositional Generalization for Multi-label Text Classification: A Data-Augmentation Approach
Authors:
Yuyang Chai,
Zhuang Li,
Jiahui Liu,
Lei Chen,
Fei Li,
Donghong Ji,
Chong Teng
Abstract:
Despite significant advancements in multi-label text classification, the ability of existing models to generalize to novel and seldom-encountered complex concepts, which are compositions of elementary ones, remains underexplored. This research addresses this gap. By creating unique data splits across three benchmarks, we assess the compositional generalization ability of existing multi-label text…
▽ More
Despite significant advancements in multi-label text classification, the ability of existing models to generalize to novel and seldom-encountered complex concepts, which are compositions of elementary ones, remains underexplored. This research addresses this gap. By creating unique data splits across three benchmarks, we assess the compositional generalization ability of existing multi-label text classification models. Our results show that these models often fail to generalize to compositional concepts encountered infrequently during training, leading to inferior performance on tests with these new combinations. To address this, we introduce a data augmentation method that leverages two innovative text generation models designed to enhance the classification models' capacity for compositional generalization. Our experiments show that this data augmentation approach significantly improves the compositional generalization capabilities of classification models on our benchmarks, with both generation models surpassing other text generation baselines.
△ Less
Submitted 20 December, 2023; v1 submitted 18 December, 2023;
originally announced December 2023.
-
How to Data in Datathons
Authors:
Carlos Mougan,
Richard Plant,
Clare Teng,
Marya Bazzi,
Alvaro Cabrejas-Egea,
Ryan Sze-Yin Chan,
David Salvador Jasin,
Martin Stoffel,
Kirstie Jane Whitaker,
Jules Manser
Abstract:
The rise of datathons, also known as data or data science hackathons, has provided a platform to collaborate, learn, and innovate in a short timeframe. Despite their significant potential benefits, organizations often struggle to effectively work with data due to a lack of clear guidelines and best practices for potential issues that might arise. Drawing on our own experiences and insights from or…
▽ More
The rise of datathons, also known as data or data science hackathons, has provided a platform to collaborate, learn, and innovate in a short timeframe. Despite their significant potential benefits, organizations often struggle to effectively work with data due to a lack of clear guidelines and best practices for potential issues that might arise. Drawing on our own experiences and insights from organizing >80 datathon challenges with >60 partnership organizations since 2016, we provide guidelines and recommendations that serve as a resource for organizers to navigate the data-related complexities of datathons. We apply our proposed framework to 10 case studies.
△ Less
Submitted 25 October, 2023; v1 submitted 18 September, 2023;
originally announced September 2023.
-
Revisiting Disentanglement and Fusion on Modality and Context in Conversational Multimodal Emotion Recognition
Authors:
Bobo Li,
Hao Fei,
Lizi Liao,
Yu Zhao,
Chong Teng,
Tat-Seng Chua,
Donghong Ji,
Fei Li
Abstract:
It has been a hot research topic to enable machines to understand human emotions in multimodal contexts under dialogue scenarios, which is tasked with multimodal emotion analysis in conversation (MM-ERC). MM-ERC has received consistent attention in recent years, where a diverse range of methods has been proposed for securing better task performance. Most existing works treat MM-ERC as a standard m…
▽ More
It has been a hot research topic to enable machines to understand human emotions in multimodal contexts under dialogue scenarios, which is tasked with multimodal emotion analysis in conversation (MM-ERC). MM-ERC has received consistent attention in recent years, where a diverse range of methods has been proposed for securing better task performance. Most existing works treat MM-ERC as a standard multimodal classification problem and perform multimodal feature disentanglement and fusion for maximizing feature utility. Yet after revisiting the characteristic of MM-ERC, we argue that both the feature multimodality and conversational contextualization should be properly modeled simultaneously during the feature disentanglement and fusion steps. In this work, we target further pushing the task performance by taking full consideration of the above insights. On the one hand, during feature disentanglement, based on the contrastive learning technique, we devise a Dual-level Disentanglement Mechanism (DDM) to decouple the features into both the modality space and utterance space. On the other hand, during the feature fusion stage, we propose a Contribution-aware Fusion Mechanism (CFM) and a Context Refusion Mechanism (CRM) for multimodal and context integration, respectively. They together schedule the proper integrations of multimodal and context features. Specifically, CFM explicitly manages the multimodal feature contributions dynamically, while CRM flexibly coordinates the introduction of dialogue contexts. On two public MM-ERC datasets, our system achieves new state-of-the-art performance consistently. Further analyses demonstrate that all our proposed mechanisms greatly facilitate the MM-ERC task by making full use of the multimodal and context features adaptively. Note that our proposed methods have the great potential to facilitate a broader range of other conversational multimodal tasks.
△ Less
Submitted 12 August, 2023; v1 submitted 8 August, 2023;
originally announced August 2023.
-
DialogRE^C+: An Extension of DialogRE to Investigate How Much Coreference Helps Relation Extraction in Dialogs
Authors:
Yiyun Xiong,
Mengwei Dai,
Fei Li,
Hao Fei,
Bobo Li,
Shengqiong Wu,
Donghong Ji,
Chong Teng
Abstract:
Dialogue relation extraction (DRE) that identifies the relations between argument pairs in dialogue text, suffers much from the frequent occurrence of personal pronouns, or entity and speaker coreference. This work introduces a new benchmark dataset DialogRE^C+, introducing coreference resolution into the DRE scenario. With the aid of high-quality coreference knowledge, the reasoning of argument r…
▽ More
Dialogue relation extraction (DRE) that identifies the relations between argument pairs in dialogue text, suffers much from the frequent occurrence of personal pronouns, or entity and speaker coreference. This work introduces a new benchmark dataset DialogRE^C+, introducing coreference resolution into the DRE scenario. With the aid of high-quality coreference knowledge, the reasoning of argument relations is expected to be enhanced. In DialogRE^C+ dataset, we manually annotate total 5,068 coreference chains over 36,369 argument mentions based on the existing DialogRE data, where four different coreference chain types namely speaker chain, person chain, location chain and organization chain are explicitly marked. We further develop 4 coreference-enhanced graph-based DRE models, which learn effective coreference representations for improving the DRE task. We also train a coreference resolution model based on our annotations and evaluate the effect of automatically extracted coreference chains demonstrating the practicality of our dataset and its potential to other domains and tasks.
△ Less
Submitted 12 August, 2023; v1 submitted 8 August, 2023;
originally announced August 2023.
-
A Bi-directional Multi-hop Inference Model for Joint Dialog Sentiment Classification and Act Recognition
Authors:
Li Zheng,
Fei Li,
Yuyang Chai,
Chong Teng,
Donghong Ji
Abstract:
The joint task of Dialog Sentiment Classification (DSC) and Act Recognition (DAR) aims to predict the sentiment label and act label for each utterance in a dialog simultaneously. However, current methods encode the dialog context in only one direction, which limits their ability to thoroughly comprehend the context. Moreover, these methods overlook the explicit correlations between sentiment and a…
▽ More
The joint task of Dialog Sentiment Classification (DSC) and Act Recognition (DAR) aims to predict the sentiment label and act label for each utterance in a dialog simultaneously. However, current methods encode the dialog context in only one direction, which limits their ability to thoroughly comprehend the context. Moreover, these methods overlook the explicit correlations between sentiment and act labels, which leads to an insufficient ability to capture rich sentiment and act clues and hinders effective and accurate reasoning. To address these issues, we propose a Bi-directional Multi-hop Inference Model (BMIM) that leverages a feature selection network and a bi-directional multi-hop inference network to iteratively extract and integrate rich sentiment and act clues in a bi-directional manner. We also employ contrastive learning and dual learning to explicitly model the correlations of sentiment and act labels. Our experiments on two widely-used datasets show that BMIM outperforms state-of-the-art baselines by at least 2.6% on F1 score in DAR and 1.4% on F1 score in DSC. Additionally, Our proposed model not only improves the performance but also enhances the interpretability of the joint sentiment and act prediction task.
△ Less
Submitted 12 August, 2023; v1 submitted 8 August, 2023;
originally announced August 2023.
-
TKDP: Threefold Knowledge-enriched Deep Prompt Tuning for Few-shot Named Entity Recognition
Authors:
Jiang Liu,
Hao Fei,
Fei Li,
**gye Li,
Bobo Li,
Liang Zhao,
Chong Teng,
Donghong Ji
Abstract:
Few-shot named entity recognition (NER) exploits limited annotated instances to identify named mentions. Effectively transferring the internal or external resources thus becomes the key to few-shot NER. While the existing prompt tuning methods have shown remarkable few-shot performances, they still fail to make full use of knowledge. In this work, we investigate the integration of rich knowledge t…
▽ More
Few-shot named entity recognition (NER) exploits limited annotated instances to identify named mentions. Effectively transferring the internal or external resources thus becomes the key to few-shot NER. While the existing prompt tuning methods have shown remarkable few-shot performances, they still fail to make full use of knowledge. In this work, we investigate the integration of rich knowledge to prompt tuning for stronger few-shot NER. We propose incorporating the deep prompt tuning framework with threefold knowledge (namely TKDP), including the internal 1) context knowledge and the external 2) label knowledge & 3) sememe knowledge. TKDP encodes the three feature sources and incorporates them into the soft prompt embeddings, which are further injected into an existing pre-trained language model to facilitate predictions. On five benchmark datasets, our knowledge-enriched model boosts by at most 11.53% F1 over the raw deep prompt method, and significantly outperforms 8 strong-performing baseline systems in 5-/10-/20-shot settings, showing great potential in few-shot NER. Our TKDP can be broadly adapted to other few-shot tasks without effort.
△ Less
Submitted 10 June, 2023; v1 submitted 6 June, 2023;
originally announced June 2023.
-
ECQED: Emotion-Cause Quadruple Extraction in Dialogs
Authors:
Li Zheng,
Donghong Ji,
Fei Li,
Hao Fei,
Shengqiong Wu,
**gye Li,
Bobo Li,
Chong Teng
Abstract:
The existing emotion-cause pair extraction (ECPE) task, unfortunately, ignores extracting the emotion type and cause type, while these fine-grained meta-information can be practically useful in real-world applications, i.e., chat robots and empathic dialog generation. Also the current ECPE is limited to the scenario of single text piece, while neglecting the studies at dialog level that should hav…
▽ More
The existing emotion-cause pair extraction (ECPE) task, unfortunately, ignores extracting the emotion type and cause type, while these fine-grained meta-information can be practically useful in real-world applications, i.e., chat robots and empathic dialog generation. Also the current ECPE is limited to the scenario of single text piece, while neglecting the studies at dialog level that should have more realistic values. In this paper, we extend the ECPE task with a broader definition and scenario, presenting a new task, Emotion-Cause Quadruple Extraction in Dialogs (ECQED), which requires detecting emotion-cause utterance pairs and emotion and cause types. We present an ECQED model based on a structural and semantic heterogeneous graph as well as a parallel grid tagging scheme, which advances in effectively incorporating the dialog context structure, meanwhile solving the challenging overlapped quadruple issue. Via experiments we show that introducing the fine-grained emotion and cause features evidently helps better dialog generation. Also our proposed ECQED system shows exceptional superiority over baselines on both the emotion-cause quadruple or pair extraction tasks, meanwhile being highly efficient.
△ Less
Submitted 10 June, 2023; v1 submitted 6 June, 2023;
originally announced June 2023.
-
Towards Large-scale Single-shot Millimeter-wave Imaging for Low-cost Security Inspection
Authors:
Liheng Bian,
Daoyu Li,
Shuoguang Wang,
Chunyang Teng,
Huteng Liu,
Hanwen Xu,
Xuyang Chang,
Guoqiang Zhao,
Shiyong Li,
Jun Zhang
Abstract:
Millimeter-wave (MMW) imaging is emerging as a promising technique for safe security inspection. It achieves a delicate balance between imaging resolution, penetrability and human safety, resulting in higher resolution compared to low-frequency microwave, stronger penetrability compared to visible light, and stronger safety compared to X ray. Despite of recent advance in the last decades, the high…
▽ More
Millimeter-wave (MMW) imaging is emerging as a promising technique for safe security inspection. It achieves a delicate balance between imaging resolution, penetrability and human safety, resulting in higher resolution compared to low-frequency microwave, stronger penetrability compared to visible light, and stronger safety compared to X ray. Despite of recent advance in the last decades, the high cost of requisite large-scale antenna array hinders widespread adoption of MMW imaging in practice. To tackle this challenge, we report a large-scale single-shot MMW imaging framework using sparse antenna array, achieving low-cost but high-fidelity security inspection under an interpretable learning scheme. We first collected extensive full-sampled MMW echoes to study the statistical ranking of each element in the large-scale array. These elements are then sampled based on the ranking, building the experimentally optimal sparse sampling strategy that reduces the cost of antenna array by up to one order of magnitude. Additionally, we derived an untrained interpretable learning scheme, which realizes robust and accurate image reconstruction from sparsely sampled echoes. Last, we developed a neural network for automatic object detection, and experimentally demonstrated successful detection of concealed centimeter-sized targets using 10% sparse array, whereas all the other contemporary approaches failed at the same sample sampling ratio. The performance of the reported technique presents higher than 50% superiority over the existing MMW imaging schemes on various metrics including precision, recall, and mAP50. With such strong detection ability and order-of-magnitude cost reduction, we anticipate that this technique provides a practical way for large-scale single-shot MMW imaging, and could advocate its further practical applications.
△ Less
Submitted 18 June, 2023; v1 submitted 25 May, 2023;
originally announced May 2023.
-
The Role of Semantic Parsing in Understanding Procedural Text
Authors:
Hossein Rajaby Faghihi,
Parisa Kordjamshidi,
Choh Man Teng,
James Allen
Abstract:
In this paper, we investigate whether symbolic semantic representations, extracted from deep semantic parsers, can help reasoning over the states of involved entities in a procedural text. We consider a deep semantic parser~(TRIPS) and semantic role labeling as two sources of semantic parsing knowledge. First, we propose PROPOLIS, a symbolic parsing-based procedural reasoning framework. Second, we…
▽ More
In this paper, we investigate whether symbolic semantic representations, extracted from deep semantic parsers, can help reasoning over the states of involved entities in a procedural text. We consider a deep semantic parser~(TRIPS) and semantic role labeling as two sources of semantic parsing knowledge. First, we propose PROPOLIS, a symbolic parsing-based procedural reasoning framework. Second, we integrate semantic parsing information into state-of-the-art neural models to conduct procedural reasoning. Our experiments indicate that explicitly incorporating such semantic knowledge improves procedural understanding. This paper presents new metrics for evaluating procedural reasoning tasks that clarify the challenges and identify differences among neural, symbolic, and integrated models.
△ Less
Submitted 17 May, 2023; v1 submitted 13 February, 2023;
originally announced February 2023.
-
Establishment of Neural Networks Robust to Label Noise
Authors:
Pengwei Yang,
Chongyangzi Teng,
Jack George Mangos
Abstract:
Label noise is a significant obstacle in deep learning model training. It can have a considerable impact on the performance of image classification models, particularly deep neural networks, which are especially susceptible because they have a strong propensity to memorise noisy labels. In this paper, we have examined the fundamental concept underlying related label noise approaches. A transition…
▽ More
Label noise is a significant obstacle in deep learning model training. It can have a considerable impact on the performance of image classification models, particularly deep neural networks, which are especially susceptible because they have a strong propensity to memorise noisy labels. In this paper, we have examined the fundamental concept underlying related label noise approaches. A transition matrix estimator has been created, and its effectiveness against the actual transition matrix has been demonstrated. In addition, we examined the label noise robustness of two convolutional neural network classifiers with LeNet and AlexNet designs. The two FashionMINIST datasets have revealed the robustness of both models. We are not efficiently able to demonstrate the influence of the transition matrix noise correction on robustness enhancements due to our inability to correctly tune the complex convolutional neural network model due to time and computing resource constraints. There is a need for additional effort to fine-tune the neural network model and explore the precision of the estimated transition model in future research.
△ Less
Submitted 23 April, 2023; v1 submitted 28 November, 2022;
originally announced November 2022.
-
Contaminated Images Recovery by Implementing Non-negative Matrix Factorisation
Authors:
Pengwei Yang,
Chongyangzi Teng,
Jack George Mangos
Abstract:
Non-negative matrix factorisation (NMF) has been extensively applied to the problem of corrupted image data. Standard NMF approach minimises Euclidean distance between data matrix and factorised approximation. The traditional NMF technique is sensitive to outliers since it utilises the squared error of each data point, despite the fact that this method has proven effective. In this study, we theor…
▽ More
Non-negative matrix factorisation (NMF) has been extensively applied to the problem of corrupted image data. Standard NMF approach minimises Euclidean distance between data matrix and factorised approximation. The traditional NMF technique is sensitive to outliers since it utilises the squared error of each data point, despite the fact that this method has proven effective. In this study, we theoretically examine the robustness of the traditional NMF, HCNMF, and L2,1-NMF algorithms and execute sets of experiments to demonstrate the robustness on ORL and Extended YaleB datasets. Our research indicates that each algorithm requires a different number of iterations to converge. Due to the computational cost of these approaches, our final models, such as the HCNMF and L2,1-NMF model, fail to converge within the iteration parameters of this work. Nonetheless, the experimental results illustrate, to some extent, the robustness of the aforementioned techniques.
△ Less
Submitted 1 May, 2023; v1 submitted 8 November, 2022;
originally announced November 2022.
-
TOE: A Grid-Tagging Discontinuous NER Model Enhanced by Embedding Tag/Word Relations and More Fine-Grained Tags
Authors:
Jiang Liu,
Donghong Ji,
**gye Li,
Dongdong Xie,
Chong Teng,
Liang Zhao,
Fei Li
Abstract:
So far, discontinuous named entity recognition (NER) has received increasing research attention and many related methods have surged such as hypergraph-based methods, span-based methods, and sequence-to-sequence (Seq2Seq) methods, etc. However, these methods more or less suffer from some problems such as decoding ambiguity and efficiency, which limit their performance. Recently, grid-tagging metho…
▽ More
So far, discontinuous named entity recognition (NER) has received increasing research attention and many related methods have surged such as hypergraph-based methods, span-based methods, and sequence-to-sequence (Seq2Seq) methods, etc. However, these methods more or less suffer from some problems such as decoding ambiguity and efficiency, which limit their performance. Recently, grid-tagging methods, which benefit from the flexible design of tagging systems and model architectures, have shown superiority to adapt for various information extraction tasks. In this paper, we follow the line of such methods and propose a competitive grid-tagging model for discontinuous NER. We call our model TOE because we incorporate two kinds of Tag-Oriented Enhancement mechanisms into a state-of-the-art (SOTA) grid-tagging model that casts the NER problem into word-word relationship prediction. First, we design a Tag Representation Embedding Module (TREM) to force our model to consider not only word-word relationships but also word-tag and tag-tag relationships. Concretely, we construct tag representations and embed them into TREM, so that TREM can treat tag and word representations as queries/keys/values and utilize self-attention to model their relationships. On the other hand, motivated by the Next-Neighboring-Word (NNW) and Tail-Head-Word (THW) tags in the SOTA model, we add two new symmetric tags, namely Previous-Neighboring-Word (PNW) and Head-Tail-Word (HTW), to model more fine-grained word-word relationships and alleviate error propagation from tag prediction. In the experiments of three benchmark datasets, namely CADEC, ShARe13 and ShARe14, our TOE model pushes the SOTA results by about 0.83%, 0.05% and 0.66% in F1, demonstrating its effectiveness.
△ Less
Submitted 1 November, 2022;
originally announced November 2022.
-
Multimodal-GuideNet: Gaze-Probe Bidirectional Guidance in Obstetric Ultrasound Scanning
Authors:
Qianhui Men,
Clare Teng,
Lior Drukker,
Aris T. Papageorghiou,
J. Alison Noble
Abstract:
Eye trackers can provide visual guidance to sonographers during ultrasound (US) scanning. Such guidance is potentially valuable for less experienced operators to improve their scanning skills on how to manipulate the probe to achieve the desired plane. In this paper, a multimodal guidance approach (Multimodal-GuideNet) is proposed to capture the stepwise dependency between a real-world US video si…
▽ More
Eye trackers can provide visual guidance to sonographers during ultrasound (US) scanning. Such guidance is potentially valuable for less experienced operators to improve their scanning skills on how to manipulate the probe to achieve the desired plane. In this paper, a multimodal guidance approach (Multimodal-GuideNet) is proposed to capture the stepwise dependency between a real-world US video signal, synchronized gaze, and probe motion within a unified framework. To understand the causal relationship between gaze movement and probe motion, our model exploits multitask learning to jointly learn two related tasks: predicting gaze movements and probe signals that an experienced sonographer would perform in routine obstetric scanning. The two tasks are associated by a modality-aware spatial graph to detect the co-occurrence among the multi-modality inputs and share useful cross-modal information. Instead of a deterministic scanning path, Multimodal-GuideNet allows for scanning diversity by estimating the probability distribution of real scans. Experiments performed with three typical obstetric scanning examinations show that the new approach outperforms single-task learning for both probe motion guidance and gaze movement prediction. Multimodal-GuideNet also provides a visual guidance signal with an error rate of less than 10 pixels for a 224x288 US image.
△ Less
Submitted 26 July, 2022;
originally announced July 2022.
-
Learnable Mixed-precision and Dimension Reduction Co-design for Low-storage Activation
Authors:
Yu-Shan Tai,
Cheng-Yang Chang,
Chieh-Fang Teng,
AnYeu,
Wu
Abstract:
Recently, deep convolutional neural networks (CNNs) have achieved many eye-catching results. However, deploying CNNs on resource-constrained edge devices is constrained by limited memory bandwidth for transmitting large intermediated data during inference, i.e., activation. Existing research utilizes mixed-precision and dimension reduction to reduce computational complexity but pays less attention…
▽ More
Recently, deep convolutional neural networks (CNNs) have achieved many eye-catching results. However, deploying CNNs on resource-constrained edge devices is constrained by limited memory bandwidth for transmitting large intermediated data during inference, i.e., activation. Existing research utilizes mixed-precision and dimension reduction to reduce computational complexity but pays less attention to its application for activation compression. To further exploit the redundancy in activation, we propose a learnable mixed-precision and dimension reduction co-design system, which separates channels into groups and allocates specific compression policies according to their importance. In addition, the proposed dynamic searching technique enlarges search space and finds out the optimal bit-width allocation automatically. Our experimental results show that the proposed methods improve 3.54%/1.27% in accuracy and save 0.18/2.02 bits per value over existing mixed-precision methods on ResNet18 and MobileNetv2, respectively.
△ Less
Submitted 18 July, 2022; v1 submitted 16 July, 2022;
originally announced July 2022.
-
Unified Named Entity Recognition as Word-Word Relation Classification
Authors:
**gye Li,
Hao Fei,
Jiang Liu,
Shengqiong Wu,
Meishan Zhang,
Chong Teng,
Donghong Ji,
Fei Li
Abstract:
So far, named entity recognition (NER) has been involved with three major types, including flat, overlapped (aka. nested), and discontinuous NER, which have mostly been studied individually. Recently, a growing interest has been built for unified NER, tackling the above three jobs concurrently with one single model. Current best-performing methods mainly include span-based and sequence-to-sequence…
▽ More
So far, named entity recognition (NER) has been involved with three major types, including flat, overlapped (aka. nested), and discontinuous NER, which have mostly been studied individually. Recently, a growing interest has been built for unified NER, tackling the above three jobs concurrently with one single model. Current best-performing methods mainly include span-based and sequence-to-sequence models, where unfortunately the former merely focus on boundary identification and the latter may suffer from exposure bias. In this work, we present a novel alternative by modeling the unified NER as word-word relation classification, namely W^2NER. The architecture resolves the kernel bottleneck of unified NER by effectively modeling the neighboring relations between entity words with Next-Neighboring-Word (NNW) and Tail-Head-Word-* (THW-*) relations. Based on the W^2NER scheme we develop a neural framework, in which the unified NER is modeled as a 2D grid of word pairs. We then propose multi-granularity 2D convolutions for better refining the grid representations. Finally, a co-predictor is used to sufficiently reason the word-word relations. We perform extensive experiments on 14 widely-used benchmark datasets for flat, overlapped, and discontinuous NER (8 English and 6 Chinese datasets), where our model beats all the current top-performing baselines, pushing the state-of-the-art performances of unified NER.
△ Less
Submitted 19 December, 2021;
originally announced December 2021.
-
Compression-aware Projection with Greedy Dimension Reduction for Convolutional Neural Network Activations
Authors:
Yu-Shan Tai,
Chieh-Fang Teng,
Cheng-Yang Chang,
An-Yeu Wu
Abstract:
Convolutional neural networks (CNNs) achieve remarkable performance in a wide range of fields. However, intensive memory access of activations introduces considerable energy consumption, impeding deployment of CNNs on resourceconstrained edge devices. Existing works in activation compression propose to transform feature maps for higher compressibility, thus enabling dimension reduction. Neverthele…
▽ More
Convolutional neural networks (CNNs) achieve remarkable performance in a wide range of fields. However, intensive memory access of activations introduces considerable energy consumption, impeding deployment of CNNs on resourceconstrained edge devices. Existing works in activation compression propose to transform feature maps for higher compressibility, thus enabling dimension reduction. Nevertheless, in the case of aggressive dimension reduction, these methods lead to severe accuracy drop. To improve the trade-off between classification accuracy and compression ratio, we propose a compression-aware projection system, which employs a learnable projection to compensate for the reconstruction loss. In addition, a greedy selection metric is introduced to optimize the layer-wise compression ratio allocation by considering both accuracy and #bits reduction simultaneously. Our test results show that the proposed methods effectively reduce 2.91x~5.97x memory access with negligible accuracy drop on MobileNetV2/ResNet18/VGG16.
△ Less
Submitted 17 October, 2021;
originally announced October 2021.
-
Mastering the Explicit Opinion-role Interaction: Syntax-aided Neural Transition System for Unified Opinion Role Labeling
Authors:
Shengqiong Wu,
Hao Fei,
Fei Li,
Donghong Ji,
Meishan Zhang,
Yijiang Liu,
Chong Teng
Abstract:
Unified opinion role labeling (ORL) aims to detect all possible opinion structures of 'opinion-holder-target' in one shot, given a text. The existing transition-based unified method, unfortunately, is subject to longer opinion terms and fails to solve the term overlap issue. Current top performance has been achieved by employing the span-based graph model, which however still suffers from both hig…
▽ More
Unified opinion role labeling (ORL) aims to detect all possible opinion structures of 'opinion-holder-target' in one shot, given a text. The existing transition-based unified method, unfortunately, is subject to longer opinion terms and fails to solve the term overlap issue. Current top performance has been achieved by employing the span-based graph model, which however still suffers from both high model complexity and insufficient interaction among opinions and roles. In this work, we investigate a novel solution by revisiting the transition architecture, and augmenting it with a pointer network (PointNet). The framework parses out all opinion structures in linear-time complexity, meanwhile breaks through the limitation of any length of terms with PointNet. To achieve the explicit opinion-role interactions, we further propose a unified dependency-opinion graph (UDOG), co-modeling the syntactic dependency structure and the partial opinion-role structure. We then devise a relation-centered graph aggregator (RCGA) to encode the multi-relational UDOG, where the resulting high-order representations are used to promote the predictions in the vanilla transition system. Our model achieves new state-of-the-art results on the MPQA benchmark. Analyses further demonstrate the superiority of our methods on both efficacy and efficiency.
△ Less
Submitted 13 December, 2021; v1 submitted 5 October, 2021;
originally announced October 2021.
-
Domain Adaptation for Learning Generator from Paired Few-Shot Data
Authors:
Chun-Chih Teng,
Pin-Yu Chen,
Wei-Chen Chiu
Abstract:
We propose a Paired Few-shot GAN (PFS-GAN) model for learning generators with sufficient source data and a few target data. While generative model learning typically needs large-scale training data, our PFS-GAN not only uses the concept of few-shot learning but also domain shift to transfer the knowledge across domains, which alleviates the issue of obtaining low-quality generator when only traine…
▽ More
We propose a Paired Few-shot GAN (PFS-GAN) model for learning generators with sufficient source data and a few target data. While generative model learning typically needs large-scale training data, our PFS-GAN not only uses the concept of few-shot learning but also domain shift to transfer the knowledge across domains, which alleviates the issue of obtaining low-quality generator when only trained with target domain data. The cross-domain datasets are assumed to have two properties: (1) each target-domain sample has its source-domain correspondence and (2) two domains share similar content information but different appearance. Our PFS-GAN aims to learn the disentangled representation from images, which composed of domain-invariant content features and domain-specific appearance features. Furthermore, a relation loss is introduced on the content features while shifting the appearance features to increase the structural diversity. Extensive experiments show that our method has better quantitative and qualitative results on the generated target-domain data with higher diversity in comparison to several baselines.
△ Less
Submitted 25 February, 2021;
originally announced February 2021.
-
A Broad-Coverage Deep Semantic Lexicon for Verbs
Authors:
James Allen,
Hannah An,
Ritwik Bose,
Will de Beaumont,
Choh Man Teng
Abstract:
Progress on deep language understanding is inhibited by the lack of a broad coverage lexicon that connects linguistic behavior to ontological concepts and axioms. We have developed COLLIE-V, a deep lexical resource for verbs, with the coverage of WordNet and syntactic and semantic details that meet or exceed existing resources. Bootstrap** from a hand-built lexicon and ontology, new ontological…
▽ More
Progress on deep language understanding is inhibited by the lack of a broad coverage lexicon that connects linguistic behavior to ontological concepts and axioms. We have developed COLLIE-V, a deep lexical resource for verbs, with the coverage of WordNet and syntactic and semantic details that meet or exceed existing resources. Bootstrap** from a hand-built lexicon and ontology, new ontological concepts and lexical entries, together with semantic role preferences and entailment axioms, are automatically derived by combining multiple constraints from parsing dictionary definitions and examples. We evaluated the accuracy of the technique along a number of different dimensions and were able to obtain high accuracy in deriving new concepts and lexical entries. COLLIE-V is publicly available.
△ Less
Submitted 6 July, 2020;
originally announced July 2020.
-
Neural Network-Aided BCJR Algorithm for Joint Symbol Detection and Channel Decoding
Authors:
Wen-Chiao Tsai,
Chieh-Fang Teng,
Han-Mo Ou,
An-Yeu Wu
Abstract:
Recently, deep learning-assisted communication systems have achieved many eye-catching results and attracted more and more researchers in this emerging field. Instead of completely replacing the functional blocks of communication systems with neural networks, a hybrid manner of BCJRNet symbol detection is proposed to combine the advantages of the BCJR algorithm and neural networks. However, its se…
▽ More
Recently, deep learning-assisted communication systems have achieved many eye-catching results and attracted more and more researchers in this emerging field. Instead of completely replacing the functional blocks of communication systems with neural networks, a hybrid manner of BCJRNet symbol detection is proposed to combine the advantages of the BCJR algorithm and neural networks. However, its separate block design not only degrades the system performance but also results in additional hardware complexity. In this work, we propose a BCJR receiver for joint symbol detection and channel decoding. It can simultaneously utilize the trellis diagram and channel state information for a more accurate calculation of branch probability and thus achieve global optimum with 2.3 dB gain over separate block design. Furthermore, a dedicated neural network model is proposed to replace the channel-model-based computation of the BCJR receiver, which can avoid the requirements of perfect CSI and is more robust under CSI uncertainty with 1.0 dB gain.
△ Less
Submitted 21 July, 2020; v1 submitted 30 May, 2020;
originally announced June 2020.
-
Syndrome-Enabled Unsupervised Learning for Neural Network-Based Polar Decoder and Jointly Optimized Blind Equalizer
Authors:
Chieh-Fang Teng,
Yen-Liang Chen
Abstract:
Recently, the syndrome loss has been proposed to achieve "unsupervised learning" for neural network-based BCH/LDPC decoders. However, the design approach cannot be applied to polar codes directly and has not been evaluated under varying channels. In this work, we propose two modified syndrome losses to facilitate unsupervised learning in the receiver. Then, we first apply it to a neural network-ba…
▽ More
Recently, the syndrome loss has been proposed to achieve "unsupervised learning" for neural network-based BCH/LDPC decoders. However, the design approach cannot be applied to polar codes directly and has not been evaluated under varying channels. In this work, we propose two modified syndrome losses to facilitate unsupervised learning in the receiver. Then, we first apply it to a neural network-based belief propagation (BP) polar decoder. With the aid of CRC-enabled syndrome loss, the BP decoder can even outperform conventional supervised learning methods in terms of block error rate. Secondly, we propose a jointly optimized syndrome-enabled blind equalizer, which can avoid the transmission of training sequences and achieve global optimum with 1.3 dB gain over non-blind minimum mean square error (MMSE) equalizer.
△ Less
Submitted 16 June, 2020; v1 submitted 6 January, 2020;
originally announced January 2020.
-
Accumulated Polar Feature-based Deep Learning for Efficient and Lightweight Automatic Modulation Classification with Channel Compensation Mechanism
Authors:
Chieh-Fang Teng,
Ching-Yao Chou,
Chun-Hsiang Chen,
An-Yeu Wu
Abstract:
In next-generation communications, massive machine-type communications (mMTC) induce severe burden on base stations. To address such an issue, automatic modulation classification (AMC) can help to reduce signaling overhead by blindly recognizing the modulation types without handshaking. Thus, it plays an important role in future intelligent modems. The emerging deep learning (DL) technique stores…
▽ More
In next-generation communications, massive machine-type communications (mMTC) induce severe burden on base stations. To address such an issue, automatic modulation classification (AMC) can help to reduce signaling overhead by blindly recognizing the modulation types without handshaking. Thus, it plays an important role in future intelligent modems. The emerging deep learning (DL) technique stores intelligence in the network, resulting in superior performance over traditional approaches. However, conventional DL-based approaches suffer from heavy training overhead, memory overhead, and computational complexity, which severely hinder practical applications for resource-limited scenarios, such as Vehicle-to-Everything (V2X) applications. Furthermore, the overhead of online retraining under time-varying fading channels has not been studied in the prior arts. In this work, an accumulated polar feature-based DL with a channel compensation mechanism is proposed to cope with the aforementioned issues. Firstly, the simulation results show that learning features from the polar domain with historical data information can approach near-optimal performance while reducing training overhead by 99.8 times. Secondly, the proposed neural network-based channel estimator (NN-CE) can learn the channel response and compensate for the distorted channel with 13% improvement. Moreover, in applying this lightweight NN-CE in a time-varying fading channel, two efficient mechanisms of online retraining are proposed, which can reduce transmission overhead and retraining overhead by 90% and 76%, respectively. Finally, the performance of the proposed approach is evaluated and compared with prior arts on a public dataset to demonstrate its great efficiency and lightness.
△ Less
Submitted 7 February, 2020; v1 submitted 5 January, 2020;
originally announced January 2020.
-
Low-Complexity LSTM-Assisted Bit-Flip** Algorithm for Successive Cancellation List Polar Decoder
Authors:
Chun-Hsiang Chen,
Chieh-Fang Teng,
An-Yeu Wu
Abstract:
Polar codes have attracted much attention in the past decade due to their capacity-achieving performance. The higher decoding capacity is required for 5G and beyond 5G (B5G). Although the cyclic redundancy check (CRC)- assisted successive cancellation list bit-flip** (CA-SCLF) decoders have been developed to obtain a better performance, the solution to error bit correction (bit-flip**) problem…
▽ More
Polar codes have attracted much attention in the past decade due to their capacity-achieving performance. The higher decoding capacity is required for 5G and beyond 5G (B5G). Although the cyclic redundancy check (CRC)- assisted successive cancellation list bit-flip** (CA-SCLF) decoders have been developed to obtain a better performance, the solution to error bit correction (bit-flip**) problem is still imperfect and hard to design. In this work, we leverage the expert knowledge in communication systems and adopt deep learning (DL) technique to obtain the better solution. A low-complexity long short-term memory network (LSTM)-assisted CA-SCLF decoder is proposed to further improve the performance of conventional CA-SCLF and avoid complexity and memory overhead. Our test results show that we can effectively improve the BLER performance by 0.11dB compared to prior work and reduce the complexity and memory overhead by over 30% of the network.
△ Less
Submitted 11 December, 2019;
originally announced December 2019.
-
Unsupervised Learning for Neural Network-based Polar Decoder via Syndrome Loss
Authors:
Chieh-Fang Teng,
An-Yeu Wu
Abstract:
With the rapid growth of deep learning in many fields, machine learning-assisted communication systems had attracted lots of researches with many eye-catching initial results. At the present stage, most of the methods still have great demand of massive labeled data for supervised learning. However, obtaining labeled data in the practical applications is not feasible, which may result in severe per…
▽ More
With the rapid growth of deep learning in many fields, machine learning-assisted communication systems had attracted lots of researches with many eye-catching initial results. At the present stage, most of the methods still have great demand of massive labeled data for supervised learning. However, obtaining labeled data in the practical applications is not feasible, which may result in severe performance degradation due to channel variations. To overcome such a constraint, syndrome loss has been proposed to penalize non-valid decoded codewords and achieve unsupervised learning for neural network-based decoder. However, it cannot be applied to polar decoder directly. In this work, by exploiting the nature of polar codes, we propose a modified syndrome loss. From simulation results, the proposed method demonstrates that domain-specific knowledge and know-how in code structure can enable unsupervised learning for neural network-based polar decoder.
△ Less
Submitted 5 November, 2019;
originally announced November 2019.
-
Convolutional Neural Network-aided Bit-flip** for Belief Propagation Decoding of Polar Codes
Authors:
Chieh-Fang Teng,
Kuan-Shiuan Ho,
Chen-Hsi Wu,
Sin-Sheng Wong,
An-Yeu Wu
Abstract:
Known for their capacity-achieving abilities, polar codes have been selected as the control channel coding scheme for 5G communications. To satisfy the needs of high throughput and low latency, belief propagation (BP) is chosen as the decoding algorithm. However, in general, the error performance of BP is worse than that of enhanced successive cancellation (SC). Recently, critical-set bit-flip**…
▽ More
Known for their capacity-achieving abilities, polar codes have been selected as the control channel coding scheme for 5G communications. To satisfy the needs of high throughput and low latency, belief propagation (BP) is chosen as the decoding algorithm. However, in general, the error performance of BP is worse than that of enhanced successive cancellation (SC). Recently, critical-set bit-flip** (CS-BF) is applied to BP decoding to lower the error rate. However, its trial and error process result in even longer latency. In this work, we propose a convolutional neural network-assisted bit-flip** (CNN-BF) mechanism to further enhance BP decoding of polar codes. With carefully designed input data and model architecture, our proposed CNN-BF can achieve much higher prediction accuracy and better error correction capability than CS-BF but with only half latency. It also achieves a lower block error rate (BLER) than SC list (CA-SCL).
△ Less
Submitted 5 February, 2020; v1 submitted 5 November, 2019;
originally announced November 2019.
-
Neural Network-based Equalizer by Utilizing Coding Gain in Advance
Authors:
Chieh-Fang Teng,
Han-Mo Ou,
An-Yeu Wu
Abstract:
Recently, deep learning has been exploited in many fields with revolutionary breakthroughs. In the light of this, deep learning-assisted communication systems have also attracted much attention in recent years and have potential to break down the conventional design rule for communication systems. In this work, we propose two kinds of neural network-based equalizers to exploit different characteri…
▽ More
Recently, deep learning has been exploited in many fields with revolutionary breakthroughs. In the light of this, deep learning-assisted communication systems have also attracted much attention in recent years and have potential to break down the conventional design rule for communication systems. In this work, we propose two kinds of neural network-based equalizers to exploit different characteristics between convolutional neural networks and recurrent neural networks. The equalizer in conventional block-based design may destroy the code structure and degrade the capacity of coding gain for decoder. On the contrary, our proposed approach not only eliminates channel fading, but also exploits the code structure with utilization of coding gain in advance, which can effectively increase the overall utilization of coding gain with more than 1.5 dB gain.
△ Less
Submitted 31 August, 2019; v1 submitted 10 July, 2019;
originally announced July 2019.
-
High-Performance Ultrasonic Levitation with FPGA-based Phased Arrays
Authors:
William Beasley,
Brenda Gatusch,
Daniel Connolly-Taylor,
Chenyuan Teng,
Asier Marzo,
Jose Nunez-Yanez
Abstract:
We present a flexible and self-contained platform for acoustic levitation research based on the Xilinx Zynq SoC using an array of ultrasonic emitters. The platform employs an inexpensive ZedBoard and provides fast movement of the levitated objects as well as object detection based on the produced echo. Several features available in the Zynq device are of benefit for this platform: hardware acceler…
▽ More
We present a flexible and self-contained platform for acoustic levitation research based on the Xilinx Zynq SoC using an array of ultrasonic emitters. The platform employs an inexpensive ZedBoard and provides fast movement of the levitated objects as well as object detection based on the produced echo. Several features available in the Zynq device are of benefit for this platform: hardware acceleration for the phase calculations, large number of parallel I/Os connected through the FPGA Mezzanine connector (FMC), integrated ADC capabilities to capture echo signals and ease of programmability due to a C-based design flow for both CPU and FPGA. A planar and spherical cap phased arrays are created and we investigate the capabilities and limitations of the different designs to improve the stability of the levitation process.
△ Less
Submitted 24 January, 2019; v1 submitted 18 January, 2019;
originally announced January 2019.
-
Low-complexity Recurrent Neural Network-based Polar Decoder with Weight Quantization Mechanism
Authors:
Chieh-Fang Teng,
Chen-Hsi Wu,
Kuan-Shiuan Ho,
An-Yeu Wu
Abstract:
Polar codes have drawn much attention and been adopted in 5G New Radio (NR) due to their capacity-achieving performance. Recently, as the emerging deep learning (DL) technique has breakthrough achievements in many fields, neural network decoder was proposed to obtain faster convergence and better performance than belief propagation (BP) decoding. However, neural networks are memory-intensive and h…
▽ More
Polar codes have drawn much attention and been adopted in 5G New Radio (NR) due to their capacity-achieving performance. Recently, as the emerging deep learning (DL) technique has breakthrough achievements in many fields, neural network decoder was proposed to obtain faster convergence and better performance than belief propagation (BP) decoding. However, neural networks are memory-intensive and hinder the deployment of DL in communication systems. In this work, a low-complexity recurrent neural network (RNN) polar decoder with codebook-based weight quantization is proposed. Our test results show that we can effectively reduce the memory overhead by 98% and alleviate computational complexity with slight performance loss.
△ Less
Submitted 1 February, 2019; v1 submitted 29 October, 2018;
originally announced October 2018.
-
Polar Feature Based Deep Architectures for Automatic Modulation Classification Considering Channel Fading
Authors:
Chieh-Fang Teng,
Ching-Chun Liao,
Chun-Hsiang Chen,
An-Yeu Wu
Abstract:
To develop intelligent receivers, automatic modulation classification (AMC) plays an important role for better spectrum utilization. The emerging deep learning (DL) technique has received much attention in AMC due to its superior performance in classifying data with deep structure. In this work, a novel polar-based deep learning architecture with channel compensation network (CCN) is proposed. Our…
▽ More
To develop intelligent receivers, automatic modulation classification (AMC) plays an important role for better spectrum utilization. The emerging deep learning (DL) technique has received much attention in AMC due to its superior performance in classifying data with deep structure. In this work, a novel polar-based deep learning architecture with channel compensation network (CCN) is proposed. Our test results show that learning features from polar domain (r-theta) can improve recognition accuracy by 5% and reduce training overhead by 48%. Besides, the proposed CCN is also robust to channel fading, such as amplitude and phase offsets, and can improve the recognition accuracy by 14% under practical channel environments.
△ Less
Submitted 7 October, 2018; v1 submitted 3 October, 2018;
originally announced October 2018.
-
FFTPL: An Analytic Placement Algorithm Using Fast Fourier Transform for Density Equalization
Authors:
**gwei Lu,
Pengwen Chen,
Chin-Chih Chang,
Lu Sha,
Dennis Jen-Hsin Huang,
Chin-Chi Teng,
Chung-Kuan Cheng
Abstract:
We propose a flat nonlinear placement algorithm FFTPL using fast Fourier transform for density equalization. The placement instance is modeled as an electrostatic system with the analogy of density cost to the potential energy. A well-defined Poisson's equation is proposed for gradient and cost computation. Our placer outperforms state-of-the-art placers with better solution quality and efficiency…
▽ More
We propose a flat nonlinear placement algorithm FFTPL using fast Fourier transform for density equalization. The placement instance is modeled as an electrostatic system with the analogy of density cost to the potential energy. A well-defined Poisson's equation is proposed for gradient and cost computation. Our placer outperforms state-of-the-art placers with better solution quality and efficiency.
△ Less
Submitted 16 December, 2013;
originally announced December 2013.
-
Possible World Partition Sequences: A Unifying Framework for Uncertain Reasoning
Authors:
Choh Man Teng
Abstract:
When we work with information from multiple sources, the formalism each employs to handle uncertainty may not be uniform. In order to be able to combine these knowledge bases of different formats, we need to first establish a common basis for characterizing and evaluating the different formalisms, and provide a semantics for the combined mechanism. A common framework can provide an infrastructur…
▽ More
When we work with information from multiple sources, the formalism each employs to handle uncertainty may not be uniform. In order to be able to combine these knowledge bases of different formats, we need to first establish a common basis for characterizing and evaluating the different formalisms, and provide a semantics for the combined mechanism. A common framework can provide an infrastructure for building an integrated system, and is essential if we are to understand its behavior. We present a unifying framework based on an ordered partition of possible worlds called partition sequences, which corresponds to our intuitive notion of biasing towards certain possible scenarios when we are uncertain of the actual situation. We show that some of the existing formalisms, namely, default logic, autoepistemic logic, probabilistic conditioning and thresholding (generalized conditioning), and possibility theory can be incorporated into this general framework.
△ Less
Submitted 13 February, 2013;
originally announced February 2013.
-
Sequential Thresholds: Context Sensitive Default Extensions
Authors:
Choh Man Teng
Abstract:
Default logic encounters some conceptual difficulties in representing common sense reasoning tasks. We argue that we should not try to formulate modular default rules that are presumed to work in all or most circumstances. We need to take into account the importance of the context which is continuously evolving during the reasoning process. Sequential thresholding is a quantitative counterpart o…
▽ More
Default logic encounters some conceptual difficulties in representing common sense reasoning tasks. We argue that we should not try to formulate modular default rules that are presumed to work in all or most circumstances. We need to take into account the importance of the context which is continuously evolving during the reasoning process. Sequential thresholding is a quantitative counterpart of default logic which makes explicit the role context plays in the construction of a non-monotonic extension. We present a semantic characterization of generic non-monotonic reasoning, as well as the instantiations pertaining to default logic and sequential thresholding. This provides a link between the two mechanisms as well as a way to integrate the two that can be beneficial to both.
△ Less
Submitted 6 February, 2013;
originally announced February 2013.
-
Choosing Among Interpretations of Probability
Authors:
Henry E. Kyburg Jr.,
Choh Man Teng
Abstract:
There is available an ever-increasing variety of procedures for managing uncertainty. These methods are discussed in the literature of artificial intelligence, as well as in the literature of philosophy of science. Heretofore these methods have been evaluated by intuition, discussion, and the general philosophical method of argument and counterexample. Almost any method of uncertainty managemen…
▽ More
There is available an ever-increasing variety of procedures for managing uncertainty. These methods are discussed in the literature of artificial intelligence, as well as in the literature of philosophy of science. Heretofore these methods have been evaluated by intuition, discussion, and the general philosophical method of argument and counterexample. Almost any method of uncertainty management will have the property that in the long run it will deliver numbers approaching the relative frequency of the kinds of events at issue. To find a measure that will provide a meaningful evaluation of these treatments of uncertainty, we must look, not at the long run, but at the short or intermediate run. Our project attempts to develop such a measure in terms of short or intermediate length performance. We represent the effects of practical choices by the outcomes of bets offered to agents characterized by two uncertainty management approaches: the subjective Bayesian approach and the Classical confidence interval approach. Experimental evaluation suggests that the confidence interval approach can outperform the subjective approach in the relatively short run.
△ Less
Submitted 23 January, 2013;
originally announced January 2013.
-
Recipe recommendation using ingredient networks
Authors:
Chun-Yuen Teng,
Yu-Ru Lin,
Lada A. Adamic
Abstract:
The recording and sharing of cooking recipes, a human activity dating back thousands of years, naturally became an early and prominent social use of the web. The resulting online recipe collections are repositories of ingredient combinations and cooking methods whose large-scale and variety yield interesting insights about both the fundamentals of cooking and user preferences. At the level of an i…
▽ More
The recording and sharing of cooking recipes, a human activity dating back thousands of years, naturally became an early and prominent social use of the web. The resulting online recipe collections are repositories of ingredient combinations and cooking methods whose large-scale and variety yield interesting insights about both the fundamentals of cooking and user preferences. At the level of an individual ingredient we measure whether it tends to be essential or can be dropped or added, and whether its quantity can be modified. We also construct two types of networks to capture the relationships between ingredients. The complement network captures which ingredients tend to co-occur frequently, and is composed of two large communities: one savory, the other sweet. The substitute network, derived from user-generated suggestions for modifications, can be decomposed into many communities of functionally equivalent ingredients, and captures users' preference for healthier variants of a recipe. Our experiments reveal that recipe ratings can be well predicted with features derived from combinations of ingredient networks and nutrition information.
△ Less
Submitted 21 May, 2012; v1 submitted 16 November, 2011;
originally announced November 2011.
-
Coevolution of Network Structure and Content
Authors:
Chun-Yuen Teng,
Liuling Gong,
Avishay Livne,
Celso Brunetti,
Lada A. Adamic
Abstract:
As individuals communicate, their exchanges form a dynamic network. We demonstrate, using time series analysis of communication in three online settings, that network structure alone can be highly revealing of the diversity and novelty of the information being communicated. Our approach uses both standard and novel network metrics to characterize how unexpected a network configuration is, and to c…
▽ More
As individuals communicate, their exchanges form a dynamic network. We demonstrate, using time series analysis of communication in three online settings, that network structure alone can be highly revealing of the diversity and novelty of the information being communicated. Our approach uses both standard and novel network metrics to characterize how unexpected a network configuration is, and to capture a network's ability to conduct information. We find that networks with a higher conductance in link structure exhibit higher information entropy, while unexpected network configurations can be tied to information novelty. We use a simulation model to explain the observed correspondence between the evolution of a network's structure and the information it carries.
△ Less
Submitted 21 May, 2012; v1 submitted 27 July, 2011;
originally announced July 2011.
-
Evaluating Defaults
Authors:
Henry E. Kyburg Jr.,
Choh Man Teng
Abstract:
We seek to find normative criteria of adequacy for nonmonotonic logic similar to the criterion of validity for deductive logic. Rather than stipulating that the conclusion of an inference be true in all models in which the premises are true, we require that the conclusion of a nonmonotonic inference be true in ``almost all'' models of a certain sort in which the premises are true. This ``certain…
▽ More
We seek to find normative criteria of adequacy for nonmonotonic logic similar to the criterion of validity for deductive logic. Rather than stipulating that the conclusion of an inference be true in all models in which the premises are true, we require that the conclusion of a nonmonotonic inference be true in ``almost all'' models of a certain sort in which the premises are true. This ``certain sort'' specification picks out the models that are relevant to the inference, taking into account factors such as specificity and vagueness, and previous inferences. The frequencies characterizing the relevant models reflect known frequencies in our actual world. The criteria of adequacy for a default inference can be extended by thresholding to criteria of adequacy for an extension. We show that this avoids the implausibilities that might otherwise result from the chaining of default inferences. The model proportions, when construed in terms of frequencies, provide a verifiable grounding of default rules, and can become the basis for generating default rules from statistics.
△ Less
Submitted 24 July, 2002;
originally announced July 2002.