-
EmT: A Novel Transformer for Generalized Cross-subject EEG Emotion Recognition
Authors:
Yi Ding,
Chengxuan Tong,
Shuailei Zhang,
Muyun Jiang,
Yong Li,
Kevin Lim Jun Liang,
Cuntai Guan
Abstract:
Integrating prior knowledge of neurophysiology into neural network architecture enhances the performance of emotion decoding. While numerous techniques emphasize learning spatial and short-term temporal patterns, there has been limited emphasis on capturing the vital long-term contextual information associated with emotional cognitive processes. In order to address this discrepancy, we introduce a…
▽ More
Integrating prior knowledge of neurophysiology into neural network architecture enhances the performance of emotion decoding. While numerous techniques emphasize learning spatial and short-term temporal patterns, there has been limited emphasis on capturing the vital long-term contextual information associated with emotional cognitive processes. In order to address this discrepancy, we introduce a novel transformer model called emotion transformer (EmT). EmT is designed to excel in both generalized cross-subject EEG emotion classification and regression tasks. In EmT, EEG signals are transformed into a temporal graph format, creating a sequence of EEG feature graphs using a temporal graph construction module (TGC). A novel residual multi-view pyramid GCN module (RMPG) is then proposed to learn dynamic graph representations for each EEG feature graph within the series, and the learned representations of each graph are fused into one token. Furthermore, we design a temporal contextual transformer module (TCT) with two types of token mixers to learn the temporal contextual information. Finally, the task-specific output module (TSO) generates the desired outputs. Experiments on four publicly available datasets show that EmT achieves higher results than the baseline methods for both EEG emotion classification and regression tasks. The code is available at https://github.com/yi-ding-cs/EmT.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Discursive objection strategies in online comments: Develo** a classification schema and validating its training
Authors:
Ashley L. Shea,
Aspen K. B. Omapang,
Ji Yong Cho,
Miryam Y. Ginsparg,
Natalie Bazarova,
Winice Hui,
René F. Kizilcec,
Chau Tong,
Drew Margolin
Abstract:
Most Americans agree that misinformation, hate speech and harassment are harmful and inadequately curbed on social media through current moderation practices. In this paper, we aim to understand the discursive strategies employed by people in response to harmful speech in news comments. We conducted a content analysis of more than 6500 comment replies to trending news videos on YouTube and Twitter…
▽ More
Most Americans agree that misinformation, hate speech and harassment are harmful and inadequately curbed on social media through current moderation practices. In this paper, we aim to understand the discursive strategies employed by people in response to harmful speech in news comments. We conducted a content analysis of more than 6500 comment replies to trending news videos on YouTube and Twitter and identified seven distinct discursive objection strategies (Study 1). We examined the frequency of each strategy's occurrence from the 6500 comment replies, as well as from a second sample of 2004 replies (Study 2). Together, these studies show that people deploy a diversity of discursive strategies when objecting to speech, and reputational attacks are the most common. The resulting classification scheme accounts for different theoretical approaches for expressing objections and offers a comprehensive perspective on grassroots efforts aimed at stop** offensive or problematic speech on campus.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
EEG-Deformer: A Dense Convolutional Transformer for Brain-computer Interfaces
Authors:
Yi Ding,
Yong Li,
Hao Sun,
Rui Liu,
Chengxuan Tong,
Cuntai Guan
Abstract:
Effectively learning the temporal dynamics in electroencephalogram (EEG) signals is challenging yet essential for decoding brain activities using brain-computer interfaces (BCIs). Although Transformers are popular for their long-term sequential learning ability in the BCI field, most methods combining Transformers with convolutional neural networks (CNNs) fail to capture the coarse-to-fine tempora…
▽ More
Effectively learning the temporal dynamics in electroencephalogram (EEG) signals is challenging yet essential for decoding brain activities using brain-computer interfaces (BCIs). Although Transformers are popular for their long-term sequential learning ability in the BCI field, most methods combining Transformers with convolutional neural networks (CNNs) fail to capture the coarse-to-fine temporal dynamics of EEG signals. To overcome this limitation, we introduce EEG-Deformer, which incorporates two main novel components into a CNN-Transformer: (1) a Hierarchical Coarse-to-Fine Transformer (HCT) block that integrates a Fine-grained Temporal Learning (FTL) branch into Transformers, effectively discerning coarse-to-fine temporal patterns; and (2) a Dense Information Purification (DIP) module, which utilizes multi-level, purified temporal information to enhance decoding accuracy. Comprehensive experiments on three representative cognitive tasks consistently verify the generalizability of our proposed EEG-Deformer, demonstrating that it either outperforms existing state-of-the-art methods or is comparable to them. Visualization results show that EEG-Deformer learns from neurophysiologically meaningful brain regions for the corresponding cognitive tasks. The source code can be found at https://github.com/yi-ding-cs/EEG-Deformer.
△ Less
Submitted 25 April, 2024;
originally announced May 2024.
-
CAPTURE-24: A large dataset of wrist-worn activity tracker data collected in the wild for human activity recognition
Authors:
Shing Chan,
Hang Yuan,
Catherine Tong,
Aidan Acquah,
Abram Schonfeldt,
Jonathan Gershuny,
Aiden Doherty
Abstract:
Existing activity tracker datasets for human activity recognition are typically obtained by having participants perform predefined activities in an enclosed environment under supervision. This results in small datasets with a limited number of activities and heterogeneity, lacking the mixed and nuanced movements normally found in free-living scenarios. As such, models trained on laboratory-style d…
▽ More
Existing activity tracker datasets for human activity recognition are typically obtained by having participants perform predefined activities in an enclosed environment under supervision. This results in small datasets with a limited number of activities and heterogeneity, lacking the mixed and nuanced movements normally found in free-living scenarios. As such, models trained on laboratory-style datasets may not generalise out of sample. To address this problem, we introduce a new dataset involving wrist-worn accelerometers, wearable cameras, and sleep diaries, enabling data collection for over 24 hours in a free-living setting. The result is CAPTURE-24, a large activity tracker dataset collected in the wild from 151 participants, amounting to 3883 hours of accelerometer data, of which 2562 hours are annotated. CAPTURE-24 is two to three orders of magnitude larger than existing publicly available datasets, which is critical to develo** accurate human activity recognition models.
△ Less
Submitted 29 February, 2024;
originally announced February 2024.
-
Autonomous Vehicle Patrolling Through Deep Reinforcement Learning: Learning to Communicate and Cooperate
Authors:
Chenhao Tong,
Maria A. Rodriguez,
Richard O. Sinnott
Abstract:
Autonomous vehicles are suited for continuous area patrolling problems. Finding an optimal patrolling strategy can be challenging due to unknown environmental factors, such as wind or landscape; or autonomous vehicles' constraints, such as limited battery life or hardware failures. Importantly, patrolling large areas often requires multiple agents to collectively coordinate their actions. However,…
▽ More
Autonomous vehicles are suited for continuous area patrolling problems. Finding an optimal patrolling strategy can be challenging due to unknown environmental factors, such as wind or landscape; or autonomous vehicles' constraints, such as limited battery life or hardware failures. Importantly, patrolling large areas often requires multiple agents to collectively coordinate their actions. However, an optimal coordination strategy is often non-trivial to be manually defined due to the complex nature of patrolling environments. In this paper, we consider a patrolling problem with environmental factors, agent limitations, and three typical cooperation problems -- collision avoidance, congestion avoidance, and patrolling target negotiation. We propose a multi-agent reinforcement learning solution based on a reinforced inter-agent learning (RIAL) method. With this approach, agents are trained to develop their own communication protocol to cooperate during patrolling where faults can and do occur. The solution is validated through simulation experiments and is compared with several state-of-the-art patrolling solutions from different perspectives, including the overall patrol performance, the collision avoidance performance, the efficiency of battery recharging strategies, and the overall fault tolerance.
△ Less
Submitted 28 January, 2024;
originally announced February 2024.
-
An Efficient Intelligent Semi-Automated Warehouse Inventory Stocktaking System
Authors:
Chunan Tong
Abstract:
In the context of evolving supply chain management, the significance of efficient inventory management has grown substantially for businesses. However, conventional manual and experience-based approaches often struggle to meet the complexities of modern market demands. This research introduces an intelligent inventory management system to address challenges related to inaccurate data, delayed moni…
▽ More
In the context of evolving supply chain management, the significance of efficient inventory management has grown substantially for businesses. However, conventional manual and experience-based approaches often struggle to meet the complexities of modern market demands. This research introduces an intelligent inventory management system to address challenges related to inaccurate data, delayed monitoring, and overreliance on subjective experience in forecasting. The proposed system integrates bar code and distributed flutter application technologies for intelligent perception, alongside comprehensive big data analytics to enable data-driven decision-making. Through meticulous analysis, system design, critical technology exploration, and simulation validation, the effectiveness of the proposed system is successfully demonstrated. The intelligent system facilitates second-level monitoring, high-frequency checks, and artificial intelligence-driven forecasting, consequently enhancing the automation, precision, and intelligence of inventory management. This system contributes to cost reduction and optimized inventory sizes through accurate predictions and informed decisions, ultimately achieving a mutually beneficial scenario. The outcomes of this research offer
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
Detect Depression from Social Networks with Sentiment Knowledge Sharing
Authors:
Yan Shi,
Yao Tian,
Chengwei Tong,
Chunyan Zhu,
Qianqian Li,
Mengzhu Zhang,
Wei Zhao,
Yong Liao,
Pengyuan Zhou
Abstract:
Social network plays an important role in propagating people's viewpoints, emotions, thoughts, and fears. Notably, following lockdown periods during the COVID-19 pandemic, the issue of depression has garnered increasing attention, with a significant portion of individuals resorting to social networks as an outlet for expressing emotions. Using deep learning techniques to discern potential signs of…
▽ More
Social network plays an important role in propagating people's viewpoints, emotions, thoughts, and fears. Notably, following lockdown periods during the COVID-19 pandemic, the issue of depression has garnered increasing attention, with a significant portion of individuals resorting to social networks as an outlet for expressing emotions. Using deep learning techniques to discern potential signs of depression from social network messages facilitates the early identification of mental health conditions. Current efforts in detecting depression through social networks typically rely solely on analyzing the textual content, overlooking other potential information. In this work, we conduct a thorough investigation that unveils a strong correlation between depression and negative emotional states. The integration of such associations as external knowledge can provide valuable insights for detecting depression. Accordingly, we propose a multi-task training framework, DeSK, which utilizes shared sentiment knowledge to enhance the efficacy of depression detection. Experiments conducted on both Chinese and English datasets demonstrate the cross-lingual effectiveness of DeSK.
△ Less
Submitted 13 June, 2023;
originally announced June 2023.
-
A Perspective Study on Chinese Social Media regarding LLM for Education and Beyond
Authors:
Yao Tian,
Chengwei Tong,
Lik-Hang Lee,
Reza Hadi Mogavi,
Yong Liao,
Pengyuan Zhou
Abstract:
The application of AI-powered tools has piqued the interest of many fields, particularly in the academic community. This study uses ChatGPT, currently the most powerful and popular AI tool, as a representative example to analyze how the Chinese public perceives the potential of large language models (LLMs) for educational and general purposes. Although facing accessibility challenges, we found tha…
▽ More
The application of AI-powered tools has piqued the interest of many fields, particularly in the academic community. This study uses ChatGPT, currently the most powerful and popular AI tool, as a representative example to analyze how the Chinese public perceives the potential of large language models (LLMs) for educational and general purposes. Although facing accessibility challenges, we found that the number of discussions on ChatGPT per month is 16 times that of Ernie Bot developed by Baidu, the most popular alternative product to ChatGPT in the mainland, making ChatGPT a more suitable subject for our analysis. The study also serves as the first effort to investigate the changes in public opinion as AI technologies become more advanced and intelligent. The analysis reveals that, upon first encounters with advanced AI that was not yet highly capable, some social media users believed that AI advancements would benefit education and society, while others feared that advanced AI, like ChatGPT, would make humans feel inferior and lead to problems such as cheating and a decline in moral principles. The majority of users remained neutral. Interestingly, with the rapid development and improvement of AI capabilities, public attitudes have tended to shift in a positive direction. We present a thorough analysis of the trending shift and a roadmap to ensure the ethical application of ChatGPT-like models in education and beyond.
△ Less
Submitted 31 May, 2024; v1 submitted 7 June, 2023;
originally announced June 2023.
-
An Energy-aware and Fault-tolerant Deep Reinforcement Learning based approach for Multi-agent Patrolling Problems
Authors:
Chenhao Tong,
Aaron Harwood,
Maria A. Rodriguez,
Richard O. Sinnott
Abstract:
Autonomous vehicles are suited for continuous area patrolling problems. However, finding an optimal patrolling strategy can be challenging for many reasons. Firstly, patrolling environments are often complex and can include unknown environmental factors, such as wind or landscape. Secondly, autonomous vehicles can have failures or hardware constraints, such as limited battery life. Importantly, pa…
▽ More
Autonomous vehicles are suited for continuous area patrolling problems. However, finding an optimal patrolling strategy can be challenging for many reasons. Firstly, patrolling environments are often complex and can include unknown environmental factors, such as wind or landscape. Secondly, autonomous vehicles can have failures or hardware constraints, such as limited battery life. Importantly, patrolling large areas often requires multiple agents that need to collectively coordinate their actions. In this work, we consider these limitations and propose an approach based on model-free, deep multi-agent reinforcement learning. In this approach, the agents are trained to patrol an environment with various unknown dynamics and factors. They can automatically recharge themselves to support continuous collective patrolling. A distributed homogeneous multi-agent architecture is proposed, where all patrolling agents execute identical policies locally based on their local observations and shared location information. This architecture provides a patrolling system that can tolerate agent failures and allow supplementary agents to be added to replace failed agents or to increase the overall patrol performance. The solution is validated through simulation experiments from multiple perspectives, including the overall patrol performance, the efficiency of battery recharging strategies, the overall fault tolerance, and the ability to cooperate with supplementary agents.
△ Less
Submitted 8 June, 2023; v1 submitted 15 December, 2022;
originally announced December 2022.
-
Learning Prototype via Placeholder for Zero-shot Recognition
Authors:
Zaiquan Yang,
Yang Liu,
Wenjia Xu,
Chong Huang,
Lei Zhou,
Chao Tong
Abstract:
Zero-shot learning (ZSL) aims to recognize unseen classes by exploiting semantic descriptions shared between seen classes and unseen classes. Current methods show that it is effective to learn visual-semantic alignment by projecting semantic embeddings into the visual space as class prototypes. However, such a projection function is only concerned with seen classes. When applied to unseen classes,…
▽ More
Zero-shot learning (ZSL) aims to recognize unseen classes by exploiting semantic descriptions shared between seen classes and unseen classes. Current methods show that it is effective to learn visual-semantic alignment by projecting semantic embeddings into the visual space as class prototypes. However, such a projection function is only concerned with seen classes. When applied to unseen classes, the prototypes often perform suboptimally due to domain shift. In this paper, we propose to learn prototypes via placeholders, termed LPL, to eliminate the domain shift between seen and unseen classes. Specifically, we combine seen classes to hallucinate new classes which play as placeholders of the unseen classes in the visual and semantic space. Placed between seen classes, the placeholders encourage prototypes of seen classes to be highly dispersed. And more space is spared for the insertion of well-separated unseen ones. Empirically, well-separated prototypes help counteract visual-semantic misalignment caused by domain shift. Furthermore, we exploit a novel semantic-oriented fine-tuning to guarantee the semantic reliability of placeholders. Extensive experiments on five benchmark datasets demonstrate the significant performance gain of LPL over the state-of-the-art methods. Code is available at https://github.com/zaiquanyang/LPL.
△ Less
Submitted 29 July, 2022;
originally announced July 2022.
-
Adaptive Weighted Nonnegative Matrix Factorization for Robust Feature Representation
Authors:
Tingting Shen,
Junhang Li,
Can Tong,
Qiang He,
Chen Li,
Yudong Yao,
Yueyang Teng
Abstract:
Nonnegative matrix factorization (NMF) has been widely used to dimensionality reduction in machine learning. However, the traditional NMF does not properly handle outliers, so that it is sensitive to noise. In order to improve the robustness of NMF, this paper proposes an adaptive weighted NMF, which introduces weights to emphasize the different importance of each data point, thus the algorithmic…
▽ More
Nonnegative matrix factorization (NMF) has been widely used to dimensionality reduction in machine learning. However, the traditional NMF does not properly handle outliers, so that it is sensitive to noise. In order to improve the robustness of NMF, this paper proposes an adaptive weighted NMF, which introduces weights to emphasize the different importance of each data point, thus the algorithmic sensitivity to noisy data is decreased. It is very different from the existing robust NMFs that use a slow growth similarity measure. Specifically, two strategies are proposed to achieve this: fuzzier weighted technique and entropy weighted regularized technique, and both of them lead to an iterative solution with a simple form. Experimental results showed that new methods have more robust feature representation on several real datasets with noise than exsiting methods.
△ Less
Submitted 7 June, 2022;
originally announced June 2022.
-
Self-supervised Learning for Human Activity Recognition Using 700,000 Person-days of Wearable Data
Authors:
Hang Yuan,
Shing Chan,
Andrew P. Creagh,
Catherine Tong,
Aidan Acquah,
David A. Clifton,
Aiden Doherty
Abstract:
Advances in deep learning for human activity recognition have been relatively limited due to the lack of large labelled datasets. In this study, we leverage self-supervised learning techniques on the UK-Biobank activity tracker dataset--the largest of its kind to date--containing more than 700,000 person-days of unlabelled wearable sensor data. Our resulting activity recognition model consistently…
▽ More
Advances in deep learning for human activity recognition have been relatively limited due to the lack of large labelled datasets. In this study, we leverage self-supervised learning techniques on the UK-Biobank activity tracker dataset--the largest of its kind to date--containing more than 700,000 person-days of unlabelled wearable sensor data. Our resulting activity recognition model consistently outperformed strong baselines across seven benchmark datasets, with an F1 relative improvement of 2.5%-100% (median 18.4%), the largest improvements occurring in the smaller datasets. In contrast to previous studies, our results generalise across external datasets, devices, and environments. Our open-source model will help researchers and developers to build customisable and generalisable activity classifiers with high performance.
△ Less
Submitted 20 June, 2024; v1 submitted 6 June, 2022;
originally announced June 2022.
-
Subspace Nonnegative Matrix Factorization for Feature Representation
Authors:
Junhang Li,
Jiao Wei,
Can Tong,
Tingting Shen,
Yuchen Liu,
Chen Li,
Shouliang Qi,
Yudong Yao,
Yueyang Teng
Abstract:
Traditional nonnegative matrix factorization (NMF) learns a new feature representation on the whole data space, which means treating all features equally. However, a subspace is often sufficient for accurate representation in practical applications, and redundant features can be invalid or even harmful. For example, if a camera has some sensors destroyed, then the corresponding pixels in the photo…
▽ More
Traditional nonnegative matrix factorization (NMF) learns a new feature representation on the whole data space, which means treating all features equally. However, a subspace is often sufficient for accurate representation in practical applications, and redundant features can be invalid or even harmful. For example, if a camera has some sensors destroyed, then the corresponding pixels in the photos from this camera are not helpful to identify the content, which means only the subspace consisting of remaining pixels is worthy of attention. This paper proposes a new NMF method by introducing adaptive weights to identify key features in the original space so that only a subspace involves generating the new representation. Two strategies are proposed to achieve this: the fuzzier weighted technique and entropy regularized weighted technique, both of which result in an iterative solution with a simple form. Experimental results on several real-world datasets demonstrated that the proposed methods can generate a more accurate feature representation than existing methods. The code developed in this study is available at https://github.com/WNMF1/FWNMF-ERWNMF.
△ Less
Submitted 18 April, 2022;
originally announced April 2022.
-
"Vironment": An Art of Wearable Social Distancing
Authors:
Steve Mann,
Cayden Pierce,
Christopher Tong,
Christina Mann
Abstract:
"Vironment" is a series of art pieces, social commentary, technology, etc., based on wearable health technologies of social-distancing, culminating in a social-distancing device that takes the familiar world of security and surveillance technologies that surround us and re-situates it on the body of the wearer (technologies that become part of us). This piece also introduces a conceptual framework…
▽ More
"Vironment" is a series of art pieces, social commentary, technology, etc., based on wearable health technologies of social-distancing, culminating in a social-distancing device that takes the familiar world of security and surveillance technologies that surround us and re-situates it on the body of the wearer (technologies that become part of us). This piece also introduces a conceptual framework for (1) the sensing of the self together with (2) sensing of others and (3) sensing of the environment around us.
△ Less
Submitted 2 December, 2021; v1 submitted 30 November, 2021;
originally announced December 2021.
-
An Entropy Weighted Nonnegative Matrix Factorization Algorithm for Feature Representation
Authors:
Jiao Wei,
Can Tong,
Bingxue Wu,
Qiang He,
Shouliang Qi,
Yudong Yao,
Yueyang Teng
Abstract:
Nonnegative matrix factorization (NMF) has been widely used to learn low-dimensional representations of data. However, NMF pays the same attention to all attributes of a data point, which inevitably leads to inaccurate representation. For example, in a human-face data set, if an image contains a hat on the head, the hat should be removed or the importance of its corresponding attributes should be…
▽ More
Nonnegative matrix factorization (NMF) has been widely used to learn low-dimensional representations of data. However, NMF pays the same attention to all attributes of a data point, which inevitably leads to inaccurate representation. For example, in a human-face data set, if an image contains a hat on the head, the hat should be removed or the importance of its corresponding attributes should be decreased during matrix factorizing. This paper proposes a new type of NMF called entropy weighted NMF (EWNMF), which uses an optimizable weight for each attribute of each data point to emphasize their importance. This process is achieved by adding an entropy regularizer to the cost function and then using the Lagrange multiplier method to solve the problem. Experimental results with several data sets demonstrate the feasibility and effectiveness of the proposed method. We make our code available at https://github.com/Poisson-EM/Entropy-weighted-NMF.
△ Less
Submitted 27 November, 2021;
originally announced November 2021.
-
Tele-Operated Oropharyngeal Swab (TOOS) RobotEnabled by TSS Soft Hand for Safe and EffectiveCOVID-19 OP Sampling
Authors:
Wei Chen,
Jianshu Zhou,
Shing Shin Cheng,
Yiang Lu,
Fangxun Zhong,
Yuan Gao,
Yaqing Wang,
Lingbin Xue,
Michael C. F. Tong,
Yun-Hui Liu
Abstract:
The COVID-19 pandemic has imposed serious challenges in multiple perspectives of human life. To diagnose COVID-19, oropharyngeal swab (OP SWAB) sampling is generally applied for viral nucleic acid (VNA) specimen collection. However, manual sampling exposes medical staff to a high risk of infection. Robotic sampling is promising to mitigate this risk to the minimum level, but traditional robot suff…
▽ More
The COVID-19 pandemic has imposed serious challenges in multiple perspectives of human life. To diagnose COVID-19, oropharyngeal swab (OP SWAB) sampling is generally applied for viral nucleic acid (VNA) specimen collection. However, manual sampling exposes medical staff to a high risk of infection. Robotic sampling is promising to mitigate this risk to the minimum level, but traditional robot suffers from safety, cost, and control complexity issues for wide-scale deployment. In this work, we present soft robotic technology is promising to achieve robotic OP swab sampling with excellent swab manipulability in a confined oral space and works as dexterous as existing manual approach. This is enabled by a novel Tstone soft (TSS) hand, consisting of a soft wrist and a soft gripper, designed from human sampling observation and bio-inspiration. TSS hand is in a compact size, exerts larger workspace, and achieves comparable dexterity compared to human hand. The soft wrist is capable of agile omnidirectional bending with adjustable stiffness. The terminal soft gripper is effective for disposable swab pinch and replacement. The OP sampling force is easy to be maintained in a safe and comfortable range (throat sampling comfortable region) under a hybrid motion and stiffness virtual fixture-based controller. A dedicated 3 DOFs RCM platform is used for TSS hand global positioning. Design, modeling, and control of the TSS hand are discussed in detail with dedicated experimental validations. A sampling test based on human tele-operation is processed on the oral cavity model with excellent success rate. The proposed TOOS robot demonstrates a highly promising solution for tele-operated, safe, cost-effective, and quick deployable COVID-19 OP swab sampling.
△ Less
Submitted 20 September, 2021;
originally announced September 2021.
-
Similarity Embedding Networks for Robust Human Activity Recognition
Authors:
Chenglin Li,
Carrie Lu Tong,
Di Niu,
Bei Jiang,
Xiao Zuo,
Lei Cheng,
Jian Xiong,
Jianming Yang
Abstract:
Deep learning models for human activity recognition (HAR) based on sensor data have been heavily studied recently. However, the generalization ability of deep models on complex real-world HAR data is limited by the availability of high-quality labeled activity data, which are hard to obtain. In this paper, we design a similarity embedding neural network that maps input sensor signals onto real vec…
▽ More
Deep learning models for human activity recognition (HAR) based on sensor data have been heavily studied recently. However, the generalization ability of deep models on complex real-world HAR data is limited by the availability of high-quality labeled activity data, which are hard to obtain. In this paper, we design a similarity embedding neural network that maps input sensor signals onto real vectors through carefully designed convolutional and LSTM layers. The embedding network is trained with a pairwise similarity loss, encouraging the clustering of samples from the same class in the embedded real space, and can be effectively trained on a small dataset and even on a noisy dataset with mislabeled samples. Based on the learned embeddings, we further propose both nonparametric and parametric approaches for activity recognition. Extensive evaluation based on two public datasets has shown that the proposed similarity embedding network significantly outperforms state-of-the-art deep models on HAR classification tasks, is robust to mislabeled samples in the training set, and can also be used to effectively denoise a noisy dataset.
△ Less
Submitted 31 May, 2021;
originally announced June 2021.
-
LGGNet: Learning from Local-Global-Graph Representations for Brain-Computer Interface
Authors:
Yi Ding,
Neethu Robinson,
Chengxuan Tong,
Qiuhao Zeng,
Cuntai Guan
Abstract:
Neuropsychological studies suggest that co-operative activities among different brain functional areas drive high-level cognitive processes. To learn the brain activities within and among different functional areas of the brain, we propose LGGNet, a novel neurologically inspired graph neural network, to learn local-global-graph representations of electroencephalography (EEG) for Brain-Computer Int…
▽ More
Neuropsychological studies suggest that co-operative activities among different brain functional areas drive high-level cognitive processes. To learn the brain activities within and among different functional areas of the brain, we propose LGGNet, a novel neurologically inspired graph neural network, to learn local-global-graph representations of electroencephalography (EEG) for Brain-Computer Interface (BCI). The input layer of LGGNet comprises a series of temporal convolutions with multi-scale 1D convolutional kernels and kernel-level attentive fusion. It captures temporal dynamics of EEG which then serves as input to the proposed local and global graph-filtering layers. Using a defined neurophysiologically meaningful set of local and global graphs, LGGNet models the complex relations within and among functional areas of the brain. Under the robust nested cross-validation settings, the proposed method is evaluated on three publicly available datasets for four types of cognitive classification tasks, namely, the attention, fatigue, emotion, and preference classification tasks. LGGNet is compared with state-of-the-art methods, such as DeepConvNet, EEGNet, R2G-STNN, TSception, RGNN, AMCNN-DGCN, HRNN and GraphNet. The results show that LGGNet outperforms these methods, and the improvements are statistically significant (p<0.05) in most cases. The results show that bringing neuroscience prior knowledge into neural network design yields an improvement of classification performance. The source code can be found at https://github.com/yi-ding-cs/LGG
△ Less
Submitted 5 December, 2022; v1 submitted 5 May, 2021;
originally announced May 2021.
-
Predicting Patient Outcomes with Graph Representation Learning
Authors:
Emma Rocheteau,
Catherine Tong,
Petar Veličković,
Nicholas Lane,
Pietro Liò
Abstract:
Recent work on predicting patient outcomes in the Intensive Care Unit (ICU) has focused heavily on the physiological time series data, largely ignoring sparse data such as diagnoses and medications. When they are included, they are usually concatenated in the late stages of a model, which may struggle to learn from rarer disease patterns. Instead, we propose a strategy to exploit diagnoses as rela…
▽ More
Recent work on predicting patient outcomes in the Intensive Care Unit (ICU) has focused heavily on the physiological time series data, largely ignoring sparse data such as diagnoses and medications. When they are included, they are usually concatenated in the late stages of a model, which may struggle to learn from rarer disease patterns. Instead, we propose a strategy to exploit diagnoses as relational information by connecting similar patients in a graph. To this end, we propose LSTM-GNN for patient outcome prediction tasks: a hybrid model combining Long Short-Term Memory networks (LSTMs) for extracting temporal features and Graph Neural Networks (GNNs) for extracting the patient neighbourhood information. We demonstrate that LSTM-GNNs outperform the LSTM-only baseline on length of stay prediction tasks on the eICU database. More generally, our results indicate that exploiting information from neighbouring patient cases using graph neural networks is a promising research direction, yielding tangible returns in supervised learning performance on Electronic Health Records.
△ Less
Submitted 11 January, 2021;
originally announced January 2021.
-
RainBench: Towards Global Precipitation Forecasting from Satellite Imagery
Authors:
Christian Schroeder de Witt,
Catherine Tong,
Valentina Zantedeschi,
Daniele De Martini,
Freddie Kalaitzis,
Matthew Chantry,
Duncan Watson-Parris,
Piotr Bilinski
Abstract:
Extreme precipitation events, such as violent rainfall and hail storms, routinely ravage economies and livelihoods around the develo** world. Climate change further aggravates this issue. Data-driven deep learning approaches could widen the access to accurate multi-day forecasts, to mitigate against such events. However, there is currently no benchmark dataset dedicated to the study of global pr…
▽ More
Extreme precipitation events, such as violent rainfall and hail storms, routinely ravage economies and livelihoods around the develo** world. Climate change further aggravates this issue. Data-driven deep learning approaches could widen the access to accurate multi-day forecasts, to mitigate against such events. However, there is currently no benchmark dataset dedicated to the study of global precipitation forecasts. In this paper, we introduce \textbf{RainBench}, a new multi-modal benchmark dataset for data-driven precipitation forecasting. It includes simulated satellite data, a selection of relevant meteorological data from the ERA5 reanalysis product, and IMERG precipitation data. We also release \textbf{PyRain}, a library to process large precipitation datasets efficiently. We present an extensive analysis of our novel dataset and establish baseline results for two benchmark medium-range precipitation forecasting tasks. Finally, we discuss existing data-driven weather forecasting methodologies and suggest future research avenues.
△ Less
Submitted 17 December, 2020;
originally announced December 2020.
-
Small Private Online Judge: A New Tool for Empirical Education Research
Authors:
Yunchi Zhu,
Zuohan Zhao,
Chengda Tong,
Xiaojun Xia
Abstract:
This paper puts forward the concept of Small Private Online Judge (SPOJ). Compared with Massive Open Online Judge (MOOJ), SPOJ has advantages in structured data acquisition of students' virtual behavior for its specific function and tight coupling with the classroom. SPOJ-based empirical education research can be conducted within "Acquisition-Analysis-Application" (3A) Framework. The case study of…
▽ More
This paper puts forward the concept of Small Private Online Judge (SPOJ). Compared with Massive Open Online Judge (MOOJ), SPOJ has advantages in structured data acquisition of students' virtual behavior for its specific function and tight coupling with the classroom. SPOJ-based empirical education research can be conducted within "Acquisition-Analysis-Application" (3A) Framework. The case study of a SPOJ program clarifies the standard pattern of SPOJ-based 3A research and highlights the emergence of education-intelligence concept. The challenges of SPOJ-based empirical education research and implications of SPOJ are also discussed.
△ Less
Submitted 25 September, 2020;
originally announced October 2020.
-
CUCHILD: A Large-Scale Cantonese Corpus of Child Speech for Phonology and Articulation Assessment
Authors:
Si-Ioi Ng,
Cymie Wing-Yee Ng,
Jiarui Wang,
Tan Lee,
Kathy Yuet-Sheung Lee,
Michael Chi-Fai Tong
Abstract:
This paper describes the design and development of CUCHILD, a large-scale Cantonese corpus of child speech. The corpus contains spoken words collected from 1,986 child speakers aged from 3 to 6 years old. The speech materials include 130 words of 1 to 4 syllables in length. The speakers cover both typically develo** (TD) children and children with speech disorder. The intended use of the corpus…
▽ More
This paper describes the design and development of CUCHILD, a large-scale Cantonese corpus of child speech. The corpus contains spoken words collected from 1,986 child speakers aged from 3 to 6 years old. The speech materials include 130 words of 1 to 4 syllables in length. The speakers cover both typically develo** (TD) children and children with speech disorder. The intended use of the corpus is to support scientific and clinical research, as well as technology development related to child speech assessment. The design of the corpus, including selection of words, participants recruitment, data acquisition process, and data pre-processing are described in detail. The results of acoustical analysis are presented to illustrate the properties of child speech. Potential applications of the corpus in automatic speech recognition, phonological error detection and speaker diarization are also discussed.
△ Less
Submitted 7 August, 2020;
originally announced August 2020.
-
IMUTube: Automatic Extraction of Virtual on-body Accelerometry from Video for Human Activity Recognition
Authors:
Hyeokhyen Kwon,
Catherine Tong,
Harish Haresamudram,
Yan Gao,
Gregory D. Abowd,
Nicholas D. Lane,
Thomas Ploetz
Abstract:
The lack of large-scale, labeled data sets impedes progress in develo** robust and generalized predictive models for on-body sensor-based human activity recognition (HAR). Labeled data in human activity recognition is scarce and hard to come by, as sensor data collection is expensive, and the annotation is time-consuming and error-prone. To address this problem, we introduce IMUTube, an automate…
▽ More
The lack of large-scale, labeled data sets impedes progress in develo** robust and generalized predictive models for on-body sensor-based human activity recognition (HAR). Labeled data in human activity recognition is scarce and hard to come by, as sensor data collection is expensive, and the annotation is time-consuming and error-prone. To address this problem, we introduce IMUTube, an automated processing pipeline that integrates existing computer vision and signal processing techniques to convert videos of human activity into virtual streams of IMU data. These virtual IMU streams represent accelerometry at a wide variety of locations on the human body. We show how the virtually-generated IMU data improves the performance of a variety of models on known HAR datasets. Our initial results are very promising, but the greater promise of this work lies in a collective approach by the computer vision, signal processing, and activity recognition communities to extend this work in ways that we outline. This should lead to on-body, sensor-based HAR becoming yet another success story in large-dataset breakthroughs in recognition.
△ Less
Submitted 4 August, 2020; v1 submitted 29 May, 2020;
originally announced June 2020.
-
DISIR: Deep Image Segmentation with Interactive Refinement
Authors:
Gaston Lenczner,
Bertrand Le Saux,
Nicola Luminari,
Adrien Chan Hon Tong,
Guy Le Besnerais
Abstract:
This paper presents an interactive approach for multi-class segmentation of aerial images. Precisely, it is based on a deep neural network which exploits both RGB images and annotations. Starting from an initial output based on the image only, our network then interactively refines this segmentation map using a concatenation of the image and user annotations. Importantly, user annotations modify t…
▽ More
This paper presents an interactive approach for multi-class segmentation of aerial images. Precisely, it is based on a deep neural network which exploits both RGB images and annotations. Starting from an initial output based on the image only, our network then interactively refines this segmentation map using a concatenation of the image and user annotations. Importantly, user annotations modify the inputs of the network - not its weights - enabling a fast and smooth process. Through experiments on two public aerial datasets, we show that user annotations are extremely rewarding: each click corrects roughly 5000 pixels. We analyze the impact of different aspects of our framework such as the representation of the annotations, the volume of training data or the network architecture. Code is available at https://github.com/delair-ai/DISIR.
△ Less
Submitted 20 August, 2020; v1 submitted 31 March, 2020;
originally announced March 2020.
-
Are Accelerometers for Activity Recognition a Dead-end?
Authors:
Catherine Tong,
Shyam A. Tailor,
Nicholas D. Lane
Abstract:
Accelerometer-based (and by extension other inertial sensors) research for Human Activity Recognition (HAR) is a dead-end. This sensor does not offer enough information for us to progress in the core domain of HAR - to recognize everyday activities from sensor data. Despite continued and prolonged efforts in improving feature engineering and machine learning models, the activities that we can reco…
▽ More
Accelerometer-based (and by extension other inertial sensors) research for Human Activity Recognition (HAR) is a dead-end. This sensor does not offer enough information for us to progress in the core domain of HAR - to recognize everyday activities from sensor data. Despite continued and prolonged efforts in improving feature engineering and machine learning models, the activities that we can recognize reliably have only expanded slightly and many of the same flaws of early models are still present today. Instead of relying on acceleration data, we should instead consider modalities with much richer information - a logical choice are images. With the rapid advance in image sensing hardware and modelling techniques, we believe that a widespread adoption of image sensors will open many opportunities for accurate and robust inference across a wide spectrum of human activities.
In this paper, we make the case for imagers in place of accelerometers as the default sensor for human activity recognition. Our review of past works has led to the observation that progress in HAR had stalled, caused by our reliance on accelerometers. We further argue for the suitability of images for activity recognition by illustrating their richness of information and the marked progress in computer vision. Through a feasibility analysis, we find that deploying imagers and CNNs on device poses no substantial burden on modern mobile hardware. Overall, our work highlights the need to move away from accelerometers and calls for further exploration of using imagers for activity recognition.
△ Less
Submitted 30 January, 2020; v1 submitted 22 January, 2020;
originally announced January 2020.
-
Fast Prototy** a Dialogue Comprehension System for Nurse-Patient Conversations on Symptom Monitoring
Authors:
Zhengyuan Liu,
Hazel Lim,
Nur Farah Ain Binte Suhaimi,
Shao Chuen Tong,
Sharon Ong,
Angela Ng,
Sheldon Lee,
Michael R. Macdonald,
Savitha Ramasamy,
Pavitra Krishnaswamy,
Wai Leng Chow,
Nancy F. Chen
Abstract:
Data for human-human spoken dialogues for research and development are currently very limited in quantity, variety, and sources; such data are even scarcer in healthcare. In this work, we investigate fast prototy** of a dialogue comprehension system by leveraging on minimal nurse-to-patient conversations. We propose a framework inspired by nurse-initiated clinical symptom monitoring conversation…
▽ More
Data for human-human spoken dialogues for research and development are currently very limited in quantity, variety, and sources; such data are even scarcer in healthcare. In this work, we investigate fast prototy** of a dialogue comprehension system by leveraging on minimal nurse-to-patient conversations. We propose a framework inspired by nurse-initiated clinical symptom monitoring conversations to construct a simulated human-human dialogue dataset, embodying linguistic characteristics of spoken interactions like thinking aloud, self-contradiction, and topic drift. We then adopt an established bidirectional attention pointer network on this simulated dataset, achieving more than 80% F1 score on a held-out test set from real-world nurse-to-patient conversations. The ability to automatically comprehend conversations in the healthcare domain by exploiting only limited data has implications for improving clinical workflows through red flag symptom detection and triaging capabilities. We demonstrate the feasibility for efficient and effective extraction, retrieval and comprehension of symptom checking information discussed in multi-turn human-human spoken conversations.
△ Less
Submitted 5 April, 2019; v1 submitted 8 March, 2019;
originally announced March 2019.
-
Effects of Some Lattice Reductions on the Success Probability of the Zero-Forcing Decoder
Authors:
**ming Wen,
Chao Tong,
Shi Bai
Abstract:
Zero-forcing (ZF) decoder is a commonly used approximation solution of the integer least squares problem which arises in communications and many other applications. Numerically simulations have shown that the LLL reduction can usually improve the success probability $P_{ZF}$ of the ZF decoder. In this paper, we first rigorously show that both SQRD and V-BLAST, two commonly used lattice reductions,…
▽ More
Zero-forcing (ZF) decoder is a commonly used approximation solution of the integer least squares problem which arises in communications and many other applications. Numerically simulations have shown that the LLL reduction can usually improve the success probability $P_{ZF}$ of the ZF decoder. In this paper, we first rigorously show that both SQRD and V-BLAST, two commonly used lattice reductions, have no effect on $P_{ZF}$. Then, we show that LLL reduction can improve $P_{ZF}$ when $n=2$, we also analyze how the parameter $δ$ in the LLL reduction affects the enhancement of $P_{ZF}$. Finally, an example is given which shows that the LLL reduction decrease $P_{ZF}$ when $n\geq3$.
△ Less
Submitted 23 July, 2018; v1 submitted 10 July, 2018;
originally announced July 2018.
-
Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks
Authors:
Martin Engelcke,
Dushyant Rao,
Dominic Zeng Wang,
Chi Hay Tong,
Ingmar Posner
Abstract:
This paper proposes a computationally efficient approach to detecting objects natively in 3D point clouds using convolutional neural networks (CNNs). In particular, this is achieved by leveraging a feature-centric voting scheme to implement novel convolutional layers which explicitly exploit the sparsity encountered in the input. To this end, we examine the trade-off between accuracy and speed for…
▽ More
This paper proposes a computationally efficient approach to detecting objects natively in 3D point clouds using convolutional neural networks (CNNs). In particular, this is achieved by leveraging a feature-centric voting scheme to implement novel convolutional layers which explicitly exploit the sparsity encountered in the input. To this end, we examine the trade-off between accuracy and speed for different architectures and additionally propose to use an L1 penalty on the filter activations to further encourage sparsity in the intermediate representations. To the best of our knowledge, this is the first work to propose sparse convolutional layers and L1 regularisation for efficient large-scale processing of 3D data. We demonstrate the efficacy of our approach on the KITTI object detection benchmark and show that Vote3Deep models with as few as three layers outperform the previous state of the art in both laser and laser-vision based approaches by margins of up to 40% while remaining highly competitive in terms of processing time.
△ Less
Submitted 5 March, 2017; v1 submitted 21 September, 2016;
originally announced September 2016.
-
Batch Nonlinear Continuous-Time Trajectory Estimation as Exactly Sparse Gaussian Process Regression
Authors:
Sean Anderson,
Timothy D. Barfoot,
Chi Hay Tong,
Simo Särkkä
Abstract:
In this paper, we revisit batch state estimation through the lens of Gaussian process (GP) regression. We consider continuous-discrete estimation problems wherein a trajectory is viewed as a one-dimensional GP, with time as the independent variable. Our continuous-time prior can be defined by any nonlinear, time-varying stochastic differential equation driven by white noise; this allows the possib…
▽ More
In this paper, we revisit batch state estimation through the lens of Gaussian process (GP) regression. We consider continuous-discrete estimation problems wherein a trajectory is viewed as a one-dimensional GP, with time as the independent variable. Our continuous-time prior can be defined by any nonlinear, time-varying stochastic differential equation driven by white noise; this allows the possibility of smoothing our trajectory estimates using a variety of vehicle dynamics models (e.g., `constant-velocity'). We show that this class of prior results in an inverse kernel matrix (i.e., covariance matrix between all pairs of measurement times) that is exactly sparse (block-tridiagonal) and that this can be exploited to carry out GP regression (and interpolation) very efficiently. When the prior is based on a linear, time-varying stochastic differential equation and the measurement model is also linear, this GP approach is equivalent to classical, discrete-time smoothing (at the measurement times); when a nonlinearity is present, we iterate over the whole trajectory to maximize accuracy. We test the approach experimentally on a simultaneous trajectory estimation and map** problem using a mobile robot dataset.
△ Less
Submitted 1 December, 2014;
originally announced December 2014.
-
Process nano scale mechanical properties measurement of thin metal films using a novel paddle cantilever test structure
Authors:
Chi-Jia Tong,
Ming-Tzer Lin
Abstract:
A new technique was developed for studying the mechanical behavior of nano-scale thin metal films on substrate is presented. The test structure was designed on a novel "paddle" cantilever beam specimens with dimensions as few hundred nanometers to less than 10 nanometers. This beam is in triangle shape in order to provide uniform plane strain distribution. Standard clean room processing was used…
▽ More
A new technique was developed for studying the mechanical behavior of nano-scale thin metal films on substrate is presented. The test structure was designed on a novel "paddle" cantilever beam specimens with dimensions as few hundred nanometers to less than 10 nanometers. This beam is in triangle shape in order to provide uniform plane strain distribution. Standard clean room processing was used to prepare the paddle sample. The experiment can be operated by using the electrostatic deflection on the paddle uniform distributed stress cantilever beam and then measure the deposited thin metal film materials on top of it. A capacitance technique was used to measurement on the other side of the deflected plate to measure its deflection with respect to the force. The measured strain was converted through the capacitance measurement for the deflection of the cantilever. System performance on the residual stress measurement of thin films are calculated with three different forces on the "paddle" cantilever beam, including the force due to the film, compliance force and electrostatic force.
△ Less
Submitted 7 May, 2008;
originally announced May 2008.
-
Monotonic and fatigue testing of spring-bridged freestanding microbeams application for MEMS
Authors:
Ming-Tzer Lin,
K. -S. Shiu,
Chi-Jia Tong
Abstract:
Microelectromechanical systems (MEMS) technologies are develo** rapidly with increasing study of the design, fabrication and commercialization of microscale systems and devices. Accurate knowledge on the mechanical behaviors of thin film materials used for MEMS is important for successful design and development of MEMS. Here a novel electroplating spring-bridge micro-tensile specimen integrate…
▽ More
Microelectromechanical systems (MEMS) technologies are develo** rapidly with increasing study of the design, fabrication and commercialization of microscale systems and devices. Accurate knowledge on the mechanical behaviors of thin film materials used for MEMS is important for successful design and development of MEMS. Here a novel electroplating spring-bridge micro-tensile specimen integrates pin-pin align holes, misalignment compensate spring, load sensor beam and freestanding thin film is demonstrated and fabricated. The specimen is fit into a specially designed micro-mechanical apparatus to carry out a series of monotonic tensile testing on sub-micron freestanding thin films. Certain thin films applicable as structure or motion gears in MEMS were tested including sputtered gold, copper and tantalum nitride thin films. Metal specimens were fabricated by sputtering; for tantalum nitride film samples, nitrogen gas was introduced into the chamber during sputtering tantalum films on the silicon wafer. The sample fabrication method involves three steps of lithography and two steps of electroplating copper to hold a dog bone freestanding thin film. Using standard wet etching or lift off techniques, a series of microtensile specimens were patterned in metal thin films, holes, and seed layer for spring and frame structure on the underlying silicon oxide coated silicon substrate. Two steps of electroplating processing to distinct spring and frame portion of the test chip. Finally, chemical etched away the silicon oxide to separated electroplated specimen and silicon substrate.
△ Less
Submitted 21 February, 2008;
originally announced February 2008.
-
Design and Development of Novel Electroplating Spring Frame Mems Structure Specimens for the Microtensile Testing of Thin Film Materials
Authors:
Ming-Tzer Lin,
Chi-Jia Tong,
Chung-Hsun Chiang
Abstract:
Microelectromechanical systems (MEMS) technologies are develo** rapidly with increasing study of the design, fabrication and commercialization of microscale systems and devices. Accurate mechanical properties are important for successful design and development of MEMS. We have demonstrated here a novel electroplating spring frame MEMS Structure Specimen integrates pin-pin align holes, misalign…
▽ More
Microelectromechanical systems (MEMS) technologies are develo** rapidly with increasing study of the design, fabrication and commercialization of microscale systems and devices. Accurate mechanical properties are important for successful design and development of MEMS. We have demonstrated here a novel electroplating spring frame MEMS Structure Specimen integrates pin-pin align holes, misalignment compensate spring structure frame, load sensor beam and freestanding thin film. The specimen can be fit into a specially designed microtensile apparatus which is capable of carrying out a series of tests on sub-micro scale freestanding thin films.
△ Less
Submitted 21 November, 2007;
originally announced November 2007.