Search | arXiv e-print repository

Improving Intention Detection in Single-Trial Classification through Fusion of EEG and Eye-tracker Data

Authors: Xianliang Ge, Yunxian Pan, Sujie Wang, Linze Qian, **gjia Yuan, Jie Xu, Nitish Thakor, Yu Sun

Abstract: Intention decoding is an indispensable procedure in hands-free human-computer interaction (HCI). Conventional eye-tracking system using single-model fixation duration possibly issues commands ignoring users' real expectation. In the current study, an eye-brain hybrid brain-computer interface (BCI) interaction system was introduced for intention detection through fusion of multi-modal eye-track and… ▽ More Intention decoding is an indispensable procedure in hands-free human-computer interaction (HCI). Conventional eye-tracking system using single-model fixation duration possibly issues commands ignoring users' real expectation. In the current study, an eye-brain hybrid brain-computer interface (BCI) interaction system was introduced for intention detection through fusion of multi-modal eye-track and ERP (a measurement derived from EEG) features. Eye-track and EEG data were recorded from 64 healthy participants as they performed a 40-min customized free search task of a fixed target icon among 25 icons. The corresponding fixation duration of eye-tracking and ERP were extracted. Five previously-validated LDA-based classifiers (including RLDA, SWLDA, BLDA, SKLDA, and STDA) and the widely-used CNN method were adopted to verify the efficacy of feature fusion from both offline and pseudo-online analysis, and optimal approach was evaluated through modulating the training set and system response duration. Our study demonstrated that the input of multi-modal eye-track and ERP features achieved superior performance of intention detection in the single trial classification of active search task. And compared with single-model ERP feature, this new strategy also induced congruent accuracy across different classifiers. Moreover, in comparison with other classification methods, we found that the SKLDA exhibited the superior performance when fusing feature in offline test (ACC=0.8783, AUC=0.9004) and online simulation with different sample amount and duration length. In sum, the current study revealed a novel and effective approach for intention classification using eye-brain hybrid BCI, and further supported the real-life application of hands-free HCI in a more precise and stable manner. △ Less

Submitted 5 December, 2021; originally announced December 2021.

arXiv:2010.14184 [pdf, other]

doi 10.1109/JSEN.2021.3087511

Spatio-temporal encoding improves neuromorphic tactile texture classification

Authors: Anupam K. Gupta, Andrei Nakagawa, Nathan F. Lepora, Nitish V. Thakor

Abstract: With the increase in interest in deployment of robots in unstructured environments to work alongside humans, the development of human-like sense of touch for robots becomes important. In this work, we implement a multi-channel neuromorphic tactile system that encodes contact events as discrete spike events that mimic the behavior of slow adapting mechanoreceptors. We study the impact of informatio… ▽ More With the increase in interest in deployment of robots in unstructured environments to work alongside humans, the development of human-like sense of touch for robots becomes important. In this work, we implement a multi-channel neuromorphic tactile system that encodes contact events as discrete spike events that mimic the behavior of slow adapting mechanoreceptors. We study the impact of information pooling across artificial mechanoreceptors on classification performance of spatially non-uniform naturalistic textures. We encoded the spatio-temporal activation patterns of mechanoreceptors through gray-level co-occurrence matrix computed from time-varying mean spiking rate-based tactile response volume. We found that this approach greatly improved texture classification in comparison to use of individual mechanoreceptor response alone. In addition, the performance was also more robust to changes in sliding velocity. The importance of exploiting precise spatial and temporal correlations between sensory channels is evident from the fact that on either removal of precise temporal information or altering of spatial structure of response pattern, a significant performance drop was observed. This study thus demonstrates the superiority of population coding approaches that can exploit the precise spatio-temporal information encoded in activation patterns of mechanoreceptor populations. It, therefore, makes an advance in the direction of development of bio-inspired tactile systems required for realistic touch applications in robotics and prostheses. △ Less

Submitted 5 June, 2021; v1 submitted 27 October, 2020; originally announced October 2020.

Comments: 8 pages, 8 figures, accepted for publication to IEEE Sensors

Journal ref: IEEE Sensors Journal, vol. 21, no. 17, pp. 19038-19046, 2021

arXiv:1903.07873 [pdf, other]

doi 10.1007/978-3-319-44781-0_54

Pose-Invariant Object Recognition for Event-Based Vision with Slow-ELM

Authors: Rohan Ghosh, Siyi Tang, Mahdi Rasouli, Nitish Thakor, Sunil Kukreja

Abstract: Neuromorphic image sensors produce activity-driven spiking output at every pixel. These low-power consuming imagers which encode visual change information in the form of spikes help reduce computational overhead and realize complex real-time systems; object recognition and pose-estimation to name a few. However, there exists a lack of algorithms in event-based vision aimed towards capturing invari… ▽ More Neuromorphic image sensors produce activity-driven spiking output at every pixel. These low-power consuming imagers which encode visual change information in the form of spikes help reduce computational overhead and realize complex real-time systems; object recognition and pose-estimation to name a few. However, there exists a lack of algorithms in event-based vision aimed towards capturing invariance to transformations. In this work, we propose a methodology for recognizing objects invariant to their pose with the Dynamic Vision Sensor (DVS). A novel slow-ELM architecture is proposed which combines the effectiveness of Extreme Learning Machines and Slow Feature Analysis. The system, tested on an Intel Core i5-4590 CPU, can perform 10,000 classifications per second and achieves 1% classification error for 8 objects with views accumulated over 90 degrees of 2D pose. △ Less

Submitted 19 March, 2019; originally announced March 2019.

Comments: Appeared in 25th International Conference on Artificial Neural Networks (ICANN), Barcelona, Spain

arXiv:1903.07067 [pdf, other]

Spatiotemporal Filtering for Event-Based Action Recognition

Authors: Rohan Ghosh, Anupam Gupta, Andrei Nakagawa, Alcimar Soares, Nitish Thakor

Abstract: In this paper, we address the challenging problem of action recognition, using event-based cameras. To recognise most gestural actions, often higher temporal precision is required for sampling visual information. Actions are defined by motion, and therefore, when using event-based cameras it is often unnecessary to re-sample the entire scene. Neuromorphic, event-based cameras have presented an alt… ▽ More In this paper, we address the challenging problem of action recognition, using event-based cameras. To recognise most gestural actions, often higher temporal precision is required for sampling visual information. Actions are defined by motion, and therefore, when using event-based cameras it is often unnecessary to re-sample the entire scene. Neuromorphic, event-based cameras have presented an alternative to visual information acquisition by asynchronously time-encoding pixel intensity changes, through temporally precise spikes (10 micro-second resolution), making them well equipped for action recognition. However, other challenges exist, which are intrinsic to event-based imagers, such as higher signal-to-noise ratio, and a spatiotemporally sparse information. One option is to convert event-data into frames, but this could result in significant temporal precision loss. In this work we introduce spatiotemporal filtering in the spike-event domain, as an alternative way of channeling spatiotemporal information through to a convolutional neural network. The filters are local spatiotemporal weight matrices, learned from the spike-event data, in an unsupervised manner. We find that appropriate spatiotemporal filtering significantly improves CNN performance beyond state-of-the-art on the event-based DVS Gesture dataset. On our newly recorded action recognition dataset, our method shows significant improvement when compared with other, standard ways of generating the spatiotemporal filters. △ Less

Submitted 17 March, 2019; originally announced March 2019.

Comments: Submitted to IEEE Transactions in Pattern Analysis and Machine Intelligence

arXiv:1903.06923 [pdf, other]

Spatiotemporal Feature Learning for Event-Based Vision

Authors: Rohan Ghosh, Anupam Gupta, Siyi Tang, Alcimar Soares, Nitish Thakor

Abstract: Unlike conventional frame-based sensors, event-based visual sensors output information through spikes at a high temporal resolution. By only encoding changes in pixel intensity, they showcase a low-power consuming, low-latency approach to visual information sensing. To use this information for higher sensory tasks like object recognition and tracking, an essential simplification step is the extrac… ▽ More Unlike conventional frame-based sensors, event-based visual sensors output information through spikes at a high temporal resolution. By only encoding changes in pixel intensity, they showcase a low-power consuming, low-latency approach to visual information sensing. To use this information for higher sensory tasks like object recognition and tracking, an essential simplification step is the extraction and learning of features. An ideal feature descriptor must be robust to changes involving (i) local transformations and (ii) re-appearances of a local event pattern. To that end, we propose a novel spatiotemporal feature representation learning algorithm based on slow feature analysis (SFA). Using SFA, smoothly changing linear projections are learnt which are robust to local visual transformations. In order to determine if the features can learn to be invariant to various visual transformations, feature point tracking tasks are used for evaluation. Extensive experiments across two datasets demonstrate the adaptability of the spatiotemporal feature learner to translation, scaling and rotational transformations of the feature points. More importantly, we find that the obtained feature representations are able to exploit the high temporal resolution of such event-based cameras in generating better feature tracks. △ Less

Submitted 16 March, 2019; originally announced March 2019.

Comments: Submitted to IEEE Transactions in Neural Networks and Learning Systems

arXiv:1903.01968 [pdf, other]

Augmented Reality Prosthesis Training Setup for Motor Skill Enhancement

Authors: Avinash Sharma, Wally Niu, Christopher L. Hunt, George Levay, Rahul Kaliki, Nitish V. Thakor

Abstract: Adjusting to amputation can often time be difficult for the body. Post-surgery, amputees have to wait for up to several months before receiving a properly fitted prosthesis. In recent years, there has been a trend toward quantitative outcome measures. In this paper, we developed the augmented reality (AR) version of one such measure, the Prosthetic Hand Assessment Measure (PHAM). The AR version of… ▽ More Adjusting to amputation can often time be difficult for the body. Post-surgery, amputees have to wait for up to several months before receiving a properly fitted prosthesis. In recent years, there has been a trend toward quantitative outcome measures. In this paper, we developed the augmented reality (AR) version of one such measure, the Prosthetic Hand Assessment Measure (PHAM). The AR version of the PHAM - HoloPHAM, offers amputees the advantage to train with pattern recognition, at their own time and convenience, pre- and post-prosthesis fitting. We provide a rigorous analysis of our system, focusing on its ability to simulate reach, grasp, and touch in AR. Similarity of motion joint dynamics for reach in physical and AR space were compared, with experiments conducted to illustrate how depth in AR is perceived. To show the effectiveness and validity of our system for prosthesis training, we conducted a 10-day study with able-bodied subjects (N = 3) to see the effect that training on the HoloPHAM had on other established functional outcome measures. A washout phase of 5 days was incorporated to observe the effect without training. Comparisons were made with standardized outcome metrics, along with the progression of kinematic variability over time. Statistically significant (p<0.05) improvements were observed between pre- and post-training stages. Our results show that AR can be an effective tool for prosthesis training with pattern recognition systems, fostering motor learning for reaching movement tasks, and paving the possibility of replacing physical training. △ Less

Submitted 5 March, 2019; originally announced March 2019.

arXiv:1901.02442 [pdf, other]

doi 10.1109/NER.2019.8717169

Stable Electromyographic Sequence Prediction During Movement Transitions using Temporal Convolutional Networks

Authors: Joseph L. Betthauser, John T. Krall, Rahul R. Kaliki, Matthew S. Fifer, Nitish V. Thakor

Abstract: Transient muscle movements influence the temporal structure of myoelectric signal patterns, often leading to unstable prediction behavior from movement-pattern classification methods. We show that temporal convolutional network sequential models leverage the myoelectric signal's history to discover contextual temporal features that aid in correctly predicting movement intentions, especially during… ▽ More Transient muscle movements influence the temporal structure of myoelectric signal patterns, often leading to unstable prediction behavior from movement-pattern classification methods. We show that temporal convolutional network sequential models leverage the myoelectric signal's history to discover contextual temporal features that aid in correctly predicting movement intentions, especially during interclass transitions. We demonstrate myoelectric classification using temporal convolutional networks to effect 3 simultaneous hand and wrist degrees-of-freedom in an experiment involving nine human-subjects. Temporal convolutional networks yield significant $(p<0.001)$ performance improvements over other state-of-the-art methods in terms of both classification accuracy and stability. △ Less

Submitted 8 January, 2019; originally announced January 2019.

Comments: 4 pages, 5 figures, accepted for Neural Engineering (NER) 2019 Conference

arXiv:1508.01176 [pdf, ps, other]

doi 10.1109/TPAMI.2015.2392947

HFirst: A Temporal Approach to Object Recognition

Authors: Garrick Orchard, Cedric Meyer, Ralph Etienne-Cummings, Christoph Posch, Nitish Thakor, Ryad Benosman

Abstract: This paper introduces a spiking hierarchical model for object recognition which utilizes the precise timing information inherently present in the output of biologically inspired asynchronous Address Event Representation (AER) vision sensors. The asynchronous nature of these systems frees computation and communication from the rigid predetermined timing enforced by system clocks in conventional sys… ▽ More This paper introduces a spiking hierarchical model for object recognition which utilizes the precise timing information inherently present in the output of biologically inspired asynchronous Address Event Representation (AER) vision sensors. The asynchronous nature of these systems frees computation and communication from the rigid predetermined timing enforced by system clocks in conventional systems. Freedom from rigid timing constraints opens the possibility of using true timing to our advantage in computation. We show not only how timing can be used in object recognition, but also how it can in fact simplify computation. Specifically, we rely on a simple temporal-winner-take-all rather than more computationally intensive synchronous operations typically used in biologically inspired neural networks for object recognition. This approach to visual computation represents a major paradigm shift from conventional clocked systems and can find application in other sensory modalities and computational tasks. We showcase effectiveness of the approach by achieving the highest reported accuracy to date (97.5\%$\pm$3.5\%) for a previously published four class card pip recognition task and an accuracy of 84.9\%$\pm$1.9\% for a new more difficult 36 class character recognition task. △ Less

Submitted 5 August, 2015; originally announced August 2015.

Comments: 13 pages, 10 figures

Journal ref: Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.37, no.10, pp.2028-2040, Oct 2015

arXiv:1507.07629 [pdf, other]

Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades

Authors: Garrick Orchard, A**kya Jayawant, Gregory Cohen, Nitish Thakor

Abstract: Creating datasets for Neuromorphic Vision is a challenging task. A lack of available recordings from Neuromorphic Vision sensors means that data must typically be recorded specifically for dataset creation rather than collecting and labelling existing data. The task is further complicated by a desire to simultaneously provide traditional frame-based recordings to allow for direct comparison with t… ▽ More Creating datasets for Neuromorphic Vision is a challenging task. A lack of available recordings from Neuromorphic Vision sensors means that data must typically be recorded specifically for dataset creation rather than collecting and labelling existing data. The task is further complicated by a desire to simultaneously provide traditional frame-based recordings to allow for direct comparison with traditional Computer Vision algorithms. Here we propose a method for converting existing Computer Vision static image datasets into Neuromorphic Vision datasets using an actuated pan-tilt camera platform. Moving the sensor rather than the scene or image is a more biologically realistic approach to sensing and eliminates timing artifacts introduced by monitor updates when simulating motion on a computer monitor. We present conversion of two popular image datasets (MNIST and Caltech101) which have played important roles in the development of Computer Vision, and we provide performance metrics on these datasets using spike-based recognition algorithms. This work contributes datasets for future use in the field, as well as results from spike-based algorithms against which future works can compare. Furthermore, by converting datasets already popular in Computer Vision, we enable more direct comparison with frame-based approaches. △ Less

Submitted 27 July, 2015; originally announced July 2015.

Comments: 10 pages, 6 figures in Frontiers in Neuromorphic Engineering, special topic on Benchmarks and Challenges for Neuromorphic Engineering, 2015 (under review)

Showing 1–9 of 9 results for author: Thakor, N