Search | arXiv e-print repository

Insights from the Use of Previously Unseen Neural Architecture Search Datasets

Authors: Rob Geada, David Towers, Matthew Forshaw, Amir Atapour-Abarghouei, A. Stephen McGough

Abstract: The boundless possibility of neural networks which can be used to solve a problem -- each with different performance -- leads to a situation where a Deep Learning expert is required to identify the best neural network. This goes against the hope of removing the need for experts. Neural Architecture Search (NAS) offers a solution to this by automatically identifying the best architecture. However,… ▽ More The boundless possibility of neural networks which can be used to solve a problem -- each with different performance -- leads to a situation where a Deep Learning expert is required to identify the best neural network. This goes against the hope of removing the need for experts. Neural Architecture Search (NAS) offers a solution to this by automatically identifying the best architecture. However, to date, NAS work has focused on a small set of datasets which we argue are not representative of real-world problems. We introduce eight new datasets created for a series of NAS Challenges: AddNIST, Language, MultNIST, CIFARTile, Gutenberg, Isabella, GeoClassing, and Chesseract. These datasets and challenges are developed to direct attention to issues in NAS development and to encourage authors to consider how their models will perform on datasets unknown to them at development time. We present experimentation using standard Deep Learning methods as well as the best results from challenge participants. △ Less

Submitted 2 April, 2024; originally announced April 2024.

arXiv:2402.14185 [pdf, other]

HINT: High-quality INPainting Transformer with Mask-Aware Encoding and Enhanced Attention

Authors: Shuang Chen, Amir Atapour-Abarghouei, Hubert P. H. Shum

Abstract: Existing image inpainting methods leverage convolution-based downsampling approaches to reduce spatial dimensions. This may result in information loss from corrupted images where the available information is inherently sparse, especially for the scenario of large missing regions. Recent advances in self-attention mechanisms within transformers have led to significant improvements in many computer… ▽ More Existing image inpainting methods leverage convolution-based downsampling approaches to reduce spatial dimensions. This may result in information loss from corrupted images where the available information is inherently sparse, especially for the scenario of large missing regions. Recent advances in self-attention mechanisms within transformers have led to significant improvements in many computer vision tasks including inpainting. However, limited by the computational costs, existing methods cannot fully exploit the efficacy of long-range modelling capabilities of such models. In this paper, we propose an end-to-end High-quality INpainting Transformer, abbreviated as HINT, which consists of a novel mask-aware pixel-shuffle downsampling module (MPD) to preserve the visible information extracted from the corrupted image while maintaining the integrity of the information available for high-level inferences made within the model. Moreover, we propose a Spatially-activated Channel Attention Layer (SCAL), an efficient self-attention mechanism interpreting spatial awareness to model the corrupted image at multiple scales. To further enhance the effectiveness of SCAL, motivated by recent advanced in speech recognition, we introduce a sandwich structure that places feed-forward networks before and after the SCAL module. We demonstrate the superior performance of HINT compared to contemporary state-of-the-art models on four datasets, CelebA, CelebA-HQ, Places2, and Dunhuang. △ Less

Submitted 21 February, 2024; originally announced February 2024.

arXiv:2305.10589 [pdf, other]

INCLG: Inpainting for Non-Cleft Lip Generation with a Multi-Task Image Processing Network

Authors: Shuang Chen, Amir Atapour-Abarghouei, Edmond S. L. Ho, Hubert P. H. Shum

Abstract: We present a software that predicts non-cleft facial images for patients with cleft lip, thereby facilitating the understanding, awareness and discussion of cleft lip surgeries. To protect patients privacy, we design a software framework using image inpainting, which does not require cleft lip images for training, thereby mitigating the risk of model leakage. We implement a novel multi-task archit… ▽ More We present a software that predicts non-cleft facial images for patients with cleft lip, thereby facilitating the understanding, awareness and discussion of cleft lip surgeries. To protect patients privacy, we design a software framework using image inpainting, which does not require cleft lip images for training, thereby mitigating the risk of model leakage. We implement a novel multi-task architecture that predicts both the non-cleft facial image and facial landmarks, resulting in better performance as evaluated by surgeons. The software is implemented with PyTorch and is usable with consumer-level color images with a fast prediction speed, enabling effective deployment. △ Less

Submitted 17 May, 2023; originally announced May 2023.

arXiv:2302.13638 [pdf, other]

doi 10.1145/3578244.3583731

Predicting the Performance of a Computing System with Deep Networks

Authors: Mehmet Cengiz, Matthew Forshaw, Amir Atapour-Abarghouei, Andrew Stephen McGough

Abstract: Predicting the performance and energy consumption of computing hardware is critical for many modern applications. This will inform procurement decisions, deployment decisions, and autonomic scaling. Existing approaches to understanding the performance of hardware largely focus around benchmarking -- leveraging standardised workloads which seek to be representative of an end-user's needs. Two key c… ▽ More Predicting the performance and energy consumption of computing hardware is critical for many modern applications. This will inform procurement decisions, deployment decisions, and autonomic scaling. Existing approaches to understanding the performance of hardware largely focus around benchmarking -- leveraging standardised workloads which seek to be representative of an end-user's needs. Two key challenges are present; benchmark workloads may not be representative of an end-user's workload, and benchmark scores are not easily obtained for all hardware. Within this paper, we demonstrate the potential to build Deep Learning models to predict benchmark scores for unseen hardware. We undertake our evaluation with the openly available SPEC 2017 benchmark results. We evaluate three different networks, one fully-connected network along with two Convolutional Neural Networks (one bespoke and one ResNet inspired) and demonstrate impressive $R^2$ scores of 0.96, 0.98 and 0.94 respectively. △ Less

Submitted 27 February, 2023; originally announced February 2023.

Comments: 8 pages, 9 figures, 4 tables, ICPE2023

arXiv:2212.06130 [pdf, other]

Siamese Neural Networks for Skin Cancer Classification and New Class Detection using Clinical and Dermoscopic Image Datasets

Authors: Michael Luke Battle, Amir Atapour-Abarghouei, Andrew Stephen McGough

Abstract: Skin cancer is the most common malignancy in the world. Automated skin cancer detection would significantly improve early detection rates and prevent deaths. To help with this aim, a number of datasets have been released which can be used to train Deep Learning systems - these have produced impressive results for classification. However, this only works for the classes they are trained on whilst t… ▽ More Skin cancer is the most common malignancy in the world. Automated skin cancer detection would significantly improve early detection rates and prevent deaths. To help with this aim, a number of datasets have been released which can be used to train Deep Learning systems - these have produced impressive results for classification. However, this only works for the classes they are trained on whilst they are incapable of identifying skin lesions from previously unseen classes, making them unconducive for clinical use. We could look to massively increase the datasets by including all possible skin lesions, though this would always leave out some classes. Instead, we evaluate Siamese Neural Networks (SNNs), which not only allows us to classify images of skin lesions, but also allow us to identify those images which are different from the trained classes - allowing us to determine that an image is not an example of our training classes. We evaluate SNNs on both dermoscopic and clinical images of skin lesions. We obtain top-1 classification accuracy levels of 74.33% and 85.61% on clinical and dermoscopic datasets, respectively. Although this is slightly lower than the state-of-the-art results, the SNN approach has the advantage that it can detect out-of-class examples. Our results highlight the potential of an SNN approach as well as pathways towards future clinical deployment. △ Less

Submitted 12 December, 2022; originally announced December 2022.

Comments: 10 pages, 5 figures, 5 tables

arXiv:2208.01149 [pdf, other]

A Feasibility Study on Image Inpainting for Non-cleft Lip Generation from Patients with Cleft Lip

Authors: Shuang Chen, Amir Atapour-Abarghouei, Jane Kerby, Edmond S. L. Ho, David C. G. Sainsbury, Sophie Butterworth, Hubert P. H. Shum

Abstract: A Cleft lip is a congenital abnormality requiring surgical repair by a specialist. The surgeon must have extensive experience and theoretical knowledge to perform surgery, and Artificial Intelligence (AI) method has been proposed to guide surgeons in improving surgical outcomes. If AI can be used to predict what a repaired cleft lip would look like, surgeons could use it as an adjunct to adjust th… ▽ More A Cleft lip is a congenital abnormality requiring surgical repair by a specialist. The surgeon must have extensive experience and theoretical knowledge to perform surgery, and Artificial Intelligence (AI) method has been proposed to guide surgeons in improving surgical outcomes. If AI can be used to predict what a repaired cleft lip would look like, surgeons could use it as an adjunct to adjust their surgical technique and improve results. To explore the feasibility of this idea while protecting patient privacy, we propose a deep learning-based image inpainting method that is capable of covering a cleft lip and generating a lip and nose without a cleft. Our experiments are conducted on two real-world cleft lip datasets and are assessed by expert cleft lip surgeons to demonstrate the feasibility of the proposed method. △ Less

Submitted 1 August, 2022; originally announced August 2022.

Comments: 4 pages, 2 figures, BHI 2022

arXiv:2207.04821 [pdf, ps, other]

Long-term Reproducibility for Neural Architecture Search

Authors: David Towers, Matthew Forshaw, Amir Atapour-Abarghouei, Andrew Stephen McGough

Abstract: It is a sad reflection of modern academia that code is often ignored after publication -- there is no academic 'kudos' for bug fixes / maintenance. Code is often unavailable or, if available, contains bugs, is incomplete, or relies on out-of-date / unavailable libraries. This has a significant impact on reproducibility and general scientific progress. Neural Architecture Search (NAS) is no excepti… ▽ More It is a sad reflection of modern academia that code is often ignored after publication -- there is no academic 'kudos' for bug fixes / maintenance. Code is often unavailable or, if available, contains bugs, is incomplete, or relies on out-of-date / unavailable libraries. This has a significant impact on reproducibility and general scientific progress. Neural Architecture Search (NAS) is no exception to this, with some prior work in reproducibility. However, we argue that these do not consider long-term reproducibility issues. We therefore propose a checklist for long-term NAS reproducibility. We evaluate our checklist against common NAS approaches along with proposing how we can retrospectively make these approaches more long-term reproducible. △ Less

Submitted 18 July, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

Comments: 4 pages, LaTeX, Typos corrected

arXiv:2202.02832 [pdf, other]

Detecting Melanoma Fairly: Skin Tone Detection and Debiasing for Skin Lesion Classification

Authors: Peter J. Bevan, Amir Atapour-Abarghouei

Abstract: Convolutional Neural Networks have demonstrated human-level performance in the classification of melanoma and other skin lesions, but evident performance disparities between differing skin tones should be addressed before widespread deployment. In this work, we propose an efficient yet effective algorithm for automatically labelling the skin tone of lesion images, and use this to annotate the benc… ▽ More Convolutional Neural Networks have demonstrated human-level performance in the classification of melanoma and other skin lesions, but evident performance disparities between differing skin tones should be addressed before widespread deployment. In this work, we propose an efficient yet effective algorithm for automatically labelling the skin tone of lesion images, and use this to annotate the benchmark ISIC dataset. We subsequently use these automated labels as the target for two leading bias unlearning techniques towards mitigating skin tone bias. Our experimental results provide evidence that our skin tone detection algorithm outperforms existing solutions and that unlearning skin tone may improve generalisation and can reduce the performance disparity between melanoma detection in lighter and darker skin tones. △ Less

Submitted 29 July, 2022; v1 submitted 6 February, 2022; originally announced February 2022.

Comments: 8 pages

arXiv:2112.01121 [pdf, other]

"Just Drive": Colour Bias Mitigation for Semantic Segmentation in the Context of Urban Driving

Authors: Jack Stelling, Amir Atapour-Abarghouei

Abstract: Biases can filter into AI technology without our knowledge. Oftentimes, seminal deep learning networks champion increased accuracy above all else. In this paper, we attempt to alleviate biases encountered by semantic segmentation models in urban driving scenes, via an iteratively trained unlearning algorithm. Convolutional neural networks have been shown to rely on colour and texture rather than g… ▽ More Biases can filter into AI technology without our knowledge. Oftentimes, seminal deep learning networks champion increased accuracy above all else. In this paper, we attempt to alleviate biases encountered by semantic segmentation models in urban driving scenes, via an iteratively trained unlearning algorithm. Convolutional neural networks have been shown to rely on colour and texture rather than geometry. This raises issues when safety-critical applications, such as self-driving cars, encounter images with covariate shift at test time - induced by variations such as lighting changes or seasonality. Conceptual proof of bias unlearning has been shown on simple datasets such as MNIST. However, the strategy has never been applied to the safety-critical domain of pixel-wise semantic segmentation of highly variable training data - such as urban scenes. Trained models for both the baseline and bias unlearning scheme have been tested for performance on colour-manipulated validation sets showing a disparity of up to 85.50% in mIoU from the original RGB images - confirming segmentation networks strongly depend on the colour information in the training data to make their classification. The bias unlearning scheme shows improvements of handling this covariate shift of up to 61% in the best observed case - and performs consistently better at classifying the "human" and "vehicle" classes compared to the baseline model. △ Less

Submitted 2 December, 2021; originally announced December 2021.

Comments: 2021 IEEE International Conference on Big Data (IEEE BigData 2021)

arXiv:2109.09818 [pdf, other]

Skin Deep Unlearning: Artefact and Instrument Debiasing in the Context of Melanoma Classification

Authors: Peter J. Bevan, Amir Atapour-Abarghouei

Abstract: Convolutional Neural Networks have demonstrated dermatologist-level performance in the classification of melanoma from skin lesion images, but prediction irregularities due to biases seen within the training data are an issue that should be addressed before widespread deployment is possible. In this work, we robustly remove bias and spurious variation from an automated melanoma classification pipe… ▽ More Convolutional Neural Networks have demonstrated dermatologist-level performance in the classification of melanoma from skin lesion images, but prediction irregularities due to biases seen within the training data are an issue that should be addressed before widespread deployment is possible. In this work, we robustly remove bias and spurious variation from an automated melanoma classification pipeline using two leading bias unlearning techniques. We show that the biases introduced by surgical markings and rulers presented in previous studies can be reasonably mitigated using these bias removal methods. We also demonstrate the generalisation benefits of unlearning spurious variation relating to the imaging instrument used to capture lesion images. Our experimental results provide evidence that the effects of each of the aforementioned biases are notably reduced, with different debiasing techniques excelling at different tasks. △ Less

Submitted 27 April, 2023; v1 submitted 20 September, 2021; originally announced September 2021.

Comments: 9 pages, ICML 2022

Journal ref: Proceedings of the 39th International Conference on Machine Learning, PMLR 162:1874-1892, 2022

arXiv:2109.09796 [pdf, other]

Transforming Fake News: Robust Generalisable News Classification Using Transformers

Authors: Ciara Blackledge, Amir Atapour-Abarghouei

Abstract: As online news has become increasingly popular and fake news increasingly prevalent, the ability to audit the veracity of online news content has become more important than ever. Such a task represents a binary classification challenge, for which transformers have achieved state-of-the-art results. Using the publicly available ISOT and Combined Corpus datasets, this study explores transformers' ab… ▽ More As online news has become increasingly popular and fake news increasingly prevalent, the ability to audit the veracity of online news content has become more important than ever. Such a task represents a binary classification challenge, for which transformers have achieved state-of-the-art results. Using the publicly available ISOT and Combined Corpus datasets, this study explores transformers' abilities to identify fake news, with particular attention given to investigating generalisation to unseen datasets with varying styles, topics and class distributions. Moreover, we explore the idea that opinion-based news articles cannot be classified as real or fake due to their subjective nature and often sensationalised language, and propose a novel two-step classification pipeline to remove such articles from both model training and the final deployed inference system. Experiments over the ISOT and Combined Corpus datasets show that transformers achieve an increase in F1 scores of up to 4.9% for out of distribution generalisation compared to baseline approaches, with a further increase of 10.1% following the implementation of our two-step classification pipeline. To the best of our knowledge, this study is the first to investigate generalisation of transformers in this context. △ Less

Submitted 3 December, 2021; v1 submitted 20 September, 2021; originally announced September 2021.

Comments: 2021 IEEE International Conference on Big Data (IEEE BigData 2021)

arXiv:2109.02119 [pdf, other]

Identification of Driver Phone Usage Violations via State-of-the-Art Object Detection with Tracking

Authors: Steven Carrell, Amir Atapour-Abarghouei

Abstract: The use of mobiles phones when driving have been a major factor when it comes to road traffic incidents and the process of capturing such violations can be a laborious task. Advancements in both modern object detection frameworks and high-performance hardware has paved the way for a more automated approach when it comes to video surveillance. In this work, we propose a custom-trained state-of-the-… ▽ More The use of mobiles phones when driving have been a major factor when it comes to road traffic incidents and the process of capturing such violations can be a laborious task. Advancements in both modern object detection frameworks and high-performance hardware has paved the way for a more automated approach when it comes to video surveillance. In this work, we propose a custom-trained state-of-the-art object detector to work with roadside cameras to capture driver phone usage without the need for human intervention. The proposed approach also addresses the issues caused by windscreen glare and introduces the steps required to remedy this. Twelve pre-trained models are fine-tuned with our custom dataset using four popular object detection methods: YOLO, SSD, Faster R-CNN, and CenterNet. Out of all the object detectors tested, the YOLO yields the highest accuracy levels of up to 96% (AP10) and frame rates of up to ~30 FPS. DeepSort object tracking algorithm is also integrated into the best-performing model to collect records of only the unique violations, and enable the proposed approach to count the number of vehicles. The proposed automated system will collect the output images of the identified violations, timestamps of each violation, and total vehicle count. Data can be accessed via a purpose-built user interface. △ Less

Submitted 8 October, 2021; v1 submitted 5 September, 2021; originally announced September 2021.

Comments: 10 pages

arXiv:2011.12709 [pdf, other]

Resolving the cybersecurity Data Sharing Paradox to scale up cybersecurity via a co-production approach towards data sharing

Authors: Amir Atapour-Abarghouei, Andrew Stephen McGough, David Stanley Wall

Abstract: As cybercriminals scale up their operations to increase their profits or inflict greater harm, we argue that there is an equal need to respond to their threats by scaling up cybersecurity. To achieve this goal, we have to develop a co-productive approach towards data collection and sharing by overcoming the cybersecurity data sharing paradox. This is where we all agree on the definition of the pro… ▽ More As cybercriminals scale up their operations to increase their profits or inflict greater harm, we argue that there is an equal need to respond to their threats by scaling up cybersecurity. To achieve this goal, we have to develop a co-productive approach towards data collection and sharing by overcoming the cybersecurity data sharing paradox. This is where we all agree on the definition of the problem and end goal (improving cybersecurity and getting rid of cybercrime), but we disagree about how to achieve it and fail to work together efficiently. At the core of this paradox is the observation that public interests differ from private interests. As a result, industry and law enforcement take different approaches to the cybersecurity problem as they seek to resolve incidents in their own interests, which manifests in different data sharing practices between both and also other interested parties, such as cybersecurity researchers. The big question we ask is can these interests be reconciled to develop an interdisciplinary approach towards co-operation and sharing data. In essence, all three will have to co-own the problem in order to co-produce a solution. We argue that a few operational models with good practices exist that provide guides to a possible solution, especially multiple third-party ownership organisations which consolidate, anonymise and analyse data. To take this forward, we suggest the practical solution of organising co-productive data collection on a sectoral basis, but acknowledge that common standards for data collection will also have to be developed and agreed upon. We propose an initial set of best practices for building collaborations and sharing data and argue that these best practices need to be developed and standardised in order to mitigate the paradox. △ Less

Submitted 20 November, 2020; originally announced November 2020.

Comments: 10 pages

arXiv:2010.12635 [pdf, other]

Not Half Bad: Exploring Half-Precision in Graph Convolutional Neural Networks

Authors: John Brennan, Stephen Bonner, Amir Atapour-Abarghouei, Philip T Jackson, Boguslaw Obara, Andrew Stephen McGough

Abstract: With the growing significance of graphs as an effective representation of data in numerous applications, efficient graph analysis using modern machine learning is receiving a growing level of attention. Deep learning approaches often operate over the entire adjacency matrix -- as the input and intermediate network layers are all designed in proportion to the size of the adjacency matrix -- leading… ▽ More With the growing significance of graphs as an effective representation of data in numerous applications, efficient graph analysis using modern machine learning is receiving a growing level of attention. Deep learning approaches often operate over the entire adjacency matrix -- as the input and intermediate network layers are all designed in proportion to the size of the adjacency matrix -- leading to intensive computation and large memory requirements as the graph size increases. It is therefore desirable to identify efficient measures to reduce both run-time and memory requirements allowing for the analysis of the largest graphs possible. The use of reduced precision operations within the forward and backward passes of a deep neural network along with novel specialised hardware in modern GPUs can offer promising avenues towards efficiency. In this paper, we provide an in-depth exploration of the use of reduced-precision operations, easily integrable into the highly popular PyTorch framework, and an analysis of the effects of Tensor Cores on graph convolutional neural networks. We perform an extensive experimental evaluation of three GPU architectures and two widely-used graph analysis tasks (vertex classification and link prediction) using well-known benchmark and synthetically generated datasets. Thus allowing us to make important observations on the effects of reduced-precision operations and Tensor Cores on computational and memory usage of graph convolutional neural networks -- often neglected in the literature. △ Less

Submitted 23 October, 2020; originally announced October 2020.

arXiv:2009.05160 [pdf, other]

Rank over Class: The Untapped Potential of Ranking in Natural Language Processing

Authors: Amir Atapour-Abarghouei, Stephen Bonner, Andrew Stephen McGough

Abstract: Text classification has long been a staple within Natural Language Processing (NLP) with applications spanning across diverse areas such as sentiment analysis, recommender systems and spam detection. With such a powerful solution, it is often tempting to use it as the go-to tool for all NLP problems since when you are holding a hammer, everything looks like a nail. However, we argue here that many… ▽ More Text classification has long been a staple within Natural Language Processing (NLP) with applications spanning across diverse areas such as sentiment analysis, recommender systems and spam detection. With such a powerful solution, it is often tempting to use it as the go-to tool for all NLP problems since when you are holding a hammer, everything looks like a nail. However, we argue here that many tasks which are currently addressed using classification are in fact being shoehorned into a classification mould and that if we instead address them as a ranking problem, we not only improve the model, but we achieve better performance. We propose a novel end-to-end ranking approach consisting of a Transformer network responsible for producing representations for a pair of text sequences, which are in turn passed into a context aggregating network outputting ranking scores used to determine an ordering to the sequences based on some notion of relevance. We perform numerous experiments on publicly-available datasets and investigate the applications of ranking in problems often solved using classification. In an experiment on a heavily-skewed sentiment analysis dataset, converting ranking results to classification labels yields an approximately 22% improvement over state-of-the-art text classification, demonstrating the efficacy of text ranking over text classification in certain scenarios. △ Less

Submitted 3 December, 2021; v1 submitted 10 September, 2020; originally announced September 2020.

Comments: 2021 IEEE International Conference on Big Data (IEEE BigData 2021)

arXiv:2007.14314 [pdf, other]

On the Impact of Lossy Image and Video Compression on the Performance of Deep Convolutional Neural Network Architectures

Authors: Matt Poyser, Amir Atapour-Abarghouei, Toby P. Breckon

Abstract: Recent advances in generalized image understanding have seen a surge in the use of deep convolutional neural networks (CNN) across a broad range of image-based detection, classification and prediction tasks. Whilst the reported performance of these approaches is impressive, this study investigates the hitherto unapproached question of the impact of commonplace image and video compression technique… ▽ More Recent advances in generalized image understanding have seen a surge in the use of deep convolutional neural networks (CNN) across a broad range of image-based detection, classification and prediction tasks. Whilst the reported performance of these approaches is impressive, this study investigates the hitherto unapproached question of the impact of commonplace image and video compression techniques on the performance of such deep learning architectures. Focusing on the JPEG and H.264 (MPEG-4 AVC) as a representative proxy for contemporary lossy image/video compression techniques that are in common use within network-connected image/video devices and infrastructure, we examine the impact on performance across five discrete tasks: human pose estimation, semantic segmentation, object detection, action recognition, and monocular depth estimation. As such, within this study we include a variety of network architectures and domains spanning end-to-end convolution, encoder-decoder, region-based CNN (R-CNN), dual-stream, and generative adversarial networks (GAN). Our results show a non-linear and non-uniform relationship between network performance and the level of lossy compression applied. Notably, performance decreases significantly below a JPEG quality (quantization) level of 15% and a H.264 Constant Rate Factor (CRF) of 40. However, retraining said architectures on pre-compressed imagery conversely recovers network performance by up to 78.4% in some cases. Furthermore, there is a correlation between architectures employing an encoder-decoder pipeline and those that demonstrate resilience to lossy image compression. The characteristics of the relationship between input compression to output task performance can be used to inform design decisions within future image/video devices and infrastructure. △ Less

Submitted 28 July, 2020; originally announced July 2020.

Comments: 8 pages, 21 figures, to be published in ICPR 2020 conference

arXiv:2007.11544 [pdf, other]

Leveraging Synthetic Subject Invariant EEG Signals for Zero Calibration BCI

Authors: Nik Khadijah Nik Aznan, Amir Atapour-Abarghouei, Stephen Bonner, Jason D. Connolly, Toby P. Breckon

Abstract: Recently, substantial progress has been made in the area of Brain-Computer Interface (BCI) using modern machine learning techniques to decode and interpret brain signals. While Electroencephalography (EEG) has provided a non-invasive method of interfacing with a human brain, the acquired data is often heavily subject and session dependent. This makes seamless incorporation of such data into real-w… ▽ More Recently, substantial progress has been made in the area of Brain-Computer Interface (BCI) using modern machine learning techniques to decode and interpret brain signals. While Electroencephalography (EEG) has provided a non-invasive method of interfacing with a human brain, the acquired data is often heavily subject and session dependent. This makes seamless incorporation of such data into real-world applications intractable as the subject and session data variance can lead to long and tedious calibration requirements and cross-subject generalisation issues. Focusing on a Steady State Visual Evoked Potential (SSVEP) classification systems, we propose a novel means of generating highly-realistic synthetic EEG data invariant to any subject, session or other environmental conditions. Our approach, entitled the Subject Invariant SSVEP Generative Adversarial Network (SIS-GAN), produces synthetic EEG data from multiple SSVEP classes using a single network. Additionally, by taking advantage of a fixed-weight pre-trained subject classification network, we ensure that our generative model remains agnostic to subject-specific features and thus produces subject-invariant data that can be applied to new previously unseen subjects. Our extensive experimental evaluation demonstrates the efficacy of our synthetic data, leading to superior performance, with improvements of up to 16 percentage points in zero-calibration classification tasks when trained using our subject-invariant synthetic EEG signals. △ Less

Submitted 23 October, 2020; v1 submitted 16 July, 2020; originally announced July 2020.

Comments: Accepted as a full paper at the 25th International Conference on Pattern Recognition (ICPR2020)

arXiv:1912.05684 [pdf, other]

Online Deep Reinforcement Learning for Autonomous UAV Navigation and Exploration of Outdoor Environments

Authors: Bruna G. Maciel-Pearson, Letizia Marchegiani, Samet Akcay, Amir Atapour-Abarghouei, James Garforth, Toby P. Breckon

Abstract: With the rapidly growing expansion in the use of UAVs, the ability to autonomously navigate in varying environments and weather conditions remains a highly desirable but as-of-yet unsolved challenge. In this work, we use Deep Reinforcement Learning to continuously improve the learning and understanding of a UAV agent while exploring a partially observable environment, which simulates the challenge… ▽ More With the rapidly growing expansion in the use of UAVs, the ability to autonomously navigate in varying environments and weather conditions remains a highly desirable but as-of-yet unsolved challenge. In this work, we use Deep Reinforcement Learning to continuously improve the learning and understanding of a UAV agent while exploring a partially observable environment, which simulates the challenges faced in a real-life scenario. Our innovative approach uses a double state-input strategy that combines the acquired knowledge from the raw image and a map containing positional information. This positional data aids the network understanding of where the UAV has been and how far it is from the target position, while the feature map from the current scene highlights cluttered areas that are to be avoided. Our approach is extensively tested using variants of Deep Q-Network adapted to cope with double state input data. Further, we demonstrate that by altering the reward and the Q-value function, the agent is capable of consistently outperforming the adapted Deep Q-Network, Double Deep Q- Network and Deep Recurrent Q-Network. Our results demonstrate that our proposed Extended Double Deep Q-Network (EDDQN) approach is capable of navigating through multiple unseen environments and under severe weather conditions. △ Less

Submitted 11 December, 2019; originally announced December 2019.

Comments: Journal Submission

arXiv:1911.08364 [pdf, other]

Volenti non fit injuria: Ransomware and its Victims

Authors: Amir Atapour-Abarghouei, Stephen Bonner, Andrew Stephen McGough

Abstract: With the recent growth in the number of malicious activities on the internet, cybersecurity research has seen a boost in the past few years. However, as certain variants of malware can provide highly lucrative opportunities for bad actors, significant resources are dedicated to innovations and improvements by vast criminal organisations. Among these forms of malware, ransomware has experienced a s… ▽ More With the recent growth in the number of malicious activities on the internet, cybersecurity research has seen a boost in the past few years. However, as certain variants of malware can provide highly lucrative opportunities for bad actors, significant resources are dedicated to innovations and improvements by vast criminal organisations. Among these forms of malware, ransomware has experienced a significant recent rise as it offers the perpetrators great financial incentive. Ransomware variants operate by removing system access from the user by either locking the system or encrypting some or all of the data, and subsequently demanding payment or ransom in exchange for returning system access or providing a decryption key to the victim. Due to the ubiquity of sensitive data in many aspects of modern life, many victims of such attacks, be they an individual home user or operators of a business, are forced to pay the ransom to regain access to their data, which in many cases does not happen as renormalisation of system operations is never guaranteed. As the problem of ransomware does not seem to be subsiding, it is very important to investigate the underlying forces driving and facilitating such attacks in order to create preventative measures. As such, in this paper, we discuss and provide further insight into variants of ransomware and their victims in order to understand how and why they have been targeted and what can be done to prevent or mitigate the effects of such attacks. △ Less

Submitted 19 November, 2019; originally announced November 2019.

Comments: 2019 IEEE International Conference on Big Data 2019

arXiv:1908.08402 [pdf, other]

Temporal Neighbourhood Aggregation: Predicting Future Links in Temporal Graphs via Recurrent Variational Graph Convolutions

Authors: Stephen Bonner, Amir Atapour-Abarghouei, Philip T Jackson, John Brennan, Ibad Kureshi, Georgios Theodoropoulos, Andrew Stephen McGough, Boguslaw Obara

Abstract: Graphs have become a crucial way to represent large, complex and often temporal datasets across a wide range of scientific disciplines. However, when graphs are used as input to machine learning models, this rich temporal information is frequently disregarded during the learning process, resulting in suboptimal performance on certain temporal infernce tasks. To combat this, we introduce Temporal N… ▽ More Graphs have become a crucial way to represent large, complex and often temporal datasets across a wide range of scientific disciplines. However, when graphs are used as input to machine learning models, this rich temporal information is frequently disregarded during the learning process, resulting in suboptimal performance on certain temporal infernce tasks. To combat this, we introduce Temporal Neighbourhood Aggregation (TNA), a novel vertex representation model architecture designed to capture both topological and temporal information to directly predict future graph states. Our model exploits hierarchical recurrence at different depths within the graph to enable exploration of changes in temporal neighbourhoods, whilst requiring no additional features or labels to be present. The final vertex representations are created using variational sampling and are optimised to directly predict the next graph in the sequence. Our claims are reinforced by extensive experimental evaluation on both real and synthetic benchmark datasets, where our approach demonstrates superior performance compared to competing methods, out-performing them at predicting new temporal edges by as much as 23% on real-world datasets, whilst also requiring fewer overall model parameters. △ Less

Submitted 21 November, 2019; v1 submitted 21 August, 2019; originally announced August 2019.

Comments: IEEE International Conference on Big Data 2019

arXiv:1908.06750 [pdf, other]

A Kings Ransom for Encryption: Ransomware Classification using Augmented One-Shot Learning and Bayesian Approximation

Authors: Amir Atapour-Abarghouei, Stephen Bonner, Andrew Stephen McGough

Abstract: Newly emerging variants of ransomware pose an ever-growing threat to computer systems governing every aspect of modern life through the handling and analysis of big data. While various recent security-based approaches have focused on detecting and classifying ransomware at the network or system level, easy-to-use post-infection ransomware classification for the lay user has not been attempted befo… ▽ More Newly emerging variants of ransomware pose an ever-growing threat to computer systems governing every aspect of modern life through the handling and analysis of big data. While various recent security-based approaches have focused on detecting and classifying ransomware at the network or system level, easy-to-use post-infection ransomware classification for the lay user has not been attempted before. In this paper, we investigate the possibility of classifying the ransomware a system is infected with simply based on a screenshot of the splash screen or the ransom note captured using a consumer camera commonly found in any modern mobile device. To train and evaluate our system, we create a sample dataset of the splash screens of 50 well-known ransomware variants. In our dataset, only a single training image is available per ransomware. Instead of creating a large training dataset of ransomware screenshots, we simulate screenshot capture conditions via carefully designed data augmentation techniques, enabling simple and efficient one-shot learning. Moreover, using model uncertainty obtained via Bayesian approximation, we ensure special input cases such as unrelated non-ransomware images and previously-unseen ransomware variants are correctly identified for special handling and not mis-classified. Extensive experimental evaluation demonstrates the efficacy of our work, with accuracy levels of up to 93.6% for ransomware classification. △ Less

Submitted 19 August, 2019; originally announced August 2019.

Comments: Submitted to 2019 IEEE International Conference on Big Data

arXiv:1908.05540 [pdf, other]

To complete or to estimate, that is the question: A Multi-Task Approach to Depth Completion and Monocular Depth Estimation

Authors: Amir Atapour-Abarghouei, Toby P. Breckon

Abstract: Robust three-dimensional scene understanding is now an ever-growing area of research highly relevant in many real-world applications such as autonomous driving and robotic navigation. In this paper, we propose a multi-task learning-based model capable of performing two tasks:- sparse depth completion (i.e. generating complete dense scene depth given a sparse depth image as the input) and monocular… ▽ More Robust three-dimensional scene understanding is now an ever-growing area of research highly relevant in many real-world applications such as autonomous driving and robotic navigation. In this paper, we propose a multi-task learning-based model capable of performing two tasks:- sparse depth completion (i.e. generating complete dense scene depth given a sparse depth image as the input) and monocular depth estimation (i.e. predicting scene depth from a single RGB image) via two sub-networks jointly trained end to end using data randomly sampled from a publicly available corpus of synthetic and real-world images. The first sub-network generates a sparse depth image by learning lower level features from the scene and the second predicts a full dense depth image of the entire scene, leading to a better geometric and contextual understanding of the scene and, as a result, superior performance of the approach. The entire model can be used to infer complete scene depth from a single RGB image or the second network can be used alone to perform depth completion given a sparse depth input. Using adversarial training, a robust objective function, a deep architecture relying on skip connections and a blend of synthetic and real-world training data, our approach is capable of producing superior high quality scene depth. Extensive experimental evaluation demonstrates the efficacy of our approach compared to contemporary state-of-the-art techniques across both problem domains. △ Less

Submitted 15 August, 2019; originally announced August 2019.

Comments: International Conference on 3D Vision (3DV) 2019

arXiv:1907.08320 [pdf, other]

Multi-Task Regression-based Learning for Autonomous Unmanned Aerial Vehicle Flight Control within Unstructured Outdoor Environments

Authors: Bruna G. Maciel-Pearson, Samet Akcay, Amir Atapour-Abarghouei, Christopher Holder, Toby P. Breckon

Abstract: Increased growth in the global Unmanned Aerial Vehicles (UAV) (drone) industry has expanded possibilities for fully autonomous UAV applications. A particular application which has in part motivated this research is the use of UAV in wide area search and surveillance operations in unstructured outdoor environments. The critical issue with such environments is the lack of structured features that co… ▽ More Increased growth in the global Unmanned Aerial Vehicles (UAV) (drone) industry has expanded possibilities for fully autonomous UAV applications. A particular application which has in part motivated this research is the use of UAV in wide area search and surveillance operations in unstructured outdoor environments. The critical issue with such environments is the lack of structured features that could aid in autonomous flight, such as road lines or paths. In this paper, we propose an End-to-End Multi-Task Regression-based Learning approach capable of defining flight commands for navigation and exploration under the forest canopy, regardless of the presence of trails or additional sensors (i.e. GPS). Training and testing are performed using a software in the loop pipeline which allows for a detailed evaluation against state-of-the-art pose estimation techniques. Our extensive experiments demonstrate that our approach excels in performing dense exploration within the required search perimeter, is capable of covering wider search regions, generalises to previously unseen and unexplored environments and outperforms contemporary state-of-the-art techniques. △ Less

Submitted 18 July, 2019; originally announced July 2019.

arXiv:1903.10764 [pdf, other]

Veritatem Dies Aperit- Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding Approach

Authors: Amir Atapour-Abarghouei, Toby P. Breckon

Abstract: Robust geometric and semantic scene understanding is ever more important in many real-world applications such as autonomous driving and robotic navigation. In this paper, we propose a multi-task learning-based approach capable of jointly performing geometric and semantic scene understanding, namely depth prediction (monocular depth estimation and depth completion) and semantic scene segmentation.… ▽ More Robust geometric and semantic scene understanding is ever more important in many real-world applications such as autonomous driving and robotic navigation. In this paper, we propose a multi-task learning-based approach capable of jointly performing geometric and semantic scene understanding, namely depth prediction (monocular depth estimation and depth completion) and semantic scene segmentation. Within a single temporally constrained recurrent network, our approach uniquely takes advantage of a complex series of skip connections, adversarial training and the temporal constraint of sequential frame recurrence to produce consistent depth and semantic class labels simultaneously. Extensive experimental evaluation demonstrates the efficacy of our approach compared to other contemporary state-of-the-art techniques. △ Less

Submitted 19 July, 2019; v1 submitted 26 March, 2019; originally announced March 2019.

Comments: CVPR 2019

arXiv:1901.08954 [pdf, other]

Skip-GANomaly: Skip Connected and Adversarially Trained Encoder-Decoder Anomaly Detection

Authors: Samet Akçay, Amir Atapour-Abarghouei, Toby P. Breckon

Abstract: Despite inherent ill-definition, anomaly detection is a research endeavor of great interest within machine learning and visual scene understanding alike. Most commonly, anomaly detection is considered as the detection of outliers within a given data distribution based on some measure of normality. The most significant challenge in real-world anomaly detection problems is that available data is hig… ▽ More Despite inherent ill-definition, anomaly detection is a research endeavor of great interest within machine learning and visual scene understanding alike. Most commonly, anomaly detection is considered as the detection of outliers within a given data distribution based on some measure of normality. The most significant challenge in real-world anomaly detection problems is that available data is highly imbalanced towards normality (i.e. non-anomalous) and contains a most a subset of all possible anomalous samples - hence limiting the use of well-established supervised learning methods. By contrast, we introduce an unsupervised anomaly detection model, trained only on the normal (non-anomalous, plentiful) samples in order to learn the normality distribution of the domain and hence detect abnormality based on deviation from this model. Our proposed approach employs an encoder-decoder convolutional neural network with skip connections to thoroughly capture the multi-scale distribution of the normal data distribution in high-dimensional image space. Furthermore, utilizing an adversarial training scheme for this chosen architecture provides superior reconstruction both within high-dimensional image space and a lower-dimensional latent vector space encoding. Minimizing the reconstruction error metric within both the image and hidden vector spaces during training aids the model to learn the distribution of normality as required. Higher reconstruction metrics during subsequent test and deployment are thus indicative of a deviation from this normal distribution, hence indicative of an anomaly. Experimentation over established anomaly detection benchmarks and challenging real-world datasets, within the context of X-ray security screening, shows the unique promise of such a proposed approach. △ Less

Submitted 25 January, 2019; originally announced January 2019.

Comments: Conference Submission. 8 pages, 9 figures

arXiv:1901.07429 [pdf, other]

doi 10.1109/IJCNN.2019.8852227

Simulating Brain Signals: Creating Synthetic EEG Data via Neural-Based Generative Models for Improved SSVEP Classification

Authors: Nik Khadijah Nik Aznan, Amir Atapour-Abarghouei, Stephen Bonner, Jason Connolly, Noura Al Moubayed, Toby Breckon

Abstract: Despite significant recent progress in the area of Brain-Computer Interface (BCI), there are numerous shortcomings associated with collecting Electroencephalography (EEG) signals in real-world environments. These include, but are not limited to, subject and session data variance, long and arduous calibration processes and predictive generalisation issues across different subjects or sessions. This… ▽ More Despite significant recent progress in the area of Brain-Computer Interface (BCI), there are numerous shortcomings associated with collecting Electroencephalography (EEG) signals in real-world environments. These include, but are not limited to, subject and session data variance, long and arduous calibration processes and predictive generalisation issues across different subjects or sessions. This implies that many downstream applications, including Steady State Visual Evoked Potential (SSVEP) based classification systems, can suffer from a shortage of reliable data. Generating meaningful and realistic synthetic data can therefore be of significant value in circumventing this problem. We explore the use of modern neural-based generative models trained on a limited quantity of EEG data collected from different subjects to generate supplementary synthetic EEG signal vectors, subsequently utilised to train an SSVEP classifier. Extensive experimental analysis demonstrates the efficacy of our generated data, leading to improvements across a variety of evaluations, with the crucial task of cross-subject generalisation improving by over 35% with the use of such synthetic data. △ Less

Submitted 3 April, 2019; v1 submitted 15 January, 2019; originally announced January 2019.

Comments: Accepted as a full paper at International Joint Conference on Neural Network (IJCNN) 2019

arXiv:1809.05375 [pdf, other]

Style Augmentation: Data Augmentation via Style Randomization

Authors: Philip T. Jackson, Amir Atapour-Abarghouei, Stephen Bonner, Toby Breckon, Boguslaw Obara

Abstract: We introduce style augmentation, a new form of data augmentation based on random style transfer, for improving the robustness of convolutional neural networks (CNN) over both classification and regression based tasks. During training, our style augmentation randomizes texture, contrast and color, while preserving shape and semantic content. This is accomplished by adapting an arbitrary style trans… ▽ More We introduce style augmentation, a new form of data augmentation based on random style transfer, for improving the robustness of convolutional neural networks (CNN) over both classification and regression based tasks. During training, our style augmentation randomizes texture, contrast and color, while preserving shape and semantic content. This is accomplished by adapting an arbitrary style transfer network to perform style randomization, by sampling input style embeddings from a multivariate normal distribution instead of inferring them from a style image. In addition to standard classification experiments, we investigate the effect of style augmentation (and data augmentation generally) on domain transfer tasks. We find that data augmentation significantly improves robustness to domain shift, and can be used as a simple, domain agnostic alternative to domain adaptation. Comparing style augmentation against a mix of seven traditional augmentation techniques, we find that it can be readily combined with them to improve network performance. We validate the efficacy of our technique with domain transfer experiments in classification and monocular depth estimation, illustrating consistent improvements in generalization. △ Less

Submitted 12 April, 2019; v1 submitted 14 September, 2018; originally announced September 2018.

arXiv:1805.06725 [pdf, other]

GANomaly: Semi-Supervised Anomaly Detection via Adversarial Training

Authors: Samet Akcay, Amir Atapour-Abarghouei, Toby P. Breckon

Abstract: Anomaly detection is a classical problem in computer vision, namely the determination of the normal from the abnormal when datasets are highly biased towards one class (normal) due to the insufficient sample size of the other class (abnormal). While this can be addressed as a supervised learning problem, a significantly more challenging problem is that of detecting the unknown/unseen anomaly case… ▽ More Anomaly detection is a classical problem in computer vision, namely the determination of the normal from the abnormal when datasets are highly biased towards one class (normal) due to the insufficient sample size of the other class (abnormal). While this can be addressed as a supervised learning problem, a significantly more challenging problem is that of detecting the unknown/unseen anomaly case that takes us instead into the space of a one-class, semi-supervised learning paradigm. We introduce such a novel anomaly detection model, by using a conditional generative adversarial network that jointly learns the generation of high-dimensional image space and the inference of latent space. Employing encoder-decoder-encoder sub-networks in the generator network enables the model to map the input image to a lower dimension vector, which is then used to reconstruct the generated output image. The use of the additional encoder network maps this generated image to its latent representation. Minimizing the distance between these images and the latent vectors during training aids in learning the data distribution for the normal samples. As a result, a larger distance metric from this learned data distribution at inference time is indicative of an outlier from that distribution - an anomaly. Experimentation over several benchmark datasets, from varying domains, shows the model efficacy and superiority over previous state-of-the-art approaches. △ Less

Submitted 13 November, 2018; v1 submitted 17 May, 2018; originally announced May 2018.

Comments: ACCV 2018

Showing 1–28 of 28 results for author: Atapour-Abarghouei, A