-
Defense against Joint Poison and Evasion Attacks: A Case Study of DERMS
Authors:
Zain ul Abdeen,
Padmaksha Roy,
Ahmad Al-Tawaha,
Rouxi Jia,
Laura Freeman,
Peter Beling,
Chen-Ching Liu,
Alberto Sangiovanni-Vincentelli,
Ming **
Abstract:
There is an upward trend of deploying distributed energy resource management systems (DERMS) to control modern power grids. However, DERMS controller communication lines are vulnerable to cyberattacks that could potentially impact operational reliability. While a data-driven intrusion detection system (IDS) can potentially thwart attacks during deployment, also known as the evasion attack, the tra…
▽ More
There is an upward trend of deploying distributed energy resource management systems (DERMS) to control modern power grids. However, DERMS controller communication lines are vulnerable to cyberattacks that could potentially impact operational reliability. While a data-driven intrusion detection system (IDS) can potentially thwart attacks during deployment, also known as the evasion attack, the training of the detection algorithm may be corrupted by adversarial data injected into the database, also known as the poisoning attack. In this paper, we propose the first framework of IDS that is robust against joint poisoning and evasion attacks. We formulate the defense mechanism as a bilevel optimization, where the inner and outer levels deal with attacks that occur during training time and testing time, respectively. We verify the robustness of our method on the IEEE-13 bus feeder model against a diverse set of poisoning and evasion attack scenarios. The results indicate that our proposed method outperforms the baseline technique in terms of accuracy, precision, and recall for intrusion detection.
△ Less
Submitted 5 May, 2024;
originally announced May 2024.
-
OffRAMPS: An FPGA-based Intermediary for Analysis and Modification of Additive Manufacturing Control Systems
Authors:
Jason Blocklove,
Md Raz,
Prithwish Basu Roy,
Hammond Pearce,
Prashanth Krishnamurthy,
Farshad Khorrami,
Ramesh Karri
Abstract:
Cybersecurity threats in Additive Manufacturing (AM) are an increasing concern as AM adoption continues to grow. AM is now being used for parts in the aerospace, transportation, and medical domains. Threat vectors which allow for part compromise are particularly concerning, as any failure in these domains would have life-threatening consequences. A major challenge to investigation of AM part-compr…
▽ More
Cybersecurity threats in Additive Manufacturing (AM) are an increasing concern as AM adoption continues to grow. AM is now being used for parts in the aerospace, transportation, and medical domains. Threat vectors which allow for part compromise are particularly concerning, as any failure in these domains would have life-threatening consequences. A major challenge to investigation of AM part-compromises comes from the difficulty in evaluating and benchmarking both identified threat vectors as well as methods for detecting adversarial actions. In this work, we introduce a generalized platform for systematic analysis of attacks against and defenses for 3D printers. Our "OFFRAMPS" platform is based on the open-source 3D printer control board "RAMPS." OFFRAMPS allows analysis, recording, and modification of all control signals and I/O for a 3D printer. We show the efficacy of OFFRAMPS by presenting a series of case studies based on several Trojans, including ones identified in the literature, and show that OFFRAMPS can both emulate and detect these attacks, i.e., it can both change and detect arbitrary changes to the g-code print commands.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Feature Reweighting for EEG-based Motor Imagery Classification
Authors:
Taveena Lotey,
Prateek Keserwani,
Debi Prosad Dogra,
Partha Pratim Roy
Abstract:
Classification of motor imagery (MI) using non-invasive electroencephalographic (EEG) signals is a critical objective as it is used to predict the intention of limb movements of a subject. In recent research, convolutional neural network (CNN) based methods have been widely utilized for MI-EEG classification. The challenges of training neural networks for MI-EEG signals classification include low…
▽ More
Classification of motor imagery (MI) using non-invasive electroencephalographic (EEG) signals is a critical objective as it is used to predict the intention of limb movements of a subject. In recent research, convolutional neural network (CNN) based methods have been widely utilized for MI-EEG classification. The challenges of training neural networks for MI-EEG signals classification include low signal-to-noise ratio, non-stationarity, non-linearity, and high complexity of EEG signals. The features computed by CNN-based networks on the highly noisy MI-EEG signals contain irrelevant information. Subsequently, the feature maps of the CNN-based network computed from the noisy and irrelevant features contain irrelevant information. Thus, many non-contributing features often mislead the neural network training and degrade the classification performance. Hence, a novel feature reweighting approach is proposed to address this issue. The proposed method gives a noise reduction mechanism named feature reweighting module that suppresses irrelevant temporal and channel feature maps. The feature reweighting module of the proposed method generates scores that reweight the feature maps to reduce the impact of irrelevant information. Experimental results show that the proposed method significantly improved the classification of MI-EEG signals of Physionet EEG-MMIDB and BCI Competition IV 2a datasets by a margin of 9.34% and 3.82%, respectively, compared to the state-of-the-art methods.
△ Less
Submitted 29 July, 2023;
originally announced August 2023.
-
Hybrid Transformer Network for Different Horizons-based Enriched Wind Speed Forecasting
Authors:
Dr. M. Madhiarasan,
Prof. Partha Pratim Roy
Abstract:
Highly accurate different horizon-based wind speed forecasting facilitates a better modern power system. This paper proposed a novel astute hybrid wind speed forecasting model and applied it to different horizons. The proposed hybrid forecasting model decomposes the original wind speed data into IMFs (Intrinsic Mode Function) using Improved Complete Ensemble Empirical Mode Decomposition with Adapt…
▽ More
Highly accurate different horizon-based wind speed forecasting facilitates a better modern power system. This paper proposed a novel astute hybrid wind speed forecasting model and applied it to different horizons. The proposed hybrid forecasting model decomposes the original wind speed data into IMFs (Intrinsic Mode Function) using Improved Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (ICEEMDAN). We fed the obtained subseries from ICEEMDAN to the transformer network. Each transformer network computes the forecast subseries and then passes to the fusion phase. Get the primary wind speed forecasting from the fusion of individual transformer network forecast subseries. Estimate the residual error values and predict errors using a multilayer perceptron neural network. The forecast error is added to the primary forecast wind speed to leverage the high accuracy of wind speed forecasting. Comparative analysis with real-time Kethanur, India wind farm dataset results reveals the proposed ICEEMDAN-TNF-MLPN-RECS hybrid model's superior performance with MAE=1.7096*10^-07, MAPE=2.8416*10^-06, MRE=2.8416*10^-08, MSE=5.0206*10^-14, and RMSE=2.2407*10^-07 for case study 1 and MAE=6.1565*10^-07, MAPE=9.5005*10^-06, MRE=9.5005*10^-08, MSE=8.9289*10^-13, and RMSE=9.4493*10^-07 for case study 2 enriched wind speed forecasting than state-of-the-art methods and reduces the burden on the power system engineer.
△ Less
Submitted 7 April, 2022;
originally announced April 2022.
-
Efficacy of Transformer Networks for Classification of Raw EEG Data
Authors:
Gourav Siddhad,
Anmol Gupta,
Debi Prosad Dogra,
Partha Pratim Roy
Abstract:
With the unprecedented success of transformer networks in natural language processing (NLP), recently, they have been successfully adapted to areas like computer vision, generative adversarial networks (GAN), and reinforcement learning. Classifying electroencephalogram (EEG) data has been challenging and researchers have been overly dependent on pre-processing and hand-crafted feature extraction.…
▽ More
With the unprecedented success of transformer networks in natural language processing (NLP), recently, they have been successfully adapted to areas like computer vision, generative adversarial networks (GAN), and reinforcement learning. Classifying electroencephalogram (EEG) data has been challenging and researchers have been overly dependent on pre-processing and hand-crafted feature extraction. Despite having achieved automated feature extraction in several other domains, deep learning has not yet been accomplished for EEG. In this paper, the efficacy of the transformer network for the classification of raw EEG data (cleaned and pre-processed) is explored. The performance of transformer networks was evaluated on a local (age and gender data) and a public dataset (STEW). First, a classifier using a transformer network is built to classify the age and gender of a person with raw resting-state EEG data. Second, the classifier is tuned for mental workload classification with open access raw multi-tasking mental workload EEG data (STEW). The network achieves an accuracy comparable to state-of-the-art accuracy on both the local (Age and Gender dataset; 94.53% (gender) and 87.79% (age)) and the public (STEW dataset; 95.28% (two workload levels) and 88.72% (three workload levels)) dataset. The accuracy values have been achieved using raw EEG data without feature extraction. Results indicate that the transformer-based deep learning models can successfully abate the need for heavy feature-extraction of EEG data for successful classification.
△ Less
Submitted 8 February, 2022;
originally announced February 2022.
-
TorchAudio: Building Blocks for Audio and Speech Processing
Authors:
Yao-Yuan Yang,
Moto Hira,
Zhaoheng Ni,
Anjali Chourdia,
Artyom Astafurov,
Caroline Chen,
Ching-Feng Yeh,
Christian Puhrsch,
David Pollack,
Dmitriy Genzel,
Donny Greenberg,
Edward Z. Yang,
Jason Lian,
Jay Mahadeokar,
Jeff Hwang,
Ji Chen,
Peter Goldsborough,
Prabhat Roy,
Sean Narenthiran,
Shinji Watanabe,
Soumith Chintala,
Vincent Quenneville-Bélair,
Yangyang Shi
Abstract:
This document describes version 0.10 of TorchAudio: building blocks for machine learning applications in the audio and speech processing domain. The objective of TorchAudio is to accelerate the development and deployment of machine learning applications for researchers and engineers by providing off-the-shelf building blocks. The building blocks are designed to be GPU-compatible, automatically dif…
▽ More
This document describes version 0.10 of TorchAudio: building blocks for machine learning applications in the audio and speech processing domain. The objective of TorchAudio is to accelerate the development and deployment of machine learning applications for researchers and engineers by providing off-the-shelf building blocks. The building blocks are designed to be GPU-compatible, automatically differentiable, and production-ready. TorchAudio can be easily installed from Python Package Index repository and the source code is publicly available under a BSD-2-Clause License (as of September 2021) at https://github.com/pytorch/audio. In this document, we provide an overview of the design principles, functionalities, and benchmarks of TorchAudio. We also benchmark our implementation of several audio and speech operations and models. We verify through the benchmarks that our implementations of various operations and models are valid and perform similarly to other publicly available implementations.
△ Less
Submitted 16 February, 2022; v1 submitted 28 October, 2021;
originally announced October 2021.
-
End-to-end Triplet Loss based Emotion Embedding System for Speech Emotion Recognition
Authors:
Puneet Kumar,
Sidharth Jain,
Balasubramanian Raman,
Partha Pratim Roy,
Masakazu Iwamura
Abstract:
In this paper, an end-to-end neural embedding system based on triplet loss and residual learning has been proposed for speech emotion recognition. The proposed system learns the embeddings from the emotional information of the speech utterances. The learned embeddings are used to recognize the emotions portrayed by given speech samples of various lengths. The proposed system implements Residual Ne…
▽ More
In this paper, an end-to-end neural embedding system based on triplet loss and residual learning has been proposed for speech emotion recognition. The proposed system learns the embeddings from the emotional information of the speech utterances. The learned embeddings are used to recognize the emotions portrayed by given speech samples of various lengths. The proposed system implements Residual Neural Network architecture. It is trained using softmax pre-training and triplet loss function. The weights between the fully connected and embedding layers of the trained network are used to calculate the embedding values. The embedding representations of various emotions are mapped onto a hyperplane, and the angles among them are computed using the cosine similarity. These angles are utilized to classify a new speech sample into its appropriate emotion class. The proposed system has demonstrated 91.67% and 64.44% accuracy while recognizing emotions for RAVDESS and IEMOCAP dataset, respectively.
△ Less
Submitted 13 October, 2020;
originally announced October 2020.
-
Fast Griffin Lim based Waveform Generation Strategy for Text-to-Speech Synthesis
Authors:
Ankit Sharma,
Puneet Kumar,
Vikas Maddukuri,
Nagasai Madamshettib,
Kishore KG,
Sahit Sai Sriram Kavurub,
Balasubramanian Raman,
Partha Pratim Roy
Abstract:
The performance of text-to-speech (TTS) systems heavily depends on spectrogram to waveform generation, also known as the speech reconstruction phase. The time required for the same is known as synthesis delay. In this paper, an approach to reduce speech synthesis delay has been proposed. It aims to enhance the TTS systems for real-time applications such as digital assistants, mobile phones, embedd…
▽ More
The performance of text-to-speech (TTS) systems heavily depends on spectrogram to waveform generation, also known as the speech reconstruction phase. The time required for the same is known as synthesis delay. In this paper, an approach to reduce speech synthesis delay has been proposed. It aims to enhance the TTS systems for real-time applications such as digital assistants, mobile phones, embedded devices, etc. The proposed approach applies Fast Griffin Lim Algorithm (FGLA) instead Griffin Lim algorithm (GLA) as vocoder in the speech synthesis phase. GLA and FGLA are both iterative, but the convergence rate of FGLA is faster than GLA. The proposed approach is tested on LJSpeech, Blizzard and Tatoeba datasets and the results for FGLA are compared against GLA and neural Generative Adversarial Network (GAN) based vocoder. The performance is evaluated based on synthesis delay and speech quality. A 36.58% reduction in speech synthesis delay has been observed. The quality of the output speech has improved, which is advocated by higher Mean opinion scores (MOS) and faster convergence with FGLA as opposed to GLA.
△ Less
Submitted 11 July, 2020;
originally announced July 2020.
-
Robust Tracking and Model Following Controller Based on Higher Order Sliding Mode Control and Observation: With an Application to MagLev System
Authors:
Siddhartha Ganguly,
Manas Kumar Bera,
Prasanta Roy
Abstract:
This paper deals with the design of robust tracking and model following (RTMF) controller for linear time-invariant (LTI) systems with uncertainties. The controller is based on the second order sliding mode (SOSM) algorithm (super twisting) which is the most effective and popular in the family of higher order sliding modes (HOSM). The use of super twisting algorithm (STA) eliminates the chattering…
▽ More
This paper deals with the design of robust tracking and model following (RTMF) controller for linear time-invariant (LTI) systems with uncertainties. The controller is based on the second order sliding mode (SOSM) algorithm (super twisting) which is the most effective and popular in the family of higher order sliding modes (HOSM). The use of super twisting algorithm (STA) eliminates the chattering problem encountered in traditional sliding mode control while retaining its robustness properties. The proposed robust tracking controller can guarantee the asymptotic stability of tracking error in the presence of time varying uncertain parameter and exogenous disturbances. Finally, this strategy is implemented on a magnetic levitation system (MagLev) which is inherently unstable and nonlinear. While implementing this proposed RTMF controller for MagLev system, a super twisting observer (STO) is used to estimate the unknown state i.e the velocity of the ball which is not directly available for measurement. It has been observed that the RTMF controller based on STA-STO pair, is not good enough to achieve SOSM for a chosen sliding surface using continuous control. As a remedy, continuous RTMF controller based on STA is implemented with a higher order sliding mode observer (HOSMO). The simulated as well as the experimental results are provided to illustrate the effectiveness of the proposed controller-observers pair.
△ Less
Submitted 11 July, 2020;
originally announced July 2020.
-
Assisted music creation with Flow Machines: towards new categories of new
Authors:
François Pachet,
Pierre Roy,
Benoit Carré
Abstract:
This chapter reflects on about 10 years of research in AI- assisted music composition, in particular during the Flow Machines project. We reflect on the motivations for such a project, its background, its main results and impact, both technological and musical, several years after its completion. We conclude with a proposal for new categories of new, created by the many uses of AI techniques to ge…
▽ More
This chapter reflects on about 10 years of research in AI- assisted music composition, in particular during the Flow Machines project. We reflect on the motivations for such a project, its background, its main results and impact, both technological and musical, several years after its completion. We conclude with a proposal for new categories of new, created by the many uses of AI techniques to generate novel material.
△ Less
Submitted 4 January, 2021; v1 submitted 16 June, 2020;
originally announced June 2020.
-
A New Chaos and Permutation Based Algorithm for Image and Video Encryption
Authors:
Chinmaya Patnayak,
Pradipta Roy,
Bibekanand Patnaik
Abstract:
Images and video sequences carry large volumes of highly correlated and redundant data. Applications like military and telecommunication require encryption methods to protect the data from unwanted access. This requirement in most cases needs to be realized in real-time. In this paper, we propose a fast new Fiestal-structured approach for image and video encryption based on a chaotic random sequen…
▽ More
Images and video sequences carry large volumes of highly correlated and redundant data. Applications like military and telecommunication require encryption methods to protect the data from unwanted access. This requirement in most cases needs to be realized in real-time. In this paper, we propose a fast new Fiestal-structured approach for image and video encryption based on a chaotic random sequence generator and a Permutation-Inverse Permutation (PIP) pixel transform. This approach utilizes mathematical functions and transforms with low complexity. The algorithm at the same time, ensures no drastic pay off in terms of encryption quality. This renders the algorithm with promising scope for real time applications and easy hardware implementation. MATLAB simulation of the algorithm establishes its high quality of encryption in terms of elevated entropy values and negligible correlation of the encrypted data with the original. Simulation results also show high sensitivity to slight variation in keys ensuring high security.
△ Less
Submitted 2 June, 2020;
originally announced June 2020.
-
Artificial Intelligence Assistance Significantly Improves Gleason Grading of Prostate Biopsies by Pathologists
Authors:
Wouter Bulten,
Maschenka Balkenhol,
Jean-Joël Awoumou Belinga,
Américo Brilhante,
Aslı Çakır,
Xavier Farré,
Katerina Geronatsiou,
Vincent Molinié,
Guilherme Pereira,
Paromita Roy,
Günter Saile,
Paulo Salles,
Ewout Schaafsma,
Joëlle Tschui,
Anne-Marie Vos,
Hester van Boven,
Robert Vink,
Jeroen van der Laak,
Christina Hulsbergen-van de Kaa,
Geert Litjens
Abstract:
While the Gleason score is the most important prognostic marker for prostate cancer patients, it suffers from significant observer variability. Artificial Intelligence (AI) systems, based on deep learning, have proven to achieve pathologist-level performance at Gleason grading. However, the performance of such systems can degrade in the presence of artifacts, foreign tissue, or other anomalies. Pa…
▽ More
While the Gleason score is the most important prognostic marker for prostate cancer patients, it suffers from significant observer variability. Artificial Intelligence (AI) systems, based on deep learning, have proven to achieve pathologist-level performance at Gleason grading. However, the performance of such systems can degrade in the presence of artifacts, foreign tissue, or other anomalies. Pathologists integrating their expertise with feedback from an AI system could result in a synergy that outperforms both the individual pathologist and the system. Despite the hype around AI assistance, existing literature on this topic within the pathology domain is limited. We investigated the value of AI assistance for grading prostate biopsies. A panel of fourteen observers graded 160 biopsies with and without AI assistance. Using AI, the agreement of the panel with an expert reference standard significantly increased (quadratically weighted Cohen's kappa, 0.799 vs 0.872; p=0.018). Our results show the added value of AI systems for Gleason grading, but more importantly, show the benefits of pathologist-AI synergy.
△ Less
Submitted 11 February, 2020;
originally announced February 2020.
-
Smart Edition of MIDI Files
Authors:
Pierre Roy,
Francois Pachet
Abstract:
We address the issue of editing musical performance data, in particular MIDI files representing human musical performances. Editing such sequences raises specific issues due to the ambiguous nature of musical objects. The first source of ambiguity is that musicians naturally produce many deviations from the metrical frame. These deviations may be intentional or subconscious, but they play an impor…
▽ More
We address the issue of editing musical performance data, in particular MIDI files representing human musical performances. Editing such sequences raises specific issues due to the ambiguous nature of musical objects. The first source of ambiguity is that musicians naturally produce many deviations from the metrical frame. These deviations may be intentional or subconscious, but they play an important role in conveying the groove or feeling of a performance. Relations between musical elements are also usually implicit, creating even more ambiguity. A note is in relation with the surrounding notes in many possible ways: it can be part of a melodic pattern, it can also play a harmonic role with the simultaneous notes, or be a pedal-tone. All these aspects play an essential role that should be preserved, as much as possible, when editing musical sequences.
In this paper, we contribute specifically to the problem of editing non-quantized, metrical musical sequences represented as MIDI files. We first list of number of problems caused by the use of naive edition operations applied to performance data, using a motivating example. We then introduce a model, called Dancing MIDI, based on 1) two desirable, well-defined properties for edit operations and 2) two well-defined operations, Split and Concat, with an implementation. We show that our model formally satisfies the two properties, and that it prevents most of the problems that occur with naive edit operations on our motivating example, as well as on a real-world example using an automatic harmonizer.
△ Less
Submitted 20 March, 2019;
originally announced March 2019.
-
The Skip** Behavior of Users of Music Streaming Services and its Relation to Musical Structure
Authors:
Nicola Montecchio,
Pierre Roy,
François Pachet
Abstract:
The behavior of users of music streaming services is investigated from the point of view of the temporal dimension of individual songs; specifically, the main object of the analysis is the point in time within a song at which users stop listening and start streaming another song ("skip"). The main contribution of this study is the ascertainment of a correlation between the distribution in time of…
▽ More
The behavior of users of music streaming services is investigated from the point of view of the temporal dimension of individual songs; specifically, the main object of the analysis is the point in time within a song at which users stop listening and start streaming another song ("skip"). The main contribution of this study is the ascertainment of a correlation between the distribution in time of skip** events and the musical structure of songs. It is also shown that such distribution is not only specific to the individual songs, but also independent of the cohort of users and, under stationary conditions, date of observation. Finally, user behavioral data is used to train a predictor of the musical structure of a song solely from its acoustic content; it is shown that the use of such data, available in large quantities to music streaming services, yields significant improvements in accuracy over the customary fashion of training this class of algorithms, in which only smaller amounts of hand-labeled data are available.
△ Less
Submitted 12 March, 2019;
originally announced March 2019.
-
Effects of Degradations on Deep Neural Network Architectures
Authors:
Prasun Roy,
Subhankar Ghosh,
Saumik Bhattacharya,
Umapada Pal
Abstract:
Deep convolutional neural networks (CNN) have massively influenced recent advances in large-scale image classification. More recently, a dynamic routing algorithm with capsules (groups of neurons) has shown state-of-the-art recognition performance. However, the behavior of such networks in the presence of a degrading signal (noise) is mostly unexplored. An analytical study on different network arc…
▽ More
Deep convolutional neural networks (CNN) have massively influenced recent advances in large-scale image classification. More recently, a dynamic routing algorithm with capsules (groups of neurons) has shown state-of-the-art recognition performance. However, the behavior of such networks in the presence of a degrading signal (noise) is mostly unexplored. An analytical study on different network architectures toward noise robustness is essential for selecting the appropriate model in a specific application scenario. This paper presents an extensive performance analysis of six deep architectures for image classification on six most common image degradation models. In this study, we have compared VGG-16, VGG-19, ResNet-50, Inception-v3, MobileNet and CapsuleNet architectures on Gaussian white, Gaussian color, salt-and-pepper, Gaussian blur, motion blur and JPEG compression noise models.
△ Less
Submitted 29 March, 2023; v1 submitted 26 July, 2018;
originally announced July 2018.
-
Queuing Theory Guided Intelligent Traffic Scheduling through Video Analysis using Dirichlet Process Mixture Model
Authors:
Santhosh Kelathodi Kumaran,
Debi Prosad Dogra,
Partha Pratim Roy
Abstract:
Accurate prediction of traffic signal duration for roadway junction is a challenging problem due to the dynamic nature of traffic flows. Though supervised learning can be used, parameters may vary across roadway junctions. In this paper, we present a computer vision guided expert system that can learn the departure rate of a given traffic junction modeled using traditional queuing theory. First, w…
▽ More
Accurate prediction of traffic signal duration for roadway junction is a challenging problem due to the dynamic nature of traffic flows. Though supervised learning can be used, parameters may vary across roadway junctions. In this paper, we present a computer vision guided expert system that can learn the departure rate of a given traffic junction modeled using traditional queuing theory. First, we temporally group the optical flow of the moving vehicles using Dirichlet Process Mixture Model (DPMM). These groups are referred to as tracklets or temporal clusters. Tracklet features are then used to learn the dynamic behavior of a traffic junction, especially during on/off cycles of a signal. The proposed queuing theory based approach can predict the signal open duration for the next cycle with higher accuracy when compared with other popular features used for tracking. The hypothesis has been verified on two publicly available video datasets. The results reveal that the DPMM based features are better than existing tracking frameworks to estimate $μ$. Thus, signal duration prediction is more accurate when tested on these datasets.The method can be used for designing intelligent operator-independent traffic control systems for roadway junctions at cities and highways.
△ Less
Submitted 17 March, 2018;
originally announced March 2018.