-
The Brain Tumor Segmentation (BraTS) Challenge 2023: Brain MR Image Synthesis for Tumor Segmentation (BraSyn)
Authors:
Hongwei Bran Li,
Gian Marco Conte,
Syed Muhammad Anwar,
Florian Kofler,
Ivan Ezhov,
Koen van Leemput,
Marie Piraud,
Maria Diaz,
Byrone Cole,
Evan Calabrese,
Jeff Rudie,
Felix Meissen,
Maruf Adewole,
Anastasia Janas,
Anahita Fathi Kazerooni,
Dominic LaBella,
Ahmed W. Moawad,
Keyvan Farahani,
James Eddy,
Timothy Bergquist,
Verena Chung,
Russell Takeshi Shinohara,
Farouk Dako,
Walter Wiggins,
Zachary Reitman
, et al. (43 additional authors not shown)
Abstract:
Automated brain tumor segmentation methods have become well-established and reached performance levels offering clear clinical utility. These methods typically rely on four input magnetic resonance imaging (MRI) modalities: T1-weighted images with and without contrast enhancement, T2-weighted images, and FLAIR images. However, some sequences are often missing in clinical practice due to time const…
▽ More
Automated brain tumor segmentation methods have become well-established and reached performance levels offering clear clinical utility. These methods typically rely on four input magnetic resonance imaging (MRI) modalities: T1-weighted images with and without contrast enhancement, T2-weighted images, and FLAIR images. However, some sequences are often missing in clinical practice due to time constraints or image artifacts, such as patient motion. Consequently, the ability to substitute missing modalities and gain segmentation performance is highly desirable and necessary for the broader adoption of these algorithms in the clinical routine. In this work, we present the establishment of the Brain MR Image Synthesis Benchmark (BraSyn) in conjunction with the Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2023. The primary objective of this challenge is to evaluate image synthesis methods that can realistically generate missing MRI modalities when multiple available images are provided. The ultimate aim is to facilitate automated brain tumor segmentation pipelines. The image dataset used in the benchmark is diverse and multi-modal, created through collaboration with various hospitals and research institutions.
△ Less
Submitted 28 June, 2023; v1 submitted 15 May, 2023;
originally announced May 2023.
-
The Brain Tumor Segmentation (BraTS) Challenge 2023: Local Synthesis of Healthy Brain Tissue via Inpainting
Authors:
Florian Kofler,
Felix Meissen,
Felix Steinbauer,
Robert Graf,
Eva Oswald,
Ezequiel de da Rosa,
Hongwei Bran Li,
Ujjwal Baid,
Florian Hoelzl,
Oezguen Turgut,
Izabela Horvath,
Diana Waldmannstetter,
Christina Bukas,
Maruf Adewole,
Syed Muhammad Anwar,
Anastasia Janas,
Anahita Fathi Kazerooni,
Dominic LaBella,
Ahmed W Moawad,
Keyvan Farahani,
James Eddy,
Timothy Bergquist,
Verena Chung,
Russell Takeshi Shinohara,
Farouk Dako
, et al. (43 additional authors not shown)
Abstract:
A myriad of algorithms for the automatic analysis of brain MR images is available to support clinicians in their decision-making. For brain tumor patients, the image acquisition time series typically starts with a scan that is already pathological. This poses problems, as many algorithms are designed to analyze healthy brains and provide no guarantees for images featuring lesions. Examples include…
▽ More
A myriad of algorithms for the automatic analysis of brain MR images is available to support clinicians in their decision-making. For brain tumor patients, the image acquisition time series typically starts with a scan that is already pathological. This poses problems, as many algorithms are designed to analyze healthy brains and provide no guarantees for images featuring lesions. Examples include but are not limited to algorithms for brain anatomy parcellation, tissue segmentation, and brain extraction. To solve this dilemma, we introduce the BraTS 2023 inpainting challenge. Here, the participants' task is to explore inpainting techniques to synthesize healthy brain scans from lesioned ones. The following manuscript contains the task formulation, dataset, and submission procedure. Later it will be updated to summarize the findings of the challenge. The challenge is organized as part of the BraTS 2023 challenge hosted at the MICCAI 2023 conference in Vancouver, Canada.
△ Less
Submitted 9 August, 2023; v1 submitted 15 May, 2023;
originally announced May 2023.
-
Ensemble prosody prediction for expressive speech synthesis
Authors:
Tian Huey Teh,
Vivian Hu,
Devang S Ram Mohan,
Zack Hodari,
Christopher G. R. Wallis,
Tomás Gomez Ibarrondo,
Alexandra Torresquintero,
James Leoni,
Mark Gales,
Simon King
Abstract:
Generating expressive speech with rich and varied prosody continues to be a challenge for Text-to-Speech. Most efforts have focused on sophisticated neural architectures intended to better model the data distribution. Yet, in evaluations it is generally found that no single model is preferred for all input texts. This suggests an approach that has rarely been used before for Text-to-Speech: an ens…
▽ More
Generating expressive speech with rich and varied prosody continues to be a challenge for Text-to-Speech. Most efforts have focused on sophisticated neural architectures intended to better model the data distribution. Yet, in evaluations it is generally found that no single model is preferred for all input texts. This suggests an approach that has rarely been used before for Text-to-Speech: an ensemble of models. We apply ensemble learning to prosody prediction. We construct simple ensembles of prosody predictors by varying either model architecture or model parameter values. To automatically select amongst the models in the ensemble when performing Text-to-Speech, we propose a novel, and computationally trivial, variance-based criterion. We demonstrate that even a small ensemble of prosody predictors yields useful diversity, which, combined with the proposed selection criterion, outperforms any individual model from the ensemble.
△ Less
Submitted 3 April, 2023;
originally announced April 2023.
-
Controllable Prosody Generation With Partial Inputs
Authors:
Dan Andrei Iliescu,
Devang Savita Ram Mohan,
Tian Huey Teh,
Zack Hodari
Abstract:
We address the problem of human-in-the-loop control for generating prosody in the context of text-to-speech synthesis. Controlling prosody is challenging because existing generative models lack an efficient interface through which users can modify the output quickly and precisely. To solve this, we introduce a novel framework whereby the user provides partial inputs and the generative model genera…
▽ More
We address the problem of human-in-the-loop control for generating prosody in the context of text-to-speech synthesis. Controlling prosody is challenging because existing generative models lack an efficient interface through which users can modify the output quickly and precisely. To solve this, we introduce a novel framework whereby the user provides partial inputs and the generative model generates the missing features. We propose a model that is specifically designed to encode partial prosodic features and output complete audio. We show empirically that our model displays two essential qualities of a human-in-the-loop control mechanism: efficiency and robustness. With even a very small number of input values (~4), our model enables users to improve the quality of the output significantly in terms of listener preference (4:1).
△ Less
Submitted 15 April, 2024; v1 submitted 14 March, 2023;
originally announced March 2023.
-
A note on power allocation for optimal capacity
Authors:
Shravan Mohan
Abstract:
The problems of determining the optimal power allocation, within maximum power bounds, to (i) maximize the minimum Shannon capacity, and (ii) minimize the weighted latency are considered. In the first case, the global optima can be achieved in polynomial time by solving a sequence of linear programs (LP). In the second case, the original non-convex problem is replaced by a convex surrogate (a geom…
▽ More
The problems of determining the optimal power allocation, within maximum power bounds, to (i) maximize the minimum Shannon capacity, and (ii) minimize the weighted latency are considered. In the first case, the global optima can be achieved in polynomial time by solving a sequence of linear programs (LP). In the second case, the original non-convex problem is replaced by a convex surrogate (a geometric program), using a functional approximation. Since the approximation error is relatively low, the optima of the surrogate is close to the global optimal point of the original problem. In either cases, there is no assumption on the SINR range. The use of LPs and geometric programming make the proposed algorithms numerically efficient. Computations are provided for corroboration.
△ Less
Submitted 13 November, 2022;
originally announced November 2022.
-
HeartSpot: Privatized and Explainable Data Compression for Cardiomegaly Detection
Authors:
Elvin Johnson,
Shreshta Mohan,
Alex Gaudio,
Asim Smailagic,
Christos Faloutsos,
Aurélio Campilho
Abstract:
Advances in data-driven deep learning for chest X-ray image analysis underscore the need for explainability, privacy, large datasets and significant computational resources. We frame privacy and explainability as a lossy single-image compression problem to reduce both computational and data requirements without training. For Cardiomegaly detection in chest X-ray images, we propose HeartSpot and fo…
▽ More
Advances in data-driven deep learning for chest X-ray image analysis underscore the need for explainability, privacy, large datasets and significant computational resources. We frame privacy and explainability as a lossy single-image compression problem to reduce both computational and data requirements without training. For Cardiomegaly detection in chest X-ray images, we propose HeartSpot and four spatial bias priors. HeartSpot priors define how to sample pixels based on domain knowledge from medical literature and from machines. HeartSpot privatizes chest X-ray images by discarding up to 97% of pixels, such as those that reveal the shape of the thoracic cage, bones, small lesions and other sensitive features. HeartSpot priors are ante-hoc explainable and give a human-interpretable image of the preserved spatial features that clearly outlines the heart. HeartSpot offers strong compression, with up to 32x fewer pixels and 11x smaller filesize. Cardiomegaly detectors using HeartSpot are up to 9x faster to train or at least as accurate (up to +.01 AUC ROC) when compared to a baseline DenseNet121. HeartSpot is post-hoc explainable by re-using existing attribution methods without requiring access to the original non-privatized image. In summary, HeartSpot improves speed and accuracy, reduces image size, improves privacy and ensures explainability.
Source code: https://www.github.com/adgaudio/HeartSpot
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
A note on load balancing in DC microgrids
Authors:
Shravan Mohan,
Bharath Bhikkaji
Abstract:
A problem of load balancing in isolated DC microgrids is considered in this paper. Here, a DC load is fed by multiple heterogenous DC sources, each of which is connected to the load via a boost converter. The gains of the DCC's provide for a means to control the division of load current amongst the DC sources. The primary objective of the control scheme is to minimise the total losses in the netwo…
▽ More
A problem of load balancing in isolated DC microgrids is considered in this paper. Here, a DC load is fed by multiple heterogenous DC sources, each of which is connected to the load via a boost converter. The gains of the DCC's provide for a means to control the division of load current amongst the DC sources. The primary objective of the control scheme is to minimise the total losses in the network, while maintaining the output voltage within a desired range, serving the load current demand and adhering to VI-characteristics of the power sources. Under assumptions of concavity/monotonocity/piece-wise-linearity of the VI-characteristics, the problem is solved using a convex relaxation. It is shown that the solution to the relaxed problem is tight. Thus, the resulting algorithm is guaranteed to reach global optimality in a numerically efficient manner. Simulations are provided for corroboration.
△ Less
Submitted 26 April, 2022;
originally announced April 2022.
-
Federated Learning Enables Big Data for Rare Cancer Boundary Detection
Authors:
Sarthak Pati,
Ujjwal Baid,
Brandon Edwards,
Micah Sheller,
Shih-Han Wang,
G Anthony Reina,
Patrick Foley,
Alexey Gruzdev,
Deepthi Karkada,
Christos Davatzikos,
Chiharu Sako,
Satyam Ghodasara,
Michel Bilello,
Suyash Mohan,
Philipp Vollmuth,
Gianluca Brugnara,
Chandrakanth J Preetha,
Felix Sahm,
Klaus Maier-Hein,
Maximilian Zenk,
Martin Bendszus,
Wolfgang Wick,
Evan Calabrese,
Jeffrey Rudie,
Javier Villanueva-Meyer
, et al. (254 additional authors not shown)
Abstract:
Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train acc…
▽ More
Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train accurate and generalizable ML models, by only sharing numerical model updates. Here we present findings from the largest FL study to-date, involving data from 71 healthcare institutions across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, utilizing the largest dataset of such patients ever used in the literature (25,256 MRI scans from 6,314 patients). We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent. We anticipate our study to: 1) enable more studies in healthcare informed by large and diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further quantitative analyses for glioblastoma via performance optimization of our consensus model for eventual public release, and 3) demonstrate the effectiveness of FL at such scale and task complexity as a paradigm shift for multi-site collaborations, alleviating the need for data sharing.
△ Less
Submitted 25 April, 2022; v1 submitted 22 April, 2022;
originally announced April 2022.
-
Attention W-Net: Improved Skip Connections for better Representations
Authors:
Shikhar Mohan,
Saumik Bhattacharya,
Sayantari Ghosh
Abstract:
Segmentation of macro and microvascular structures in fundoscopic retinal images plays a crucial role in the detection of multiple retinal and systemic diseases, yet it is a difficult problem to solve. Most neural network approaches face several issues such as lack of enough parameters, overfitting and/or incompatibility between internal feature-spaces. We propose Attention W-Net, a new U-Net base…
▽ More
Segmentation of macro and microvascular structures in fundoscopic retinal images plays a crucial role in the detection of multiple retinal and systemic diseases, yet it is a difficult problem to solve. Most neural network approaches face several issues such as lack of enough parameters, overfitting and/or incompatibility between internal feature-spaces. We propose Attention W-Net, a new U-Net based architecture for retinal vessel segmentation to address these problems. In this architecture, we have two main contributions: Attention Block and regularisation measures. Our Attention Block uses attention between encoder and decoder features, resulting in higher compatibility upon addition. Our regularisation measures include augmentation and modifications to the ResNet Block used, which greatly prevent overfitting. We observe an F1 and AUC of 0.8407 and 0.9833 on the DRIVE and 0.8174 and 0.9865 respectively on the CHASE-DB1 datasets - a sizeable improvement over its backbone as well as competitive performance among contemporary state-of-the-art methods.
△ Less
Submitted 29 June, 2022; v1 submitted 17 October, 2021;
originally announced October 2021.
-
Comparative assessment of typical controlrealizations of grid forming converters based ontheir voltage source behaviour
Authors:
Kanakesh Vatta Kkuni,
Sibin Mohan,
Guangya Yang,
Wilsun Xu
Abstract:
The converter control functions to provide the capabilities similar to synchronous generators are referred to as grid forming converters (GFC). Identical to a synchronous machine, a grid forming converter is expected to behave as a voltage source behind an impedance beyond the control bandwidth. However, GFC's realization has been different, with some utilizes inner current and voltage controllers…
▽ More
The converter control functions to provide the capabilities similar to synchronous generators are referred to as grid forming converters (GFC). Identical to a synchronous machine, a grid forming converter is expected to behave as a voltage source behind an impedance beyond the control bandwidth. However, GFC's realization has been different, with some utilizes inner current and voltage controllers while others do not. This paper studies the impact of the inner loop on the grid forming converter's ability to behave as a voltage source behind an impedance. Three of the most popular GFC structures, 1) GFC with cascaded voltage and current control, 2) with inner current control, 3) with no inner loop, are chosen for the comparison. The analysis revealed that MW level GFC with inner loops could potentially go unstable under weak power system. Additionally, the GFC with cascaded control can only operate stably within a narrow range of network impedances. Furthermore, it is also shown that slow response behavior based on cascaded inner loop can impact on dynamic reactive and active power-sharing.
△ Less
Submitted 18 June, 2021;
originally announced June 2021.
-
Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis
Authors:
Devang S Ram Mohan,
Vivian Hu,
Tian Huey Teh,
Alexandra Torresquintero,
Christopher G. R. Wallis,
Marlene Staib,
Lorenzo Foglianti,
Jiameng Gao,
Simon King
Abstract:
Text does not fully specify the spoken form, so text-to-speech models must be able to learn from speech data that vary in ways not explained by the corresponding text. One way to reduce the amount of unexplained variation in training data is to provide acoustic information as an additional learning signal. When generating speech, modifying this acoustic information enables multiple distinct rendit…
▽ More
Text does not fully specify the spoken form, so text-to-speech models must be able to learn from speech data that vary in ways not explained by the corresponding text. One way to reduce the amount of unexplained variation in training data is to provide acoustic information as an additional learning signal. When generating speech, modifying this acoustic information enables multiple distinct renditions of a text to be produced.
Since much of the unexplained variation is in the prosody, we propose a model that generates speech explicitly conditioned on the three primary acoustic correlates of prosody: $F_{0}$, energy and duration. The model is flexible about how the values of these features are specified: they can be externally provided, or predicted from text, or predicted then subsequently modified.
Compared to a model that employs a variational auto-encoder to learn unsupervised latent features, our model provides more interpretable, temporally-precise, and disentangled control. When automatically predicting the acoustic features from text, it generates speech that is more natural than that from a Tacotron 2 model with reference encoder. Subsequent human-in-the-loop modification of the predicted acoustic features can significantly further increase naturalness.
△ Less
Submitted 15 June, 2021;
originally announced June 2021.
-
ADEPT: A Dataset for Evaluating Prosody Transfer
Authors:
Alexandra Torresquintero,
Tian Huey Teh,
Christopher G. R. Wallis,
Marlene Staib,
Devang S Ram Mohan,
Vivian Hu,
Lorenzo Foglianti,
Jiameng Gao,
Simon King
Abstract:
Text-to-speech is now able to achieve near-human naturalness and research focus has shifted to increasing expressivity. One popular method is to transfer the prosody from a reference speech sample. There have been considerable advances in using prosody transfer to generate more expressive speech, but the field lacks a clear definition of what successful prosody transfer means and a method for meas…
▽ More
Text-to-speech is now able to achieve near-human naturalness and research focus has shifted to increasing expressivity. One popular method is to transfer the prosody from a reference speech sample. There have been considerable advances in using prosody transfer to generate more expressive speech, but the field lacks a clear definition of what successful prosody transfer means and a method for measuring it.
We introduce a dataset of prosodically-varied reference natural speech samples for evaluating prosody transfer. The samples include global variations reflecting emotion and interpersonal attitude, and local variations reflecting topical emphasis, propositional attitude, syntactic phrasing and marked tonicity. The corpus only includes prosodic variations that listeners are able to distinguish with reasonable accuracy, and we report these figures as a benchmark against which text-to-speech prosody transfer can be compared.
We conclude the paper with a demonstration of our proposed evaluation methodology, using the corpus to evaluate two text-to-speech models that perform prosody transfer.
△ Less
Submitted 21 July, 2021; v1 submitted 15 June, 2021;
originally announced June 2021.
-
Develo** and Evaluating Deep Neural Network-based Denoising for Nanoparticle TEM Images with Ultra-low Signal-to-Noise
Authors:
Joshua L. Vincent,
Ramon Manzorro,
Sreyas Mohan,
Binh Tang,
Dev Y. Sheth,
Eero P. Simoncelli,
David S. Matteson,
Carlos Fernandez-Granda,
Peter A. Crozier
Abstract:
A deep convolutional neural network has been developed to denoise atomic-resolution TEM image datasets of nanoparticles acquired using direct electron counting detectors, for applications where the image signal is severely limited by shot noise. The network was applied to a model system of CeO2-supported Pt nanoparticles. We leverage multislice image simulations to generate a large and flexible da…
▽ More
A deep convolutional neural network has been developed to denoise atomic-resolution TEM image datasets of nanoparticles acquired using direct electron counting detectors, for applications where the image signal is severely limited by shot noise. The network was applied to a model system of CeO2-supported Pt nanoparticles. We leverage multislice image simulations to generate a large and flexible dataset for training and testing the network. The proposed network outperforms state-of-the-art denoising methods by a significant margin both on simulated and experimental test data. Factors contributing to the performance are identified, including most importantly (a) the geometry of the images used during training and (b) the size of the network's receptive field. Through a gradient-based analysis, we investigate the mechanisms learned by the network to denoise experimental images. This shows that the network exploits global and local information in the noisy measurements, for example, by adapting its filtering approach when it encounters atomic-level defects at the nanoparticle surface. Extensive analysis has been done to characterize the network's ability to correctly predict the exact atomic structure at the nanoparticle surface. Finally, we develop an approach based on the log-likelihood ratio test that provides a quantitative measure of the agreement between the noisy observation and the atomic-level structure in the network-denoised image.
△ Less
Submitted 17 March, 2021; v1 submitted 19 January, 2021;
originally announced January 2021.
-
Unsupervised Deep Video Denoising
Authors:
Dev Yashpal Sheth,
Sreyas Mohan,
Joshua L. Vincent,
Ramon Manzorro,
Peter A. Crozier,
Mitesh M. Khapra,
Eero P. Simoncelli,
Carlos Fernandez-Granda
Abstract:
Deep convolutional neural networks (CNNs) for video denoising are typically trained with supervision, assuming the availability of clean videos. However, in many applications, such as microscopy, noiseless videos are not available. To address this, we propose an Unsupervised Deep Video Denoiser (UDVD), a CNN architecture designed to be trained exclusively with noisy data. The performance of UDVD i…
▽ More
Deep convolutional neural networks (CNNs) for video denoising are typically trained with supervision, assuming the availability of clean videos. However, in many applications, such as microscopy, noiseless videos are not available. To address this, we propose an Unsupervised Deep Video Denoiser (UDVD), a CNN architecture designed to be trained exclusively with noisy data. The performance of UDVD is comparable to the supervised state-of-the-art, even when trained only on a single short noisy video. We demonstrate the promise of our approach in real-world imaging applications by denoising raw video, fluorescence-microscopy and electron-microscopy data. In contrast to many current approaches to video denoising, UDVD does not require explicit motion compensation. This is advantageous because motion compensation is computationally expensive, and can be unreliable when the input data are noisy. A gradient-based analysis reveals that UDVD automatically adapts to local motion in the input noisy videos. Thus, the network learns to perform implicit motion compensation, even though it is only trained for denoising.
△ Less
Submitted 19 August, 2021; v1 submitted 30 November, 2020;
originally announced November 2020.
-
Deep Denoising For Scientific Discovery: A Case Study In Electron Microscopy
Authors:
Sreyas Mohan,
Ramon Manzorro,
Joshua L. Vincent,
Binh Tang,
Dev Yashpal Sheth,
Eero P. Simoncelli,
David S. Matteson,
Peter A. Crozier,
Carlos Fernandez-Granda
Abstract:
Denoising is a fundamental challenge in scientific imaging. Deep convolutional neural networks (CNNs) provide the current state of the art in denoising natural images, where they produce impressive results. However, their potential has barely been explored in the context of scientific imaging. Denoising CNNs are typically trained on real natural images artificially corrupted with simulated noise.…
▽ More
Denoising is a fundamental challenge in scientific imaging. Deep convolutional neural networks (CNNs) provide the current state of the art in denoising natural images, where they produce impressive results. However, their potential has barely been explored in the context of scientific imaging. Denoising CNNs are typically trained on real natural images artificially corrupted with simulated noise. In contrast, in scientific applications, noiseless ground-truth images are usually not available. To address this issue, we propose a simulation-based denoising (SBD) framework, in which CNNs are trained on simulated images. We test the framework on data obtained from transmission electron microscopy (TEM), an imaging technique with widespread applications in material science, biology, and medicine. SBD outperforms existing techniques by a wide margin on a simulated benchmark dataset, as well as on real data. Apart from the denoised images, SBD generates likelihood maps to visualize the agreement between the structure of the denoised image and the observed data. Our results reveal shortcomings of state-of-the-art denoising architectures, such as their small field-of-view: substantially increasing the field-of-view of the CNNs allows them to exploit non-local periodic patterns in the data, which is crucial at high noise levels. In addition, we analyze the generalization capability of SBD, demonstrating that the trained networks are robust to variations of imaging parameters and of the underlying signal structure. Finally, we release the first publicly available benchmark dataset of TEM images, containing 18,000 examples.
△ Less
Submitted 13 July, 2021; v1 submitted 24 October, 2020;
originally announced October 2020.
-
Phonological Features for 0-shot Multilingual Speech Synthesis
Authors:
Marlene Staib,
Tian Huey Teh,
Alexandra Torresquintero,
Devang S Ram Mohan,
Lorenzo Foglianti,
Raphael Lenain,
Jiameng Gao
Abstract:
Code-switching---the intra-utterance use of multiple languages---is prevalent across the world. Within text-to-speech (TTS), multilingual models have been found to enable code-switching. By modifying the linguistic input to sequence-to-sequence TTS, we show that code-switching is possible for languages unseen during training, even within monolingual models. We use a small set of phonological featu…
▽ More
Code-switching---the intra-utterance use of multiple languages---is prevalent across the world. Within text-to-speech (TTS), multilingual models have been found to enable code-switching. By modifying the linguistic input to sequence-to-sequence TTS, we show that code-switching is possible for languages unseen during training, even within monolingual models. We use a small set of phonological features derived from the International Phonetic Alphabet (IPA), such as vowel height and frontness, consonant place and manner. This allows the model topology to stay unchanged for different languages, and enables new, previously unseen feature combinations to be interpreted by the model. We show that this allows us to generate intelligible, code-switched speech in a new language at test time, including the approximation of sounds never seen in training.
△ Less
Submitted 6 August, 2020;
originally announced August 2020.
-
Incremental Text to Speech for Neural Sequence-to-Sequence Models using Reinforcement Learning
Authors:
Devang S Ram Mohan,
Raphael Lenain,
Lorenzo Foglianti,
Tian Huey Teh,
Marlene Staib,
Alexandra Torresquintero,
Jiameng Gao
Abstract:
Modern approaches to text to speech require the entire input character sequence to be processed before any audio is synthesised. This latency limits the suitability of such models for time-sensitive tasks like simultaneous interpretation. Interleaving the action of reading a character with that of synthesising audio reduces this latency. However, the order of this sequence of interleaved actions v…
▽ More
Modern approaches to text to speech require the entire input character sequence to be processed before any audio is synthesised. This latency limits the suitability of such models for time-sensitive tasks like simultaneous interpretation. Interleaving the action of reading a character with that of synthesising audio reduces this latency. However, the order of this sequence of interleaved actions varies across sentences, which raises the question of how the actions should be chosen. We propose a reinforcement learning based framework to train an agent to make this decision. We compare our performance against that of deterministic, rule-based systems. Our results demonstrate that our agent successfully balances the trade-off between the latency of audio generation and the quality of synthesised audio. More broadly, we show that neural sequence-to-sequence models can be adapted to run in an incremental manner.
△ Less
Submitted 7 August, 2020;
originally announced August 2020.
-
Optimal Switching of Controlled Rectifiers
Authors:
Shravan Mohan
Abstract:
This paper discusses a linear programming approach for designing switching signals for controlled rectifiers to achieve a low input current & output voltage total harmonic distortions. The focus here is on fully controlled rectifiers made with four-quadrant MOSFET based switches. This topology, unlike thyristor based rectifiers, can be turned ON or OFF anytime. Yet another assumption made here is…
▽ More
This paper discusses a linear programming approach for designing switching signals for controlled rectifiers to achieve a low input current & output voltage total harmonic distortions. The focus here is on fully controlled rectifiers made with four-quadrant MOSFET based switches. This topology, unlike thyristor based rectifiers, can be turned ON or OFF anytime. Yet another assumption made here is that the current drawn by the load is constant. The basic idea for designing the waveform is to first time discretize its one period. This discretization, along with Parsevals identity lead to a linear programming formulation for minimizing a weighted sum of total harmonic distortions of the input current and the output voltages. The LPs so obtained can be solved efficiently using standard solvers to obtain the switching instants. The method can be used for both single phase and three-phase rectifiers. Simulations are provided for corroboration.
△ Less
Submitted 25 February, 2020;
originally announced February 2020.
-
Towards Label-Free 3D Segmentation of Optical Coherence Tomography Images of the Optic Nerve Head Using Deep Learning
Authors:
Sripad Krishna Devalla,
Tan Hung Pham,
Satish Kumar Panda,
Liang Zhang,
Giridhar Subramanian,
Anirudh Swaminathan,
Chin Zhi Yun,
Mohan Rajan,
Sujatha Mohan,
Ramaswami Krishnadas,
Vijayalakshmi Senthil,
John Mark S. de Leon,
Tin A. Tun,
Ching-Yu Cheng,
Leopold Schmetterer,
Shamira Perera,
Tin Aung,
Alexandre H. Thiery,
Michael J. A. Girard
Abstract:
Since the introduction of optical coherence tomography (OCT), it has been possible to study the complex 3D morphological changes of the optic nerve head (ONH) tissues that occur along with the progression of glaucoma. Although several deep learning (DL) techniques have been recently proposed for the automated extraction (segmentation) and quantification of these morphological changes, the device s…
▽ More
Since the introduction of optical coherence tomography (OCT), it has been possible to study the complex 3D morphological changes of the optic nerve head (ONH) tissues that occur along with the progression of glaucoma. Although several deep learning (DL) techniques have been recently proposed for the automated extraction (segmentation) and quantification of these morphological changes, the device specific nature and the difficulty in preparing manual segmentations (training data) limit their clinical adoption. With several new manufacturers and next-generation OCT devices entering the market, the complexity in deploying DL algorithms clinically is only increasing. To address this, we propose a DL based 3D segmentation framework that is easily translatable across OCT devices in a label-free manner (i.e. without the need to manually re-segment data for each device). Specifically, we developed 2 sets of DL networks. The first (referred to as the enhancer) was able to enhance OCT image quality from 3 OCT devices, and harmonized image-characteristics across these devices. The second performed 3D segmentation of 6 important ONH tissue layers. We found that the use of the enhancer was critical for our segmentation network to achieve device independency. In other words, our 3D segmentation network trained on any of 3 devices successfully segmented ONH tissue layers from the other two devices with high performance (Dice coefficients > 0.92). With such an approach, we could automatically segment images from new OCT devices without ever needing manual segmentation data from such devices.
△ Less
Submitted 22 February, 2020;
originally announced February 2020.
-
Knee Cartilage Segmentation Using Diffusion-Weighted MRI
Authors:
Alejandra Duarte,
Chaitra V. Hegde,
Aakash Kaku,
Sreyas Mohan,
José G. Raya
Abstract:
The integrity of articular cartilage is a crucial aspect in the early diagnosis of osteoarthritis (OA). Many novel MRI techniques have the potential to assess compositional changes of the cartilage extracellular matrix. Among these techniques, diffusion tensor imaging (DTI) of cartilage provides a simultaneous assessment of the two principal components of the solid matrix: collagen structure and p…
▽ More
The integrity of articular cartilage is a crucial aspect in the early diagnosis of osteoarthritis (OA). Many novel MRI techniques have the potential to assess compositional changes of the cartilage extracellular matrix. Among these techniques, diffusion tensor imaging (DTI) of cartilage provides a simultaneous assessment of the two principal components of the solid matrix: collagen structure and proteoglycan concentration. DTI, as for any other compositional MRI technique, require a human expert to perform segmentation manually. The manual segmentation is error-prone and time-consuming ($\sim$ few hours per subject). We use an ensemble of modified U-Nets to automate this segmentation task. We benchmark our model against a human expert test-retest segmentation and conclude that our model is superior for Patellar and Tibial cartilage using dice score as the comparison metric. In the end, we do a perturbation analysis to understand the sensitivity of our model to the different components of our input. We also provide confidence maps for the predictions so that radiologists can tweak the model predictions as required. The model has been deployed in practice. In conclusion, cartilage segmentation on DW-MRI images with modified U-Nets achieves accuracy that outperforms the human segmenter. Code is available at https://github.com/aakashrkaku/knee-cartilage-segmentation
△ Less
Submitted 4 December, 2019;
originally announced December 2019.
-
Control of Permanent Magnet Motors with Actuation Bounds using Convex Optimization
Authors:
Shravan Mohan
Abstract:
This paper presents a nonlinear control algorithm for speed control of a permanent magnet motor. The idea relies on a feedback linearization technique which also ensures adherence to current and voltage bounds. These bounds arise from practical limitations of the power source. The feedback linearization law is computed using a convex optimization routine to minimize response time as well. The aid…
▽ More
This paper presents a nonlinear control algorithm for speed control of a permanent magnet motor. The idea relies on a feedback linearization technique which also ensures adherence to current and voltage bounds. These bounds arise from practical limitations of the power source. The feedback linearization law is computed using a convex optimization routine to minimize response time as well. The aid of convex optimization leads to computational efficiency. Moreover, the mathematical tractability of the approach also aids analysis of the system performance under model uncertainty and feedback measurement noise. Simulations and computations corroborate the proposed idea.
△ Less
Submitted 26 November, 2019;
originally announced November 2019.
-
REVAMP$^2$T: Real-time Edge Video Analytics for Multi-camera Privacy-aware Pedestrian Tracking
Authors:
Christopher Neff,
Matías Mendieta,
Shrey Mohan,
Mohammadreza Baharani,
Samuel Rogers,
Hamed Tabkhi
Abstract:
This article presents REVAMP$^2$T, Real-time Edge Video Analytics for Multi-camera Privacy-aware Pedestrian Tracking, as an integrated end-to-end IoT system for privacy-built-in decentralized situational awareness. REVAMP$^2$T presents novel algorithmic and system constructs to push deep learning and video analytics next to IoT devices (i.e. video cameras). On the algorithm side, REVAMP$^2$T propo…
▽ More
This article presents REVAMP$^2$T, Real-time Edge Video Analytics for Multi-camera Privacy-aware Pedestrian Tracking, as an integrated end-to-end IoT system for privacy-built-in decentralized situational awareness. REVAMP$^2$T presents novel algorithmic and system constructs to push deep learning and video analytics next to IoT devices (i.e. video cameras). On the algorithm side, REVAMP$^2$T proposes a unified integrated computer vision pipeline for detection, re-identification, and tracking across multiple cameras without the need for storing the streaming data. At the same time, it avoids facial recognition, and tracks and re-identifies pedestrians based on their key features at runtime. On the IoT system side, REVAMP$^2$T provides infrastructure to maximize hardware utilization on the edge, orchestrates global communications, and provides system-wide re-identification, without the use of personally identifiable information, for a distributed IoT network. For the results and evaluation, this article also proposes a new metric, Accuracy$\cdot$Efficiency (Æ), for holistic evaluation of IoT systems for real-time video analytics based on accuracy, performance, and power efficiency. REVAMP$^2$T outperforms current state-of-the-art by as much as thirteen-fold Æ~improvement.
△ Less
Submitted 25 November, 2019; v1 submitted 20 November, 2019;
originally announced November 2019.
-
Toward an Automatic System for Computer-Aided Assessment in Facial Palsy
Authors:
Diego L. Guarin,
Yana Yunusova,
Babak Taati,
Joseph R Dusseldorp,
Suresh Mohan,
Joana Tavares,
Martinus M. van Veen,
Emily Fortier,
Tessa A. Hadlock,
Nate Jowett
Abstract:
Importance: Machine learning (ML) approaches to facial landmark localization carry great clinical potential for quantitative assessment of facial function as they enable high-throughput automated quantification of relevant facial metrics from photographs. However, translation from research settings to clinical applications requires important improvements. Objective: To develop an ML algorithm for…
▽ More
Importance: Machine learning (ML) approaches to facial landmark localization carry great clinical potential for quantitative assessment of facial function as they enable high-throughput automated quantification of relevant facial metrics from photographs. However, translation from research settings to clinical applications requires important improvements. Objective: To develop an ML algorithm for accurate facial landmarks localization in photographs of facial palsy patients, and use it as part of an automated computer-aided diagnosis system. Design, Setting, and Participants: Facial landmarks were manually localized in portrait photographs of eight expressions obtained from 200 facial palsy patients and 10 controls. A novel ML model for automated facial landmark localization was trained using this disease-specific database. Model output was compared to manual annotations and the output of a model trained using a larger database consisting only of healthy subjects. Model accuracy was evaluated by the normalized root mean square error (NRMSE) between algorithms' prediction and manual annotations. Results: Publicly available algorithms provide poor results when applied to patients compared to healthy controls (NRMSE, 8.56 +/- 2.16 vs. 7.09 +/- 2.34, p << 0.01). We found significant improvement in facial landmark localization accuracy for the clinical population when using a model trained with a relatively small number patients' photographs (1440) compared to a model trained using several thousand more images of healthy faces (NRMSE, 6.03 +/- 2.43 vs. 8.56 +/- 2.16, p << 0.01). Conclusions: Retraining a landmark detection model with a small number of clinical images significantly improved landmark detection performance in frontal view photographs of the clinical population. These results represent the first steps towards an automatic system for computer-aided assessment in facial palsy.
△ Less
Submitted 24 October, 2019;
originally announced October 2019.
-
Protecting Actuators in Safety-Critical IoT Systems from Control Spoofing Attacks
Authors:
Monowar Hasan,
Sibin Mohan
Abstract:
In this paper, we propose a framework called Contego-TEE to secure Internet-of-Things (IoT) edge devices with timing requirements from control spoofing attacks where an adversary sends malicious control signals to the actuators. We use a trusted computing base available in commodity processors (such as ARM TrustZone) and propose an invariant checking mechanism to ensure the security and safety of…
▽ More
In this paper, we propose a framework called Contego-TEE to secure Internet-of-Things (IoT) edge devices with timing requirements from control spoofing attacks where an adversary sends malicious control signals to the actuators. We use a trusted computing base available in commodity processors (such as ARM TrustZone) and propose an invariant checking mechanism to ensure the security and safety of the physical system. A working prototype of Contego-TEE was developed using embedded Linux kernel. We demonstrate the feasibility of our approach for a robotic vehicle running on an ARM-based platform.
△ Less
Submitted 25 August, 2019;
originally announced August 2019.
-
Robust and interpretable blind image denoising via bias-free convolutional neural networks
Authors:
Sreyas Mohan,
Zahra Kadkhodaie,
Eero P. Simoncelli,
Carlos Fernandez-Granda
Abstract:
Deep convolutional networks often append additive constant ("bias") terms to their convolution operations, enabling a richer repertoire of functional map**s. Biases are also used to facilitate training, by subtracting mean response over batches of training images (a component of "batch normalization"). Recent state-of-the-art blind denoising methods (e.g., DnCNN) seem to require these terms for…
▽ More
Deep convolutional networks often append additive constant ("bias") terms to their convolution operations, enabling a richer repertoire of functional map**s. Biases are also used to facilitate training, by subtracting mean response over batches of training images (a component of "batch normalization"). Recent state-of-the-art blind denoising methods (e.g., DnCNN) seem to require these terms for their success. Here, however, we show that these networks systematically overfit the noise levels for which they are trained: when deployed at noise levels outside the training range, performance degrades dramatically. In contrast, a bias-free architecture -- obtained by removing the constant terms in every layer of the network, including those used for batch normalization-- generalizes robustly across noise levels, while preserving state-of-the-art performance within the training range. Locally, the bias-free network acts linearly on the noisy image, enabling direct analysis of network behavior via standard linear-algebraic tools. These analyses provide interpretations of network functionality in terms of nonlinear adaptive filtering, and projection onto a union of low-dimensional subspaces, connecting the learning-based method to more traditional denoising methodology.
△ Less
Submitted 8 February, 2020; v1 submitted 13 June, 2019;
originally announced June 2019.
-
Data-driven Estimation of Sinusoid Frequencies
Authors:
Gautier Izacard,
Sreyas Mohan,
Carlos Fernandez-Granda
Abstract:
Frequency estimation is a fundamental problem in signal processing, with applications in radar imaging, underwater acoustics, seismic imaging, and spectroscopy. The goal is to estimate the frequency of each component in a multisinusoidal signal from a finite number of noisy samples. A recent machine-learning approach uses a neural network to output a learned representation with local maxima at the…
▽ More
Frequency estimation is a fundamental problem in signal processing, with applications in radar imaging, underwater acoustics, seismic imaging, and spectroscopy. The goal is to estimate the frequency of each component in a multisinusoidal signal from a finite number of noisy samples. A recent machine-learning approach uses a neural network to output a learned representation with local maxima at the position of the frequency estimates. In this work, we propose a novel neural-network architecture that produces a significantly more accurate representation, and combine it with an additional neural-network module trained to detect the number of frequencies. This yields a fast, fully-automatic method for frequency estimation that achieves state-of-the-art results. In particular, it outperforms existing techniques by a substantial margin at medium-to-high noise levels.
△ Less
Submitted 3 February, 2021; v1 submitted 3 June, 2019;
originally announced June 2019.
-
Secure Integration of Electric Vehicles with the Power Grid
Authors:
Chaitra Niddodi,
Shanny Lin,
Sibin Mohan,
Hao Zhu
Abstract:
This paper focuses on the secure integration of distributed energy resources (DERs), especially pluggable electric vehicles (EVs), with the power grid. We consider the vehicle-to-grid (V2G) system where EVs are connected to the power grid through an aggregator. In this paper, we propose a novel Cyber-Physical Anomaly Detection Engine that monitors system behavior and detects anomalies almost insta…
▽ More
This paper focuses on the secure integration of distributed energy resources (DERs), especially pluggable electric vehicles (EVs), with the power grid. We consider the vehicle-to-grid (V2G) system where EVs are connected to the power grid through an aggregator. In this paper, we propose a novel Cyber-Physical Anomaly Detection Engine that monitors system behavior and detects anomalies almost instantaneously. This detection engine ensures that the critical power grid component (viz.,aggregator)remains secure by monitoring(a)cyber messages for various state changes and data constraints along with (b)power data on the V2G cyber network using power measurements from sensors on the physical/power distribution network. Since the V2G system is time-sensitive, the anomaly detection engine also monitors the timing requirements of the protocol messages to enhance the safety of the aggregator. To the best of our knowledge, this is the first piece of work that combines(a)the EV charging/discharging protocols, the(b)cyber network and(c)power measurements from physical network to detect intrusions in the EV to power grid system.
△ Less
Submitted 4 August, 2019; v1 submitted 3 May, 2019;
originally announced May 2019.
-
A note on rank constrained solutions to linear matrix equations
Authors:
Shravan Mohan
Abstract:
This preliminary note presents a heuristic for determining rank constrained solutions to linear matrix equations (LME). The method proposed here is based on minimizing a non-convex quadratic functional, which will hence-forth be termed as the \textit{Low-Rank-Functional} (LRF). Although this method lacks a formal proof/comprehensive analysis, for example in terms of a probabilistic guarantee for c…
▽ More
This preliminary note presents a heuristic for determining rank constrained solutions to linear matrix equations (LME). The method proposed here is based on minimizing a non-convex quadratic functional, which will hence-forth be termed as the \textit{Low-Rank-Functional} (LRF). Although this method lacks a formal proof/comprehensive analysis, for example in terms of a probabilistic guarantee for converging to a solution, the proposed idea is intuitive and has been seen to perform well in simulations. To that end, many numerical examples are provided to corroborate the idea.
△ Less
Submitted 6 September, 2018;
originally announced September 2018.
-
On the primal-dual dynamics of Support Vector Machines
Authors:
Krishna Chaitanya Kosaraju,
Shravan Mohan,
Ramkrishna Pasumarthy
Abstract:
The aim of this paper is to study the convergence of the primal-dual dynamics pertaining to Support Vector Machines (SVM). The optimization routine, used for determining an SVM for classification, is first formulated as a dynamical system. The dynamical system is constructed such that its equilibrium point is the solution to the SVM optimization problem. It is then shown, using passivity theory, t…
▽ More
The aim of this paper is to study the convergence of the primal-dual dynamics pertaining to Support Vector Machines (SVM). The optimization routine, used for determining an SVM for classification, is first formulated as a dynamical system. The dynamical system is constructed such that its equilibrium point is the solution to the SVM optimization problem. It is then shown, using passivity theory, that the dynamical system is global asymptotically stable. In other words, the dynamical system converges onto the optimal solution asymptotically, irrespective of the initial condition. Simulations and computations are provided for corroboration.
△ Less
Submitted 2 May, 2018;
originally announced May 2018.
-
A linear programming approach for designing multilevel PWM waveforms
Authors:
Shravan Mohan,
Bharath Bhikkaji
Abstract:
This paper considers the problem of designing a multilevel pulse width modulated waveform (PWM) with a prescribed harmonic content. Multilevel PWM design plays a major role in many diverse engineering disciplines. In power electronics, multilevel PWM design corresponds to determining the inverter switching times and levels for selective harmonic elimination and harmonic compensation. In mechatroni…
▽ More
This paper considers the problem of designing a multilevel pulse width modulated waveform (PWM) with a prescribed harmonic content. Multilevel PWM design plays a major role in many diverse engineering disciplines. In power electronics, multilevel PWM design corresponds to determining the inverter switching times and levels for selective harmonic elimination and harmonic compensation. In mechatronics, the same design corresponds to sha** input signals to damp residual vibrations in flexible structures. More generally, in most applications, the aim of PWM design is to minimize the total harmonic distortion while adhering to a prescribed harmonic content. The solution approach presented in this paper is based on linear programming with the objective of minimizing the total harmonic distortion. This objective is achieved within an arbitrarily small bound of the optimal solution. In addition, the linear programming formulation makes the design of such switching waveforms computationally tractable and efficient. Simulations are provided for corroboration.
△ Less
Submitted 2 January, 2018; v1 submitted 28 December, 2017;
originally announced December 2017.
-
Optimal input design for system identification using spectral decomposition
Authors:
Shravan Mohan,
Mithun Im,
Bharath Bhikkaji
Abstract:
The aim of this paper is to design a band-limited optimal input with power constraints for identifying a linear multi-input multi-output system. It is assumed that the nominal system parameters are specified. The key idea is to use the spectral decomposition theorem and write the power spectrum as $φ_{u}(jω)=\frac{1}{2}H(jω)H^*(jω)$. The matrix $H(jω)$ is expressed in terms of a truncated basis fo…
▽ More
The aim of this paper is to design a band-limited optimal input with power constraints for identifying a linear multi-input multi-output system. It is assumed that the nominal system parameters are specified. The key idea is to use the spectral decomposition theorem and write the power spectrum as $φ_{u}(jω)=\frac{1}{2}H(jω)H^*(jω)$. The matrix $H(jω)$ is expressed in terms of a truncated basis for $\mathcal{L}^2\left(\left[-ω_{\mbox{cut-off}},ω_{\mbox{cut-off}}\right]\right)$. With this parameterization, the elements of the Fisher Information Matrix and the power constraints turn out to be homogeneous quadratics in the basis coefficients. The optimality criterion used are the well-known $\mathcal{D}-$optimality, $\mathcal{A}-$optimality, $\mathcal{T}-$optimality and $\mathcal{E}-$optimality. The resulting optimization problem is non-convex in general. A lower bound on the optimum is obtained through a bi-linear formulation of the problem, while an upper bound is obtained through a convex relaxation. These bounds can be computed efficiently as the associated problems are convex. The lower bound is used as a sub-optimal solution, the sub-optimality of which is determined by the difference in the bounds. Interestingly, the bounds match in many instances and thus, the global optimum is achieved. A discussion on the non-convexity of the optimization problem is also presented. Simulations are provided for corroboration.
△ Less
Submitted 13 June, 2017;
originally announced June 2017.
-
Convex Computation of the Reachable Set for Hybrid Systems with Parametric Uncertainty
Authors:
Shankar Mohan,
Victor Shia,
Ram Vasudevan
Abstract:
To verify the correct operation of systems, engineers need to determine the set of configurations of a dynamical model that are able to safely reach a specified configuration under a control law. Unfortunately, constructing models for systems interacting in highly dynamic environments is difficult. This paper addresses this challenge by presenting a convex optimization method to efficiently comput…
▽ More
To verify the correct operation of systems, engineers need to determine the set of configurations of a dynamical model that are able to safely reach a specified configuration under a control law. Unfortunately, constructing models for systems interacting in highly dynamic environments is difficult. This paper addresses this challenge by presenting a convex optimization method to efficiently compute the set of configurations of a polynomial hybrid dynamical system that are able to safely reach a user defined target set despite parametric uncertainty in the model. This class of models describes, for example, legged robots moving over uncertain terrains. The presented approach utilizes the notion of occupation measures to describe the evolution of trajectories of a nonlinear hybrid dynamical system with parametric uncertainty as a linear equation over measures whose supports coincide with the trajectories under investigation. This linear equation with user defined support constraints is approximated with vanishing conservatism using a hierarchy of semidefinite programs that are each proven to compute an inner/outer approximation to the set of initial conditions that can reach the user defined target set safely in spite of uncertainty. The efficacy of this method is illustrated on a collection of six representative examples.
△ Less
Submitted 5 January, 2016;
originally announced January 2016.
-
S3A: Secure System Simplex Architecture for Enhanced Security of Cyber-Physical Systems
Authors:
Sibin Mohan,
Stanley Bak,
Emiliano Betti,
Heechul Yun,
Lui Sha,
Marco Caccamo
Abstract:
Until recently, cyber-physical systems, especially those with safety-critical properties that manage critical infrastructure (e.g. power generation plants, water treatment facilities, etc.) were considered to be invulnerable against software security breaches. The recently discovered 'W32.Stuxnet' worm has drastically changed this perception by demonstrating that such systems are susceptible to ex…
▽ More
Until recently, cyber-physical systems, especially those with safety-critical properties that manage critical infrastructure (e.g. power generation plants, water treatment facilities, etc.) were considered to be invulnerable against software security breaches. The recently discovered 'W32.Stuxnet' worm has drastically changed this perception by demonstrating that such systems are susceptible to external attacks. Here we present an architecture that enhances the security of safety-critical cyber-physical systems despite the presence of such malware. Our architecture uses the property that control systems have deterministic execution behavior, to detect an intrusion within 0.6 μs while still guaranteeing the safety of the plant. We also show that even if an attack is successful, the overall state of the physical system will still remain safe. Even if the operating system's administrative privileges have been compromised, our architecture will still be able to protect the physical system from coming to harm.
△ Less
Submitted 25 February, 2012;
originally announced February 2012.