Search | arXiv e-print repository

arXiv:2312.02658 [pdf]

Do AI models produce better weather forecasts than physics-based models? A quantitative evaluation case study of Storm Ciarán

Authors: Andrew J. Charlton-Perez, Helen F. Dacre, Simon Driscoll, Suzanne L. Gray, Ben Harvey, Natalie J. Harvey, Kieran M. R. Hunt, Robert W. Lee, Ran**i Swaminathan, Remy Vandaele, Ambrogio Volonté

Abstract: There has been huge recent interest in the potential of making operational weather forecasts using machine learning techniques. As they become a part of the weather forecasting toolbox, there is a pressing need to understand how well current machine learning models can simulate high-impact weather events. We compare forecasts of Storm Ciarán, a European windstorm that caused sixteen deaths and ext… ▽ More There has been huge recent interest in the potential of making operational weather forecasts using machine learning techniques. As they become a part of the weather forecasting toolbox, there is a pressing need to understand how well current machine learning models can simulate high-impact weather events. We compare forecasts of Storm Ciarán, a European windstorm that caused sixteen deaths and extensive damage in Northern Europe, made by machine learning and numerical weather prediction models. The four machine learning models considered (FourCastNet, Pangu-Weather, GraphCast and FourCastNet-v2) produce forecasts that accurately capture the synoptic-scale structure of the cyclone including the position of the cloud head, shape of the warm sector and location of warm conveyor belt jet, and the large-scale dynamical drivers important for the rapid storm development such as the position of the storm relative to the upper-level jet exit. However, their ability to resolve the more detailed structures important for issuing weather warnings is more mixed. All of the machine learning models underestimate the peak amplitude of winds associated with the storm, only some machine learning models resolve the warm core seclusion and none of the machine learning models capture the sharp bent-back warm frontal gradient. Our study shows there is a great deal about the performance and properties of machine learning weather forecasts that can be derived from case studies of high-impact weather events such as Storm Ciarán. △ Less

Submitted 19 February, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

arXiv:2305.07778 [pdf, other]

doi 10.1109/SLT54892.2023.10022592

Accelerator-Aware Training for Transducer-Based Speech Recognition

Authors: Suhaila M. Shakiah, Rupak Vignesh Swaminathan, Hieu Duy Nguyen, Raviteja Chinta, Tariq Afzal, Nathan Susanj, Athanasios Mouchtaris, Grant P. Strimel, Ariya Rastrow

Abstract: Machine learning model weights and activations are represented in full-precision during training. This leads to performance degradation in runtime when deployed on neural network accelerator (NNA) chips, which leverage highly parallelized fixed-point arithmetic to improve runtime memory and latency. In this work, we replicate the NNA operators during the training phase, accounting for the degradat… ▽ More Machine learning model weights and activations are represented in full-precision during training. This leads to performance degradation in runtime when deployed on neural network accelerator (NNA) chips, which leverage highly parallelized fixed-point arithmetic to improve runtime memory and latency. In this work, we replicate the NNA operators during the training phase, accounting for the degradation due to low-precision inference on the NNA in back-propagation. Our proposed method efficiently emulates NNA operations, thus foregoing the need to transfer quantization error-prone data to the Central Processing Unit (CPU), ultimately reducing the user perceived latency (UPL). We apply our approach to Recurrent Neural Network-Transducer (RNN-T), an attractive architecture for on-device streaming speech recognition tasks. We train and evaluate models on 270K hours of English data and show a 5-7% improvement in engine latency while saving up to 10% relative degradation in WER. △ Less

Submitted 12 May, 2023; originally announced May 2023.

Comments: Accepted to SLT 2022

Journal ref: IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar, 2023, pp. 100-107

arXiv:2301.08360 [pdf, other]

Domain-adapted Learning and Imitation: DRL for Power Arbitrage

Authors: Yuanrong Wang, Vignesh Raja Swaminathan, Nikita P. Granger, Carlos Ros Perez, Christian Michler

Abstract: In this paper, we discuss the Dutch power market, which is comprised of a day-ahead market and an intraday balancing market that operates like an auction. Due to fluctuations in power supply and demand, there is often an imbalance that leads to different prices in the two markets, providing an opportunity for arbitrage. To address this issue, we restructure the problem and propose a collaborative… ▽ More In this paper, we discuss the Dutch power market, which is comprised of a day-ahead market and an intraday balancing market that operates like an auction. Due to fluctuations in power supply and demand, there is often an imbalance that leads to different prices in the two markets, providing an opportunity for arbitrage. To address this issue, we restructure the problem and propose a collaborative dual-agent reinforcement learning approach for this bi-level simulation and optimization of European power arbitrage trading. We also introduce two new implementations designed to incorporate domain-specific knowledge by imitating the trading behaviours of power traders. By utilizing reward engineering to imitate domain expertise, we are able to reform the reward system for the RL agent, which improves convergence during training and enhances overall performance. Additionally, the tranching of orders increases bidding success rates and significantly boosts profit and loss (P&L). Our study demonstrates that by leveraging domain expertise in a general learning problem, the performance can be improved substantially, and the final integrated approach leads to a three-fold improvement in cumulative P&L compared to the original agent. Furthermore, our methodology outperforms the highest benchmark policy by around 50% while maintaining efficient computational performance. △ Less

Submitted 10 September, 2023; v1 submitted 19 January, 2023; originally announced January 2023.

arXiv:2210.16238 [pdf, ps, other]

Contextual-Utterance Training for Automatic Speech Recognition

Authors: Alejandro Gomez-Alanis, Lukas Drude, Andreas Schwarz, Rupak Vignesh Swaminathan, Simon Wiesler

Abstract: Recent studies of streaming automatic speech recognition (ASR) recurrent neural network transducer (RNN-T)-based systems have fed the encoder with past contextual information in order to improve its word error rate (WER) performance. In this paper, we first propose a contextual-utterance training technique which makes use of the previous and future contextual utterances in order to do an implicit… ▽ More Recent studies of streaming automatic speech recognition (ASR) recurrent neural network transducer (RNN-T)-based systems have fed the encoder with past contextual information in order to improve its word error rate (WER) performance. In this paper, we first propose a contextual-utterance training technique which makes use of the previous and future contextual utterances in order to do an implicit adaptation to the speaker, topic and acoustic environment. Also, we propose a dual-mode contextual-utterance training technique for streaming automatic speech recognition (ASR) systems. This proposed approach allows to make a better use of the available acoustic context in streaming models by distilling "in-place" the knowledge of a teacher, which is able to see both past and future contextual utterances, to the student which can only see the current and past contextual utterances. The experimental results show that a conformer-transducer system trained with the proposed techniques outperforms the same system trained with the classical RNN-T loss. Specifically, the proposed technique is able to reduce both the WER and the average last token emission latency by more than 6% and 40ms relative, respectively. △ Less

Submitted 27 October, 2022; originally announced October 2022.

arXiv:2209.14868 [pdf, other]

ConvRNN-T: Convolutional Augmented Recurrent Neural Network Transducers for Streaming Speech Recognition

Authors: Martin Radfar, Rohit Barnwal, Rupak Vignesh Swaminathan, Feng-Ju Chang, Grant P. Strimel, Nathan Susanj, Athanasios Mouchtaris

Abstract: The recurrent neural network transducer (RNN-T) is a prominent streaming end-to-end (E2E) ASR technology. In RNN-T, the acoustic encoder commonly consists of stacks of LSTMs. Very recently, as an alternative to LSTM layers, the Conformer architecture was introduced where the encoder of RNN-T is replaced with a modified Transformer encoder composed of convolutional layers at the frontend and betwee… ▽ More The recurrent neural network transducer (RNN-T) is a prominent streaming end-to-end (E2E) ASR technology. In RNN-T, the acoustic encoder commonly consists of stacks of LSTMs. Very recently, as an alternative to LSTM layers, the Conformer architecture was introduced where the encoder of RNN-T is replaced with a modified Transformer encoder composed of convolutional layers at the frontend and between attention layers. In this paper, we introduce a new streaming ASR model, Convolutional Augmented Recurrent Neural Network Transducers (ConvRNN-T) in which we augment the LSTM-based RNN-T with a novel convolutional frontend consisting of local and global context CNN encoders. ConvRNN-T takes advantage of causal 1-D convolutional layers, squeeze-and-excitation, dilation, and residual blocks to provide both global and local audio context representation to LSTM layers. We show ConvRNN-T outperforms RNN-T, Conformer, and ContextNet on Librispeech and in-house data. In addition, ConvRNN-T offers less computational complexity compared to Conformer. ConvRNN-T's superior accuracy along with its low footprint make it a promising candidate for on-device streaming ASR technologies. △ Less

Submitted 29 September, 2022; originally announced September 2022.

Comments: This paper was presented in Interspeech 2022

arXiv:2106.07734 [pdf, other]

CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition

Authors: Rupak Vignesh Swaminathan, Brian King, Grant P. Strimel, Jasha Droppo, Athanasios Mouchtaris

Abstract: We propose a simple yet effective method to compress an RNN-Transducer (RNN-T) through the well-known knowledge distillation paradigm. We show that the transducer's encoder outputs naturally have a high entropy and contain rich information about acoustically similar word-piece confusions. This rich information is suppressed when combined with the lower entropy decoder outputs to produce the joint… ▽ More We propose a simple yet effective method to compress an RNN-Transducer (RNN-T) through the well-known knowledge distillation paradigm. We show that the transducer's encoder outputs naturally have a high entropy and contain rich information about acoustically similar word-piece confusions. This rich information is suppressed when combined with the lower entropy decoder outputs to produce the joint network logits. Consequently, we introduce an auxiliary loss to distill the encoder logits from a teacher transducer's encoder, and explore training strategies where this encoder distillation works effectively. We find that tandem training of teacher and student encoders with an inplace encoder distillation outperforms the use of a pre-trained and static teacher transducer. We also report an interesting phenomenon we refer to as implicit distillation, that occurs when the teacher and student encoders share the same decoder. Our experiments show 5.37-8.4% relative word error rate reductions (WERR) on in-house test sets, and 5.05-6.18% relative WERRs on LibriSpeech test sets. △ Less

Submitted 14 June, 2021; originally announced June 2021.

Comments: Accepted at InterSpeech 2021

arXiv:2106.06126 [pdf, other]

Exploiting Large-scale Teacher-Student Training for On-device Acoustic Models

Authors: **g Liu, Rupak Vignesh Swaminathan, Sree Hari Krishnan Parthasarathi, Chunchuan Lyu, Athanasios Mouchtaris, Siegfried Kunzmann

Abstract: We present results from Alexa speech teams on semi-supervised learning (SSL) of acoustic models (AM) with experiments spanning over 3000 hours of GPU time, making our study one of the largest of its kind. We discuss SSL for AMs in a small footprint setting, showing that a smaller capacity model trained with 1 million hours of unsupervised data can outperform a baseline supervised system by 14.3% w… ▽ More We present results from Alexa speech teams on semi-supervised learning (SSL) of acoustic models (AM) with experiments spanning over 3000 hours of GPU time, making our study one of the largest of its kind. We discuss SSL for AMs in a small footprint setting, showing that a smaller capacity model trained with 1 million hours of unsupervised data can outperform a baseline supervised system by 14.3% word error rate reduction (WERR). When increasing the supervised data to seven-fold, our gains diminish to 7.1% WERR; to improve SSL efficiency at larger supervised data regimes, we employ a step-wise distillation into a smaller model, obtaining a WERR of 14.4%. We then switch to SSL using larger student models in low data regimes; while learning efficiency with unsupervised data is higher, student models may outperform teacher models in such a setting. We develop a theoretical sketch to explain this behavior. △ Less

Submitted 10 June, 2021; originally announced June 2021.

Comments: TSD2021

arXiv:2103.02162 [pdf, other]

Predicting Driver Fatigue in Automated Driving with Explainability

Authors: Feng Zhou, Areen Alsaid, Mike Blommer, Reates Curry, Radhakrishnan Swaminathan, Dev Kochhar, Walter Talamonti, Louis Tijerina

Abstract: Research indicates that monotonous automated driving increases the incidence of fatigued driving. Although many prediction models based on advanced machine learning techniques were proposed to monitor driver fatigue, especially in manual driving, little is known about how these black-box machine learning models work. In this paper, we proposed a combination of eXtreme Gradient Boosting (XGBoost) a… ▽ More Research indicates that monotonous automated driving increases the incidence of fatigued driving. Although many prediction models based on advanced machine learning techniques were proposed to monitor driver fatigue, especially in manual driving, little is known about how these black-box machine learning models work. In this paper, we proposed a combination of eXtreme Gradient Boosting (XGBoost) and SHAP (SHapley Additive exPlanations) to predict driver fatigue with explanations due to their efficiency and accuracy. First, in order to obtain the ground truth of driver fatigue, PERCLOS (percentage of eyelid closure over the pupil over time) between 0 and 100 was used as the response variable. Second, we built a driver fatigue regression model using both physiological and behavioral measures with XGBoost and it outperformed other selected machine learning models with 3.847 root-mean-squared error (RMSE), 1.768 mean absolute error (MAE) and 0.996 adjusted $R^2$. Third, we employed SHAP to identify the most important predictor variables and uncovered the black-box XGBoost model by showing the main effects of most important predictor variables globally and explaining individual predictions locally. Such an explainable driver fatigue prediction model offered insights into how to intervene in automated driving when necessary, such as during the takeover transition period from automated driving to manual driving. △ Less

Submitted 2 March, 2021; originally announced March 2021.

arXiv:1805.04907 [pdf, other]

A Computational Framework for Modelling and Analyzing Ice Storms

Authors: Ran**i Swaminathan, Mohan Sridharan, Katharine Hayhoe

Abstract: Ice storms are extreme weather events that can have devastating implications for the sustainability of natural ecosystems as well as man made infrastructure. Ice storms are caused by a complex mix of atmospheric conditions and are among the least understood of severe weather events. Our ability to model ice storms and characterize storm features will go a long way towards both enabling support sys… ▽ More Ice storms are extreme weather events that can have devastating implications for the sustainability of natural ecosystems as well as man made infrastructure. Ice storms are caused by a complex mix of atmospheric conditions and are among the least understood of severe weather events. Our ability to model ice storms and characterize storm features will go a long way towards both enabling support systems that offset storm impacts and increasing our understanding of ice storms. In this paper, we present a holistic computational framework to answer key questions of interest about ice storms. We model ice storms as a function of relevant surface and atmospheric variables. We learn these models by adapting and applying supervised and unsupervised machine learning algorithms on data with missing or incorrect labels. We also include a knowledge representation module that reasons with domain knowledge to revise the output of the learned models. Our models are trained using reanalysis data and historical records of storm events. We evaluate these models on reanalyis data as well as Global Climate Model (GCM) data for historical and future climate change scenarios. Furthermore, we discuss the use of appropriate bias correction approaches to run such modeling frameworks with GCM data. △ Less

Submitted 13 May, 2018; originally announced May 2018.

Comments: 7 pages including bibliography

arXiv:1307.3396 [pdf]

doi 10.5121/cseij.2013.3301

Software as a Service - Common Service Bus (SAAS-CSB)

Authors: R. Swaminathan, K. Karnavel

Abstract: Software-as-a-Service (SaaS) is a form of cloud computing that relieves the user from the concern of hardware, software installation and management. It is an emerging business model that delivers software applications to the users through Web-based technology. Software vendors have varying requirements and SaaS applications most typically support such requirements. The various applications used by… ▽ More Software-as-a-Service (SaaS) is a form of cloud computing that relieves the user from the concern of hardware, software installation and management. It is an emerging business model that delivers software applications to the users through Web-based technology. Software vendors have varying requirements and SaaS applications most typically support such requirements. The various applications used by unique customers in a single instance are known as Multi-Tenancy. There would be a delay in service when the user sends the data from multiple applications to multiple destinations and from multiple applications to single destination due to the use of single CSB. This problem can be overcome by using multiple CSB concepts and hence multiple senders can efficiently send their data to multiple receivers at the same time. The multiple clouds are monitored and managed by the SaaS-CSB portal. The idea of SaaS-CSB Portal is to provide a single pane of glass for the user to consume and govern any service from any cloud. Thus, SaaS-CSB application allows companies to save their IT cost and valuable time. △ Less

Submitted 12 July, 2013; originally announced July 2013.

Showing 1–10 of 10 results for author: Swaminathan, R