-
N-gram Boosting: Improving Contextual Biasing with Normalized N-gram Targets
Authors:
Wang Yau Li,
Shreekantha Nadig,
Karol Chang,
Zafarullah Mahmood,
Riqiang Wang,
Simon Vandieken,
Jonas Robertson,
Fred Mailhot
Abstract:
Accurate transcription of proper names and technical terms is particularly important in speech-to-text applications for business conversations. These words, which are essential to understanding the conversation, are often rare and therefore likely to be under-represented in text and audio training data, creating a significant challenge in this domain. We present a two-step keyword boosting mechani…
▽ More
Accurate transcription of proper names and technical terms is particularly important in speech-to-text applications for business conversations. These words, which are essential to understanding the conversation, are often rare and therefore likely to be under-represented in text and audio training data, creating a significant challenge in this domain. We present a two-step keyword boosting mechanism that successfully works on normalized unigrams and n-grams rather than just single tokens, which eliminates missing hits issues with boosting raw targets. In addition, we show how adjusting the boosting weight logic avoids over-boosting multi-token keywords. This improves our keyword recognition rate by 26% relative on our proprietary in-domain dataset and 2% on LibriSpeech. This method is particularly useful on targets that involve non-alphabetic characters or have non-standard pronunciations.
△ Less
Submitted 3 August, 2023;
originally announced August 2023.
-
Stochastic simulation of residential building occupant-driven energy use in a bottom-up model of the U.S. housing stock
Authors:
Jianli Chen,
Rajendra Adhikari,
Eric Wilson,
Joseph Robertson,
Anthony Fontanini,
Ben Polly,
Opeoluwa Olawale
Abstract:
The residential buildings sector is one of the largest electricity consumers worldwide and contributes disproportionally to peak electricity demand in many regions. Strongly driven by occupant activities at home, household energy consumption is stochastic and heterogeneous in nature. However, most residential building energy models applied by industry use homogeneous, deterministic occupant activi…
▽ More
The residential buildings sector is one of the largest electricity consumers worldwide and contributes disproportionally to peak electricity demand in many regions. Strongly driven by occupant activities at home, household energy consumption is stochastic and heterogeneous in nature. However, most residential building energy models applied by industry use homogeneous, deterministic occupant activity schedules, which work well for predictions of annual energy consumption, but can result in unrealistic hourly or sub-hourly electric load profiles, with exaggerated or muted peaks. This mattered less in the past, but the increasing proportion of variable renewable energy generators in power systems means that representing the heterogeneity and stochasticity of occupant behavior is crucial for reliable energy planning. This is particularly true for systems that include distributed energy resources, such as grid-interactive efficient buildings, solar photovoltaics, and battery storage. This work presents a stochastic occupant behavior simulator that models the energy use behavior of individual household members. It also presents an integration with a building stock model to simulate residential building loads more accurately at community, city, state, and national scales. More specifically, we first employ clustering techniques to identify distinct patterns of occupant behavior. Then, we combine time-inhomogeneous Markov chain simulations with probabilistic sampling of event durations to realistically simulate occupant behaviors. This stochastic simulator is integrated with ResStock, a large-scale residential building stock simulation tool, to demonstrate the capability of stochastic residential building load modeling at scale. The simulation results were validated against both American Time Use Survey data and measured end-use electricity data for accuracy and reliability.
△ Less
Submitted 3 November, 2021; v1 submitted 2 November, 2021;
originally announced November 2021.
-
Comparison of SVD and factorized TDNN approaches for speech to text
Authors:
Jeffrey Josanne Michael,
Nagendra Kumar Goel,
Navneeth K,
Jonas Robertson,
Shravan Mishra
Abstract:
This work concentrates on reducing the RTF and word error rate of a hybrid HMM-DNN. Our baseline system uses an architecture with TDNN and LSTM layers. We find this architecture particularly useful for lightly reverberated environments. However, these models tend to demand more computation than is desirable. In this work, we explore alternate architectures employing singular value decomposition (S…
▽ More
This work concentrates on reducing the RTF and word error rate of a hybrid HMM-DNN. Our baseline system uses an architecture with TDNN and LSTM layers. We find this architecture particularly useful for lightly reverberated environments. However, these models tend to demand more computation than is desirable. In this work, we explore alternate architectures employing singular value decomposition (SVD) is applied to the TDNN layers to reduce the RTF, as well as to the affine transforms of every LSTM cell. We compare this approach with specifying bottleneck layers similar to those introduced by SVD before training. Additionally, we reduced the search space of the decoding graph to make it a better fit to operate in real-time applications. We report -61.57% relative reduction in RTF and almost 1% relative decrease in WER for our architecture trained on Fisher data along with reverberated versions of this dataset in order to match one of our target test distributions.
△ Less
Submitted 13 October, 2021;
originally announced October 2021.
-
Efficient approximations of the multi-sensor labelled multi-Bernoulli filter
Authors:
S. C. J. Robertson,
C. E. van Daalen,
J. A. du Preez
Abstract:
In this paper, we propose two efficient, approximate formulations of the multi-sensor labelled multi-Bernoulli (LMB) filter, which both allow the sensors' measurement updates to be computed in parallel. Our first filter is based on the direct mathematical manipulation of the multi-sensor, multi-object Bayes filter's posterior distribution. Unfortunately, it requires the division of probability dis…
▽ More
In this paper, we propose two efficient, approximate formulations of the multi-sensor labelled multi-Bernoulli (LMB) filter, which both allow the sensors' measurement updates to be computed in parallel. Our first filter is based on the direct mathematical manipulation of the multi-sensor, multi-object Bayes filter's posterior distribution. Unfortunately, it requires the division of probability distributions and its extension beyond linear Gaussian applications is not obvious. Our second filter is based on geometric average fusion and it approximates the multi-sensor, multi-object Bayes filter's posterior distribution using the geometric average of each sensor's measurement-updated distribution. This filter can be used under non-linear conditions; however, it is not as accurate as our first filter. In both cases, we approximate the LMB filter's measurement update using an existing loopy belief propagation algorithm. Both filters have a constant complexity in the number of sensors, and linear complexity in both number of measurements and objects. This is an improvement on an iterated-corrector LMB (IC-LMB) filter, which has linear complexity in the number of sensors. The proposed filters are of interest when tracking many objects using several sensors, where filter run-time is more important than filter accuracy. We evaluate both filters' performances on simulated data and the results indicate that the filters' loss of accuracy compared to the IC-LMB filter is not significant.
△ Less
Submitted 10 July, 2022; v1 submitted 18 March, 2021;
originally announced March 2021.
-
Convolutional Image Edge Detection Using Ultrafast Photonic Spiking VCSEL Neurons
Authors:
Joshua Robertson,
Yahui Zhang,
Matej Hejda,
Andrew Adair,
Julian Bueno,
Shuiying Xiang,
Antonio Hurtado
Abstract:
We report experimentally and in theory on the detection of edge information in digital images using ultrafast spiking optical artificial neurons towards convolutional neural networks (CNNs). In tandem with traditional convolution techniques, a photonic neuron model based on a Vertical-Cavity Surface Emitting Laser (VCSEL) is implemented experimentally to threshold and activate fast spiking respons…
▽ More
We report experimentally and in theory on the detection of edge information in digital images using ultrafast spiking optical artificial neurons towards convolutional neural networks (CNNs). In tandem with traditional convolution techniques, a photonic neuron model based on a Vertical-Cavity Surface Emitting Laser (VCSEL) is implemented experimentally to threshold and activate fast spiking responses upon the detection of target edge features in digital images. Edges of different directionalities are detected using individual kernel operators and complete image edge detection is achieved using gradient magnitude. Importantly, the neuromorphic (brain-like) image edge detection system of this work uses commercially sourced VCSELs exhibiting spiking responses at sub-nanosecond rates (many orders of magnitude faster than biological neurons) and operating at the telecom wavelength of 1300 nm; hence making our approach compatible with optical communication and data-center technologies. These results therefore have exciting prospects for ultrafast photonic implementations of neural networks towards computer vision and decision making systems for future artificial intelligence applications.
△ Less
Submitted 2 July, 2020;
originally announced July 2020.