Search | arXiv e-print repository

Multi-modal anticipation of stochastic trajectories in a dynamic environment with Conditional Variational Autoencoders

Abstract: Forecasting short-term motion of nearby vehicles presents an inherently challenging issue as the space of their possible future movements is not strictly limited to a set of single trajectories. Recently proposed techniques that demonstrate plausible results concentrate primarily on forecasting a fixed number of deterministic predictions, or on classifying over a wide variety of trajectories that… ▽ More Forecasting short-term motion of nearby vehicles presents an inherently challenging issue as the space of their possible future movements is not strictly limited to a set of single trajectories. Recently proposed techniques that demonstrate plausible results concentrate primarily on forecasting a fixed number of deterministic predictions, or on classifying over a wide variety of trajectories that were previously generated using e.g. dynamic model. This paper focuses on addressing the uncertainty associated with the discussed task by utilising the stochastic nature of generative models in order to produce a diverse set of plausible paths with regards to tracked vehicles. More specifically, we propose to account for the multi-modality of the problem with use of Conditional Variational Autoencoder (C-VAE) conditioned on an agent's past motion as well as a rasterised scene context encoded with Capsule Network (CapsNet). In addition, we demonstrate advantages of employing the Minimum over N (MoN) cost function which measures the distance between ground truth and N generated samples and tries to minimise the loss with respect to the closest sample, effectively leading to more diverse predictions. We examine our network on a publicly available dataset against recent state-of-the-art methods and show that our approach outperforms these techniques in numerous scenarios whilst significantly reducing the number of trainable parameters as well as allowing to sample an arbitrary amount of diverse trajectories. △ Less

Submitted 5 March, 2021; originally announced March 2021.

arXiv:2103.01644 [pdf, other]

Exploiting latent representation of sparse semantic layers for improved short-term motion prediction with Capsule Networks

Authors: Albert Dulian, John C. Murray

Abstract: As urban environments manifest high levels of complexity it is of vital importance that safety systems embedded within autonomous vehicles (AVs) are able to accurately anticipate short-term future motion of nearby agents. This problem can be further understood as generating a sequence of coordinates describing the future motion of the tracked agent. Various proposed approaches demonstrate signific… ▽ More As urban environments manifest high levels of complexity it is of vital importance that safety systems embedded within autonomous vehicles (AVs) are able to accurately anticipate short-term future motion of nearby agents. This problem can be further understood as generating a sequence of coordinates describing the future motion of the tracked agent. Various proposed approaches demonstrate significant benefits of using a rasterised top-down image of the road, with a combination of Convolutional Neural Networks (CNNs), for extraction of relevant features that define the road structure (eg. driveable areas, lanes, walkways). In contrast, this paper explores use of Capsule Networks (CapsNets) in the context of learning a hierarchical representation of sparse semantic layers corresponding to small regions of the High-Definition (HD) map. Each region of the map is dismantled into separate geometrical layers that are extracted with respect to the agent's current position. By using an architecture based on CapsNets the model is able to retain hierarchical relationships between detected features within images whilst also preventing loss of spatial data often caused by the pooling operation. We train and evaluate our model on publicly available dataset nuTonomy scenes and compare it to recently published methods. We show that our model achieves significant improvement over recently published works on deterministic prediction, whilst drastically reducing the overall size of the network. △ Less

Submitted 25 March, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

arXiv:2006.05159 [pdf, other]

Physically constrained short-term vehicle trajectory forecasting with naive semantic maps

Authors: Albert Dulian, John C. Murray

Abstract: Urban environments manifest a high level of complexity, and therefore it is of vital importance for safety systems embedded within autonomous vehicles (AVs) to be able to accurately predict the short-term future motion of nearby agents. This problem can be further understood as generating a sequence of future coordinates for a given agent based on its past motion data e.g. position, velocity, acce… ▽ More Urban environments manifest a high level of complexity, and therefore it is of vital importance for safety systems embedded within autonomous vehicles (AVs) to be able to accurately predict the short-term future motion of nearby agents. This problem can be further understood as generating a sequence of future coordinates for a given agent based on its past motion data e.g. position, velocity, acceleration etc, and whilst current approaches demonstrate plausible results they have a propensity to neglect a scene's physical constrains. In this paper we propose the model based on a combination of the CNN and LSTM encoder-decoder architecture that learns to extract a relevant road features from semantic maps as well as general motion of agents and uses this learned representation to predict their short-term future trajectories. We train and validate the model on the publicly available dataset that provides data from urban areas, allowing us to examine it in challenging and uncertain scenarios. We show that our model is not only capable of anticipating future motion whilst taking into consideration road boundaries, but can also effectively and precisely predict trajectories for a longer time horizon than initially trained for. △ Less

Submitted 9 June, 2020; originally announced June 2020.

Showing 1–3 of 3 results for author: Dulian, A