Showing 1–2 of 2 results for author: Martin, S R

Search v0.5.6 released 2020-02-24

arXiv:2405.08429 [pdf, other]

cs.CV

doi 10.1093/jigpal/jzae048

TEDNet: Twin Encoder Decoder Neural Network for 2D Camera and LiDAR Road Detection

Authors: Martín Bayón-Gutiérrez, María Teresa García-Ordás, Héctor Alaiz Moretón, Jose Aveleira-Mata, Sergio Rubio Martín, José Alberto Benítez-Andrades

Abstract: Robust road surface estimation is required for autonomous ground vehicles to navigate safely. Despite it becoming one of the main targets for autonomous mobility researchers in recent years, it is still an open problem in which cameras and LiDAR sensors have demonstrated to be adequate to predict the position, size and shape of the road a vehicle is driving on in different environments. In this wo… ▽ More Robust road surface estimation is required for autonomous ground vehicles to navigate safely. Despite it becoming one of the main targets for autonomous mobility researchers in recent years, it is still an open problem in which cameras and LiDAR sensors have demonstrated to be adequate to predict the position, size and shape of the road a vehicle is driving on in different environments. In this work, a novel Convolutional Neural Network model is proposed for the accurate estimation of the roadway surface. Furthermore, an ablation study has been conducted to investigate how different encoding strategies affect model performance, testing 6 slightly different neural network architectures. Our model is based on the use of a Twin Encoder-Decoder Neural Network (TEDNet) for independent camera and LiDAR feature extraction, and has been trained and evaluated on the Kitti-Road dataset. Bird's Eye View projections of the camera and LiDAR data are used in this model to perform semantic segmentation on whether each pixel belongs to the road surface. The proposed method performs among other state-of-the-art methods and operates at the same frame-rate as the LiDAR and cameras, so it is adequate for its use in real-time applications. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: Source code: https://github.com/martin-bayon/TEDNet

Journal ref: M Bayón-Gutiérrez, MT García-Ordás, H Alaiz Moretón, J Aveleira-Mata, S Rubio-Martín, JA Benítez-Andrades. TEDNet: Twin Encoder Decoder Neural Network for 2D Camera and LiDAR Road Detection. Logic Journal of the IGPL. 2024
arXiv:2404.07341 [pdf, other]

eess.AS cs.CL cs.LG cs.SD

Conformer-1: Robust ASR via Large-Scale Semisupervised Bootstrap**

Authors: Kevin Zhang, Luka Chkhetiani, Francis McCann Ramirez, Yash Khare, Andrea Vanzo, Michael Liang, Sergio Ramirez Martin, Gabriel Oexle, Ruben Bousbib, Taufiquzzaman Peyash, Michael Nguyen, Dillon Pulliam, Domenic Donato

Abstract: This paper presents Conformer-1, an end-to-end Automatic Speech Recognition (ASR) model trained on an extensive dataset of 570k hours of speech audio data, 91% of which was acquired from publicly available sources. To achieve this, we perform Noisy Student Training after generating pseudo-labels for the unlabeled public data using a strong Conformer RNN-T baseline model. The addition of these pseu… ▽ More This paper presents Conformer-1, an end-to-end Automatic Speech Recognition (ASR) model trained on an extensive dataset of 570k hours of speech audio data, 91% of which was acquired from publicly available sources. To achieve this, we perform Noisy Student Training after generating pseudo-labels for the unlabeled public data using a strong Conformer RNN-T baseline model. The addition of these pseudo-labeled data results in remarkable improvements in relative Word Error Rate (WER) by 11.5% and 24.3% for our asynchronous and realtime models, respectively. Additionally, the model is more robust to background noise owing to the addition of these data. The results obtained in this study demonstrate that the incorporation of pseudo-labeled publicly available data is a highly effective strategy for improving ASR accuracy and noise robustness. △ Less

Submitted 12 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

Search v0.5.6 released 2020-02-24