Search | arXiv e-print repository

Sentence-Level Sign Language Recognition Framework

Abstract: We present two solutions to sentence-level SLR. Sentence-level SLR required map** videos of sign language sentences to sequences of gloss labels. Connectionist Temporal Classification (CTC) has been used as the classifier level of both models. CTC is used to avoid pre-segmenting the sentences into individual words. The first model is an LRCN-based model, and the second model is a Multi-Cue Netwo… ▽ More We present two solutions to sentence-level SLR. Sentence-level SLR required map** videos of sign language sentences to sequences of gloss labels. Connectionist Temporal Classification (CTC) has been used as the classifier level of both models. CTC is used to avoid pre-segmenting the sentences into individual words. The first model is an LRCN-based model, and the second model is a Multi-Cue Network. LRCN is a model in which a CNN as a feature extractor is applied to each frame before feeding them into an LSTM. In the first approach, no prior knowledge has been leveraged. Raw frames are fed into an 18-layer LRCN with a CTC on top. In the second approach, three main characteristics (hand shape, hand position, and hand movement information) associated with each sign have been extracted using Mediapipe. 2D landmarks of hand shape have been used to create the skeleton of the hands and then are fed to a CONV-LSTM model. Hand locations and hand positions as relative distance to head are fed to separate LSTMs. All three sources of information have been then integrated into a Multi-Cue network with a CTC classification layer. We evaluated the performance of proposed models on RWTH-PHOENIX-Weather. After performing an excessive search on model hyper-parameters such as the number of feature maps, input size, batch size, sequence length, LSTM memory cell, regularization, and dropout, we were able to achieve 35 Word Error Rate (WER). △ Less

Submitted 12 November, 2022; originally announced November 2022.

arXiv:1901.06401 [pdf, other]

Slim LSTM networks: LSTM_6 and LSTM_C6

Authors: Atra Akandeh, Fathi M. Salem

Abstract: We have shown previously that our parameter-reduced variants of Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNN) are comparable in performance to the standard LSTM RNN on the MNIST dataset. In this study, we show that this is also the case for two diverse benchmark datasets, namely, the review sentiment IMDB and the 20 Newsgroup datasets. Specifically, we focus on two of the simplest… ▽ More We have shown previously that our parameter-reduced variants of Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNN) are comparable in performance to the standard LSTM RNN on the MNIST dataset. In this study, we show that this is also the case for two diverse benchmark datasets, namely, the review sentiment IMDB and the 20 Newsgroup datasets. Specifically, we focus on two of the simplest variants, namely LSTM_6 (i.e., standard LSTM with three constant fixed gates) and LSTM_C6 (i.e., LSTM_6 with further reduced cell body input block). We demonstrate that these two aggressively reduced-parameter variants are competitive with the standard LSTM when hyper-parameters, e.g., learning parameter, number of hidden units and gate constants are set properly. These architectures enable speeding up training computations and hence, these networks would be more suitable for online training and inference onto portable devices with relatively limited computational resources. △ Less

Submitted 18 January, 2019; originally announced January 2019.

Comments: 6 pages, 12 figures, 5 tables

arXiv:1707.04626 [pdf, other]

Simplified Long Short-term Memory Recurrent Neural Networks: part III

Authors: Atra Akandeh, Fathi M. Salem

Abstract: This is part III of three-part work. In parts I and II, we have presented eight variants for simplified Long Short Term Memory (LSTM) recurrent neural networks (RNNs). It is noted that fast computation, specially in constrained computing resources, are an important factor in processing big time-sequence data. In this part III paper, we present and evaluate two new LSTM model variants which dramati… ▽ More This is part III of three-part work. In parts I and II, we have presented eight variants for simplified Long Short Term Memory (LSTM) recurrent neural networks (RNNs). It is noted that fast computation, specially in constrained computing resources, are an important factor in processing big time-sequence data. In this part III paper, we present and evaluate two new LSTM model variants which dramatically reduce the computational load while retaining comparable performance to the base (standard) LSTM RNNs. In these new variants, we impose (Hadamard) pointwise state multiplications in the cell-memory network in addition to the gating signal networks. △ Less

Submitted 14 July, 2017; originally announced July 2017.

Comments: Here 5 pages (in the conference 4 pages), 10 figures, 5 tables; this is part III of a three part work, all will appear in the IKE'17 - The 16th Int'l Conference on Information & Knowledge Engineering. The 2017 World Congress in Computer Science Computer Engineering & Applied Computing | CSCE'17, July 17-20, 2017, Las Vegas, Nevada, USA

arXiv:1707.04623 [pdf, other]

Simplified Long Short-term Memory Recurrent Neural Networks: part II

Authors: Atra Akandeh, Fathi M. Salem

Abstract: This is part II of three-part work. Here, we present a second set of inter-related five variants of simplified Long Short-term Memory (LSTM) recurrent neural networks by further reducing adaptive parameters. Two of these models have been introduced in part I of this work. We evaluate and verify our model variants on the benchmark MNIST dataset and assert that these models are comparable to the bas… ▽ More This is part II of three-part work. Here, we present a second set of inter-related five variants of simplified Long Short-term Memory (LSTM) recurrent neural networks by further reducing adaptive parameters. Two of these models have been introduced in part I of this work. We evaluate and verify our model variants on the benchmark MNIST dataset and assert that these models are comparable to the base LSTM model while use progressively less number of parameters. Moreover, we observe that in case of using the ReLU activation, the test accuracy performance of the standard LSTM will drop after a number of epochs when learning parameter become larger. However all of the new model variants sustain their performance. △ Less

Submitted 14 July, 2017; originally announced July 2017.

Comments: 4 pages, 6 figures, 5 tables; this is part II of three-part work, all to appear in IKE'17- The 16th Int'l Conference on Information & Knowledge Engineering, in The 2017 World Congress in Computer Science Computer Engineering & Applied Computing | CSCE'17 July 17-20, 2017, Las Vegas, Nevada, USA

arXiv:1707.04619 [pdf, other]

Simplified Long Short-term Memory Recurrent Neural Networks: part I

Authors: Atra Akandeh, Fathi M. Salem

Abstract: We present five variants of the standard Long Short-term Memory (LSTM) recurrent neural networks by uniformly reducing blocks of adaptive parameters in the gating mechanisms. For simplicity, we refer to these models as LSTM1, LSTM2, LSTM3, LSTM4, and LSTM5, respectively. Such parameter-reduced variants enable speeding up data training computations and would be more suitable for implementations ont… ▽ More We present five variants of the standard Long Short-term Memory (LSTM) recurrent neural networks by uniformly reducing blocks of adaptive parameters in the gating mechanisms. For simplicity, we refer to these models as LSTM1, LSTM2, LSTM3, LSTM4, and LSTM5, respectively. Such parameter-reduced variants enable speeding up data training computations and would be more suitable for implementations onto constrained embedded platforms. We comparatively evaluate and verify our five variant models on the classical MNIST dataset and demonstrate that these variant models are comparable to a standard implementation of the LSTM model while using less number of parameters. Moreover, we observe that in some cases the standard LSTM's accuracy performance will drop after a number of epochs when using the ReLU nonlinearity; in contrast, however, LSTM3, LSTM4 and LSTM5 will retain their performance. △ Less

Submitted 14 July, 2017; originally announced July 2017.

Comments: 4 pages, 6 figures, 5 tables. Part I of a three part publications that will appear in IKE'17 - The 16th Int'l Conference on Information & Knowledge Engineering The 2017 World Congress in Computer Science, Computer Engineering & Applied Computing | CSCE'17, July 17-20, 2017, Las Vegas, Nevada, USA

Showing 1–5 of 5 results for author: Akandeh, A