-
Intelligent energy management of steam generators
Authors:
Ahmed S. Hussein,
Noha H. El-Amary,
Loai Saad El-din Nasrat,
Ali Selim
Abstract:
This paper introduces a smart model for intelligent energy management of steam generators which are utilized for steam generator and controlling the air to fuel ratio for steam generator all over the firing curve and transient mode operation. Nowadays, the environment faces a lot of pollution and global warming phenomena. With the spread of electrical devices, electric cars with conventional elect…
▽ More
This paper introduces a smart model for intelligent energy management of steam generators which are utilized for steam generator and controlling the air to fuel ratio for steam generator all over the firing curve and transient mode operation. Nowadays, the environment faces a lot of pollution and global warming phenomena. With the spread of electrical devices, electric cars with conventional electrical generation sources, and the increase in electrical consumption, instead of minimizing the pollution level the situation becomes disastrous. Steam generators have a lot of pros which cannot be neglected, such as: high efficiency, reliable operation, low emission (with regular maintenance), and big variety of fuel source. However, regular maintenance overlooks some parameters, especially the air to fuel ratio that achieves green environment, high efficiency and low fuel consumption. The steam generator system is simulated utilizing Simulink/MATLAB. The system is operated at different loading and generation conditions to determine the variation of air to fuel ratio against power variation. Neural Network (NN) unit is added in different locations and scenarios. It is effective in controlling the main bus of air, fuel, auxiliary and inverter speed. By testing the NN on the simulated tested system, the results are satisfied.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization
Authors:
Amir Hussein,
Brian Yan,
Antonios Anastasopoulos,
Shinji Watanabe,
Sanjeev Khudanpur
Abstract:
Incorporating longer context has been shown to benefit machine translation, but the inclusion of context in end-to-end speech translation (E2E-ST) remains under-studied. To bridge this gap, we introduce target language context in E2E-ST, enhancing coherence and overcoming memory constraints of extended audio segments. Additionally, we propose context dropout to ensure robustness to the absence of…
▽ More
Incorporating longer context has been shown to benefit machine translation, but the inclusion of context in end-to-end speech translation (E2E-ST) remains under-studied. To bridge this gap, we introduce target language context in E2E-ST, enhancing coherence and overcoming memory constraints of extended audio segments. Additionally, we propose context dropout to ensure robustness to the absence of context, and further improve performance by adding speaker information. Our proposed contextual E2E-ST outperforms the isolated utterance-based E2E-ST approach. Lastly, we demonstrate that in conversational speech, contextual information primarily contributes to capturing context style, as well as resolving anaphora and named entities.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
Speech collage: code-switched audio generation by collaging monolingual corpora
Authors:
Amir Hussein,
Dorsa Zeinali,
Ondřej Klejch,
Matthew Wiesner,
Brian Yan,
Shammur Chowdhury,
Ahmed Ali,
Shinji Watanabe,
Sanjeev Khudanpur
Abstract:
Designing effective automatic speech recognition (ASR) systems for Code-Switching (CS) often depends on the availability of the transcribed CS resources. To address data scarcity, this paper introduces Speech Collage, a method that synthesizes CS data from monolingual corpora by splicing audio segments. We further improve the smoothness quality of audio generation using an overlap-add approach. We…
▽ More
Designing effective automatic speech recognition (ASR) systems for Code-Switching (CS) often depends on the availability of the transcribed CS resources. To address data scarcity, this paper introduces Speech Collage, a method that synthesizes CS data from monolingual corpora by splicing audio segments. We further improve the smoothness quality of audio generation using an overlap-add approach. We investigate the impact of generated data on speech recognition in two scenarios: using in-domain CS text and a zero-shot approach with synthesized CS text. Empirical results highlight up to 34.4% and 16.2% relative reductions in Mixed-Error Rate and Word-Error Rate for in-domain and zero-shot scenarios, respectively. Lastly, we demonstrate that CS augmentation bolsters the model's code-switching inclination and reduces its monolingual bias.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
Benchmarking Evaluation Metrics for Code-Switching Automatic Speech Recognition
Authors:
Injy Hamed,
Amir Hussein,
Oumnia Chellah,
Shammur Chowdhury,
Hamdy Mubarak,
Sunayana Sitaram,
Nizar Habash,
Ahmed Ali
Abstract:
Code-switching poses a number of challenges and opportunities for multilingual automatic speech recognition. In this paper, we focus on the question of robust and fair evaluation metrics. To that end, we develop a reference benchmark data set of code-switching speech recognition hypotheses with human judgments. We define clear guidelines for minimal editing of automatic hypotheses. We validate the…
▽ More
Code-switching poses a number of challenges and opportunities for multilingual automatic speech recognition. In this paper, we focus on the question of robust and fair evaluation metrics. To that end, we develop a reference benchmark data set of code-switching speech recognition hypotheses with human judgments. We define clear guidelines for minimal editing of automatic hypotheses. We validate the guidelines using 4-way inter-annotator agreement. We evaluate a large number of metrics in terms of correlation with human judgments. The metrics we consider vary in terms of representation (orthographic, phonological, semantic), directness (intrinsic vs extrinsic), granularity (e.g. word, character), and similarity computation method. The highest correlation to human judgment is achieved using transliteration followed by text normalization. We release the first corpus for human acceptance of code-switching speech recognition results in dialectal Arabic/English conversation speech.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
Progress towards machine learning methodologies for laser-induced breakdown spectroscopy with an emphasis on soil analysis
Authors:
Yingchao Huang,
Sivanandan S. Harilal,
Abdul Bais,
Amina E. Hussein
Abstract:
Optical emission spectroscopy of laser-produced plasmas, commonly known as laser-induced breakdown spectroscopy (LIBS), is an emerging analytical tool for rapid soil analysis. However, specific challenges with LIBS exist, such as matrix effects and quantification issues, that require further study in the application of LIBS, particularly for analysis of heterogeneous samples such as soils. Advance…
▽ More
Optical emission spectroscopy of laser-produced plasmas, commonly known as laser-induced breakdown spectroscopy (LIBS), is an emerging analytical tool for rapid soil analysis. However, specific challenges with LIBS exist, such as matrix effects and quantification issues, that require further study in the application of LIBS, particularly for analysis of heterogeneous samples such as soils. Advancements in the applications of Machine Learning (ML) methods can address some of these issues, advancing the potential for LIBS in soil analysis. This article aims to review the progress of LIBS application combined with ML methods, focusing on methodological approaches used in reducing matrix effect, feature selection, quantification analysis, soil classification, and self-absorption. The performance of various adopted ML approaches is discussed, including their shortcomings and advantages, to provide researchers with a clear picture of the current status of ML applications in LIBS for improving its analytical capability. The challenges and prospects of LIBS development in soil analysis are proposed, offering a path toward future research. This review article emphasize ML tools for LIBS soil analysis that are broadly relevant for other LIBS applications.
△ Less
Submitted 15 August, 2022;
originally announced August 2022.
-
The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic Speech Recognition
Authors:
Jonathan Mukiibi,
Andrew Katumba,
Joyce Nakatumba-Nabende,
Ali Hussein,
Josh Meyer
Abstract:
Building a usable radio monitoring automatic speech recognition (ASR) system is a challenging task for under-resourced languages and yet this is paramount in societies where radio is the main medium of public communication and discussions. Initial efforts by the United Nations in Uganda have proved how understanding the perceptions of rural people who are excluded from social media is important in…
▽ More
Building a usable radio monitoring automatic speech recognition (ASR) system is a challenging task for under-resourced languages and yet this is paramount in societies where radio is the main medium of public communication and discussions. Initial efforts by the United Nations in Uganda have proved how understanding the perceptions of rural people who are excluded from social media is important in national planning. However, these efforts are being challenged by the absence of transcribed speech datasets. In this paper, The Makerere Artificial Intelligence research lab releases a Luganda radio speech corpus of 155 hours. To our knowledge, this is the first publicly available radio dataset in sub-Saharan Africa. The paper describes the development of the voice corpus and presents baseline Luganda ASR performance results using Coqui STT toolkit, an open source speech recognition toolkit.
△ Less
Submitted 20 June, 2022;
originally announced June 2022.
-
Textual Data Augmentation for Arabic-English Code-Switching Speech Recognition
Authors:
Amir Hussein,
Shammur Absar Chowdhury,
Ahmed Abdelali,
Najim Dehak,
Ahmed Ali,
Sanjeev Khudanpur
Abstract:
The pervasiveness of intra-utterance code-switching (CS) in spoken content requires that speech recognition (ASR) systems handle mixed language. Designing a CS-ASR system has many challenges, mainly due to data scarcity, grammatical structure complexity, and domain mismatch. The most common method for addressing CS is to train an ASR system with the available transcribed CS speech, along with mono…
▽ More
The pervasiveness of intra-utterance code-switching (CS) in spoken content requires that speech recognition (ASR) systems handle mixed language. Designing a CS-ASR system has many challenges, mainly due to data scarcity, grammatical structure complexity, and domain mismatch. The most common method for addressing CS is to train an ASR system with the available transcribed CS speech, along with monolingual data. In this work, we propose a zero-shot learning methodology for CS-ASR by augmenting the monolingual data with artificially generating CS text. We based our approach on random lexical replacements and Equivalence Constraint (EC) while exploiting aligned translation pairs to generate random and grammatically valid CS content. Our empirical results show a 65.5% relative reduction in language model perplexity, and 7.7% in ASR WER on two ecologically valid CS test sets. The human evaluation of the generated text using EC suggests that more than 80% is of adequate quality.
△ Less
Submitted 11 January, 2023; v1 submitted 7 January, 2022;
originally announced January 2022.
-
Arabic Code-Switching Speech Recognition using Monolingual Data
Authors:
Ahmed Ali,
Shammur Chowdhury,
Amir Hussein,
Yasser Hifny
Abstract:
Code-switching in automatic speech recognition (ASR) is an important challenge due to globalization. Recent research in multilingual ASR shows potential improvement over monolingual systems. We study key issues related to multilingual modeling for ASR through a series of large-scale ASR experiments. Our innovative framework deploys a multi-graph approach in the weighted finite state transducers (W…
▽ More
Code-switching in automatic speech recognition (ASR) is an important challenge due to globalization. Recent research in multilingual ASR shows potential improvement over monolingual systems. We study key issues related to multilingual modeling for ASR through a series of large-scale ASR experiments. Our innovative framework deploys a multi-graph approach in the weighted finite state transducers (WFST) framework. We compare our WFST decoding strategies with a transformer sequence to sequence system trained on the same data. Given a code-switching scenario between Arabic and English languages, our results show that the WFST decoding approaches were more suitable for the intersentential code-switching datasets. In addition, the transformer system performed better for intrasentential code-switching task. With this study, we release an artificially generated development and test sets, along with ecological code-switching test set, to benchmark the ASR performance.
△ Less
Submitted 4 July, 2021;
originally announced July 2021.
-
QASR: QCRI Aljazeera Speech Resource -- A Large Scale Annotated Arabic Speech Corpus
Authors:
Hamdy Mubarak,
Amir Hussein,
Shammur Absar Chowdhury,
Ahmed Ali
Abstract:
We introduce the largest transcribed Arabic speech corpus, QASR, collected from the broadcast domain. This multi-dialect speech dataset contains 2,000 hours of speech sampled at 16kHz crawled from Aljazeera news channel. The dataset is released with lightly supervised transcriptions, aligned with the audio segments. Unlike previous datasets, QASR contains linguistically motivated segmentation, pun…
▽ More
We introduce the largest transcribed Arabic speech corpus, QASR, collected from the broadcast domain. This multi-dialect speech dataset contains 2,000 hours of speech sampled at 16kHz crawled from Aljazeera news channel. The dataset is released with lightly supervised transcriptions, aligned with the audio segments. Unlike previous datasets, QASR contains linguistically motivated segmentation, punctuation, speaker information among others. QASR is suitable for training and evaluating speech recognition systems, acoustics- and/or linguistics- based Arabic dialect identification, punctuation restoration, speaker identification, speaker linking, and potentially other NLP modules for spoken data. In addition to QASR transcription, we release a dataset of 130M words to aid in designing and training a better language model. We show that end-to-end automatic speech recognition trained on QASR reports a competitive word error rate compared to the previous MGB-2 corpus. We report baseline results for downstream natural language processing tasks such as named entity recognition using speech transcript. We also report the first baseline for Arabic punctuation restoration. We make the corpus available for the research community.
△ Less
Submitted 24 June, 2021;
originally announced June 2021.
-
Balanced End-to-End Monolingual pre-training for Low-Resourced Indic Languages Code-Switching Speech Recognition
Authors:
Amir Hussein,
Shammur Chowdhury,
Najim Dehak,
Ahmed Ali
Abstract:
The success in designing Code-Switching (CS) ASR often depends on the availability of the transcribed CS resources. Such dependency harms the development of ASR in low-resourced languages such as Bengali and Hindi. In this paper, we exploit the transfer learning approach to design End-to-End (E2E) CS ASR systems for the two low-resourced language pairs using different monolingual speech data and a…
▽ More
The success in designing Code-Switching (CS) ASR often depends on the availability of the transcribed CS resources. Such dependency harms the development of ASR in low-resourced languages such as Bengali and Hindi. In this paper, we exploit the transfer learning approach to design End-to-End (E2E) CS ASR systems for the two low-resourced language pairs using different monolingual speech data and a small set of noisy CS data. We trained the CS-ASR, following two steps: (i) building a robust bilingual ASR system using a convolution-augmented transformer (Conformer) based acoustic model and n-gram language model, and (ii) fine-tuned the entire E2E ASR with limited noisy CS data. We tested our method on MUCS 2021 challenge and achieved 3rd place in the CS track. We then tested the proposed method using noisy CS data released for Hindi-English and Bengali-English pairs in Multilingual and Code-Switching ASR Challenges for Low Resource Indian Languages (MUCS 2021) and achieved 3rd place in the CS track. Unlike, the leading two systems that benefited from crawling YouTube and learning transliteration pairs, our proposed transfer learning approach focused on using only the limited CS data with no data-cleaning or data re-segmentation. Our approach achieved 14.1% relative gain in word error rate (WER) in Hindi-English and 27.1% in Bengali-English. We provide detailed guidelines on the steps to finetune the self-attention based model for limited data for ASR. Moreover, we release the code and recipe used in this paper.
△ Less
Submitted 15 February, 2022; v1 submitted 10 June, 2021;
originally announced June 2021.
-
Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR
Authors:
Shammur Absar Chowdhury,
Amir Hussein,
Ahmed Abdelali,
Ahmed Ali
Abstract:
With the advent of globalization, there is an increasing demand for multilingual automatic speech recognition (ASR), handling language and dialectal variation of spoken content. Recent studies show its efficacy over monolingual systems. In this study, we design a large multilingual end-to-end ASR using self-attention based conformer architecture. We trained the system using Arabic (Ar), English (E…
▽ More
With the advent of globalization, there is an increasing demand for multilingual automatic speech recognition (ASR), handling language and dialectal variation of spoken content. Recent studies show its efficacy over monolingual systems. In this study, we design a large multilingual end-to-end ASR using self-attention based conformer architecture. We trained the system using Arabic (Ar), English (En) and French (Fr) languages. We evaluate the system performance handling: (i) monolingual (Ar, En and Fr); (ii) multi-dialectal (Modern Standard Arabic, along with dialectal variation such as Egyptian and Moroccan); (iii) code-switching -- cross-lingual (Ar-En/Fr) and dialectal (MSA-Egyptian dialect) test cases, and compare with current state-of-the-art systems. Furthermore, we investigate the influence of different embedding/character representations including character vs word-piece; shared vs distinct input symbol per language. Our findings demonstrate the strength of such a model by outperforming state-of-the-art monolingual dialectal Arabic and code-switching Arabic ASR.
△ Less
Submitted 5 July, 2021; v1 submitted 31 May, 2021;
originally announced May 2021.
-
Arabic Speech Recognition by End-to-End, Modular Systems and Human
Authors:
Amir Hussein,
Shinji Watanabe,
Ahmed Ali
Abstract:
Recent advances in automatic speech recognition (ASR) have achieved accuracy levels comparable to human transcribers, which led researchers to debate if the machine has reached human performance. Previous work focused on the English language and modular hidden Markov model-deep neural network (HMM-DNN) systems. In this paper, we perform a comprehensive benchmarking for end-to-end transformer ASR,…
▽ More
Recent advances in automatic speech recognition (ASR) have achieved accuracy levels comparable to human transcribers, which led researchers to debate if the machine has reached human performance. Previous work focused on the English language and modular hidden Markov model-deep neural network (HMM-DNN) systems. In this paper, we perform a comprehensive benchmarking for end-to-end transformer ASR, modular HMM-DNN ASR, and human speech recognition (HSR) on the Arabic language and its dialects. For the HSR, we evaluate linguist performance and lay-native speaker performance on a new dataset collected as a part of this study. For ASR the end-to-end work led to 12.5%, 27.5%, 33.8% WER; a new performance milestone for the MGB2, MGB3, and MGB5 challenges respectively. Our results suggest that human performance in the Arabic language is still considerably better than the machine with an absolute WER gap of 3.5% on average.
△ Less
Submitted 29 June, 2021; v1 submitted 21 January, 2021;
originally announced January 2021.
-
Vehicle Platooning Impact on Drag Coefficients and Energy/Fuel Saving Implications
Authors:
Ahmed A. Hussein,
Hesham A. Rakha
Abstract:
In this paper, empirical data from the literature are used to develop general power models that capture the impact of a vehicle position, in a platoon of homogeneous vehicles, and the distance gap to its lead (and following) vehicle on its drag coefficient. These models are developed for light duty vehicles, buses, and heavy duty trucks. The models were fit using a constrained optimization framewo…
▽ More
In this paper, empirical data from the literature are used to develop general power models that capture the impact of a vehicle position, in a platoon of homogeneous vehicles, and the distance gap to its lead (and following) vehicle on its drag coefficient. These models are developed for light duty vehicles, buses, and heavy duty trucks. The models were fit using a constrained optimization framework to fit a general power function using either direct drag force or fuel measurements. The model is then used to extrapolate the empirical measurements to a wide range of vehicle distance gaps within a platoon. Using these models we estimate the potential fuel reduction associated with homogeneous platoons of light duty vehicles, buses, and heavy duty trucks. The results show a significant reduction in the vehicle fuel consumption when compared with those based on a constant drag coefficient assumption. Specifically, considering a minimum time gap between vehicles of $0.5 \; secs$ (which is typical considering state-of-practice communication and mechanical system latencies) running at a speed of $100 \; km/hr$, the optimum fuel reduction that is achieved is $4.5 \%$, $15.5 \%$, and $7.0 \%$ for light duty vehicle, bus, and heavy duty truck platoons, respectively. For longer time gaps, the bus and heavy duty truck platoons still produce fuel reductions in the order of $9.0 \%$ and $4.5 \%$, whereas light duty vehicles produce negligible fuel savings.
△ Less
Submitted 2 March, 2020; v1 submitted 2 January, 2020;
originally announced January 2020.
-
Correction Filter for Single Image Super-Resolution: Robustifying Off-the-Shelf Deep Super-Resolvers
Authors:
Shady Abu Hussein,
Tom Tirer,
Raja Giryes
Abstract:
The single image super-resolution task is one of the most examined inverse problems in the past decade. In the recent years, Deep Neural Networks (DNNs) have shown superior performance over alternative methods when the acquisition process uses a fixed known downsampling kernel-typically a bicubic kernel. However, several recent works have shown that in practical scenarios, where the test data mism…
▽ More
The single image super-resolution task is one of the most examined inverse problems in the past decade. In the recent years, Deep Neural Networks (DNNs) have shown superior performance over alternative methods when the acquisition process uses a fixed known downsampling kernel-typically a bicubic kernel. However, several recent works have shown that in practical scenarios, where the test data mismatch the training data (e.g. when the downsampling kernel is not the bicubic kernel or is not available at training), the leading DNN methods suffer from a huge performance drop. Inspired by the literature on generalized sampling, in this work we propose a method for improving the performance of DNNs that have been trained with a fixed kernel on observations acquired by other kernels. For a known kernel, we design a closed-form correction filter that modifies the low-resolution image to match one which is obtained by another kernel (e.g. bicubic), and thus improves the results of existing pre-trained DNNs. For an unknown kernel, we extend this idea and propose an algorithm for blind estimation of the required correction filter. We show that our approach outperforms other super-resolution methods, which are designed for general downsampling kernels.
△ Less
Submitted 24 May, 2020; v1 submitted 30 November, 2019;
originally announced December 2019.
-
Latent Function Decomposition for Forecasting Li-ion Battery Cells Capacity: A Multi-Output Convolved Gaussian Process Approach
Authors:
Abdallah A. Chehade,
Ala A. Hussein
Abstract:
A latent function decomposition method is proposed for forecasting the capacity of lithium-ion battery cells. The method uses the Multi-Output Gaussian Process, a generative machine learning framework for multi-task and transfer learning. The MCGP decomposes the available capacity trends from multiple battery cells into latent functions. The latent functions are then convolved over kernel smoother…
▽ More
A latent function decomposition method is proposed for forecasting the capacity of lithium-ion battery cells. The method uses the Multi-Output Gaussian Process, a generative machine learning framework for multi-task and transfer learning. The MCGP decomposes the available capacity trends from multiple battery cells into latent functions. The latent functions are then convolved over kernel smoothers to reconstruct and/or forecast capacity trends of the battery cells. Besides the high prediction accuracy the proposed method possesses, it provides uncertainty information for the predictions and captures nontrivial cross-correlations between capacity trends of different battery cells. These two merits make the proposed MCGP a very reliable and practical solution for applications that use battery cell packs. The MCGP is derived and compared to benchmark methods on an experimental lithium-ion battery cells data. The results show the effectiveness of the proposed method.
△ Less
Submitted 19 July, 2019;
originally announced July 2019.
-
Image-Adaptive GAN based Reconstruction
Authors:
Shady Abu Hussein,
Tom Tirer,
Raja Giryes
Abstract:
In the recent years, there has been a significant improvement in the quality of samples produced by (deep) generative models such as variational auto-encoders and generative adversarial networks. However, the representation capabilities of these methods still do not capture the full distribution for complex classes of images, such as human faces. This deficiency has been clearly observed in previo…
▽ More
In the recent years, there has been a significant improvement in the quality of samples produced by (deep) generative models such as variational auto-encoders and generative adversarial networks. However, the representation capabilities of these methods still do not capture the full distribution for complex classes of images, such as human faces. This deficiency has been clearly observed in previous works that use pre-trained generative models to solve imaging inverse problems. In this paper, we suggest to mitigate the limited representation capabilities of generators by making them image-adaptive and enforcing compliance of the restoration with the observations via back-projections. We empirically demonstrate the advantages of our proposed approach for image super-resolution and compressed sensing.
△ Less
Submitted 25 November, 2019; v1 submitted 12 June, 2019;
originally announced June 2019.
-
Artificial Neural Network for LiDAL Systems
Authors:
Aubida A. Al-Hameed,
Safwan Hafeedh Younus,
Ahmed Taha Hussein,
Mohammed T. Alresheedi,
Jaafar M. H. Elmirghani
Abstract:
In this paper, we introduce an intelligent light detection and localization (LiDAL) system that uses artificial neural networks (ANN). The LiDAL systems of interest are MIMO LiDAL and MISO IMG LiDAL systems. A trained ANN with the LiDAL system of interest is used to distinguish a human (target) from the background obstacles (furniture) in a realistic indoor environment. In the LiDAL systems, the r…
▽ More
In this paper, we introduce an intelligent light detection and localization (LiDAL) system that uses artificial neural networks (ANN). The LiDAL systems of interest are MIMO LiDAL and MISO IMG LiDAL systems. A trained ANN with the LiDAL system of interest is used to distinguish a human (target) from the background obstacles (furniture) in a realistic indoor environment. In the LiDAL systems, the received reflected signals in the time domain have different patterns corresponding to the number of targets and their locations in an indoor environment. The indoor environment with background obstacles (furniture) appears as a set of patterns in the time domain when the transmitted optical signals are reflected from objects in LiDAL systems. Hence, a trained neural network that has the ability to classify and recognize the received signal patterns can distinguish the targets from the background obstacles in a realistic environment. The LiDAL systems with ANN are evaluated in a realistic indoor environment through computer simulation.
△ Less
Submitted 26 April, 2019;
originally announced April 2019.
-
LiDAL: Light Detection and Localization
Authors:
Aubida A. Al-Hameed,
Safwan Hafeedh Younus,
Ahmed Taha Hussein,
Mohammed T. Alresheedi,
Jaafar M. H. Elmirghani
Abstract:
In this paper, we present the first indoor light-based detection and localization system that builds on concepts from radio detection and ranging (radar) making use of the expected growth in the use and adoption of visible light communication (VLC), which can provide the infrastructure for our LiDAL system. Our system enables active detection, counting and localization of people, in addition to be…
▽ More
In this paper, we present the first indoor light-based detection and localization system that builds on concepts from radio detection and ranging (radar) making use of the expected growth in the use and adoption of visible light communication (VLC), which can provide the infrastructure for our LiDAL system. Our system enables active detection, counting and localization of people, in addition to being fully compatible with existing VLC systems. In order to detect human (targets), LiDAL uses the visible light spectrum, it sends pulses using a VLC transmitter and analyses the reflected signal collected by a photodetector receiver. Although we examine the use of the visible spectrum here, LiDAL can be used in the infrared spectrum and other parts of the light spectrum. We introduce LiDAL with different transmitter-receiver configurations and optimum detectors considering the fluctuation of the received reflected signal from the target in the presence of Gaussian noise. We design an efficient multiple input multiple output (MIMO) LiDAL system with wide field of view (FOV) single photodetector receiver, and also design a multiple input single output (MISO) LiDAL system with an imaging receiver to eliminate the ambiguity in target detection and localization. We develop models for the human body and its reflections and consider the impact of the colour and texture of the cloth used as well as the impact of target mobility. A number of detection and localization methods are developed for our LiDAL system including cross correaltion, a background subtraction method and a neural network based method. These methods are considered to distinguish a mobile target from the ambient reflections due to background obstacles (furniture) in a realistic indoor environment.
△ Less
Submitted 26 April, 2019; v1 submitted 23 March, 2019;
originally announced March 2019.
-
Optical Wireless Communication Systems, A Survey
Authors:
Osama Alsulami,
Ahmed Taha Hussein,
Mohammed T. Alresheedi,
Jaafar M. H. Elmirghani
Abstract:
In the past few years, the demand for high data rate services has increased dramatically. The congestion in the radio frequency (RF) spectrum (3 kHz ~ 300 GHz) is expected to limit the growth of future wireless systems unless new parts of the spectrum are opened. Even with the use of advanced engineering, such as signal processing and advanced modulation schemes, it will be very challenging to mee…
▽ More
In the past few years, the demand for high data rate services has increased dramatically. The congestion in the radio frequency (RF) spectrum (3 kHz ~ 300 GHz) is expected to limit the growth of future wireless systems unless new parts of the spectrum are opened. Even with the use of advanced engineering, such as signal processing and advanced modulation schemes, it will be very challenging to meet the demands of the users in the next decades using the existing carrier frequencies. On the other hand, there is a potential band of the spectrum available that can provide tens of Gbps to Tbps for users in the near future. Optical wireless communication (OWC) systems are among the promising solutions to the bandwidth limitation problem faced by radio systems. In this paper, we give a tutorial survey of the most significant issues in OWC systems that operate at short ranges such as indoor systems. We consider the challenging issues facing these systems such as (i) link design and system requirements, (ii) transmitter structures, (iii) receiver structures, (iv) challenges and possible techniques to mitigate the impairments in these systems, (v) the main applications and (vi) open research issues. In indoor OWC systems we describe channel modelling, mobility and dispersion mitigation techniques. Infrared communication (IRC) and visible light communication (VLC) are presented as potential implementation approaches for OWC systems and are comprehensively discussed. Moreover, open research issues in OWC systems are discussed.
△ Less
Submitted 30 December, 2018;
originally announced December 2018.
-
VLC Systems with CGHs
Authors:
Safwan Hafeedh Younus,
Ahmed Taha Hussein,
Mohammed T. Alresheedi,
Jaafar M. H. Elmirghani
Abstract:
The achievable data rate in indoor wireless systems that employ visible light communication (VLC) can be limited by multipath propagation. Here, we use computer generated holograms (CGHs) in VLC system design to improve the achievable system data rate. The CGHs are utilized to produce a fixed broad beam from the light source, selecting the light source that offers the best performance. The CGHs di…
▽ More
The achievable data rate in indoor wireless systems that employ visible light communication (VLC) can be limited by multipath propagation. Here, we use computer generated holograms (CGHs) in VLC system design to improve the achievable system data rate. The CGHs are utilized to produce a fixed broad beam from the light source, selecting the light source that offers the best performance. The CGHs direct this beam to a specific zone on the room's communication floor where the receiver is located. This reduces the effect of diffuse reflections. Consequently, decreasing the intersymbol interference (ISI) and enabling the VLC indoor channel to support higher data rates. We consider two settings to examine our propose VLC system and consider lighting constraints. We evaluate the performance in idealistic and realistic room setting in a diffuse environment with up to second order reflections and also under mobility. The results show that using the CGHs enhances the 3dB bandwidth of the VLC channel and improves the received optical power.
△ Less
Submitted 12 November, 2018;
originally announced December 2018.
-
WDM for Multi-user Indoor VLC Systems with SCM
Authors:
Safwan Hafeedh Younus,
Aubida A. Al-Hameed,
Ahmed Taha Hussein,
Mohammed T. Alresheedi,
Jaafar M. H. Elmirghani
Abstract:
A system that employs wavelength division multiplexing (WDM) in conjunction with subcarrier multiplexing (SCM) tones is proposed to realize high data rate multi-user indoor visible light communication (VLC). The SCM tones, which are unmodulated signals, are used to identify each light unit, to find the optimum light unit for each user and to calculate the level of the co-channel interference (CCI)…
▽ More
A system that employs wavelength division multiplexing (WDM) in conjunction with subcarrier multiplexing (SCM) tones is proposed to realize high data rate multi-user indoor visible light communication (VLC). The SCM tones, which are unmodulated signals, are used to identify each light unit, to find the optimum light unit for each user and to calculate the level of the co-channel interference (CCI). WDM is utilized to attain a high data rate for each user. In this paper, multicolour (four colours) laser diodes (LDs) are utilized as sources of lighting and data communication. One of the WDM colours is used to convey the SCM tones at the beginning of the connection to set up the connection among receivers and light units (to find the optimum light unit for each user). To evaluate the performance of our VLC system, we propose two types of receivers: an array of non-imaging receivers (NI-R) and an array of non-imaging angle diversity receivers (NI-ADR). In this paper, we consider the effects of diffuse reflections, CCI and mobility on the system performance.
△ Less
Submitted 4 November, 2018;
originally announced November 2018.
-
Subcarrier Multiplexing for Parallel Data Transmission in Indoor Visible Light Communication Systems
Authors:
Safwan Hafeedh Younus,
Aubida A. Al-Hameed,
Ahmed Taha Hussein,
Mohammed T. Alresheedi,
Jaafar M. H. Elmirghani
Abstract:
This paper presents an indoor visible light communication (VLC) system in conjunction with an imaging receiver with parallel data transmission (spatial multiplexing) to decrease the effects of inter-symbol interference (ISI). To distinguish between light units (transmitters) and to match the light units used to convey the data with the pixels of the imaging receiver, we propose the use of subcarri…
▽ More
This paper presents an indoor visible light communication (VLC) system in conjunction with an imaging receiver with parallel data transmission (spatial multiplexing) to decrease the effects of inter-symbol interference (ISI). To distinguish between light units (transmitters) and to match the light units used to convey the data with the pixels of the imaging receiver, we propose the use of subcarrier multiplexing (SCM) tones. Each light unit transmission is multiplexed with a unique tone. At the receiver, a SCM tone decision system is utilized to measure the power level of each SCM tone and consequently associate each pixel with a light unit. In addition, the level of co-channel interference (CCI) between light units is estimated using the SCM tones. Our proposed system is examined in two indoor environments taking into account reflective components (first and second order reflections). The results show that this system has the potential to achieve an aggregate data rate of 8 Gb/s with a bit error rate (BER) of 10-6 for each light unit, using simple on-off-keying (OOK).
△ Less
Submitted 4 November, 2018;
originally announced November 2018.