Search | arXiv e-print repository

Generating Clear Images From Images With Distortions Caused by Adverse Weather Using Generative Adversarial Networks

Abstract: We presented a method for improving computer vision tasks on images affected by adverse weather conditions, including distortions caused by adherent raindrops. Overcoming the challenge of applying computer vision to images affected by adverse weather conditions is essential for autonomous vehicles utilizing RGB cameras. For this purpose, we trained an appropriate generative adversarial network and… ▽ More We presented a method for improving computer vision tasks on images affected by adverse weather conditions, including distortions caused by adherent raindrops. Overcoming the challenge of applying computer vision to images affected by adverse weather conditions is essential for autonomous vehicles utilizing RGB cameras. For this purpose, we trained an appropriate generative adversarial network and showed that it was effective at removing the effect of the distortions, in the context of image reconstruction and computer vision tasks. We showed that object recognition, a vital task for autonomous driving vehicles, is completely impaired by the distortions and occlusions caused by adherent raindrops and that performance can be restored by our de-raining model. The approach described in this paper could be applied to all adverse weather conditions. △ Less

Submitted 1 November, 2022; originally announced November 2022.

Comments: 14 pages, 8 figures. arXiv admin note: text overlap with arXiv:2112.11245

arXiv:2112.11245 [pdf, other]

Generating Photo-realistic Images from LiDAR Point Clouds with Generative Adversarial Networks

Authors: Nuriel Shalom Mor

Abstract: We examined the feasibility of generative adversarial networks (GANs) to generate photo-realistic images from LiDAR point clouds. For this purpose, we created a dataset of point cloud image pairs and trained the GAN to predict photorealistic images from LiDAR point clouds containing reflectance and distance information. Our models learned how to predict realistically looking images from just point… ▽ More We examined the feasibility of generative adversarial networks (GANs) to generate photo-realistic images from LiDAR point clouds. For this purpose, we created a dataset of point cloud image pairs and trained the GAN to predict photorealistic images from LiDAR point clouds containing reflectance and distance information. Our models learned how to predict realistically looking images from just point cloud data, even images with black cars. Black cars are difficult to detect directly from point clouds because of their low level of reflectivity. This approach might be used in the future to perform visual object recognition on photorealistic images generated from LiDAR point clouds. In addition to the conventional LiDAR system, a second system that generates photorealistic images from LiDAR point clouds would run simultaneously for visual object recognition in real-time. In this way, we might preserve the supremacy of LiDAR and benefit from using photo-realistic images for visual object recognition without the usage of any camera. In addition, this approach could be used to colorize point clouds without the usage of any camera images. △ Less

Submitted 20 December, 2021; originally announced December 2021.

Comments: 11 pages, 4 figures

arXiv:2008.13525 [pdf]

Applying Deep Learning to Specific Learning Disorder Screening

Authors: Nuriel S. Mor, Kathryn L. Dardeck

Abstract: Early detection is key for treating those diagnosed with specific learning disorder, which includes problems with spelling, grammar, punctuation, clarity and organization of written expression. Intervening early can prevent potential negative consequences from this disorder. Deep convolutional neural networks (CNNs) perform better than human beings in many visual tasks such as making a medical dia… ▽ More Early detection is key for treating those diagnosed with specific learning disorder, which includes problems with spelling, grammar, punctuation, clarity and organization of written expression. Intervening early can prevent potential negative consequences from this disorder. Deep convolutional neural networks (CNNs) perform better than human beings in many visual tasks such as making a medical diagnosis from visual data. The purpose of this study was to evaluate the ability of a deep CNN to detect students with a diagnosis of specific learning disorder from their handwriting. The MobileNetV2 deep CNN architecture was used by applying transfer learning. The model was trained using a data set of 497 images of handwriting samples from students with a diagnosis of specific learning disorder, as well as those without this diagnosis. The detection of a specific learning disorder yielded on the validation set a mean area under the receiver operating characteristics curve of 0.89. This is a novel attempt to detect students with the diagnosis of specific learning disorder using deep learning. Such a system as was built for this study, may potentially provide fast initial screening of students who may meet the criteria for a diagnosis of specific learning disorder. △ Less

Submitted 22 August, 2020; originally announced August 2020.

Comments: 17 pages, 3 figures, 1 table

arXiv:1903.09589 [pdf, other]

A Fog Robotics Approach to Deep Robot Learning: Application to Object Recognition and Grasp Planning in Surface Decluttering

Authors: Ajay Kumar Tanwani, Nitesh Mor, John Kubiatowicz, Joseph E. Gonzalez, Ken Goldberg

Abstract: The growing demand of industrial, automotive and service robots presents a challenge to the centralized Cloud Robotics model in terms of privacy, security, latency, bandwidth, and reliability. In this paper, we present a `Fog Robotics' approach to deep robot learning that distributes compute, storage and networking resources between the Cloud and the Edge in a federated manner. Deep models are tra… ▽ More The growing demand of industrial, automotive and service robots presents a challenge to the centralized Cloud Robotics model in terms of privacy, security, latency, bandwidth, and reliability. In this paper, we present a `Fog Robotics' approach to deep robot learning that distributes compute, storage and networking resources between the Cloud and the Edge in a federated manner. Deep models are trained on non-private (public) synthetic images in the Cloud; the models are adapted to the private real images of the environment at the Edge within a trusted network and subsequently, deployed as a service for low-latency and secure inference/prediction for other robots in the network. We apply this approach to surface decluttering, where a mobile robot picks and sorts objects from a cluttered floor by learning a deep object recognition and a grasp planning model. Experiments suggest that Fog Robotics can improve performance by sim-to-real domain adaptation in comparison to exclusively using Cloud or Edge resources, while reducing the inference cycle time by 4\times to successfully declutter 86% of objects over 213 attempts. △ Less

Submitted 22 March, 2019; originally announced March 2019.

Comments: IEEE International Conference on Robotics and Automation, ICRA, 2019

arXiv:1805.11161 [pdf, other]

doi 10.1109/WACV.2018.00030

Confidence Prediction for Lexicon-Free OCR

Authors: Noam Mor, Lior Wolf

Abstract: Having a reliable accuracy score is crucial for real world applications of OCR, since such systems are judged by the number of false readings. Lexicon-based OCR systems, which deal with what is essentially a multi-class classification problem, often employ methods explicitly taking into account the lexicon, in order to improve accuracy. However, in lexicon-free scenarios, filtering errors requires… ▽ More Having a reliable accuracy score is crucial for real world applications of OCR, since such systems are judged by the number of false readings. Lexicon-based OCR systems, which deal with what is essentially a multi-class classification problem, often employ methods explicitly taking into account the lexicon, in order to improve accuracy. However, in lexicon-free scenarios, filtering errors requires an explicit confidence calculation. In this work we show two explicit confidence measurement techniques, and show that they are able to achieve a significant reduction in misreads on both standard benchmarks and a proprietary dataset. △ Less

Submitted 28 May, 2018; originally announced May 2018.

arXiv:1805.07848 [pdf, other]

A Universal Music Translation Network

Authors: Noam Mor, Lior Wolf, Adam Polyak, Yaniv Taigman

Abstract: We present a method for translating music across musical instruments, genres, and styles. This method is based on a multi-domain wavenet autoencoder, with a shared encoder and a disentangled latent space that is trained end-to-end on waveforms. Employing a diverse training dataset and large net capacity, the domain-independent encoder allows us to translate even from musical domains that were not… ▽ More We present a method for translating music across musical instruments, genres, and styles. This method is based on a multi-domain wavenet autoencoder, with a shared encoder and a disentangled latent space that is trained end-to-end on waveforms. Employing a diverse training dataset and large net capacity, the domain-independent encoder allows us to translate even from musical domains that were not seen during training. The method is unsupervised and does not rely on supervision in the form of matched samples between domains or musical transcriptions. We evaluate our method on NSynth, as well as on a dataset collected from professional musicians, and achieve convincing translations, even when translating from whistling, potentially enabling the creation of instrumental music by untrained humans. △ Less

Submitted 23 May, 2018; v1 submitted 20 May, 2018; originally announced May 2018.

arXiv:1803.09337 [pdf, other]

Text Segmentation as a Supervised Learning Task

Authors: Omri Koshorek, Adir Cohen, Noam Mor, Michael Rotman, Jonathan Berant

Abstract: Text segmentation, the task of dividing a document into contiguous segments based on its semantic structure, is a longstanding challenge in language understanding. Previous work on text segmentation focused on unsupervised methods such as clustering or graph search, due to the paucity in labeled data. In this work, we formulate text segmentation as a supervised learning problem, and present a larg… ▽ More Text segmentation, the task of dividing a document into contiguous segments based on its semantic structure, is a longstanding challenge in language understanding. Previous work on text segmentation focused on unsupervised methods such as clustering or graph search, due to the paucity in labeled data. In this work, we formulate text segmentation as a supervised learning problem, and present a large new dataset for text segmentation that is automatically extracted and labeled from Wikipedia. Moreover, we develop a segmentation model based on this dataset and show that it generalizes well to unseen natural text. △ Less

Submitted 25 March, 2018; originally announced March 2018.

Comments: 5 pages, 1 figure, NAACL 2018

Showing 1–7 of 7 results for author: Mor, N