Search | arXiv e-print repository

Finding A Taxi with Illegal Driver Substitution Activity via Behavior Modelings

Authors: Junbiao Pang, Muhammad Ayub Sabir, Zhuyun Wang, An**g Hu, Xue Yang, Haitao Yu, Qingming Huang

Abstract: In our urban life, Illegal Driver Substitution (IDS) activity for a taxi is a grave unlawful activity in the taxi industry, possibly causing severe traffic accidents and painful social repercussions. Currently, the IDS activity is manually supervised by law enforcers, i.e., law enforcers empirically choose a taxi and inspect it. The pressing problem of this scheme is the dilemma between the limite… ▽ More In our urban life, Illegal Driver Substitution (IDS) activity for a taxi is a grave unlawful activity in the taxi industry, possibly causing severe traffic accidents and painful social repercussions. Currently, the IDS activity is manually supervised by law enforcers, i.e., law enforcers empirically choose a taxi and inspect it. The pressing problem of this scheme is the dilemma between the limited number of law-enforcers and the large volume of taxis. In this paper, motivated by this problem, we propose a computational method that helps law enforcers efficiently find the taxis which tend to have the IDS activity. Firstly, our method converts the identification of the IDS activity to a supervised learning task. Secondly, two kinds of taxi driver behaviors, i.e., the Slee** Time and Location (STL) behavior and the Pick-Up (PU) behavior are proposed. Thirdly, the multiple scale pooling on self-similarity is proposed to encode the individual behaviors into the universal features for all taxis. Finally, a Multiple Component- Multiple Instance Learning (MC-MIL) method is proposed to handle the deficiency of the behavior features and to align the behavior features simultaneously. Extensive experiments on a real-world data set shows that the proposed behavior features have a good generalization ability across different classifiers, and the proposed MC-MIL method suppresses the baseline methods. △ Less

Submitted 17 April, 2024; originally announced April 2024.

arXiv:2403.05499 [pdf, other]

Enabling Developers, Protecting Users: Investigating Harassment and Safety in VR

Authors: Abhinaya S. B., Aafaq Sabir, Anupam Das

Abstract: Virtual Reality (VR) has witnessed a rising issue of harassment, prompting the integration of safety controls like muting and blocking in VR applications. However, the lack of standardized safety measures across VR applications hinders their universal effectiveness, especially across contexts like socializing, gaming, and streaming. While prior research has studied safety controls in social VR app… ▽ More Virtual Reality (VR) has witnessed a rising issue of harassment, prompting the integration of safety controls like muting and blocking in VR applications. However, the lack of standardized safety measures across VR applications hinders their universal effectiveness, especially across contexts like socializing, gaming, and streaming. While prior research has studied safety controls in social VR applications, our user study (n = 27) takes a multi-perspective approach, examining both users' perceptions of safety control usability and effectiveness as well as the challenges that developers face in designing and deploying VR safety controls. We identify challenges VR users face while employing safety controls, such as finding users in crowded virtual spaces to block them. VR users also find controls ineffective in addressing harassment; for instance, they fail to eliminate the harassers' presence from the environment. Further, VR users find the current methods of submitting evidence for reports time-consuming and cumbersome. Improvements desired by users include live moderation and behavior tracking across VR apps; however, developers cite technological, financial, and legal obstacles to implementing such solutions, often due to a lack of awareness and high development costs. We emphasize the importance of establishing technical and legal guidelines to enhance user safety in virtual environments. △ Less

Submitted 8 March, 2024; originally announced March 2024.

Comments: Accepted at USENIX Security 2024

arXiv:2310.19130 [pdf, other]

Women Wearing Lipstick: Measuring the Bias Between an Object and Its Related Gender

Authors: Ahmed Sabir, Lluís Padró

Abstract: In this paper, we investigate the impact of objects on gender bias in image captioning systems. Our results show that only gender-specific objects have a strong gender bias (e.g., women-lipstick). In addition, we propose a visual semantic-based gender score that measures the degree of bias and can be used as a plug-in for any image captioning system. Our experiments demonstrate the utility of the… ▽ More In this paper, we investigate the impact of objects on gender bias in image captioning systems. Our results show that only gender-specific objects have a strong gender bias (e.g., women-lipstick). In addition, we propose a visual semantic-based gender score that measures the degree of bias and can be used as a plug-in for any image captioning system. Our experiments demonstrate the utility of the gender score, since we observe that our score can measure the bias relation between a caption and its related gender; therefore, our score can be used as an additional metric to the existing Object Gender Co-Occ approach. Code and data are publicly available at \url{https://github.com/ahmedssabir/GenderScore}. △ Less

Submitted 20 November, 2023; v1 submitted 29 October, 2023; originally announced October 2023.

Comments: EMNLP Findings 2023

arXiv:2302.07153 [pdf]

A Novel Poisoned Water Detection Method Using Smartphone Embedded Wi-Fi Technology and Machine Learning Algorithms

Authors: Halgurd S. Maghdid, Sheerko R. Hma Salah, Akar T. Hawre, Hassan M. Bayram, Azhin T. Sabir, Kosrat N. Kaka, Salam Ghafour Taher, Ladeh S. Abdulrahman, Abdulbasit K. Al-Talabani, Safar M. Asaad, Aras Asaad

Abstract: Water is a necessary fluid to the human body and automatic checking of its quality and cleanness is an ongoing area of research. One such approach is to present the liquid to various types of signals and make the amount of signal attenuation an indication of the liquid category. In this article, we have utilized the Wi-Fi signal to distinguish clean water from poisoned water via training differen… ▽ More Water is a necessary fluid to the human body and automatic checking of its quality and cleanness is an ongoing area of research. One such approach is to present the liquid to various types of signals and make the amount of signal attenuation an indication of the liquid category. In this article, we have utilized the Wi-Fi signal to distinguish clean water from poisoned water via training different machine learning algorithms. The Wi-Fi access points (WAPs) signal is acquired via equivalent smartphone-embedded Wi-Fi chipsets, and then Channel-State-Information CSI measures are extracted and converted into feature vectors to be used as input for machine learning classification algorithms. The measured amplitude and phase of the CSI data are selected as input features into four classifiers k-NN, SVM, LSTM, and Ensemble. The experimental results show that the model is adequate to differentiate poison water from clean water with a classification accuracy of 89% when LSTM is applied, while 92% classification accuracy is achieved when the AdaBoost-Ensemble classifier is applied. △ Less

Submitted 13 February, 2023; originally announced February 2023.

arXiv:2301.08784 [pdf, other]

Visual Semantic Relatedness Dataset for Image Captioning

Authors: Ahmed Sabir, Francesc Moreno-Noguer, Lluís Padró

Abstract: Modern image captioning system relies heavily on extracting knowledge from images to capture the concept of a static story. In this paper, we propose a textual visual context dataset for captioning, in which the publicly available dataset COCO Captions (Lin et al., 2014) has been extended with information about the scene (such as objects in the image). Since this information has a textual form, it… ▽ More Modern image captioning system relies heavily on extracting knowledge from images to capture the concept of a static story. In this paper, we propose a textual visual context dataset for captioning, in which the publicly available dataset COCO Captions (Lin et al., 2014) has been extended with information about the scene (such as objects in the image). Since this information has a textual form, it can be used to leverage any NLP task, such as text similarity or semantic relation methods, into captioning systems, either as an end-to-end training strategy or a post-processing based approach. △ Less

Submitted 30 April, 2023; v1 submitted 20 January, 2023; originally announced January 2023.

Comments: Project Page: bit.ly/project-page-paper

arXiv:2209.12817 [pdf, other]

Word to Sentence Visual Semantic Similarity for Caption Generation: Lessons Learned

Authors: Ahmed Sabir

Abstract: This paper focuses on enhancing the captions generated by image-caption generation systems. We propose an approach for improving caption generation systems by choosing the most closely related output to the image rather than the most likely output produced by the model. Our model revises the language generation output beam search from a visual context perspective. We employ a visual semantic measu… ▽ More This paper focuses on enhancing the captions generated by image-caption generation systems. We propose an approach for improving caption generation systems by choosing the most closely related output to the image rather than the most likely output produced by the model. Our model revises the language generation output beam search from a visual context perspective. We employ a visual semantic measure in a word and sentence level manner to match the proper caption to the related information in the image. The proposed approach can be applied to any caption system as a post-processing based method. △ Less

Submitted 6 July, 2023; v1 submitted 26 September, 2022; originally announced September 2022.

Comments: project page: https://bit.ly/project_page_MVA2023

arXiv:2209.08163 [pdf, other]

Belief Revision based Caption Re-ranker with Visual Semantic Information

Authors: Ahmed Sabir, Francesc Moreno-Noguer, Pranava Madhyastha, Lluís Padró

Abstract: In this work, we focus on improving the captions generated by image-caption generation systems. We propose a novel re-ranking approach that leverages visual-semantic measures to identify the ideal caption that maximally captures the visual information in the image. Our re-ranker utilizes the Belief Revision framework (Blok et al., 2003) to calibrate the original likelihood of the top-n captions by… ▽ More In this work, we focus on improving the captions generated by image-caption generation systems. We propose a novel re-ranking approach that leverages visual-semantic measures to identify the ideal caption that maximally captures the visual information in the image. Our re-ranker utilizes the Belief Revision framework (Blok et al., 2003) to calibrate the original likelihood of the top-n captions by explicitly exploiting the semantic relatedness between the depicted caption and the visual context. Our experiments demonstrate the utility of our approach, where we observe that our re-ranker can enhance the performance of a typical image-captioning system without the necessity of any additional training or fine-tuning. △ Less

Submitted 16 September, 2022; originally announced September 2022.

Comments: COLING 2022

arXiv:2004.10349 [pdf, other]

Textual Visual Semantic Dataset for Text Spotting

Authors: Ahmed Sabir, Francesc Moreno-Noguer, Lluís Padró

Abstract: Text Spotting in the wild consists of detecting and recognizing text appearing in images (e.g. signboards, traffic signals or brands in clothing or objects). This is a challenging problem due to the complexity of the context where texts appear (uneven backgrounds, shading, occlusions, perspective distortions, etc.). Only a few approaches try to exploit the relation between text and its surrounding… ▽ More Text Spotting in the wild consists of detecting and recognizing text appearing in images (e.g. signboards, traffic signals or brands in clothing or objects). This is a challenging problem due to the complexity of the context where texts appear (uneven backgrounds, shading, occlusions, perspective distortions, etc.). Only a few approaches try to exploit the relation between text and its surrounding environment to better recognize text in the scene. In this paper, we propose a visual context dataset for Text Spotting in the wild, where the publicly available dataset COCO-text [Veit et al. 2016] has been extended with information about the scene (such as objects and places appearing in the image) to enable researchers to include semantic relations between texts and scene in their Text Spotting systems, and to offer a common framework for such approaches. For each text in an image, we extract three kinds of context information: objects in the scene, image location label and a textual image description (caption). We use state-of-the-art out-of-the-box available tools to extract this additional information. Since this information has textual form, it can be used to leverage text similarity or semantic relation methods into Text Spotting systems, either as a post-processing or in an end-to-end training strategy. Our data is publicly available at https://git.io/JeZTb. △ Less

Submitted 21 April, 2020; originally announced April 2020.

arXiv:1909.07950 [pdf, other]

Semantic Relatedness Based Re-ranker for Text Spotting

Authors: Ahmed Sabir, Francesc Moreno-Noguer, Lluís Padró

Abstract: Applications such as textual entailment, plagiarism detection or document clustering rely on the notion of semantic similarity, and are usually approached with dimension reduction techniques like LDA or with embedding-based neural approaches. We present a scenario where semantic similarity is not enough, and we devise a neural approach to learn semantic relatedness. The scenario is text spotting i… ▽ More Applications such as textual entailment, plagiarism detection or document clustering rely on the notion of semantic similarity, and are usually approached with dimension reduction techniques like LDA or with embedding-based neural approaches. We present a scenario where semantic similarity is not enough, and we devise a neural approach to learn semantic relatedness. The scenario is text spotting in the wild, where a text in an image (e.g. street sign, advertisement or bus destination) must be identified and recognized. Our goal is to improve the performance of vision systems by leveraging semantic information. Our rationale is that the text to be spotted is often related to the image context in which it appears (word pairs such as Delta-airplane, or quarters-parking are not similar, but are clearly related). We show how learning a word-to-word or word-to-sentence relatedness score can improve the performance of text spotting systems up to 2.9 points, outperforming other measures in a benchmark dataset. △ Less

Submitted 19 September, 2019; v1 submitted 17 September, 2019; originally announced September 2019.

Comments: Accepted by EMNLP 2019

arXiv:1810.12738 [pdf, other]

Visual Re-ranking with Natural Language Understanding for Text Spotting

Authors: Ahmed Sabir, Francesc Moreno-Noguer, Lluís Padró

Abstract: Many scene text recognition approaches are based on purely visual information and ignore the semantic relation between scene and text. In this paper, we tackle this problem from natural language processing perspective to fill the gap between language and vision. We propose a post-processing approach to improve scene text recognition accuracy by using occurrence probabilities of words (unigram lang… ▽ More Many scene text recognition approaches are based on purely visual information and ignore the semantic relation between scene and text. In this paper, we tackle this problem from natural language processing perspective to fill the gap between language and vision. We propose a post-processing approach to improve scene text recognition accuracy by using occurrence probabilities of words (unigram language model), and the semantic correlation between scene and text. For this, we initially rely on an off-the-shelf deep neural network, already trained with a large amount of data, which provides a series of text hypotheses per input image. These hypotheses are then re-ranked using word frequencies and semantic relatedness with objects or scenes in the image. As a result of this combination, the performance of the original network is boosted with almost no additional cost. We validate our approach on ICDAR'17 dataset. △ Less

Submitted 29 October, 2018; originally announced October 2018.

Comments: Accepted by ACCV 2018. arXiv admin note: substantial text overlap with arXiv:1810.09776

arXiv:1810.09776 [pdf, other]

Visual Semantic Re-ranker for Text Spotting

Authors: Ahmed Sabir, Francesc Moreno-Noguer, Lluís Padró

Abstract: Many current state-of-the-art methods for text recognition are based on purely local information and ignore the semantic correlation between text and its surrounding visual context. In this paper, we propose a post-processing approach to improve the accuracy of text spotting by using the semantic relation between the text and the scene. We initially rely on an off-the-shelf deep neural network tha… ▽ More Many current state-of-the-art methods for text recognition are based on purely local information and ignore the semantic correlation between text and its surrounding visual context. In this paper, we propose a post-processing approach to improve the accuracy of text spotting by using the semantic relation between the text and the scene. We initially rely on an off-the-shelf deep neural network that provides a series of text hypotheses for each input image. These text hypotheses are then re-ranked using the semantic relatedness with the object in the image. As a result of this combination, the performance of the original network is boosted with a very low computational cost. The proposed framework can be used as a drop-in complement for any text-spotting algorithm that outputs a ranking of word hypotheses. We validate our approach on ICDAR'17 shared task dataset. △ Less

Submitted 27 October, 2018; v1 submitted 23 October, 2018; originally announced October 2018.

Showing 1–11 of 11 results for author: Sabir, A