Search | arXiv e-print repository

Privacy-Aware Visual Language Models

Authors: Laurens Samson, Nimrod Barazani, Sennay Ghebreab, Yuki M. Asano

Abstract: This paper aims to advance our understanding of how Visual Language Models (VLMs) handle privacy-sensitive information, a crucial concern as these technologies become integral to everyday life. To this end, we introduce a new benchmark PrivBench, which contains images from 8 sensitive categories such as passports, or fingerprints. We evaluate 10 state-of-the-art VLMs on this benchmark and observe… ▽ More This paper aims to advance our understanding of how Visual Language Models (VLMs) handle privacy-sensitive information, a crucial concern as these technologies become integral to everyday life. To this end, we introduce a new benchmark PrivBench, which contains images from 8 sensitive categories such as passports, or fingerprints. We evaluate 10 state-of-the-art VLMs on this benchmark and observe a generally limited understanding of privacy, highlighting a significant area for model improvement. Based on this we introduce PrivTune, a new instruction-tuning dataset aimed at equip** VLMs with knowledge about visual privacy. By tuning two pretrained VLMs, TinyLLaVa and MiniGPT-v2, on this small dataset, we achieve strong gains in their ability to recognize sensitive content, outperforming even GPT4-V. At the same time, we show that privacy-tuning only minimally affects the VLMs performance on standard benchmarks such as VQA. Overall, this paper lays out a crucial challenge for making VLMs effective in handling real-world data safely and provides a simple recipe that takes the first step towards building privacy-aware VLMs. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: preprint

arXiv:2109.07180 [pdf, other]

Back to Basics: Deep Reinforcement Learning in Traffic Signal Control

Authors: Sierk Kanis, Laurens Samson, Daan Bloembergen, Tim Bakker

Abstract: In this paper we revisit some of the fundamental premises for a reinforcement learning (RL) approach to self-learning traffic lights. We propose RLight, a combination of choices that offers robust performance and good generalization to unseen traffic flows. In particular, our main contributions are threefold: our lightweight and cluster-aware state representation leads to improved performance; we… ▽ More In this paper we revisit some of the fundamental premises for a reinforcement learning (RL) approach to self-learning traffic lights. We propose RLight, a combination of choices that offers robust performance and good generalization to unseen traffic flows. In particular, our main contributions are threefold: our lightweight and cluster-aware state representation leads to improved performance; we reformulate the Markov Decision Process (MDP) such that it skips redundant timesteps of yellow light, speeding up learning by 30%; and we investigate the action space and provide insight into the difference in performance between acyclic and cyclic phase transitions. Additionally, we provide insights into the generalisation of the methods to unseen traffic. Evaluations using the real-world Hangzhou traffic dataset show that RLight outperforms state-of-the-art rule-based and deep reinforcement learning algorithms, demonstrating the potential of RL-based methods to improve urban traffic flows. △ Less

Submitted 21 November, 2021; v1 submitted 15 September, 2021; originally announced September 2021.

Comments: 9 pages, 4 figures; minor textual improvements w.r.t. v1. Presented at the 10th Intl. Workshop on Urban Computing at ACM SIGSPATIAL 2021. Code for this paper is available at https://github.com/Amsterdam-Internships/Self-Learning-Traffic-Lights

ACM Class: I.2.6

arXiv:1912.01728 [pdf, other]

Fast Intent Classification for Spoken Language Understanding

Authors: Akshit Tyagi, Varun Sharma, Rahul Gupta, Lynn Samson, Nan Zhuang, Zihang Wang, Bill Campbell

Abstract: Spoken Language Understanding (SLU) systems consist of several machine learning components operating together (e.g. intent classification, named entity recognition and resolution). Deep learning models have obtained state of the art results on several of these tasks, largely attributed to their better modeling capacity. However, an increase in modeling capacity comes with added costs of higher lat… ▽ More Spoken Language Understanding (SLU) systems consist of several machine learning components operating together (e.g. intent classification, named entity recognition and resolution). Deep learning models have obtained state of the art results on several of these tasks, largely attributed to their better modeling capacity. However, an increase in modeling capacity comes with added costs of higher latency and energy usage, particularly when operating on low complexity devices. To address the latency and computational complexity issues, we explore a BranchyNet scheme on an intent classification scheme within SLU systems. The BranchyNet scheme when applied to a high complexity model, adds exit points at various stages in the model allowing early decision making for a set of queries to the SLU model. We conduct experiments on the Facebook Semantic Parsing dataset with two candidate model architectures for intent classification. Our experiments show that the BranchyNet scheme provides gains in terms of computational complexity without compromising model accuracy. We also conduct analytical studies regarding the improvements in the computational cost, distribution of utterances that egress from various exit points and the impact of adding more complexity to models with the BranchyNet scheme. △ Less

Submitted 14 February, 2020; v1 submitted 3 December, 2019; originally announced December 2019.

Comments: Accepted as a conference paper at ICASSP 20

arXiv:1908.02711 [pdf, other]

I Bet You Are Wrong: Gambling Adversarial Networks for Structured Semantic Segmentation

Authors: Laurens Samson, Nanne van Noord, Olaf Booij, Michael Hofmann, Efstratios Gavves, Mohsen Ghafoorian

Abstract: Adversarial training has been recently employed for realizing structured semantic segmentation, in which the aim is to preserve higher-level scene structural consistencies in dense predictions. However, as we show, value-based discrimination between the predictions from the segmentation network and ground-truth annotations can hinder the training process from learning to improve structural qualiti… ▽ More Adversarial training has been recently employed for realizing structured semantic segmentation, in which the aim is to preserve higher-level scene structural consistencies in dense predictions. However, as we show, value-based discrimination between the predictions from the segmentation network and ground-truth annotations can hinder the training process from learning to improve structural qualities as well as disabling the network from properly expressing uncertainties. In this paper, we rethink adversarial training for semantic segmentation and propose to formulate the fake/real discrimination framework with a correct/incorrect training objective. More specifically, we replace the discriminator with a "gambler" network that learns to spot and distribute its budget in areas where the predictions are clearly wrong, while the segmenter network tries to leave no clear clues for the gambler where to bet. Empirical evaluation on two road-scene semantic segmentation tasks shows that not only does the proposed method re-enable expressing uncertainties, it also improves pixel-wise and structure-based metrics. △ Less

Submitted 7 August, 2019; originally announced August 2019.

Comments: 13 pages, 8 figures

arXiv:1209.4218 [pdf, ps, other]

An Energy-Efficient Power Allocation Game with Selfish Channel State Reporting in Cellular Networks

Authors: Mériaux François, Valentin Stefan, Lasaulce Samson, Kieffer Michel

Abstract: With energy-efficient resource allocation, mobile users and base station have different objectives. While the base station strives for an energy-efficient operation of the complete cell, each user aims to maximize its own data rate. To obtain this individual benefit, users may selfishly adjust their Channel State Information (CSI) reports, reducing the cell's energy efficiency. To analyze this con… ▽ More With energy-efficient resource allocation, mobile users and base station have different objectives. While the base station strives for an energy-efficient operation of the complete cell, each user aims to maximize its own data rate. To obtain this individual benefit, users may selfishly adjust their Channel State Information (CSI) reports, reducing the cell's energy efficiency. To analyze this conflict of interest, we formalize energy-efficient power allocation as a utility maximization problem and present a simple algorithm that performs close to the optimum. By formulating selfish CSI reporting as a game, we prove the existence of an unique equilibrium and characterize energy efficiency with true and selfish CSI in closed form. Our numerical results show that, surprisingly, energy-efficient power allocation in small cells is more robust against selfish CSI than cells with large transmit powers. This and further design rules show that our paper provides valuable theoretical insight to energy-efficient networks when CSI reports cannot be trusted. △ Less

Submitted 19 September, 2012; originally announced September 2012.

Comments: In Proceedings of the 6th International Conference on Performance Evaluation Methodologies and Tools (Valuetools), Oct. 2012, Cargèse, France

Showing 1–5 of 5 results for author: Samson, L