Search | arXiv e-print repository

Privacy-Aware Visual Language Models

Authors: Laurens Samson, Nimrod Barazani, Sennay Ghebreab, Yuki M. Asano

Abstract: This paper aims to advance our understanding of how Visual Language Models (VLMs) handle privacy-sensitive information, a crucial concern as these technologies become integral to everyday life. To this end, we introduce a new benchmark PrivBench, which contains images from 8 sensitive categories such as passports, or fingerprints. We evaluate 10 state-of-the-art VLMs on this benchmark and observe… ▽ More This paper aims to advance our understanding of how Visual Language Models (VLMs) handle privacy-sensitive information, a crucial concern as these technologies become integral to everyday life. To this end, we introduce a new benchmark PrivBench, which contains images from 8 sensitive categories such as passports, or fingerprints. We evaluate 10 state-of-the-art VLMs on this benchmark and observe a generally limited understanding of privacy, highlighting a significant area for model improvement. Based on this we introduce PrivTune, a new instruction-tuning dataset aimed at equip** VLMs with knowledge about visual privacy. By tuning two pretrained VLMs, TinyLLaVa and MiniGPT-v2, on this small dataset, we achieve strong gains in their ability to recognize sensitive content, outperforming even GPT4-V. At the same time, we show that privacy-tuning only minimally affects the VLMs performance on standard benchmarks such as VQA. Overall, this paper lays out a crucial challenge for making VLMs effective in handling real-world data safely and provides a simple recipe that takes the first step towards building privacy-aware VLMs. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: preprint

arXiv:2109.07180 [pdf, other]

Back to Basics: Deep Reinforcement Learning in Traffic Signal Control

Authors: Sierk Kanis, Laurens Samson, Daan Bloembergen, Tim Bakker

Abstract: In this paper we revisit some of the fundamental premises for a reinforcement learning (RL) approach to self-learning traffic lights. We propose RLight, a combination of choices that offers robust performance and good generalization to unseen traffic flows. In particular, our main contributions are threefold: our lightweight and cluster-aware state representation leads to improved performance; we… ▽ More In this paper we revisit some of the fundamental premises for a reinforcement learning (RL) approach to self-learning traffic lights. We propose RLight, a combination of choices that offers robust performance and good generalization to unseen traffic flows. In particular, our main contributions are threefold: our lightweight and cluster-aware state representation leads to improved performance; we reformulate the Markov Decision Process (MDP) such that it skips redundant timesteps of yellow light, speeding up learning by 30%; and we investigate the action space and provide insight into the difference in performance between acyclic and cyclic phase transitions. Additionally, we provide insights into the generalisation of the methods to unseen traffic. Evaluations using the real-world Hangzhou traffic dataset show that RLight outperforms state-of-the-art rule-based and deep reinforcement learning algorithms, demonstrating the potential of RL-based methods to improve urban traffic flows. △ Less

Submitted 21 November, 2021; v1 submitted 15 September, 2021; originally announced September 2021.

Comments: 9 pages, 4 figures; minor textual improvements w.r.t. v1. Presented at the 10th Intl. Workshop on Urban Computing at ACM SIGSPATIAL 2021. Code for this paper is available at https://github.com/Amsterdam-Internships/Self-Learning-Traffic-Lights

ACM Class: I.2.6

arXiv:1912.01728 [pdf, other]

Fast Intent Classification for Spoken Language Understanding

Authors: Akshit Tyagi, Varun Sharma, Rahul Gupta, Lynn Samson, Nan Zhuang, Zihang Wang, Bill Campbell

Abstract: Spoken Language Understanding (SLU) systems consist of several machine learning components operating together (e.g. intent classification, named entity recognition and resolution). Deep learning models have obtained state of the art results on several of these tasks, largely attributed to their better modeling capacity. However, an increase in modeling capacity comes with added costs of higher lat… ▽ More Spoken Language Understanding (SLU) systems consist of several machine learning components operating together (e.g. intent classification, named entity recognition and resolution). Deep learning models have obtained state of the art results on several of these tasks, largely attributed to their better modeling capacity. However, an increase in modeling capacity comes with added costs of higher latency and energy usage, particularly when operating on low complexity devices. To address the latency and computational complexity issues, we explore a BranchyNet scheme on an intent classification scheme within SLU systems. The BranchyNet scheme when applied to a high complexity model, adds exit points at various stages in the model allowing early decision making for a set of queries to the SLU model. We conduct experiments on the Facebook Semantic Parsing dataset with two candidate model architectures for intent classification. Our experiments show that the BranchyNet scheme provides gains in terms of computational complexity without compromising model accuracy. We also conduct analytical studies regarding the improvements in the computational cost, distribution of utterances that egress from various exit points and the impact of adding more complexity to models with the BranchyNet scheme. △ Less

Submitted 14 February, 2020; v1 submitted 3 December, 2019; originally announced December 2019.

Comments: Accepted as a conference paper at ICASSP 20

arXiv:1908.02711 [pdf, other]

I Bet You Are Wrong: Gambling Adversarial Networks for Structured Semantic Segmentation

Authors: Laurens Samson, Nanne van Noord, Olaf Booij, Michael Hofmann, Efstratios Gavves, Mohsen Ghafoorian

Abstract: Adversarial training has been recently employed for realizing structured semantic segmentation, in which the aim is to preserve higher-level scene structural consistencies in dense predictions. However, as we show, value-based discrimination between the predictions from the segmentation network and ground-truth annotations can hinder the training process from learning to improve structural qualiti… ▽ More Adversarial training has been recently employed for realizing structured semantic segmentation, in which the aim is to preserve higher-level scene structural consistencies in dense predictions. However, as we show, value-based discrimination between the predictions from the segmentation network and ground-truth annotations can hinder the training process from learning to improve structural qualities as well as disabling the network from properly expressing uncertainties. In this paper, we rethink adversarial training for semantic segmentation and propose to formulate the fake/real discrimination framework with a correct/incorrect training objective. More specifically, we replace the discriminator with a "gambler" network that learns to spot and distribute its budget in areas where the predictions are clearly wrong, while the segmenter network tries to leave no clear clues for the gambler where to bet. Empirical evaluation on two road-scene semantic segmentation tasks shows that not only does the proposed method re-enable expressing uncertainties, it also improves pixel-wise and structure-based metrics. △ Less

Submitted 7 August, 2019; originally announced August 2019.

Comments: 13 pages, 8 figures

arXiv:1209.4218 [pdf, ps, other]

An Energy-Efficient Power Allocation Game with Selfish Channel State Reporting in Cellular Networks

Authors: Mériaux François, Valentin Stefan, Lasaulce Samson, Kieffer Michel

Abstract: With energy-efficient resource allocation, mobile users and base station have different objectives. While the base station strives for an energy-efficient operation of the complete cell, each user aims to maximize its own data rate. To obtain this individual benefit, users may selfishly adjust their Channel State Information (CSI) reports, reducing the cell's energy efficiency. To analyze this con… ▽ More With energy-efficient resource allocation, mobile users and base station have different objectives. While the base station strives for an energy-efficient operation of the complete cell, each user aims to maximize its own data rate. To obtain this individual benefit, users may selfishly adjust their Channel State Information (CSI) reports, reducing the cell's energy efficiency. To analyze this conflict of interest, we formalize energy-efficient power allocation as a utility maximization problem and present a simple algorithm that performs close to the optimum. By formulating selfish CSI reporting as a game, we prove the existence of an unique equilibrium and characterize energy efficiency with true and selfish CSI in closed form. Our numerical results show that, surprisingly, energy-efficient power allocation in small cells is more robust against selfish CSI than cells with large transmit powers. This and further design rules show that our paper provides valuable theoretical insight to energy-efficient networks when CSI reports cannot be trusted. △ Less

Submitted 19 September, 2012; originally announced September 2012.

Comments: In Proceedings of the 6th International Conference on Performance Evaluation Methodologies and Tools (Valuetools), Oct. 2012, Cargèse, France

arXiv:0912.4288 [pdf, ps, other]

doi 10.1063/1.3355544

Phase-change chalcogenide glass metamaterial

Authors: Z. L. Samson, K. F. MacDonald, F. De Angelis, K. Knight, C. C. Huang, E. Di Fabrizio, D. W. Hewak, N. I. Zheludev

Abstract: Combining metamaterials with functional media brings a new dimension to their performance. Here we demonstrate substantial resonance frequency tuning in a photonic metamaterial hybridized with an electrically/optically switchable chalcogenide glass. The transition between amorphous and crystalline forms brings about a 10% shift in the near-infrared resonance wavelength of an asymmetric split-rin… ▽ More Combining metamaterials with functional media brings a new dimension to their performance. Here we demonstrate substantial resonance frequency tuning in a photonic metamaterial hybridized with an electrically/optically switchable chalcogenide glass. The transition between amorphous and crystalline forms brings about a 10% shift in the near-infrared resonance wavelength of an asymmetric split-ring array, providing transmission modulation functionality with a contrast ratio of 4:1 in a device of sub-wavelength thickness. △ Less

Submitted 21 December, 2009; originally announced December 2009.

Comments: 3 pages, 3 figures

Journal ref: Appl. Phys. Lett. 96, 143105 (2010)

arXiv:0807.2542 [pdf]

doi 10.1038/NPHOTON.2008.249

Ultrafast active plasmonics: transmission and control of femtosecond plasmon signals

Authors: K. F. MacDonald, Z. L. Samson, M. I. Stockman, N. I. Zheludev

Abstract: We report that femtosecond surface plasmon polariton pulses can propagate along a metal-dielectric waveguide and that they can be modulated on the femtosecond timescale by direct ultrafast optical excitation of the metal, thereby offering unprecedented terahertz plasmonic bandwidth - a key missing component in the development of surface plasmons as information carriers for next generation nanoph… ▽ More We report that femtosecond surface plasmon polariton pulses can propagate along a metal-dielectric waveguide and that they can be modulated on the femtosecond timescale by direct ultrafast optical excitation of the metal, thereby offering unprecedented terahertz plasmonic bandwidth - a key missing component in the development of surface plasmons as information carriers for next generation nanophotonic devices. △ Less

Submitted 16 July, 2008; originally announced July 2008.

Comments: 4 pages (inc. 3 figures)

Journal ref: Nat. Photon. 3, 55 (2009)

arXiv:cond-mat/9512030 [pdf, ps, other]

doi 10.1103/PhysRevE.53.6496

Stochastic Model for the Motion of a Particle on an Inclined Rough Plane and the Onset of Viscous Friction

Authors: G. G. Batrouni, S. Dippel, L. Samson

Abstract: Experiments on the motion of a particle on an inclined rough plane have yielded some surprising results. For example, it was found that the frictional force acting on the ball is viscous, {\it i.e.} proportional to the velocity rather than the expected square of the velocity. It was also found that, for a given inclination of the plane, the velocity of the ball scales as a power of its radius. W… ▽ More Experiments on the motion of a particle on an inclined rough plane have yielded some surprising results. For example, it was found that the frictional force acting on the ball is viscous, {\it i.e.} proportional to the velocity rather than the expected square of the velocity. It was also found that, for a given inclination of the plane, the velocity of the ball scales as a power of its radius. We present here a one dimensional stochastic model based on the microscopic equations of motion of the ball, which exhibits the same behaviour as the experiments. This model yields a mechanism for the origins of the viscous friction force and the scaling of the velocity with the radius. It also reproduces other aspects of the phase diagram of the motion which we will discuss. △ Less

Submitted 5 December, 1995; originally announced December 1995.

Comments: 19 pages, latex, 11 postscript figures in separate uuencoded file

Showing 1–8 of 8 results for author: Samson, L