Search | arXiv e-print repository

MELTing point: Mobile Evaluation of Language Transformers

Authors: Stefanos Laskaridis, Kleomenis Katevas, Lorenzo Minto, Hamed Haddadi

Abstract: Transformers have revolutionized the machine learning landscape, gradually making their way into everyday tasks and equip** our computers with ``sparks of intelligence''. However, their runtime requirements have prevented them from being broadly deployed on mobile. As personal devices become increasingly powerful and prompt privacy becomes an ever more pressing issue, we explore the current stat… ▽ More Transformers have revolutionized the machine learning landscape, gradually making their way into everyday tasks and equip** our computers with ``sparks of intelligence''. However, their runtime requirements have prevented them from being broadly deployed on mobile. As personal devices become increasingly powerful and prompt privacy becomes an ever more pressing issue, we explore the current state of mobile execution of Large Language Models (LLMs). To achieve this, we have created our own automation infrastructure, MELT, which supports the headless execution and benchmarking of LLMs on device, supporting different models, devices and frameworks, including Android, iOS and Nvidia Jetson devices. We evaluate popular instruction fine-tuned LLMs and leverage different frameworks to measure their end-to-end and granular performance, tracing their memory and energy requirements along the way. Our analysis is the first systematic study of on-device LLM execution, quantifying performance, energy efficiency and accuracy across various state-of-the-art models and showcases the state of on-device intelligence in the era of hyperscale models. Results highlight the performance heterogeneity across targets and corroborates that LLM inference is largely memory-bound. Quantization drastically reduces memory requirements and renders execution viable, but at a non-negligible accuracy cost. Drawing from its energy footprint and thermal behavior, the continuous execution of LLMs remains elusive, as both factors negatively affect user experience. Last, our experience shows that the ecosystem is still in its infancy, and algorithmic as well as hardware breakthroughs can significantly shift the execution cost. We expect NPU acceleration, and framework-hardware co-design to be the biggest bet towards efficient standalone execution, with the alternative of offloading tailored towards edge deployments. △ Less

Submitted 20 March, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

Comments: Under review

arXiv:2302.13438 [pdf, other]

P4L: Privacy Preserving Peer-to-Peer Learning for Infrastructureless Setups

Authors: Ioannis Arapakis, Panagiotis Papadopoulos, Kleomenis Katevas, Diego Perino

Abstract: Distributed (or Federated) learning enables users to train machine learning models on their very own devices, while they share only the gradients of their models usually in a differentially private way (utility loss). Although such a strategy provides better privacy guarantees than the traditional centralized approach, it requires users to blindly trust a centralized infrastructure that may also b… ▽ More Distributed (or Federated) learning enables users to train machine learning models on their very own devices, while they share only the gradients of their models usually in a differentially private way (utility loss). Although such a strategy provides better privacy guarantees than the traditional centralized approach, it requires users to blindly trust a centralized infrastructure that may also become a bottleneck with the increasing number of users. In this paper, we design and implement P4L: a privacy preserving peer-to-peer learning system for users to participate in an asynchronous, collaborative learning scheme without requiring any sort of infrastructure or relying on differential privacy. Our design uses strong cryptographic primitives to preserve both the confidentiality and utility of the shared gradients, a set of peer-to-peer mechanisms for fault tolerance and user churn, proximity and cross device communications. Extensive simulations under different network settings and ML scenarios for three real-life datasets show that P4L provides competitive performance to baselines, while it is resilient to different poisoning attacks. We implement P4L and experimental results show that the performance overhead and power consumption is minimal (less than 3mAh of discharge). △ Less

Submitted 26 February, 2023; originally announced February 2023.

arXiv:2206.10963 [pdf, other]

FLaaS: Cross-App On-device Federated Learning in Mobile Environments

Authors: Kleomenis Katevas, Diego Perino, Nicolas Kourtellis

Abstract: Federated Learning (FL) has recently emerged as a popular solution to distributedly train a model on user devices improving user privacy and system scalability. Major Internet companies have deployed FL in their applications for specific use cases (e.g., keyboard prediction or acoustic keyword trigger), and the research community has devoted significant attention to improving different aspects of… ▽ More Federated Learning (FL) has recently emerged as a popular solution to distributedly train a model on user devices improving user privacy and system scalability. Major Internet companies have deployed FL in their applications for specific use cases (e.g., keyboard prediction or acoustic keyword trigger), and the research community has devoted significant attention to improving different aspects of FL (e.g., accuracy, privacy, efficiency). However, there is still a lack of a practical system to enable easy collaborative cross-silo FL training, in the context of mobile environments. In this work, we bridge this gap and propose FLaME, an end-to-end system (i.e., client-side framework and libraries, and central server) to enable intra- and inter-app training on mobile devices with different types of IID and NonIID data distributions, in a secure and easy to deploy fashion. Our design solves major technical challenges such as on-device training, secure and private single and cross-app model training, while being offered in an "as a service" model. We implement FLaME for Android devices and experimentally evaluate its performance in-lab and in-wild, on more than 140 users for over a month. Our results show the feasibility and benefits of the design in a realistic mobile context and provide several insights to the FL community on the practicality and usage of FL in the wild. △ Less

Submitted 16 December, 2022; v1 submitted 22 June, 2022; originally announced June 2022.

Comments: 12 pages, 6 figures, 46 references

MSC Class: 68T05 ACM Class: I.2.11

arXiv:2201.12614 [pdf, other]

BatteryLab: A Collaborative Platform for Power Monitoring

Authors: Matteo Varvello, Kleomenis Katevas, Mihai Plesa, Hamed Haddadi, Fabian Bustamante, Ben Livshits

Abstract: Advances in cloud computing have simplified the way that both software development and testing are performed. This is not true for battery testing for which state of the art test-beds simply consist of one phone attached to a power meter. These test-beds have limited resources, access, and are overall hard to maintain; for these reasons, they often sit idle with no experiment to run. In this paper… ▽ More Advances in cloud computing have simplified the way that both software development and testing are performed. This is not true for battery testing for which state of the art test-beds simply consist of one phone attached to a power meter. These test-beds have limited resources, access, and are overall hard to maintain; for these reasons, they often sit idle with no experiment to run. In this paper, we propose to share existing battery testbeds and transform them into vantage points of BatteryLab, a power monitoring platform offering heterogeneous devices and testing conditions. We have achieved this vision with a combination of hardware and software which allow to augment existing battery test-beds with remote capabilities. BatteryLab currently counts three vantage points, one in Europe and two in the US, hosting three Android devices and one iPhone 7. We benchmark BatteryLab with respect to the accuracy of its battery readings, system performance, and platform heterogeneity. Next, we demonstrate how measurements can be run atop of BatteryLab by develo** the "Web Power Monitor" (WPM), a tool which can measure website power consumption at scale. We released WPM and used it to report on the energy consumption of Alexa's top 1,000 websites across 3 locations and 4 devices (both Android and iOS). △ Less

Submitted 29 January, 2022; originally announced January 2022.

Comments: 25 pages, 11 figures, Passive and Active Measurement Conference 2022 (PAM '22). arXiv admin note: text overlap with arXiv:1910.08951

arXiv:2110.15097 [pdf, other]

Choosing the Best of Both Worlds: Diverse and Novel Recommendations through Multi-Objective Reinforcement Learning

Authors: Dusan Stamenkovic, Alexandros Karatzoglou, Ioannis Arapakis, Xin Xin, Kleomenis Katevas

Abstract: Since the inception of Recommender Systems (RS), the accuracy of the recommendations in terms of relevance has been the golden criterion for evaluating the quality of RS algorithms. However, by focusing on item relevance, one pays a significant price in terms of other important metrics: users get stuck in a "filter bubble" and their array of options is significantly reduced, hence degrading the qu… ▽ More Since the inception of Recommender Systems (RS), the accuracy of the recommendations in terms of relevance has been the golden criterion for evaluating the quality of RS algorithms. However, by focusing on item relevance, one pays a significant price in terms of other important metrics: users get stuck in a "filter bubble" and their array of options is significantly reduced, hence degrading the quality of the user experience and leading to churn. Recommendation, and in particular session-based/sequential recommendation, is a complex task with multiple - and often conflicting objectives - that existing state-of-the-art approaches fail to address. In this work, we take on the aforementioned challenge and introduce Scalarized Multi-Objective Reinforcement Learning (SMORL) for the RS setting, a novel Reinforcement Learning (RL) framework that can effectively address multi-objective recommendation tasks. The proposed SMORL agent augments standard recommendation models with additional RL layers that enforce it to simultaneously satisfy three principal objectives: accuracy, diversity, and novelty of recommendations. We integrate this framework with four state-of-the-art session-based recommendation models and compare it with a single-objective RL agent that only focuses on accuracy. Our experimental results on two real-world datasets reveal a substantial increase in aggregate diversity, a moderate increase in accuracy, reduced repetitiveness of recommendations, and demonstrate the importance of reinforcing diversity and novelty as complementary objectives. △ Less

Submitted 28 October, 2021; originally announced October 2021.

Comments: 9 pages, 4 figures, Proc. ACM WSDM, 2022 In Proceedings of the 15th ACM International Conference on Web Search and Data Mining (WSDM '22), February 21-25, 2022, Phoenix, Arizona

arXiv:2104.14380 [pdf, other]

PPFL: Privacy-preserving Federated Learning with Trusted Execution Environments

Authors: Fan Mo, Hamed Haddadi, Kleomenis Katevas, Eduard Marin, Diego Perino, Nicolas Kourtellis

Abstract: We propose and implement a Privacy-preserving Federated Learning ($PPFL$) framework for mobile systems to limit privacy leakages in federated learning. Leveraging the widespread presence of Trusted Execution Environments (TEEs) in high-end and mobile devices, we utilize TEEs on clients for local training, and on servers for secure aggregation, so that model/gradient updates are hidden from adversa… ▽ More We propose and implement a Privacy-preserving Federated Learning ($PPFL$) framework for mobile systems to limit privacy leakages in federated learning. Leveraging the widespread presence of Trusted Execution Environments (TEEs) in high-end and mobile devices, we utilize TEEs on clients for local training, and on servers for secure aggregation, so that model/gradient updates are hidden from adversaries. Challenged by the limited memory size of current TEEs, we leverage greedy layer-wise training to train each model's layer inside the trusted area until its convergence. The performance evaluation of our implementation shows that $PPFL$ can significantly improve privacy while incurring small system overheads at the client-side. In particular, $PPFL$ can successfully defend the trained model against data reconstruction, property inference, and membership inference attacks. Furthermore, it can achieve comparable model utility with fewer communication rounds (0.54$\times$) and a similar amount of network traffic (1.002$\times$) compared to the standard federated learning of a complete model. This is achieved while only introducing up to ~15% CPU time, ~18% memory usage, and ~21% energy consumption overhead in $PPFL$'s client-side. △ Less

Submitted 28 June, 2021; v1 submitted 29 April, 2021; originally announced April 2021.

Comments: 15 pages, 8 figures, accepted to MobiSys 2021

arXiv:2011.09359 [pdf, other]

doi 10.1145/3426745.3431337

FLaaS: Federated Learning as a Service

Authors: Nicolas Kourtellis, Kleomenis Katevas, Diego Perino

Abstract: Federated Learning (FL) is emerging as a promising technology to build machine learning models in a decentralized, privacy-preserving fashion. Indeed, FL enables local training on user devices, avoiding user data to be transferred to centralized servers, and can be enhanced with differential privacy mechanisms. Although FL has been recently deployed in real systems, the possibility of collaborativ… ▽ More Federated Learning (FL) is emerging as a promising technology to build machine learning models in a decentralized, privacy-preserving fashion. Indeed, FL enables local training on user devices, avoiding user data to be transferred to centralized servers, and can be enhanced with differential privacy mechanisms. Although FL has been recently deployed in real systems, the possibility of collaborative modeling across different 3rd-party applications has not yet been explored. In this paper, we tackle this problem and present Federated Learning as a Service (FLaaS), a system enabling different scenarios of 3rd-party application collaborative model building and addressing the consequent challenges of permission and privacy management, usability, and hierarchical model training. FLaaS can be deployed in different operational environments. As a proof of concept, we implement it on a mobile phone setting and discuss practical implications of results on simulated and real devices with respect to on-device training CPU cost, memory footprint and power consumed per FL model round. Therefore, we demonstrate FLaaS's feasibility in building unique or joint FL models across applications for image object detection in a few hours, across 100 devices. △ Less

Submitted 18 November, 2020; originally announced November 2020.

Comments: 7 pages, 4 figures, 7 subfigures, 34 references

Journal ref: In 1st Workshop on Distributed Machine Learning (DistributedML'20), Dec. 1, 2020, Barcelona, Spain. ACM, New York, NY, USA, 7 pages

arXiv:2004.05703 [pdf, other]

doi 10.1145/3386901.3388946

DarkneTZ: Towards Model Privacy at the Edge using Trusted Execution Environments

Authors: Fan Mo, Ali Shahin Shamsabadi, Kleomenis Katevas, Soteris Demetriou, Ilias Leontiadis, Andrea Cavallaro, Hamed Haddadi

Abstract: We present DarkneTZ, a framework that uses an edge device's Trusted Execution Environment (TEE) in conjunction with model partitioning to limit the attack surface against Deep Neural Networks (DNNs). Increasingly, edge devices (smartphones and consumer IoT devices) are equipped with pre-trained DNNs for a variety of applications. This trend comes with privacy risks as models can leak information a… ▽ More We present DarkneTZ, a framework that uses an edge device's Trusted Execution Environment (TEE) in conjunction with model partitioning to limit the attack surface against Deep Neural Networks (DNNs). Increasingly, edge devices (smartphones and consumer IoT devices) are equipped with pre-trained DNNs for a variety of applications. This trend comes with privacy risks as models can leak information about their training data through effective membership inference attacks (MIAs). We evaluate the performance of DarkneTZ, including CPU execution time, memory usage, and accurate power consumption, using two small and six large image classification models. Due to the limited memory of the edge device's TEE, we partition model layers into more sensitive layers (to be executed inside the device TEE), and a set of layers to be executed in the untrusted part of the operating system. Our results show that even if a single layer is hidden, we can provide reliable model privacy and defend against state of the art MIAs, with only 3% performance overhead. When fully utilizing the TEE, DarkneTZ provides model protections with up to 10% overhead. △ Less

Submitted 12 April, 2020; originally announced April 2020.

Comments: 13 pages, 8 figures, accepted to ACM MobiSys 2020

arXiv:2003.06612 [pdf, other]

Policy-Based Federated Learning

Authors: Kleomenis Katevas, Eugene Bagdasaryan, Jason Waterman, Mohamad Mounir Safadieh, Eleanor Birrell, Hamed Haddadi, Deborah Estrin

Abstract: In this paper we present PoliFL, a decentralized, edge-based framework that supports heterogeneous privacy policies for federated learning. We evaluate our system on three use cases that train models with sensitive user data collected by mobile phones - predictive text, image classification, and notification engagement prediction - on a Raspberry Pi edge device. We find that PoliFL is able to perf… ▽ More In this paper we present PoliFL, a decentralized, edge-based framework that supports heterogeneous privacy policies for federated learning. We evaluate our system on three use cases that train models with sensitive user data collected by mobile phones - predictive text, image classification, and notification engagement prediction - on a Raspberry Pi edge device. We find that PoliFL is able to perform accurate model training and inference within reasonable resource and time budgets while also enforcing heterogeneous privacy policies. △ Less

Submitted 18 February, 2021; v1 submitted 14 March, 2020; originally announced March 2020.

arXiv:1910.08951 [pdf, other]

doi 10.1145/3365609.3365852

BatteryLab, A Distributed Power Monitoring Platform For Mobile Devices

Authors: Matteo Varvello, Kleomenis Katevas, Mihai Plesa, Hamed Haddadi, Benjamin Livshits

Abstract: Recent advances in cloud computing have simplified the way that both software development and testing are performed. Unfortunately, this is not true for battery testing for which state of the art test-beds simply consist of one phone attached to a power meter. These test-beds have limited resources, access, and are overall hard to maintain; for these reasons, they often sit idle with no experiment… ▽ More Recent advances in cloud computing have simplified the way that both software development and testing are performed. Unfortunately, this is not true for battery testing for which state of the art test-beds simply consist of one phone attached to a power meter. These test-beds have limited resources, access, and are overall hard to maintain; for these reasons, they often sit idle with no experiment to run. In this paper, we propose to share existing battery testing setups and build BatteryLab, a distributed platform for battery measurements. Our vision is to transform independent battery testing setups into vantage points of a planetary-scale measurement platform offering heterogeneous devices and testing conditions. In the paper, we design and deploy a combination of hardware and software solutions to enable BatteryLab's vision. We then preliminarily evaluate BatteryLab's accuracy of battery reporting, along with some system benchmarking. We also demonstrate how BatteryLab can be used by researchers to investigate a simple research question. △ Less

Submitted 20 October, 2019; originally announced October 2019.

Comments: 8 pages, 8 figures, HotNets 2019 paper

Journal ref: HotNets 2019

arXiv:1907.06034 [pdf, other]

Towards Characterizing and Limiting Information Exposure in DNN Layers

Authors: Fan Mo, Ali Shahin Shamsabadi, Kleomenis Katevas, Andrea Cavallaro, Hamed Haddadi

Abstract: Pre-trained Deep Neural Network (DNN) models are increasingly used in smartphones and other user devices to enable prediction services, leading to potential disclosures of (sensitive) information from training data captured inside these models. Based on the concept of generalization error, we propose a framework to measure the amount of sensitive information memorized in each layer of a DNN. Our r… ▽ More Pre-trained Deep Neural Network (DNN) models are increasingly used in smartphones and other user devices to enable prediction services, leading to potential disclosures of (sensitive) information from training data captured inside these models. Based on the concept of generalization error, we propose a framework to measure the amount of sensitive information memorized in each layer of a DNN. Our results show that, when considered individually, the last layers encode a larger amount of information from the training data compared to the first layers. We find that, while the neuron of convolutional layers can expose more (sensitive) information than that of fully connected layers, the same DNN architecture trained with different datasets has similar exposure per layer. We evaluate an architecture to protect the most sensitive layers within the memory limits of Trusted Execution Environment (TEE) against potential white-box membership inference attacks without the significant computational overhead. △ Less

Submitted 13 July, 2019; originally announced July 2019.

Comments: 5 pages, 6 figures, CCS PPML workshop

arXiv:1809.00947 [pdf, other]

Finding Dory in the Crowd: Detecting Social Interactions using Multi-Modal Mobile Sensing

Authors: Kleomenis Katevas, Katrin Hänsel, Richard Clegg, Ilias Leontiadis, Hamed Haddadi, Laurissa Tokarchuk

Abstract: Remembering our day-to-day social interactions is challenging even if you aren't a blue memory challenged fish. The ability to automatically detect and remember these types of interactions is not only beneficial for individuals interested in their behavior in crowded situations, but also of interest to those who analyze crowd behavior. Currently, detecting social interactions is often performed us… ▽ More Remembering our day-to-day social interactions is challenging even if you aren't a blue memory challenged fish. The ability to automatically detect and remember these types of interactions is not only beneficial for individuals interested in their behavior in crowded situations, but also of interest to those who analyze crowd behavior. Currently, detecting social interactions is often performed using a variety of methods including ethnographic studies, computer vision techniques and manual annotation-based data analysis. However, mobile phones offer easier means for data collection that is easy to analyze and can preserve the user's privacy. In this work, we present a system for detecting stationary social interactions inside crowds, leveraging multi-modal mobile sensing data such as Bluetooth Smart (BLE), accelerometer and gyroscope. To inform the development of such system, we conducted a study with 24 participants, where we asked them to socialize with each other for 45 minutes. We built a machine learning system based on gradient-boosted trees that predicts both 1:1 and group interactions with 77.8% precision and 86.5% recall, a 30.2% performance increase compared to a proximity-based approach. By utilizing a community detection-based method, we further detected the various group formation that exist within the crowd. Using mobile phone sensors already carried by the majority of people in a crowd makes our approach particularly well suited to real-life analysis of crowd behavior and influence strategies. △ Less

Submitted 16 November, 2018; v1 submitted 30 August, 2018; originally announced September 2018.

Comments: 21 pages, 6 figures, conference paper

arXiv:1807.02472 [pdf, other]

doi 10.1145/3229434.3229441

Typical Phone Use Habits: Intense Use Does Not Predict Negative Well-Being

Authors: Kleomenis Katevas, Ioannis Arapakis, Martin Pielot

Abstract: Not all smartphone owners use their device in the same way. In this work, we uncover broad, latent patterns of mobile phone use behavior. We conducted a study where, via a dedicated logging app, we collected daily mobile phone activity data from a sample of 340 participants for a period of four weeks. Through an unsupervised learning approach and a methodologically rigorous analysis, we reveal fiv… ▽ More Not all smartphone owners use their device in the same way. In this work, we uncover broad, latent patterns of mobile phone use behavior. We conducted a study where, via a dedicated logging app, we collected daily mobile phone activity data from a sample of 340 participants for a period of four weeks. Through an unsupervised learning approach and a methodologically rigorous analysis, we reveal five generic phone use profiles which describe at least 10% of the participants each: limited use, business use, power use, and personality- & externally induced problematic use. We provide evidence that intense mobile phone use alone does not predict negative well-being. Instead, our approach automatically revealed two groups with tendencies for lower well-being, which are characterized by nightly phone use sessions. △ Less

Submitted 6 July, 2018; originally announced July 2018.

Comments: 10 pages, 6 figures, conference paper

ACM Class: H.5.m

arXiv:1802.03151 [pdf, other]

Deep Private-Feature Extraction

Authors: Seyed Ali Osia, Ali Taheri, Ali Shahin Shamsabadi, Kleomenis Katevas, Hamed Haddadi, Hamid R. Rabiee

Abstract: We present and evaluate Deep Private-Feature Extractor (DPFE), a deep model which is trained and evaluated based on information theoretic constraints. Using the selective exchange of information between a user's device and a service provider, DPFE enables the user to prevent certain sensitive information from being shared with a service provider, while allowing them to extract approved information… ▽ More We present and evaluate Deep Private-Feature Extractor (DPFE), a deep model which is trained and evaluated based on information theoretic constraints. Using the selective exchange of information between a user's device and a service provider, DPFE enables the user to prevent certain sensitive information from being shared with a service provider, while allowing them to extract approved information using their model. We introduce and utilize the log-rank privacy, a novel measure to assess the effectiveness of DPFE in removing sensitive information and compare different models based on their accuracy-privacy tradeoff. We then implement and evaluate the performance of DPFE on smartphones to understand its complexity, resource demands, and efficiency tradeoffs. Our results on benchmark image datasets demonstrate that under moderate resource utilization, DPFE can achieve high accuracy for primary tasks while preserving the privacy of sensitive features. △ Less

Submitted 28 February, 2018; v1 submitted 9 February, 2018; originally announced February 2018.

arXiv:1712.07120 [pdf, other]

Continual Prediction of Notification Attendance with Classical and Deep Network Approaches

Authors: Kleomenis Katevas, Ilias Leontiadis, Martin Pielot, Joan Serrà

Abstract: We investigate to what extent mobile use patterns can predict -- at the moment it is posted -- whether a notification will be clicked within the next 10 minutes. We use a data set containing the detailed mobile phone usage logs of 279 users, who over the course of 5 weeks received 446,268 notifications from a variety of apps. Besides using classical gradient-boosted trees, we demonstrate how to ma… ▽ More We investigate to what extent mobile use patterns can predict -- at the moment it is posted -- whether a notification will be clicked within the next 10 minutes. We use a data set containing the detailed mobile phone usage logs of 279 users, who over the course of 5 weeks received 446,268 notifications from a variety of apps. Besides using classical gradient-boosted trees, we demonstrate how to make continual predictions using a recurrent neural network (RNN). The two approaches achieve a similar AUC of ca. 0.7 on unseen users, with a possible operation point of 50% sensitivity and 80% specificity considering all notification types (an increase of 40% with respect to a probabilistic baseline). These results enable automatic, intelligent handling of mobile phone notifications without the need for user feedback or personalization. Furthermore, they showcase how forego feature-extraction by using RNNs for continual predictions directly on mobile usage logs. To the best of our knowledge, this is the first work that leverages mobile sensor data for continual, context-aware predictions of interruptibility using deep neural networks. △ Less

Submitted 19 December, 2017; originally announced December 2017.

Comments: 15 pages

arXiv:1710.01727 [pdf, ps, other]

Privacy-Preserving Deep Inference for Rich User Data on The Cloud

Authors: Seyed Ali Osia, Ali Shahin Shamsabadi, Ali Taheri, Kleomenis Katevas, Hamid R. Rabiee, Nicholas D. Lane, Hamed Haddadi

Abstract: Deep neural networks are increasingly being used in a variety of machine learning applications applied to rich user data on the cloud. However, this approach introduces a number of privacy and efficiency challenges, as the cloud operator can perform secondary inferences on the available data. Recently, advances in edge processing have paved the way for more efficient, and private, data processing… ▽ More Deep neural networks are increasingly being used in a variety of machine learning applications applied to rich user data on the cloud. However, this approach introduces a number of privacy and efficiency challenges, as the cloud operator can perform secondary inferences on the available data. Recently, advances in edge processing have paved the way for more efficient, and private, data processing at the source for simple tasks and lighter models, though they remain a challenge for larger, and more complicated models. In this paper, we present a hybrid approach for breaking down large, complex deep models for cooperative, privacy-preserving analytics. We do this by breaking down the popular deep architectures and fine-tune them in a particular way. We then evaluate the privacy benefits of this approach based on the information exposed to the cloud service. We also asses the local inference cost of different layers on a modern handset for mobile applications. Our evaluations show that by using certain kind of fine-tuning and embedding techniques and at a small processing costs, we can greatly reduce the level of information available to unintended tasks applied to the data feature on the cloud, and hence achieving the desired tradeoff between privacy and performance. △ Less

Submitted 11 October, 2017; v1 submitted 4 October, 2017; originally announced October 2017.

Comments: arXiv admin note: substantial text overlap with arXiv:1703.02952

arXiv:1705.06224 [pdf, other]

doi 10.1145/3089801.3089802

Practical Processing of Mobile Sensor Data for Continual Deep Learning Predictions

Authors: Kleomenis Katevas, Ilias Leontiadis, Martin Pielot, Joan Serrà

Abstract: We present a practical approach for processing mobile sensor time series data for continual deep learning predictions. The approach comprises data cleaning, normalization, cap**, time-based compression, and finally classification with a recurrent neural network. We demonstrate the effectiveness of the approach in a case study with 279 participants. On the basis of sparse sensor events, the netwo… ▽ More We present a practical approach for processing mobile sensor time series data for continual deep learning predictions. The approach comprises data cleaning, normalization, cap**, time-based compression, and finally classification with a recurrent neural network. We demonstrate the effectiveness of the approach in a case study with 279 participants. On the basis of sparse sensor events, the network continually predicts whether the participants would attend to a notification within 10 minutes. Compared to a random baseline, the classifier achieves a 40% performance increase (AUC of 0.702) on a withheld test set. This approach allows to forgo resource-intensive, domain-specific, error-prone feature engineering, which may drastically increase the applicability of machine learning to mobile phone sensor data. △ Less

Submitted 17 May, 2017; originally announced May 2017.

Comments: 6 pages, 3 figures, 3 tables

Journal ref: DeepMobile Workshop, MobileHCI 2017

arXiv:1703.02952 [pdf, other]

doi 10.1109/JIOT.2020.2967734

A Hybrid Deep Learning Architecture for Privacy-Preserving Mobile Analytics

Authors: Seyed Ali Osia, Ali Shahin Shamsabadi, Sina Sajadmanesh, Ali Taheri, Kleomenis Katevas, Hamid R. Rabiee, Nicholas D. Lane, Hamed Haddadi

Abstract: Internet of Things (IoT) devices and applications are being deployed in our homes and workplaces. These devices often rely on continuous data collection to feed machine learning models. However, this approach introduces several privacy and efficiency challenges, as the service operator can perform unwanted inferences on the available data. Recently, advances in edge processing have paved the way f… ▽ More Internet of Things (IoT) devices and applications are being deployed in our homes and workplaces. These devices often rely on continuous data collection to feed machine learning models. However, this approach introduces several privacy and efficiency challenges, as the service operator can perform unwanted inferences on the available data. Recently, advances in edge processing have paved the way for more efficient, and private, data processing at the source for simple tasks and lighter models, though they remain a challenge for larger, and more complicated models. In this paper, we present a hybrid approach for breaking down large, complex deep neural networks for cooperative, privacy-preserving analytics. To this end, instead of performing the whole operation on the cloud, we let an IoT device to run the initial layers of the neural network, and then send the output to the cloud to feed the remaining layers and produce the final result. In order to ensure that the user's device contains no extra information except what is necessary for the main task and preventing any secondary inference on the data, we introduce Siamese fine-tuning. We evaluate the privacy benefits of this approach based on the information exposed to the cloud service. We also assess the local inference cost of different layers on a modern handset. Our evaluations show that by using Siamese fine-tuning and at a small processing cost, we can greatly reduce the level of unnecessary, potentially sensitive information in the personal data, and thus achieving the desired trade-off between utility, privacy, and performance. △ Less

Submitted 26 December, 2019; v1 submitted 8 March, 2017; originally announced March 2017.

Comments: To appear in IEEE Internet of Things Journal

Journal ref: IEEE Internet of Things Journal, May 2020

arXiv:1606.05576 [pdf, ps, other]

SensingKit: Evaluating the Sensor Power Consumption in iOS devices

Authors: Kleomenis Katevas, Hamed Haddadi, Laurissa Tokarchuk

Abstract: Today's smartphones come equipped with a range of advanced sensors capable of sensing motion, orientation, audio as well as environmental data with high accuracy. With the existence of application distribution channels such as the Apple App Store and the Google Play Store, researchers can distribute applications and collect large scale data in ways that previously were not possible. Motivated by t… ▽ More Today's smartphones come equipped with a range of advanced sensors capable of sensing motion, orientation, audio as well as environmental data with high accuracy. With the existence of application distribution channels such as the Apple App Store and the Google Play Store, researchers can distribute applications and collect large scale data in ways that previously were not possible. Motivated by the lack of a universal, multi-platform sensing library, in this work we present the design and implementation of SensingKit, an open-source continuous sensing system that supports both iOS and Android mobile devices. One of the unique features of SensingKit is the support of the latest beacon technologies based on Bluetooth Smart (BLE), such as iBeaconand Eddystone. We evaluate and compare the power consumption of each supported sensor individually, using an iPhone 5S device running on iOS 9. We believe that this platform will be beneficial to all researchers and developers who plan to use mobile sensing technology in large-scale experiments. △ Less

Submitted 17 June, 2016; originally announced June 2016.

Comments: 4 pages, 2 figures, 3 tables. To be published in the 12th International Conference on Intelligent Environments (IE'16)

MSC Class: 68N01 ACM Class: C.4; D.2.13

Showing 1–19 of 19 results for author: Katevas, K