A Survey on Privacy-Preserving Caching at Network Edge: Classification, Solutions, and Challenges

Xianzhi Zhang [email protected] School of Computer Science and Engineering, Sun Yat-sen UniversityChina Yipeng Zhou [email protected] School of Computing, Macquarie UniversityAustralia Di Wu [email protected] School of Computer Science and Engineering, Sun Yat-sen UniversityChina Shazia Riaz [email protected] Quan Z. Sheng [email protected] School of Computing, Macquarie UniversityAustralia Miao Hu [email protected]  and  Linchang Xiao [email protected] School of Computer Science and Engineering, Sun Yat-sen UniversityChina
(2018)
Abstract.

Caching content at the network edge is a popular and effective technique widely deployed to alleviate the burden of network backhaul, shorten service delay and improve service quality. However, there has been some controversy over privacy violations in caching content at the network edge. On the one hand, the multi-access open edge network provides an ideal surface for external attackers to obtain private data from the edge cache by extracting sensitive information. On the other hand, privacy can be infringed by curious edge caching providers through caching trace analysis targeting to achieve better caching performance or higher profits. Therefore, an in-depth understanding of privacy issues in edge caching networks is vital and indispensable for creating a privacy-preserving caching service at the network edge. In this article, we are among the first to fill in this gap by examining privacy-preserving techniques for caching content at the network edge. Firstly, we provide an introduction to the background of Privacy-Preserving Edge Caching (PPEC). Next, we summarize the key privacy issues and present a taxonomy for caching at the network edge from the perspective of private data. Additionally, we conduct a retrospective review of the state-of-the-art countermeasures against privacy leakage from content caching at the network edge. Finally, we conclude the survey and envision challenges for future research.

Edge cache, Privacy-preserving caching, Edge networks, Countermeasure, Caching performance
copyright: acmcopyrightjournalyear: 2023doi: XXXXXXX.XXXXXXXccs: General and reference Surveys and overviewsccs: Security and privacy Privacy-preserving protocolsccs: Networks Network privacy and anonymity

1. Introduction

Content caching at the network edge is driven by two factors. First, the population of networked devices has become astronomical due to advances in intelligent terminals and the broad deployment of the Internet of Things (IoT) (Cui et al., 2022; Guo et al., 2022; Cui et al., 2023). Second, the Internet content market is blooming due to the proliferation of various multimedia content (Zhang et al., 2022d; Ni et al., 2021). As a result, network-based content delivery services are extremely bandwidth-consuming. It was forecasted by Cisco (others Forecast, 2019) that the consumer share of the total devices, including both fixed and mobile devices, will be more than 21 billion and account for 74% of total devices in 2023. At the same time, emerging network technologies, such as Gigabit Ethernet and 5G, are expected to provide extremely high data transmission rates and low access delays for terminal devices at the network edge to support time-sensitive services such as autonomous driving, industrial automation, high-quality video streaming, and virtual/enhanced emerging applications.

Such a vast data flow brings two main challenges to the established networks: (1) It brings a heavy communication burden to the Internet core network links. During the peak hours of network usage, a large amount of content transmission will inevitably aggravate the link burden of the core network, causing network congestion and the increase of network operating costs; (2) It will also prolong the service delay of content transmission from remote servers to end devices, which will adversely influence users’ service Quality-of-Experience (QoE) or even ruin the reliability of delay-sensitive applications.

Edge Caching is a technique that involves storing content in proximity to end users, typically at or near the point of user access or ahead of the core network (Ni et al., 2021). Its primary objective is to shorten service latency and enhance content delivery performance by bringing content closer to the users who request it. When users request content that is available in the edge cache, their requests can be directly served at the network edge with a high Quality-of-Service (QoS). However, if the requested content is not available in the edge cache, it can be redirected to a remote server, such as a data center. In a typical edge network (Ni et al., 2021), there are three main entities for edge caching: end devices, access networks and edge networks:

  1. (1)

    End devices (e.g., smartphones, laptops, intelligent vehicles, and industrial IoT devices) carried by users will generate requests for downloading content via networks (Cui et al., 2022; Guo et al., 2022). It is possible that end devices can share content with each other through Device-to-Device (D2D) communications with licensed-band or unlicensed-band protocols.

  2. (2)

    In access networks, wired and wireless communication technologies support end devices for accessing the Internet with the infrastructures in the edge network, such as 5G Base Stations (BSs), WiFi routers, intelligent television boxes, and Roadside Units (RSU) in the Internet of Vehicles (IoV). Popular content can be cached in these accessible infrastructures at the instance of end devices or third-party Internet Service Providers (ISPs) to serve user requests.

  3. (3)

    Edge cache can be located at Edge Servers (ESs) ahead of the core network, such as nodes in the Content Delivery Network (CDN) (Cui et al., 2020a), edge routers in the Information-Centric Network (ICN) (Sivaraman and Sikdar, 2021; Xue et al., 2019, 2018), and macro base stations covering a specific region (Araldo et al., 2018). We also call it in-network-edge caching. Edge servers, generally maintained by third-party suppliers, are the heart and anchor point for multi-access edge networks to enhance various content delivery applications.

Caching content at the edge network is effective in reducing the burden of network backhaul (Yang et al., 2019; Jiang et al., 2017; Qiao et al., 2022), shortening service latency (Zhang et al., 2022b; Qiao et al., 2022; Cui et al., 2020a), and diminishing resource cost (Jiang et al., 2017; Hassanpour et al., 2023). First, it is common to cache popular content at the edge network through which the edge network can offload the access of requests and hence reduce the backhaul data flow. Even though the caching capability is limited at the network edge, the edge cache can offload up to 35% of the traffic burden over backhaul links  (Ni et al., 2021). Second, the service latency can be significantly shortened by caching content on edge devices near end users. In particular, a shortened latency is critical for content delivery of latency-sensitive applications (Ni et al., 2021). Third, network edge can make the access of content inexpensive since caching content at edge devices can avoid the transmission bottleneck. For example, in wireless edge networks, spectral efficiency and energy efficiency can be improved by about 900% and 500%, respectively, by making use of network edge for caching content (Liu et al., 2016).

Despite the enormous benefits brought by caching content at the network edge, there have been some controversy over privacy violations brought by such caching. The concerns can be illustrated from two aspects. The first privacy threat comes from external attackers, such as malicious user devices (Sivaraman and Sikdar, 2021; Liang and Liu, 2019; Acs et al., 2019; Qian et al., 2020; Cui et al., 2020a; Tong et al., 2022). The multi-access open property of the edge network provides an ideal surface for external attackers to obtain the cached content from the edge cache so as to extract sensitive information of end users (Xu et al., 2020). In particular, adversaries can obtain user-sensitive information by launching cache side-channel attacks (Sivaraman and Sikdar, 2021; Liang and Liu, 2019; Acs et al., 2019) and cache tampering attacks (Qian et al., 2020; Cui et al., 2020a; Tong et al., 2022). However, it is non-trivial to embed advanced privacy protection mechanisms into network edge due to the limited computing capacity, energy power, and storage space of edge devices. Second, user privacy can be infringed by curious edge caching providers by analyzing traces and management records. Due to limited caching space relative to the rapidly growing user population and the scale of content (Zhou et al., 2019), network edge providers have a strong motivation to spy on user privacy in order to improve their resource utilization. In other words, if content popularity can be accurately predicted, the right content can be cached by edge devices just before the surge of requests towards these content (Zhang et al., 2022d). Hence, edge network providers are curious about users’ personal interests and confidential information to infer their request behaviours, which can be extracted from users’ historical request traces (e.g., request patterns (Zhang et al., 2022d; Cui et al., 2020c, a), identifiable information (Zhang et al., 2022c; Zhu et al., 2021; Araldo et al., 2018; Cui et al., 2020c, a)). Network edge providers can implement monitoring attacks (Xue et al., 2018; Zhang et al., 2022c) and inference attacks (Qiao et al., 2022; Liu et al., 2022) in their systems to compromise users’ privacy based on collected request information from users. Therefore, an in-depth understanding of privacy risks in Privacy-Preserving Edge Caching (PPEC) is crucial for the design of feasible solutions to achieve privacy-preserving content cache at the network edge.

Recently, significant progress has been made by existing works to enhance the privacy protection for users in network edge (Ni et al., 2021), but it is still far from thoroughly solving all problems. Particularly, the influence of privacy protection on caching performance has been largely overlooked in existing works. In view of the limitations mentioned above and the scarcity of literature review on the edge-assisted network cache, this article is dedicated to comprehensively examining and categorizing current works on privacy issues at the edge caching networks. The main contributions of this article are summarized as follows:

  1. (1)

    We make in-depth discussions on sensitive privacy in edge caching and propose a taxonomy from a data perspective to classify existing works. To the best of our knowledge, this is the first such comprehensive exposition.

  2. (2)

    We conduct a thorough review of recent high-quality research, diving into the background of privacy attacks and mitigation methods in the realm of edge caching. Our review encompasses the latest solutions proposed for enhancing privacy in the edge cache, which have been published in influential conferences and journals in the fields of computing networks, architecture, and privacy, such as CCS, INFOCOM, ToN, JSAC, TPDS, TIFS, and TDSC, among others. Based on different kinds of privacy information and attacks towards each kind of privacy information, we respectively review countermeasures to defend against attacks for protecting each kind of infringed privacy.

  3. (3)

    Based on open problems outlined in existing works, we envision privacy-related open challenges in PPEC to provide insights for inspiring future research.

The remainder of this article is organized as follows. Section 2 provides an introduction to the taxonomy of privacy-preserving solutions that are based on the protection of sensitive privacy data in edge cache. Section 3 provides a background discussion on privacy issues in the edge caching paradigm from two plain perspectives, i.e., privacy attacks and mitigation methods. From Section 4 to Section 6, we describe the possible privacy mitigation solutions for edge caching in correspondence with three main classes of privacy, i.e., user privacy, content privacy and knowledge privacy, respectively. Section 7 provides open challenges and future research directions. Finally, we make a summary in Section 8. To facilitate readability, we have compiled a summary of commonly used abbreviations for the solutions in Table 1.

Table 1. List of Common Abbreviations in this Paper.
Abbreviation Meaning Abbreviation Meaning
CDN Content Delivery Network CP(s) Content Provider(s)
EC Edge Cache ES(s) Edge Server(s)
EN(s) Edge Node(s) D2D Device-to-Device
(L)DP (Local) Differential Privacy DL Deep Learning
FL Federated Learning HE Homomorphic Encryption
ICN Information-Centric Network IoV Internet of Vehicles
ISP(s) Internet Service Provider(s) LBS Location-Based Services
ML Machine Learning PIR Private Information Retrieval
POI(s) Point of Interest(s) PPEC Privacy-Preserving Edge Caching
RSU(s) Roadside Unit(s) (D)RL (Deep) Reinforcement Learning
(S)BS(s) (Small) Base Station(s) SS Secret Sharing
TTP Truest Third Party TDC Trusted Distributed Computing
Refer to caption
Figure 1. The framework for PPEC encompasses six distinct types of data concerns: request traces, personal information, location data, machine learning knowledge, private content, and content popularity. These concerns can be primarily classified into three categories of private data: user privacy, content privacy, and knowledge privacy.

2. Overview of Sensitive Privacy in Edge Cache

In this section, we overview sensitive information that should be protected to avoid privacy leakage in PPEC. In the realm of edge caching, sensitive privacy can be exposed by either users (Acs et al., 2019) unconsciously or edge servers (Cui et al., 2020c; Araldo et al., 2018). Sensitive privacy includes personal information, browsing history, location, and private content data, through their request traces to the ES or other service providers. For example, in mobile social networks, user-generated content, which is sensitive and confidential, can be cached and distributed by edge servers (Wang et al., 2019b; Zhou et al., 2019). Similarly, edge servers can leak private information and extract knowledge from a collection of users who have interacted with edge servers. For example, edge servers may leak video content popularity (extracted from user request traces) to malicious users (Cui et al., 2020c, a). To build a privacy-preserving content caching system, the first step is to understand what private information can be exposed by users and edge servers. In Fig. 1, we outline all kinds of sensitive privacy that should be protected in PPEC. We will elaborate on each kind of private information in this section.

2.1. User Privacy

In PPEC, all information related to users but not directly related to cached content is regarded as user privacy such as users’ historical records, age, gender, and location. For our discussion, we classify all user privacy information into three types: request trace, personal information and location.

2.1.1. Request trace

A request trace refers to a sequence of content requests and responses between an end device and the ES or service providers. This trace probably contains user-sensitive information such as the user’s browsing behaviour, preferences, and interests (Sivaraman and Sikdar, 2021; Cui et al., 2020c, a). User request traces are valuable assets to service providers and caching systems. Service or content providers can analyze these request traces to infer users’ behavior patterns, such as the type of websites or applications they frequently use and the content they prefer to consume. Edge servers can maintain and analyze request traces to improve caching performance by predicting future requests so as to prefetch and cache popular content in advance.

However, user request traces may expose users’ sensitive information such as user preferences (Cui et al., 2020c; Qian et al., 2020), request patterns (Sivaraman and Sikdar, 2021; Cui et al., 2020a; Acs et al., 2019; Liang and Liu, 2019) and movement patterns (Zhang et al., 2022a). Such information can be utilized by advertisers or malicious attackers to make profits or harm. There are two potential risks: interception and misuse. First, request records can be intercepted and sniffed by other users and external attackers. For example, malicious users can use timing attacks (Sivaraman and Sikdar, 2021; Acs et al., 2019) to infer historical request records of nearby users, leading to illegal advertising and cache pollution attacks (Wu et al., 2016). Second, edge servers and service providers are curious about user interest patterns who may misuse request traces for their purposes. For example, request traces can be exploited to develop trace-driven content caching algorithms, and hence users may suffer privacy infringement threats from untrusted or profit-driven third-party edge servers (Araldo et al., 2018; Cui et al., 2020c; Schlegel et al., 2022; Tong et al., 2022).

However, designing methods to preserve user privacy in edge caching systems is non-trivial. Most existing privacy-enhancing approaches fail to effectively address the privacy leakage risks confronted by users in caching systems because request records cannot be arbitrarily altered or obfuscated by users, and they must be visible to service providers and edge caching servers. Attackers can obtain user request records through various methods such as timing attacks, through which attackers pretend to be the normal user who sends content requests to the server. Then, attackers may infer user request traces by exploiting the timing difference between cached and non-cached responses (Sivaraman and Sikdar, 2021; Acs et al., 2019).

2.1.2. Personal information

Personal information is a type of private data that can be mined to identify a specific end device or user in the network. Edge caching servers and service providers can obtain various types of personal information from users, depending on the specific context and implementation of the edge caching system. Typical examples of personal information that can be compromised in edge caching include: (1) Identifier information such as pseudonyms and IP addresses. In particular, through IP addresses, we can identify a user’s Internet Service Provider (ISP), approximate location, and other information, with which the edge cache can carry out sensitive operations, such as integrity verification (Tong et al., 2022) and cache admission control (Xue et al., 2019, 2018). (2) Device information such as the operating system, connection type, browser type, and version, which is also essential for edge nodes to provide high-performance edge caching and tailored content to users (Zhang et al., 2022c; Cui et al., 2022). (3) Account-related information such as email address, gender, age, payment, and social relation, which can be captured by the EC or service providers when a user logs in or creates an account to access the service, potentially revealing more personal privacy (Zhang et al., 2022c; Cui et al., 2020a).

Excessively exposing personal information by edge caching can result in annoying tracking and profiling. When personal information is collected, edge caching servers and service providers can create detailed profiles of users, including their browsing habits and interests. This information can be harnessed for making caching decisions, targeted advertising, or even more malicious purposes such as manipulation or discrimination (Zhang et al., 2022c; Cui et al., 2022). In addition, malicious nodes and attackers can take advantage of excessive disclosure of personal information to gain unauthorized access to user accounts (Xue et al., 2019, 2018) and pull off cache tampering attacks (Cui et al., 2022, 2020a; Tong et al., 2022), resulting in financial losses and other harms.

2.1.3. Location

Location information is a critical type of privacy data carrying location, spatial coordinates, and the current time of moving objects. In edge caching systems, there are two fundamental types of location information: users’ location information and Point of Interests (POIs).

When users access edge caching systems, they may unconsciously expose private location information in the following processes: (1) A user’s geographic location can be exposed to the edge cache when accessing content or services directly from the edge (Cui et al., 2020b); (2) Content Providers (CP) and Edge Caching providers can proactively collect users’ geographic location information to provide better content distribution services, such as predicting user moving patterns (Zhang et al., 2022a); (3) In Location-Based Services (LBS), users may provide their private geographic information and POIs to search for their interests at the edge cache (Amini et al., 2011; Cui et al., 2020b; GU Yi-ming, BAI Guang-wei, SHEN Hang, 2019; Nisha et al., 2022). This information can be abused, resulting in undesired tracking and profiling or even more severe consequences, such as location-based attacks.

Location information is sensitive that can be utilized to learn an individual’s daily routine and movements. Service providers can use this information to deliver more relevant advertisements and cached content to users, potentially boosting profits. Yet, if malicious attackers obtain location information, it can put users at risk of physical harm. Malicious attackers can use location information to track a user’s movement trajectories and potentially cause harm, particularly in the case of stalking or other criminal activities.

2.2. Content Privacy

Content privacy refers to privacy information contained by the content stored and transmitted through edge caching systems, mainly including private content and content popularity.

2.2.1. Private content

Content cached by the edge system may reveal sensitive and private information, and therefore it is essential to protect the privacy of such content, particularly when it includes confidential information, e.g., personal and financial data, confidential business information, and government secrets. We name such sensitive cached content data as “private content”. For example, in mobile social networks, each user can be regarded as a content provider who can produce fresh content desiring that their content can be efficiently and accurately delivered to consumers (Xu et al., 2019; Zhou et al., 2019). In this case, edge computing is a feasible architecture for caching and delivering the content. Consumers in proximity  (Zhang et al., 2022a; Xu et al., 2020) or with close social relations (Wang et al., 2019b) to a particular user content provider in social networks are more likely to request these content. Thereby, using an edge server to cache and deliver content in mobile social networks can diminish bandwidth costs, which however raises privacy leakage risks.

Briefly speaking, private content privacy can be infringed in several ways. First, edge servers are not trustworthy and can expose cached content to the public. Second, malicious and unauthorized users in the edge network can access cached content during transmission or processing between end users and the edge cache or between different edge caches. For instance, in cache side-channel attacks (Sivaraman and Sikdar, 2021; Liang and Liu, 2019), attackers attempt to access cached content by sending targeted requests, potentially allowing them to view sensitive information. For another instance, attackers can lodge cache tampering by injecting malicious content into the cache to exploit vulnerabilities in end-user systems or steal sensitive information (Qian et al., 2020; Cui et al., 2020a; Tong et al., 2022).

2.2.2. Content popularity

Content popularity can be defined as the relative frequency of a particular content to be requested by users. It indicates the level of popularity of content among users. The popularity information is broadly utilized in improving caching efficiency, and caching the most popular content can effectively lower the content delivery cost. However, the popularity information is sensitive, unveiling the private preference information of users (Cui et al., 2020a; Yu et al., 2022). Besides, it is possible that content popularity information can reveal sensitive information about content providers, such as their financial success and strategic direction, which should be kept confidentially (Araldo et al., 2018; Cui et al., 2020a).

The popularity information is crucial for making effective edge caching decisions. As the number of records owned by a single ES is limited, content providers may need to provide supplementary information. For example, edge caching servers can mutually exchange popularity information to optimize edge caching decisions for the entire caching system (Yu et al., 2022; Cui et al., 2020a). Yet, this practice exposes the relative popularity of different content on edge. Furthermore, when the cache is full, the ES must decide which content to be removed to save space, revealing popularity information as well (Acs et al., 2019).

2.3. Knowledge Privacy

Extracted knowledge refers to the insights and patterns learned by training machine learning models on datasets collected from users. In edge caching, providers are curious about knowledge extracted from user trace records because it is valuable for improving caching performance. The extracted knowledge is widely used to predict the future request patterns of users in a dynamic system, enabling providers to make effective caching decisions (Muller et al., 2017; Yang et al., 2019). For instance, video request access patterns are driven by users’ interests in different locations (Ma et al., 2017; Dhar and Varshney, 2011), and users may move dynamically (Zhang et al., 2022a, c) with their interests changing over time (Zhang et al., 2022d). By relying on predictions based on the knowledge extracted from users’ historical request records, edge caching performance can be significantly improved. Learning-based methods provide a feasible framework for making effective edge caching decisions. Still, user privacy can be leaked during their access to the original dataset for model training and making predictions. Therefore, it is essential to carefully consider and address these privacy risks when employing learning-based methods for edge cache.

3. Overview of Attack and Defence Methods

This section is divided into two parts: an overview of attack methods targeting each type of sensitive data in edge caching systems, and a summary of defence methods against each type of attack. In Fig. 2, we present a relation map between potential privacy attacks, sensitive data and defence methods in edge caching systems. On the left-hand side of Fig. 2, we overview the types of sensitive privacy that can be attacked by attack methods. On the right-hand side of Fig. 2, we overview defence methods that can be used to protect each type of sensitive privacy. In the rest of this section, we briefly discuss each type of attack and defence method covered by Fig. 2.

3.1. Privacy Attack in Edge Cache

There are mainly four types of privacy attacks in edge caching systems, which are monitoring attacks, data mining attacks, cache side-channel attacks and cache tampering attacks. We introduce these attacks with potential risk entities in this subsection.

3.1.1. Monitoring attack

Monitoring attacks, also known as eavesdrop** attacks, can be divided into two main categories:

(1) The first is sniffing attacks on network communications, i.e., an adversary sniffs on network traffic through the edge caching node to read or intercept private information in network packets (Zhang et al., 2022c). For example, the edge cache can monitor user requests during the caching service process. In other words, the edge caching operator can monitor users’ requests intended to responding end users’ requests and improve the caching efficiency. Through subsequent data analysis, edge caching managers can improve the caching efficiency and reduce the transmission delay of the requested content. However, a user request may contain private information, such as personal content preference (Cui et al., 2020c; Yuan et al., 2016; Schlegel et al., 2022), location (Zhang et al., 2022a), content popularity (Cui et al., 2020c), and other personal information (Kong et al., 2019; Xue et al., 2019, 2018). Therefore, edge caching systems should take both caching efficiency and privacy preservation into account. Entities that can implement sniffing attacks in network communications include edge caching managers (e.g., content providers (Cui et al., 2020c), location service providers (Zhang et al., 2022a), Internet services providers or based station (Yuan et al., 2016), edge devices (Zhou et al., 2019; Cui et al., 2020c; Xu et al., 2019; Schlegel et al., 2022; Tong et al., 2022)), malicious end devices (Xue et al., 2019, 2018; Cui et al., 2020c; Nikolaou et al., 2016), and external adversaries (Zhang et al., 2022c).

2) The second type of monitoring attack is supervisory attacks on cached content, i.e., attackers conduct improper monitoring, replacement, pollution, and other privacy attack activities on cached content. By leveraging the illegal cache access, adversaries can obtain private data or information such as content popularity (Araldo et al., 2018; Cui et al., 2020a; Andreoletti et al., 2019a), user preferences (Qian et al., 2020), and other private information (Cui et al., 2020a; Tong et al., 2022). If the cached content is not protected prudently, the user’s privacy can be seriously compromised by edge caches, which are often deployed by honest but curious third parties (e.g., Internet service providers (Andreoletti et al., 2019a; Araldo et al., 2018), edge servers (Cui et al., 2020a; Qian et al., 2020; Tong et al., 2022), and end devices (Cui et al., 2020a; Qian et al., 2020)).

Refer to caption
Figure 2. The possible privacy attacks on different sensitive data and the corresponding defence methods for enhancing privacy in edge caching systems.

3.1.2. Data mining attacks

Data mining attacks usually occur when an edge caching entity applies a learning-based caching algorithm to explore sensitive data for making caching decisions. Due to the high dynamics and complicated access patterns driven by users’ interest (Muller et al., 2017; Yang et al., 2019), designing an intelligent edge caching algorithm is essential to improve the caching performance. Commonly, learning-based methods make caching decisions by exploiting historical information to train a prediction model. It is necessary to feed the model training with private and sensitive data related to users, and thus users may be reluctant to share. Since edge caching decisions are generated by learning algorithms, edge caching becomes a trade-off problem between caching performance and privacy protection level. As a consequence, learning-based methods in edge computing-assisted caching are usually vulnerable to two types of privacy risks: (1) exploratory, in which adversaries investigate vulnerabilities (such as the training dataset, model parameters, and gradient data) without changing the training process, and 2) causative, in which attackers manipulate and inject misleading training datasets to alter the machine learning model’s training process (Tourani et al., 2018). Additionally, previous research has shown that model parameters (Shokri et al., 2017) and gradients (Abadi et al., 2016; Zhao et al., 2020) of the machine learning model can be utilized to recover original sensitive and private data information. Learning-based methods provide a practical framework for making edge caching decisions but are susceptible to privacy risks that can compromise user privacy. The potential adversaries to launch data mining attacks include edge caching managers (e.g., content providers (Qiao et al., 2022; Cui et al., 2020c), Internet services providers (Wang et al., 2020), edge devices (Qiao et al., 2022; Wang et al., 2022a; Liu et al., 2022)) and malicious end devices (Wang et al., 2019b).

3.1.3. Cache side-channel attacks

In cache side-channel attacks, attackers can learn privacy information about users and cached content by observing and measuring activities relevant to edge caches such as response time, power consumption, and return faults (Sivaraman and Sikdar, 2021; Liang and Liu, 2019; Wu et al., 2016; Acs et al., 2019). Through the edge caching service, users can conveniently upload their content to edge servers or download requested content from edge servers. Due to the open accessibility of edge cache (Ni et al., 2021), adversaries can easily access content cached by edge servers. Adversaries can target a particular victim user by identifying content requested by the victim. The attacker may know the victim’s content consumption habits or other specific characteristics to distinguish the victim from other users. One of the main types of cache side-channel attacks is cache-timing attacks, which allows attackers to determine whether specific content has been cached by comparing response times. Previous works such as (Liang and Liu, 2019; Sivaraman and Sikdar, 2021; Acs et al., 2019) have explored cache-timing attacks in edge caching systems. An attacker can conduct the precise timing measurement to distinguish cache hits from misses, which can identify what content is cached at the ES. A cache hit means that a nearby user has requested the content (or has a high caching value), while a cache miss means that the content has not been requested (or has been ejected from the cache). A knowledgeable attacker can further determine whether the request is served by the provider or by a router somewhere along the provider’s path (Liang and Liu, 2019). The main risk entities to launch cache side-channel attacks include malicious end devices (Liang and Liu, 2019; Acs et al., 2019; Wu et al., 2016) and external adversaries (Sivaraman and Sikdar, 2021; Zhang et al., 2022c).

3.1.4. Cache tampering attacks

A cache tampering attack is a form of cyber-attacks in which an adversary aims to alter content stored at an edge cache to gain unauthorized access, introduce illicit content and disrupt the caching system’s regular operation. Within an edge network, a caching server offers a temporary storage area, holding frequently accessed content to expedite distribution. However, cache tampering attacks can transpire when an attacker modifies content cached in the ES or deceives the user to gain unauthorized content. The main risk entities to implement cache tampering attacks include edge servers (Tong et al., 2022; Qian et al., 2020), malicious end devices (Qian et al., 2020; Cui et al., 2020a) and external adversary (Qian et al., 2020; Cui et al., 2020a).

A typical instance of cache tampering attacks is cache poisoning, where an attacker manipulates a Content Delivery Network (CDN) or edge server’s cache to store and deliver malicious content or information (Ni et al., 2021). For example, an attacker can exploit the vulnerability of the caching system by requesting a legitimate image with a specially crafted HTTP header. This header may contain malicious code that tricks the cache into storing a different image the attacker controls rather than the legitimate one. The next time when a user requests the original image, it will instead receive the attacker’s image, which could contain harmful content such as malware or phishing links.

A variant of the cache tampering attack is the cache deception attack, wherein an adversary gains access to private information by misleading and influencing a privileged user (Ni et al., 2021). This process consists of two primary steps: initially, the attacker prompts the privileged user to request sensitive content and cache it in the ES; subsequently, the adversary submits an identical request to the edge cache and retrieves the sensitive content. For example, in named data networking, an attacker creates a URL request targeting a victim user’s private content by attaching a tag of a widely-used image. The victim is then enticed to make that request using its privilege. Upon retrieval, the cloud server disregards the invalid suffix and returns legitimate privacy content. The caching node retains the privacy content as the popular image’s content. In this manner, the attacker can make the same request to access the identical privacy content in the edge cache, enabling them to acquire private content they are not authorized to access, potentially resulting in the victim’s private content being leaked. The above kinds of cache tampering attacks give rise to unbearable privacy risks for users in edge caching systems.

Table 2. Method classification based on countermeasures and privacy types.
Classes Methods Request Record Personal Information Location Extracted Knowledge Private Content Content Popularity
Noise- based DP (Zhang et al., 2018; Wang et al., 2019b; Zhou et al., 2019; Guo et al., 2022; Zhang et al., 2022a; Sivaraman and Sikdar, 2021) (Zhu et al., 2021; Zeng et al., 2020, 2021) (Zhang et al., 2022a) (Yu et al., 2022; Lu et al., 2020; Jiang et al., 2023; Nair et al., 2023) (Wang et al., 2022b) (Yu et al., 2022)
Obfuscation (Wu et al., 2016; Nikolaou et al., 2016; Qian et al., 2020) (Wang et al., 2022a) (Zhang et al., 2019; Amini et al., 2011; GU Yi-ming, BAI Guang-wei, SHEN Hang, 2019) (Wang and Deng, 2022; Wang et al., 2022a) / /
Anonymity (Cui et al., 2020b) (Zhang et al., 2022c; Xue et al., 2019, 2018; Nguyen et al., 2023) (Nisha et al., 2022; Cui et al., 2020b; Sen et al., 2018; Hu et al., 2018; Yang and Kong, 2016; Zhang et al., 2023) / / /
TDC-based FL / (Qiao et al., 2022; Wang et al., 2022a) / (Cui et al., 2022; Qiao et al., 2022; Liu et al., 2022; Yu et al., 2022; Zheng et al., 2022; Chen et al., 2022a; Wang and Deng, 2022; Wang et al., 2022a; Saputra et al., 2022; Cheng et al., 2021; Li et al., 2020; Wang et al., 2020; Yu et al., 2018; Wang et al., 2019a; Yu et al., 2021; Qi and Yang, 2020; Zheng et al., 2021; Yu et al., 2020; Lu et al., 2020; Jiang et al., 2023; Nair et al., 2023) / (Li et al., 2020)
SS (Acs et al., 2019; Schlegel et al., 2022) / / / (Pu et al., 2019) (Andreoletti et al., 2019b)
Blockchain (Qian et al., 2020) (Lei et al., 2020; Dai et al., 2020; Vu et al., 2019; Liu et al., 2020) / (Cui et al., 2022) / /
Cryptology -based Encryption Communication (Leguay et al., 2017; Yuan et al., 2016; Cui et al., 2020a; Jiang et al., 2020) (Xue et al., 2019, 2018; Zhang et al., 2022c) / (Chen et al., 2022a) (Xu et al., 2019; Pu et al., 2019) (Cui et al., 2020c; Araldo et al., 2018)
HE (Kong et al., 2019; Cui et al., 2020c) (Kong et al., 2019) / (Saputra et al., 2022) / /
PIR (Tong et al., 2022; Yan and Tuninetti, 2021; Kumar et al., 2019) / / / (Tong et al., 2022) /
Others Optimization (Sivaraman and Sikdar, 2021) / / / (Xu et al., 2019, 2020; Shi et al., 2018) (Andreoletti et al., 2019a)
Access Control (Cui et al., 2020a; Jiang et al., 2020) (Lei et al., 2020; Zhang et al., 2022c; Cui et al., 2020a; Nguyen et al., 2023) / / (Xue et al., 2019, 2018; Cui et al., 2020a) /

3.2. Mitigation Methods to Preserve Privacy in Edge Cache

In the following subsection, we will provide a concise introduction to a range of methods that can effectively mitigate privacy leakage in content caching systems, which can be mainly classified into four types of methods: (1) noise-based methods, (2) cryptology-based methods, (3) trusted distributed computing, and (4) other approaches. The specific solutions corresponding to each privacy mitigation approach are detailed in Section 4-6. For easy reference, we also present a classification matrix for the solutions introduced in this survey based on countermeasures and privacy data in the realm of edge cache in Table 2.

3.2.1. Noise-based methods

Noise-based methods represent the most prevalent approaches for preserving privacy within edge caching systems. These methods introduce disturbances to the real and genuine information before its exposure and interaction, effectively safeguarding privacy. Within the domain of edge caching, three specific types of methods are commonly employed: Differential Privacy (DP), confusion, and anonymization.

Differential Privacy (DP) is a data-sharing technique that allows data owners to share only some statistical characteristics of a database while withholding individual-specific information (Sivaraman and Sikdar, 2021; Acs et al., 2019). As shown in Fig. 3, there are two ways to add noise in the differential privacy mechanism. The traditional one is to add noise to the public database at the time of data release. However, the data collection agency is not always reliable, and thus Local Differential Privacy (LDP) mechanism is also leveraged by data owners to distort original data before submitting private data. The use of DP in edge caching systems can introduce distortion to the actual user or content information during the collection or release of sensitive data. DP is introduced to protect request traces (Zhang et al., 2018; Wang et al., 2019b; Zhou et al., 2019; Zhang et al., 2022a; Sivaraman and Sikdar, 2021) personal information (Zhu et al., 2021; Zeng et al., 2020), and machine learning models (Yu et al., 2022) in edge caching systems.

Refer to caption
Figure 3. Two distinct forms of differential privacy exist: the LDP mechanism, which entails data owners injecting noise into their sensitive data prior to submission, and the traditional DP mechanism, which adds noise during the data release process.

Confusion mainly has two ways to enhance privacy in edge caching. The first one is cache obfuscation (such as proactive cache (Qian et al., 2020; Nikolaou et al., 2016), off-path cache (Wu et al., 2016), and request hit delay (Liang and Liu, 2019)), which can be used to protect users’ requests when retrieving the content from monitoring or timing attacks in an untrusted or semi-trusted network environment. The second one is spatial confusion (Amini et al., 2011; GU Yi-ming, BAI Guang-wei, SHEN Hang, 2019; Zhang et al., 2019), which is to protect the location information when users enjoy location-based services.

Anonymous methods are the last category of privacy risk mitigation measures. Anonymity is the act of not being named or using an alias, as opposed to the act of having a real identity (Cui et al., 2020b). In particular, a set of public data satisfies K𝐾Kitalic_K-anonymity if the information of any entity cannot be distinguished from at least K1𝐾1K-1italic_K - 1 other entities. K𝐾Kitalic_K-anonymity method is often used to enhance geographical (Hu et al., 2018; Nisha et al., 2022; Yang and Kong, 2016) and personal privacy identity information (Sen et al., 2018; Cui et al., 2020b) in edge caching systems. Besides, the anonymity group technology is also used in protecting users’ identity information (Zhang et al., 2022c; Xue et al., 2019, 2018; Nguyen et al., 2023).

3.2.2. Trusted distributed computing-based methods

Trusted distributed computing (TDC) methods encompass three primary mitigation frameworks—Federated Learning (FL), Secret Sharing (SS), and blockchain technology—to safeguard privacy in the context of edge caching.

Federated Learning (FL) is a distributed machine learning technique that trains a learning-based algorithm across multiple decentralized devices or edge servers locally holding data samples without exposure (Brendan McMahan et al., 2017). The federated learning framework is one of the most essential methods to preserve private data during the machine learning process. It is common that the federated learning framework (Yu et al., 2018; Wang et al., 2019a, 2020; Liu et al., 2022; Yu et al., 2021, 2020; Li et al., 2020) trains learning models by exposing model parameters or gradients. Instead, traditional machine learning methods need to collect raw data for the learning process. However, model parameters or gradients are also private assets of users since attackers can infer and recover users’ private information from exposed model information. In addition, model information may have significant economic benefits, which will compromise the self-interest of model owners if they are exposed directly. A number of works (Yu et al., 2022; Wang et al., 2022a; Cui et al., 2022; Chen et al., 2022a) have contributed to upgrading the federated learning framework by injecting noise or other interference to model information prior to exposure.

Secret Sharing (SS), also known as secret splitting, is a kind of secure multi-party computation and storage method in which each party gets a part of the secret, called a secret share. The secretly shared information cannot be recovered unless a sufficient number of secret shares can be collected. A single share cannot restore the original secret. For example, the (t,n)𝑡𝑛(t,n)( italic_t , italic_n )-threshold scheme is the most straightforward secret-sharing scheme. In this scheme, there are a total of n𝑛nitalic_n players. Each player receives only one secret share. The secret can be recovered if at least t𝑡titalic_t players cooperate, but if fewer than t𝑡titalic_t players cooperate, where t𝑡titalic_t is the safety threshold parameter. SS can be introduced to protect request traces (Acs et al., 2019; Schlegel et al., 2022) and the content popularity information (Andreoletti et al., 2019b) in edge caching systems.

Blockchain is a technical solution that does not rely on third parties to carry out network data storage, verification, transmission, and communication through its own distributed nodes. As Fig. 4 shows, the blockchain mechanism can automate these four steps: (1) When a new blockchain transaction occurs, all participants can competitively record that transaction as a data block. (2) Following the rule of consensus, most participants on the blockchain network must vote for a valid recorded transaction. Depending on the type of network, the consensus mechanism of agreement can vary but is typically established at the start of the network. (3) Once participants have reached a consensus, transactions on the blockchain are written into blocks appended to a cryptographic hash that links blocks together as a chain. (4) The blockchain system finally updates and broadcasts a copy of the latest ledger to all participants. Blockchain can be used to enhance the protection of user preferences (Qian et al., 2020), personal information (Lei et al., 2020; Dai et al., 2020; Vu et al., 2019; Liu et al., 2020), and machine learning data (Cui et al., 2022) in edge caching systems.

Refer to caption
Figure 4. A simplified workflow depicting the blockchain mechanism for generating a new block and adding it to the chain.

3.2.3. Cryptology-based methods

Cryptology-based methods, as a vital category of mitigation approaches, play a significant role in preserving privacy within edge caching systems. These methods employ cryptographic techniques to safeguard sensitive content or information, ensuring confidentiality, integrity, and authentication. Within the realm of edge caching, three specific types of methods are leveraged: encryption communication, Homomorphic Encryption (HE), and Privacy-preserving Information Retrieval (PIR).

Encryption communication is divided into two steps to protect the security and privacy of communication data. The first step is to encrypt communication data as follows. The sender encrypts the content by an encryption algorithm and the receiver’s public key to obtain the ciphertext. The receiver, once getting the ciphertext, conducts decryption through the decryption algorithm and the private key to recover the original data. Encryption communication is commonly used to protect the security of user request records and other data in Internet communications. There are three main approaches for encryption in edge privacy-enhanced caching systems. One is symmetric encryption, which mainly uses Data Encryption Standard (DES), Advanced Encryption Standard (AES) (Pu et al., 2019; Yuan et al., 2016), or Searchable Encryption (SE) (Cui et al., 2020a). Second, asymmetric encryption mainly includes Rivest-Shamir-Adleman (RSA) (Xu et al., 2019), Attribute-Based Encryption (ABE) (Pu et al., 2019), and Elliptic Curve Cryptography (ECC) (Zhang et al., 2022c; Cui et al., 2020c). Finally, there are hashing algorithms (Xue et al., 2019, 2018; Xu et al., 2019), which are sometimes used in blockchain (Cui et al., 2022; Lei et al., 2020). However, there are also three significant concerns with the use of cryptographic methods in edge caching systems. Firstly, due to the existence of encryption, third-party edge cache often cannot directly use encrypted requests to retrieve related content, which may lead to the unavailability of edge cache. Secondly, introducing encryption technology may pose computational pressure on the resource-constrained edge and end devices. Lastly, encryption communication may fail to prevent record privacy from content providers or service providers, who have the key to decrypt request information. Therefore, how to introduce cryptology-based techniques into edge caching systems is still a challenging problem. In addition, as a special communication encryption method, the digital signature (Kong et al., 2019; Chen et al., 2022a; Jiang et al., 2020) is often used in the edge cache to verify user identity and data reliability.

Homomorphic Encryption (HE) is a form of encryption by which each party co-computes the result of a specific objective function concerning their private data without a Trusted Third Party (TTP). Each party cannot unveil private data from other parties even if the computation is completed. In other words, it allows a participant to perform operations such as searching and multiplying encrypted data to produce correct results without decrypting it during calculation. HE can be used to protect user preferences (Cui et al., 2020c) and information (Kong et al., 2019) when searching the edge cache.

Privacy Information Retrieval (PIR) is mainly used to protect a user’s request record information (Tong et al., 2022; Kumar et al., 2019) in the edge caching system. When obtaining sensitive data, request records likely expose important privacy information of users. PIR can help users with query needs to complete private data retrieval from the edge cache under the condition that the query privacy information is not leaked. In other words, the PIR technology can prevent attackers from obtaining precise query information and content items in cache retrieval or other sensitive queries. At the same time, PIR can let users obtain desired private content.

3.2.4. Other methods

Optimization-based methods and access control are the other two main methods to enhance the effectiveness of privacy protection in edge caching systems. In optimization-based methods, metrics such as privacy exposure and credibility are mathematically modeled. Then, the quantified metrics are regarded as the objective function or constraint variables of a cache optimization problem. Ultimately, the optimization problem is solved to obtain optimal privacy protection decisions. Access control is an enforcing control method that allows or denies a user’s access to a specific network resource, e.g., private content in the edge cache, based on the user’s account or group. Without a defined authorization mechanism, access to system resources will have no restrictions, and thus illegal device operations can be easily launched. The edge cache can implement strict access control to filter out unauthorized or illegal accesses into the caching space for privacy protection. Access control methods have been applied to protect personal information (Lei et al., 2020; Cui et al., 2020a; Zhang et al., 2022c) and content privacy (Xue et al., 2019, 2018) in edge caching systems. In the next section, we dive into the details of defence methods for protecting each type of sensitive privacy.

4. Enhancing User Privacy in Edge Cache

User privacy is the most important privacy in edge caching systems, which has attracted tremendous research efforts dominating the research on privacy preservation in edge caching systems. We discuss these defence methods based on three types of user privacy, i.e., request traces, personal information and location.

4.1. Privacy of Request Traces in Edge Cache

Request traces are the most critical privacy information in the edge cache, from which adversaries can obtain user preferences (Wang et al., 2019b). We summarize methods to protect user request records from four aspects which are noise-based methods, cryptology-based methods, trusted distributed computing-based methods and other methods. A brief timeline of solutions for enhancing the privacy of request traces is presented in Fig. 5. The solutions for enhancing other user privacy, e.g., personal information and location, are also summarized in Fig. 5 for the sake of brevity.

4.1.1. Noise-based methods

The initial class of methods to protect request records are noise-based methods. There are mainly two ways. The first one is to add noises generated by a mechanism such as Differential Privacy (DP) to protect information (Zhang et al., 2018; Zhou et al., 2019; Wang et al., 2019b; Zhu et al., 2021; Zeng et al., 2020, 2021; Guo et al., 2022; Wang et al., 2017). The second one is cache obfuscation methods (such as proactive cache (Qian et al., 2020; Nikolaou et al., 2016), off-path cache (Wu et al., 2016) and request hit delay (Liang and Liu, 2019)), which protect users’ requests when retrieving content from the monitoring or timing attacks in an untrusted or semi-trusted network environment. In the following, we will elaborate on these methods, orderly.

Refer to caption
Figure 5. A brief timeline of solutions aimed at enhancing user privacy, including request traces, personal information and location, in the edge cache. Each solution is accompanied by its main mitigation approach.
Differential Privacy (DP)

Content providers (CPs) often utilize edge caching nodes in edge networks and collect users’ private access records to predict user preference to improve delivery efficiency. However, directly collecting users’ profiles can lead to privacy breaches. Additionally, in highly dynamic scenarios, the entities of edge cache (e.g., Edge Nodes (ENs) (Zhou et al., 2019) and Edge Servers (ESs) (Zhu et al., 2021)) collect user request records in real-time and make dynamic decisions to improve the efficiency of edge caching. This real-time data collection process also poses a risk of privacy leakage, where DP-based methods can be employed to mitigate the risk.

Zhou et al. (Zhou et al., 2019) proposed a privacy-preserving and online distributed multimedia content retrieval system. Each EN in the system is modelled as an online learner to exploit user requests with a context that includes their background information (e.g., age, gender, location, social profile, and query criteria). The ENs can collaboratively make multimedia content recommendations and cache in the edge network. When an EN needs extra context information to make a retrieval scheme, the TTP sends noisy records to ENs by deploying differential privacy. A trust mechanism is also proposed to identify and remove malicious ENs. Zhu et al. (Zhu et al., 2021) studied the trade-off between privacy protection and caching efficiency in the edge cache. When a user generates a content rating vector, Gaussian noises are added to the original rating vector, and then the distorted rating vector is transmitted to the ES for privacy protection. In the global aggregation information stage, ES calculates the eigenvalues and eigenvectors of collected data based on the lightweight level calculation algorithm. Then, ES broadcasts the results to all users.

In collaborative edge caching, managers exchange sensitive information, such as user records or preferences (Zeng et al., 2020; Zhou et al., 2019), and routing records (Zeng et al., 2020), to improve caching efficiency. However, protecting privacy often in collaborative edge cache may rely on a centralized TTP, which is challenging to obtain in practice and places more pressure on network bandwidth. Moreover, if the centralized TTP is attacked, it may pose a more serious privacy breach risk. Zeng et al. (Zeng et al., 2020) proposed a distributed method to develop network caching and routing strategies for Small Base Stations (SBSs). The scheme adds a DP noise in the routing information (i.e., the portion of the requested content served by each SBS) during the exchange process to protect the privacy of SBSs and Mobile Users (MUs). It defines an optimization problem that minimizes the global cost, which is solved by a distributed protocol. Guo et al. (Guo et al., 2022) introduced a blockchain and DP-based decentralized edge-thing system for privacy preservation and fair utilization of edge computing resources. The proposed system employed blockchain technique to deal with transactions and smart contracts’ tempering issues caused by the malicious auctioneer node. Moreover, an exponential mechanism-based DP is applied to the double auction scheme to tackle the inference attack on auction results saved in the blockchain.

Hits on the user’s local cache can provide the best service experience for users. However, it is challenging for end devices that rely on a user’s personal historical information to make accurate pre-fetching decisions solely. Collaborative efforts between users are necessary, but such information exchange is risky, and the recorded history must be protected when disclosed. Wang et al. (Wang et al., 2019b) presented a mobile video pre-fetching strategy based on differential privacy and distributed online learning algorithms. They formulated the pre-fetching problem as an online optimization problem considering user preferences, video popularity, and social connections. The problem is then decomposed into two sub-problems, which are solved and swapped at each terminal by a distributed method to obtain the optimal global solution. A differential privacy mechanism is added in exchanging user-sensitive information during each round of iteration to protect user privacy.

Cache Obfuscation

In Information-Centric Network (ICN), users can directly access desired content from edge routing nodes. However, edge routing nodes are often vulnerable to cache side-channel attacks, which can result in the exposure of requested record privacy. Liang et al. (Liang and Liu, 2019) designed a method to defend against timing attacks in Content-Centric Networks (CCN). According to the privacy protection degree for requested content and the honesty degree of requested nodes, evaluated by the historical information, the caching node calculates the delay in responding to requests to defend against timing attacks. Further, Wu et al. (Wu et al., 2016) designed a multi-path caching strategy for ICN based on random linear network coding. The strategy encodes different video chunks into the same block for efficient content delivery. When the block is delivered along the path, it can only serve all routing nodes with related video chunk requests and keep unavailable to irrelevant nodes. It adopts a random forwarding method which increases the diversity of routing paths, thereby increasing the size of anonymity sets and the cost of inferring user privacy.

In addition, proactive caching of redundant and obfuscated content at the edge can interfere with an attacker’s ability to access the user’s actual request records. Qian et al. (Qian et al., 2020) proposed a privacy-aware content caching architecture for Cognitive Internet of Vehicles (CIoV) networks with proactive caching and blockchain technology. In this system, Roadside Units (RSUs) and smart vehicles can cache content in advance, which can provide the cached content in the form of a broadcast to meet the content needs of other vehicles. Therefore, a vehicle only needs to obtain content from broadcast data without further requests, which can reduce user privacy exposure. At the same time, blockchain technology is introduced to ensure a more secure and reliable transaction mode to guarantee the reliability of the content. Additionally, Nikolaou (Nikolaou et al., 2016) proposed two cache placement strategies for the joint caching of users. The first strategy considers the graph network structure between user terminals, and the second one focuses on the workload change of the server. However, transmitting requested videos between clients will leak privacy for both sides. The requested user proactively fetches and caches obfuscated content. At the same time, the server adds randomly obfuscated addresses when sending feasible retrieval address lists to reduce the risk of privacy exposure.

4.1.2. Trusted distributed computing-based methods

The second category of trusted distributed computing methods aim at safeguarding request records primarily comprises Secret Sharing (SS), a secure multi-party computation technique, that can effectively prevent attackers from acquiring valued request records. Acs et al. (Acs et al., 2019) proposed two timing attack defence methods for the edge router cache in the ICN network. For interactive traffic-type communication, random naming and secret sharing are used for privacy protection to prevent attackers from obtaining specific traffic information. In view of the content distribution traffic, a method of increasing artificial delay is proposed to protect privacy, and a certain delay is added to the private content that is hit by the router cache to prevent adversaries from determining the hit status of private-sensitive content.

4.1.3. Cryptology-based methods

Cryptology-based methods have been widely used to protect the security and privacy of user request records and other information in Internet communications. However, there are also three challenging problems when using cryptographic methods to protect the privacy of request records in edge caching systems. Firstly, due to the existence of encryption, third-party edge cache probably cannot directly use encrypted requests to retrieve related content, leading to the unavailability of edge cache (Yuan et al., 2016; Leguay et al., 2017). Secondly, introducing encryption technology may pose heavy computational pressure on the resource-constrained edge and end devices. Lastly, cryptology-based methods fail to prevent the leakage of record privacy from content providers or service providers, who have the key to decrypt requests. Therefore, how to apply cryptology-based techniques to edge caching systems is still a challenging problem.

Encryption Communication

To prevent the monitoring of users’ request records by Internet Service Providers (ISPs), efforts have been made to encrypt request records and the corresponding transmitted data using encryption algorithms while ensuring the availability of the cache within the ISP. Yuan et al. (Yuan et al., 2016) designed a system to achieve efficient encrypted video delivery in the ISP network. The content cached in the network is encrypted and distributed in the ISP network. This system can efficiently and safely locate and retrieve related content from the ISP network with a proposed encrypted content fingerprint index for a given encrypted request.

In order to improve privacy in the Content Delivery Network (CDN), Cui et al. (Cui et al., 2020a) proposed a novel encrypted method that combines Searchable Encryption (SE) and a multi-CDN strategy to achieve both content delivery performance and security in edge CDN nodes. The work introduces the SE method to realize content security and searchability. In addition, a semantically secure algorithm is used to encrypt user requests so that the same query can correspond to different request content. To further protect user preference privacy, a one-time nonce will also be used for secondary encryption, which will be transmitted together with the content transferred between CDN node clusters. For each request, the node must receive the nonce to search, and after the search hits, the nonce must be regenerated and re-encrypted before continuing to deliver the content.

Homomorphic Encryption (HE)

In previous works, HE has been introduced to protect the privacy of vehicles’ request records in IoV while collaborating with RSUs to improve the efficiency of edge caching. Cui et al. (Cui et al., 2020c) proposed a cooperative download scheme in the IoV network, considering the security and privacy protection of request traces. This scheme uses edge computing architecture to reduce transmission delay. It uses lightweight encryption methods, such as elliptic-curve cryptography, the Tesla broadcast authentication, and additive HE, to protect user privacy and content security. The strategy proposed in this work is composed of two phases: the non-accelerated phase and the accelerated phase, the details of which can be found in (Cui et al., 2020c).

Kong et al. (Kong et al., 2019) utilized an invertible matrix to construct multiple content requests sent by different vehicles such that the RSUs can recover each request without being associated with a specific car. Specifically, when a vehicle needs to initiate a request, it will first generate a kk𝑘𝑘k*kitalic_k ∗ italic_k random invertible matrix and send secret information required for HE to k𝑘kitalic_k vehicle users within a unified range. Then, in the response, a collaborative request group is randomly selected for the requested vehicle. Other vehicles in the group first generate the requested information according to the Paillier HE algorithm and send it to the RSU, returning the HE information to the requested vehicle. That vehicle completes the corresponding HE according to the returned information and the invertible matrix. Finally, it sends the encrypted request to the RSU to retrieve the private content without exposing its privacy.

Private Information Retrieval (PIR)

By utilizing PIR methods, users are able to obtain the content they desire while preventing potential leaks of their private interests. Kumar et al. (Kumar et al., 2019) were the first to introduce a PIR strategy based on encoding cache into wireless edge caching. Erasure-correcting codes are used to encode cached content, and different bit rates can be selected for videos with varying popularity to conserve backhaul bandwidth usage. Additionally, the scheme is based on general Reed-Solomon coding to safeguard user privacy from SBSs that may collude with one another. Furthermore, ensuring the integrity of content in the edge cache is essential for maintaining a stable edge caching system. This is particularly important because edge devices owned by individuals or small organizations are susceptible to cache tampering attacks and internal hardware failures. However, verifying the integrity of the content can compromise its privacy, especially when third-party verifiers are involved. To address this issue, Tong et al. (Tong et al., 2022) proposed an integrity-checking protocol for edge storage based on provable data possession to verify the integrity of cached content on a single EN. The protocol employs a PIR scheme and homomorphic verifiable tags to prevent the disclosure of sensitive information (e.g., user request traces, edge download schemes, and private content) to verifiers.

4.1.4. Other methods

Other methods, such as Optimization-based methods, are also introduced to enhance the privacy of request traces or user preferences in EC. Sivaraman et al. (Sivaraman and Sikdar, 2021) used game theory to formulate an off-path and cooperative caching problem in the edge of ICN, where users can choose their optimal routers in the edge network to cached content. Constraints in the problem include network latency, caching cost, and the amount of exposed user privacy. Two different privacy measures (i.e., conditional entropy and differential privacy) are used as constraints in the work. Finally, it is proved that a Nash equilibrium point exists in the game, which can be solved by an iterative method.

Furthermore, Cao et al. (Cao et al., 2020) studied the reliable and efficient performance of multimedia transmission services between Base Stations (BS) and MUs through a two-stage joint optimization. In the first stage of optimization, a service reliability evaluation mechanism is designed to evaluate the credibility of BS to ensure the security of user privacy information. Then, the price and reliability competition among BSes and the strategic interaction of all players are modelled by the Stackelberg game (He et al., 2007). A resource allocation problem is further proposed in the second stage to coordinate multiple MUs serving on the same BS. The potential game model is used to improve the transmission service performance. Additionally, Shi et al. (Shi et al., 2018) proposed a model for the cache placement problem in wireless edge caching, considering a multi-attacker scenario where both benign users’ and attackers’ locations follow a homogeneous Poisson Point Process (PPP). An optimization problem is formulated to determine the probability of each caching file, considering the average probability of successful eavesdropper attacks and transmissions in the wireless edge network. Finally, the genetic algorithm is used to maximize the secure transmission performance of the system.

4.2. Privacy of Personal Information in Edge Cache

Personal identity information is also sensitive in the network, which can be used by edge cache for carrying out sensitive operations such as permission control and cache admission control. However, excessive disclosure of users’ personal identity information makes it convenient for malicious nodes and attackers to spam users with advertisements and recommendations and attack edge servers by polluting cached content.

4.2.1. Blockchain-based methods

Previous works mainly employ blockchain to protect users’ identity information (Lei et al., 2020; Dai et al., 2020; Vu et al., 2019; Liu et al., 2020). Specifically, Vu et al. (Vu et al., 2019) proposed a blockchain-based CDN (B-CDN) architecture for content delivery, which enables anonymous operations on users. The B-CDN leverages intelligent contracts to maintain the blockchain and provide CPs with users’ registration and subscription functions while ensuring user privacy. Additionally, the B-CDN can reduce the cost of CP management by utilizing a public database of requested traces, which allows CPs to estimate users’ preferences with virtual identities and maximize the efficiency of their caching services.

Named Data Network (NDN) is a variant of the ICN, where content can be retrieved by the content name. Lei et al. (Lei et al., 2020) introduced a blockchain-based security architecture for improving the security and privacy of NDN-based vehicular edge computing systems. This work deploys blockchain nodes in edge servers and ISP nodes, where a delegated consensus algorithm is designed to enhance the efficiency of the blockchain. A three-layer management framework and an access control strategy are proposed for key management based on blockchain verification and vehicle attributes, respectively. A resource requester needs to prove to blockchain consensus nodes that it satisfies the access condition according to the access policy of the resource owner.

Dai et al. (Dai et al., 2020) designed a content caching mechanism based on the permissioned blockchain technology to address the problem of privacy and security in the vehicle edge computing network. A new block validator selection method is proposed to achieve a fast and efficient blockchain consensus mechanism. In addition, this work presents a deep reinforcement learning-based vehicle content caching algorithm.

Liu et al. (Liu et al., 2020) designed a decentralized caching framework empowered with blockchain credentials to tackle the challenges of content data verification and edge device authentication. In the designed system, it is possible to trace each transaction in an active edge network without a central manager. A cache order matching technique is devised to use the cache resources efficiently. Further, data integrity verification is done with the help of a content trading mechanism which helps data sharing among the edge devices of the edge network and ensures the efficiency of trading in the edge cashing system.

4.2.2. Other methods

The access control is also exploited to protect users’ identity information (Nguyen et al., 2023; Zhang et al., 2022c; Xue et al., 2019, 2018). In (Xue et al., 2019, 2018), Xue et al. proposed a secure and efficient network access framework (SEAF) for cache resources at the edge of ICN. The SEAF provides several security and privacy features, including content confidentiality, user privacy protection, user privilege revocation, countability, and efficiency. In SEAF, routers at the network edge authenticate user requests to separate access control from content provisioning. Only authenticated requests can enter the network; thus, authorized users can only access the bandwidth and cache resources inside the ICN. Meanwhile, to protect privacy, users can verify their identity to the edge router by generating a valid group signature, thereby maintaining users’ anonymity to the edge router. Zhang et al.  (Zhang et al., 2022c) focused on the security issues of cache-based software-defined networks, using the Tesla protocol to achieve fast authentication of the cache of vehicles and fog nodes. Besides, the Pedersen commitment mechanism is used to directly authenticate vehicles and fog nodes without exposing user identity privacy. Considering the limited computing power and delay-sensitive characteristics of the IoV, the author designed a set of cryptographic mechanisms supporting batch verification.

4.3. Privacy of Location in Edge Cache

The location information is a kind of critical privacy of a high value, including moving trajectory (Wu et al., 2023; Li et al., 2023; Zhang et al., 2022a), spatial coordinates (Zhang et al., 2019), and other unique features (Zhang et al., 2022a). Noise-based methods comprise the primary class of techniques employed to enhance location privacy as illustrated in Fig. 5.

4.3.1. Noise-based methods

As such, noise-based methods are mainly introduced to protect location privacy, including geographic differential privacy (Zhang et al., 2022a), Spatial Confusion (Amini et al., 2011; GU Yi-ming, BAI Guang-wei, SHEN Hang, 2019; Zhang et al., 2019), k𝑘kitalic_k-anonymity (Cui et al., 2020b; Nisha et al., 2022; Yang and Kong, 2016; Sen et al., 2018; Zhang et al., 2019), etc.

Differential Privacy

With the increasing mobility of users and the constant threat of malicious attacks from third parties, there is a growing risk of privacy breaches in mobile edge caching. In order to address this issue, Zhang et al. (Zhang et al., 2022a) proposed a DP-based method for improving the Video Quality of Experience (VQoE) for mobile users while protecting their location and preference privacy in mobile edge caching. The proposed scheme utilizes a privacy-preserving approach for computing the location transfer model and aggregating user preferences, achieving a balance between caching service efficiency and privacy protection in mobile edge networks. Specifically, the Laplacian perturbation model satisfying the LDP mechanism is employed to protect users’ location and preferences when submitting their information. Based on the perturbed information, mobile edge caching nodes can evaluate the popularity of videos in the user’s area, and Q-learning (Sutton and Barto, 1998) is employed to achieve cache optimization goals combined with transcoding technologies.

Spatial Obfuscation

Amini et al. (Amini et al., 2011) were one of the first to utilize devices’ cache to protect users’ location information, where location-based content can be periodically prefetched to devices in large geographic blocks before they are actually consumed. When content has been cached in a user’s local area, the user can access it directly on their device without needing external network services. This can effectively reduce privacy exposure risks for the user.

Additionally, privacy protection can be achieved through a distributed collaborative cache that forms anonymous user groups within the vicinity. Zhang et al. (Zhang et al., 2019) proposed a multi-level caching strategy to reduce the number of users directly requesting LBS from the Local Service Provider (LSP). In turn, users can obtain the required services from the local cache, surrounding neighbour caches, and trusted anonymizers. In this way, the interaction with untrusted LBS is reduced, and privacy exposure is mitigated. When the request is lost, it has to request the LSP by generating a stealth zone and making a request to the LSP. The anonymizer will select the optimal K-space anonymity to request content according to the prediction result (considering a user’s future geographic location, the caching contribution rate of each unit, and the freshness of the content in the unit).

However, the high communication overhead and computational energy consumption of users collaborating as a group pose problems in protecting privacy. Moreover, the introduction of centralized anonymizers is vulnerable to attacks, and if breached, all users’ private data may be compromised. To address these limitations, Ming et al. (GU Yi-ming, BAI Guang-wei, SHEN Hang, 2019) proposed a method that employs the trusted ESs to preprocess user requests and blur their location information during the snapshot query (i.e., one-shot query) of their POI. The ESs cache the requested POI for further query, thus minimizing the number of queries exposed to LBS providers and potential attackers. Additionally, in continuous queries, fuzzy prediction queries are generated and correlated with the actual query to enhance the queries’ utility while interfering with attackers.

Anonymity

The utilization of cache in edge devices, such as user devices (Yang and Kong, 2016; Cui et al., 2020b; Nisha et al., 2022), ENs (Sen et al., 2018) and RSUs (Hu et al., 2018), can keep users’ transparency from LBS providers by reusing the users’ POI within a specific region. This approach allows users to access the cached POI directly at the network edge instead of relying on remote LBS service providers. Additional privacy protections (e.g., k𝑘kitalic_k-anonymity (Yang and Kong, 2016; Hu et al., 2018; Zhang et al., 2023), l𝑙litalic_l-diversity (Cui et al., 2020b), anonymity groups (Sen et al., 2018; Nisha et al., 2022)) are exploited when resources have to be obtained from LBS providers. As a result, the likelihood of exposing sensitive location information to the service provider is reduced.

Zhang et al. (Zhang et al., 2023) devised a Caching-based Dual k𝑘kitalic_k-Anonymous (CDKA) mechanism to preserve location privacy. CDKA uses double anonymity and multilevel caching to reduce communication overhead while providing location privacy. For this, an edge server is used to intervene between the user and the LBS server, and location privacy is ensured by making mobile clients and edge servers anonymous. The proposed mechanism is assessed for computational efficiency, communication overhead, and cache hit ratio. Additionally, dealing with vehicles’ high-speed movement characteristics in vehicular networks, Hu et al. (Hu et al., 2018) designed a privacy protection algorithm combining proactive caching and the k𝑘kitalic_k-anonymity method. When a vehicle user requests a specific POI, it needs to send k1𝑘1k-1italic_k - 1 obfuscated requests simultaneously. Besides, the corresponding request content will be obtained through multiple passing RSUs to protect the user’s location information, including factual geographic and POI.

To further prevent users’ location and personal information from being accessed by untrustworthy EC and malicious users, Nisha et al. (Nisha et al., 2022) proposed a caching scheme called Group Collaboration Scheme (GCS) to request POI combining with spatial obfuscation. In this scheme, users who need to find POI in a specific area will modify the requested area according to the proposed random area obfuscation algorithm and then register with the group authenticator to obtain virtual group identity information and cooperative anonymous user groups. The collaboration is one-time, and the anonymous group changes as the user moves. Users with request requirements will cooperate with nearby users to query whether the cache of other users in the anonymous group meets the request requirements. If the request POI is unavailable in the user group, the required content will be requested in the name of the anonymous group.

4.3.2. Trusted distributed computing

To enhance the Quality of Service (QoS), CPs collaborate with ISPs to deploy edge caching resources as close to the users as possible. ISPs can support edge cache by placing Virtual Servers (VSes) at the network’s edge and assigning them to CPs. However, CPs only possess the request records of users, while ISPs only have access to their geographic location information. In the caching process, CPs do not want to disclose all the requested information to the ISPs, and vice versa. To deal with this challenge, Andreoletti et al. (Andreoletti et al., 2018) proposed a secure multi-party computation protocol to facilitate cooperation between ISPs and CPs without requiring either party to disclose sensitive information. The protocol enables ISPs to obtain the number of requests for specific video content in a given area at a low computational cost. Once the ISP has this information, it can deploy VSes efficiently, and the CP can use these VSes to place the edge cache, thereby minimizing the number of hops for content delivery and reducing communication delays.

Despite the comprehensive introduction of major solutions, our discussion is not exhaustive. Thus, we provide a supplementary introduction Table 3, briefly introducing other solutions to protect user privacy in edge caching systems that have not been discussed in detail in Section 4.

Table 3. A brief supplement of solutions to protect user privacy in the edge caching systems.
User Privacy Refs. Edge Cache Entities Mitigation Methods Key Ideas Potential Attackers
Request Traces (Zhang et al., 2018) APs LDP Add LDP noise to the users’ preference content information. CP
(Zeng et al., 2021) SBSs LDP Add LDP noise to the caching policy when the spread of the caching policy is needed. Other SBSs
(Leguay et al., 2017) ENs Encrypt. Comm. / Pseudonyms Cache symmetrically encrypted content with pseudo-identifiers. ENs
(Yan and Tuninetti, 2021) Users Devices PIR Propose a collaborating caching scheme with encoding methods based on PIR. Other Users
Request Traces / Personal Information (Jiang et al., 2020) ENs / Vehicles Encrypt. Comm. / Access Control Design a double-layer encryption scheme to achieve access control and data integrity verification in the edge cache of IoV. Other Vehicles
(Nguyen et al., 2023) ENs Opt.-based Introduce a novel distributed game-theoretic technique for collaborations among CP and ENs. CP
Location (Yang and Kong, 2016) User Devices Anonymity Disturb the real POI with k𝑘kitalic_k-anonymity method during interaction with LBS. LBS
(Cui et al., 2020b) User Devices Anonymity Combine peer-to-peer caching technique and l𝑙litalic_l-diversity to reduce privacy exposure during interaction with LBS. LBS
Request Traces / Location (Nguyen et al., 2023) ISP Anonymity / Spatial Obf. Proposed a double cache strategy with a pair of caches for users in a specific region to jointly request their POIs. LBS / Other users

5. Enhancing Content Privacy in Edge Cache

In this section, we move on to discuss defence methods that can preserve the second type of sensitive privacy, i.e., content privacy, in edge caching systems. For these defence methods, we present a timeline, as depicted in Fig. 6, summarizing the methods employed to safeguard content privacy, encompassing private content data and content popularity.

5.1. Privacy of Content Data in Edge Cache

Other than caching content for CPs, edge nodes are also able to cache private content generated by users. However, due to the presence of incompletely trusted ENs (Pu et al., 2019; Xu et al., 2019) or malicious and unauthorized users (Xu et al., 2019, 2020) in the edge network, stored content in EC may face privacy leakage risks.

DP-based methods are used to upload local data in the network cache while preserving its privacy. For example, Wang et al. (Wang et al., 2022b) proposed a Differential Privacy-Preserving Peep Learning Caching Framework (DP-DLCF) to deal with the privacy leakage problem of private content in edge caching networks. The privacy budget is utilized adaptively to strike a trade-off between the privacy and accuracy of the prediction. In the proposed technique, users upload their data after perturbing it with a randomized response technique based on LDP to preserve the privacy of their local data. Next, the neighboring BS accumulates the uploaded data and transfers it to the deep model for training. Moreover, the prediction accuracy of the model training is improved by the bootstrap aggregation algorithm.

Crytology-based methods can also be leveraged in protecting the private content in the edge cache. Pu et al. (Pu et al., 2019) proposed a secure and privacy-aware content-sharing strategy to protect sharing data stored and delivered by incompletely trusted ESs. To ensure the secure sharing of content, the content generator first encrypts the content using the Ciphertext-Policy Attribute-Based Encryption (CP-ABE) algorithm and calculates its signature based on its private key. Additionally, by utilizing the public key cached at the nearest ES, the generator performs secondary encryption of the content to the nearest ES. When the ES receives the encrypted content from the content generator, it will first decrypt the content with its private key and check the security of the content. According to the secret sharing scheme, ES randomly divides the content into n𝑛nitalic_n parts and distributes the content parts to other n1𝑛1n-1italic_n - 1 ESs to store the content. The proposed scheme can effectively ensure the integrity and recovery ability of the content in case any edge cache node becomes offline.

Optimization-based methods are introduced to enhance the privacy of content caching in edge servers. To prevent private content from leaking to the unreliable edges and make optimal caching decisions for MUs, Xu et al. (Xu et al., 2019) used the multi-leader and multi-follower Stackelberg game to model a multi-link cache scenario in the mobile edge network. In the scenario, Edge Computing Small Base Stations (ECSBS) act as leaders and, firstly, set pricing strategies in a non-cooperative game. Then, a trust mechanism is proposed to evaluate the reliability of each ECSBS, which consists of two parts: direct trust degree and indirect trust degree. Based on the caching reliability and pricing offered by ECSBS, MUs can make their optimal caching decisions as followers. Additionally, Xu et al. (Xu et al., 2020) proposed a Stackelberg game model to encourage Edge Cache Devices (ECDs) to provide secure caching services in both static and dynamic scenarios. The model takes into account the selfish and open nature of ECDs and employs a zero-payment mechanism to penalize ECDs that provide poor services. The optimal strategies for the CP and ECDs in a static game are analyzed, proving the existence of a unique equilibrium in the Stackelberg game. Besides, in dynamic games with incomplete information, the Q-learning algorithm is used to solve the problem.

Refer to caption
Figure 6. A brief timeline of solutions for enhancing content privacy, including private content data and content popularity.

5.2. Privacy of Content Popularity in Edge Cache

Content popularity, which can be used as the key knowledge to improve caching efficiency, is business-critical information for the CPs and edge caching managers (e.g., ISP). Due to the limited number of records in the service scope of edge cache (e.g., serving a specific geographical location range or a particular network level), edge caching suppliers may require content providers and other edge caching entities to provide the critical content popularity information so that they can judiciously make caching decisions so as to shrink bandwidth consumption of the core network.

Andreoletti et al. (Andreoletti et al., 2019a) improved the solution proposed in (Yuan et al., 2016) by allowing CPs to encrypt content and associate them with pseudonyms to prevent privacy leakage to edge caching managers. ISPs only count the occurrences of these pseudonyms to infer content popularity without examining the original content. The authors introduced the mathematical definition of privacy and studied the trade-off relationship between privacy and hit rate, retrieval latency, and traffic load metrics. Additionally, Andreoletti et al. (Andreoletti et al., 2019b) proposed a protocol for spatial partitioning of ISP caches based on the popularity of different CPs’ content, which aims to improve the quality of service (QoS) of edge caching services while protecting CPs’ privacy of popularity information. The protocol employs the Shamir Secret Sharing scheme for CPs to share the popularity information between the ISP, and the Regulator Authority (RA), which guarantees a fair subdivision of the cache storage and the preservation of privacy. The ISP can calculate the caching space requirement for each CP using the secret information, thus protecting CPs’ privacy.

Similarly, Araldo et al. (Araldo et al., 2018) proposed a caching space partitioning method that protects the popularity information of CPs while ensuring the efficiency of edge caching. The method divides the ISP’s caching space into multiple slices and assigns each slice to different CPs using the stochastic dynamic cache partitioning algorithm. The algorithm takes an initial slice allocation as input and iteratively optimizes the slice allocation scheme by testing the Cache Miss rate of the allocation scheme in each round. However, unlike the partitioning method proposed by Andreoletti et al. (Andreoletti et al., 2019b), this method does not depend on the private information of CPs’ popularity. Additionally, this architecture also supports a transparent cache of encrypted content deployed at the edge of the ISP network.

Refer to caption
Figure 7. A brief timeline of solutions for enhancing knowledge privacy.

6. Enhancing Knowledge Privacy in Edge Cache

In this section, we discuss defence methods that can preserve privacy for the last type of privacy, i.e., knowledge privacy, in edge caching systems. All edge caching service providers have the motivation to extract knowledge for improving caching performance, which gives rise to the trade-off between caching performance and privacy protection. Due to the high dynamics and complicated access patterns driven by users’ interest (Muller et al., 2017; Yang et al., 2019), it is essential to come up with intelligent edge caching algorithms to improve the caching performance. Machine learning-based methods provide a feasible framework to extract user access patterns by exploiting collected datasets related to users, which may contain sensitive information. For example, video request access patterns are driven by users’ interest in different locations (Ma et al., 2017; Dhar and Varshney, 2011). Users may keep dynamic moving (Dai et al., 2020), and their interests evolve over time (Zhang et al., 2022d). Thus, it is necessary to make edge caching decisions based on features which can be extracted from localized and private user information by machine learning methods.

Federated Learning (FL) as a distributed machine learning framework is the most popular method to preserve knowledge privacy. FL trains a learning-based algorithm across multiple decentralized devices or edge servers holding local data samples without exposing them. Additionally, we provide a comprehensive summary of the FL-based methods employed to safeguard knowledge privacy, presented in a timeline illustrated in Fig. 7. Table 4 offers a detailed classification of these solutions based on the combination of methods used.

Table 4. Protection methods of private data in extracted knowledge.
Methods Refs. Training Dataset Model or Gradient Data Other Machine Learning Data
Origin FL framework (Yu et al., 2018; Wang et al., 2019a, 2020; Liu et al., 2022; Yu et al., 2021, 2020; Li et al., 2020)
Combination of origin FL framework and other privacy protection methods (Qi and Yang, 2020; Cheng et al., 2021; Zheng et al., 2022; Wang and Deng, 2022; Saputra et al., 2022)
Noising FL framework (Yu et al., 2022)
Combination of noising FL framework and other privacy protection methods (Wang et al., 2022a; Cui et al., 2022; Chen et al., 2022a)

6.1. Enhacing Knowledge Privacy with FL Frameworks

The most common approach is to use an FL framework to train prediction models. Unlike traditional machine learning methods, FL does not collect raw data for model training (Yu et al., 2018, 2021, 2020). This framework encourages models to be trained on local data, and all training works upload model parameters or gradients rather than sensitive raw data. Yu et al. (Yu et al., 2018) were probably the first to propose a learning-based proactive content caching method following the FL framework. This work proposes a hybrid filtering method based on the autoencoder to calculate the user-content similarity and predict the content of a user’s interest. Yu et al. (Yu et al., 2021) also designed an FL-based proactive caching method for vehicular networks. Considering the high mobility of vehicles and dynamic content popularity in vehicular networks, RSUs integrate the mobility-aware cache replacement policy to make proactive caching decisions. Following the FL framework, the above three works enable users to train machine learning models (e.g., autoencoder model) with their private datasets, locally and distributively, and upload trained models to the corresponding parameter server for aggregation.

Reinforcement learning can be realized in the FL framework to solve the complex dynamic control problem and mitigate the privacy leakage problem in edge caching systems (Wang et al., 2019a; Li et al., 2020; Wang et al., 2020; Liu et al., 2022; Abadi et al., 2016; Qiao et al., 2022; Xiao et al., 2018) to improve the caching performance and privacy protection simultaneously. Wang et al. (Wang et al., 2019a) proposed an “In-EDGE AI” system with deep reinforcement learning in FL. It delegates the reinforcement learning training task to the device side to protect the private dataset and brings more intelligence to edge systems. Liu et al. (Liu et al., 2022; Zheng et al., 2021) proposed a privacy-preserving distributed deep deterministic policy gradient scheme to make caching decisions for EC. At the same time, to preserve user privacy, the model only predicts content popularity by avoiding mining sensitive historical information. The model training process is completed by FL in order to prevent users from leaking privacy to ESes. Qiao et al. (Qiao et al., 2022) proposed an FL-based proactive content caching scheme to shorten content retrieval latency and protect users’ private datasets. Firstly, the edge computing architecture reduces energy consumption and transmission overhead. The problems of client selection and local iteration round selection in the FL process are modeled as an MDP, which is solved by the deep reinforcement learning algorithm. The solution can alleviate the non-independent and independent distributed (Non-IID) data distribution problem and limited resources for end users.

In vehicular networks, privacy-preserving edge caching nodes, such as at RSUs, can also be effectively achieved by combining FL and DRL frameworks. However, the high mobility of vehicles introduces additional challenges to edge caching efficiency and privacy security. To tackle these challenges, Wu et al. (Wu et al., 2023) designed an asynchronous federated learning model to evaluate regional content popularity, taking into account vehicle movement speed, RSU coverage, and network channel conditions. They modified the selection of training vehicles and the aggregation function’s weight, assigning different weights to vehicles with varying dwell times and channel conditions. They proposed a joint content placement strategy based on dueling DRL to overcome the caching efficiency degradation caused by high vehicle mobility. This strategy further reduces content transmission delay while ensuring user data privacy and RSU joint caching efficiency in edge vehicle computing scenarios. Li et al. (Li et al., 2023) tackled the privacy and long-term training delay issues in high-precision map caching in intelligent connected vehicles (ICV) by formulating a framework called federated deep reinforcement learning (F-DRL). F-DRL is an MDP-based edge cooperative caching technique in which Dueling-Deep-Q-Network (Dueling-DQN) is employed to optimize the adaptive edge caching scheme with an improved FL approach to preserve the privacy of ICV. For FL, resource provision and member vehicle selection are made using joint optimization to minimize the delays in training and load on the edge cache.

6.2. Combining FL with Other Methods

Other than requiring sensitive data to train machine learning models, the edge caching system may also need private data to make edge caching decisions. Therefore, some works (Qi and Yang, 2020; Cheng et al., 2021; Zheng et al., 2022; Wang and Deng, 2022) have introduced additional privacy protection methods into the FL framework to enhance data privacy during the model training process. Zheng et al. (Zheng et al., 2022) proposed a privacy-preserving FL model to predict popularity in an unsupervised manner. The prediction method introduces two concepts: local and global popularity, considering both efficiency and privacy. Local popularity can be evaluated by historical information by the Long Short-Term Memory (LSTM) model on users. In contrast, global popularity can only be predicted by the information at the current moment, which will be erased immediately at the next moment. FL is applied to perform offline training and online popularity evaluation with distributed information to avoid exposing privacy. Wang et al. (Wang and Deng, 2022) proposed a private FL-based caching scheme, which utilizes an FL framework and a pseudo rating matrix to collect statistical characteristics of user groups. With this distorted information, the server can predict the popularity of content and make caching decisions. The scheme also protects the privacy of individual users from being accessed by servers and other users. Saputra et al. (Saputra et al., 2022) introduced the HE method into the FL framework to protect the privacy of MUs with constrained computing resources. The scheme allows MUs to upload encrypted training data to ENs, which can perform additional training processes. The portions of the encrypted decision problem are modeled as a multi-objective profit maximization problem considering both privacy and training costs. The optimization problem is proved to be a concave function that can be solved by the interior point method. At the same time, the training data cached at the EN or the cloud node is HE based on the Brakerski Fan Vercauteren (BFV) method.

6.3. Combining Noise-based FL with Other Methods

More parameters or gradients (representing knowledge extracted from user-related data) can expose user privacy because attackers can probably infer and restore user information from exposed model information. In addition, model parameters may have a huge economic value, and directly uploading model parameters will compromise the self-interest of model owners. Therefore, there are works dedicated to upgrading the FL framework by adding noises (Lu et al., 2020; Yu et al., 2022; Jiang et al., 2023; Nair et al., 2023) or other interference (Wang et al., 2022a; Chen et al., 2022a; Cui et al., 2022) to model information prior to exposure.

The FL framework has further employed DP-based noise to safeguard the parameters or gradients in previous works. Lu et al. (Lu et al., 2020) designed a differentially private asynchronous federated learning scheme to share resources in vehicular networks. The proposed scheme uses LDP to perturb the local model parameters with noise drawn from the Gaussian distribution. Moreover, a distributed random update method is used to preserve the privacy of the global ML model during the update process. Yu et al. (Yu et al., 2022) proposed an FL framework based on privacy protection so that the user dataset is always kept locally. Further, the LDP mechanism is added while exchanging model parameters for aggregation to protect user privacy. In addition, this work proposes a hierarchical joint caching mechanism to combine the characteristics of local caching and global caching. A weighted aggregation method is used to solve the data imbalance problem. Jiang et al. (Jiang et al., 2023) developed a privacy-preserving FL framework for industrial data processing. This framework works by compressing adaptive gradients in the first place during model training at the edge terminal. Afterwards, hybrid DP is applied to optimize the FL framework, and the privacy-preserved gradients are transferred in the industrial environment.

Furthermore, some efforts try to enhance privacy protection in edge caching systems by integrating the Generative Adversarial Network (GAN) technique with FL. Wang et al. (Wang et al., 2022a) combined FL and Wasserstein Generative Adversarial Network (WGAN) to improve further the efficiency of model training and accuracy of the popularity prediction model. With the fake data generated by WGAN, the privacy of users’ real preferences can be enhanced. Besides, gradient clip** and model parameter restriction are applied at the training time to protect model privacy and security.

Privacy preservation in vehicular edge computing is demanded since new attack types are developed continuously. To cope with the situation, Chen et al. (Chen et al., 2022a) proposed a novel edge computing approach that utilizes unmanned aerial vehicle swarms as edge computing nodes to aggregate model parameters and caches the model parameters, thereby reducing the communication cost of the core network and protecting users’ dataset. To enhance the security and privacy protection of cached model parameters, the authors designed a comprehensive protocol for model aggregation, storage, and transmission, which can effectively prevent potential security threats, such as poisoning attacks, man-in-the-middle attacks, and eavesdrop** attacks. Meanwhile, to defend against pollution attacks, the cosine similarity between local parameters and its edge aggregation parameters is calculated to exclude parameters uploaded by malicious nodes. Then, parameters are re-aggregated, and the aggregated parameters are sent to the cloud servers for the final process. Schnorr signature is also added before uploading the aggregated model parameters to ensure the reliability of the parameters.

In the Internet of Things (IoT) realm, edge computing architectures can expedite data processing, while edge caching can accelerate file delivery speeds for IoT devices. To ensure the reliability and privacy of data in IoT networks, Cui et al. (Cui et al., 2022) proposed a blockchain system comprising four contracts to predict content popularity, cache, and deliver sensitive content. Meanwhile, to improve the security and throughput of the system, the Proof of Stake (PoS) consensus mechanism based on reputation is modified and applied to reach consensus more efficiently. Besides, the FL algorithm based on compressed gradients is used to protect the privacy information of edge nodes and reduce communication overhead. The K-means algorithm filters important gradients that must be uploaded accurately. These gradients are then quantified using a clustering-based quantization algorithm to reduce the amount of data uploaded. Meanwhile, an averaged gradient value is uploaded to the server for other gradients with a small value. Blockchain technique is also used to verify uploaded data. Recently, the Internet of Medical Things (IoMT) is becoming popular. However, it is also prone to privacy threats like other edge computing-based approaches. To tackle these challenges for IoMT-based big data analytics, Nair et al. (Nair et al., 2023) proposed an edge computing-based FL scheme called Fed_select, which ensures privacy and provides load reduction at the central FL server by introducing an edge server. To ensure privacy, Fed_select performs user anonymity at the edge server by employing hybrid encryption techniques with client and attribute selection performed at the edge server. Moreover, DP with Laplace noise is applied to the shared gradients to make them private during transfer.

Due to limited space, we briefly present an overview of additional solutions for safeguarding knowledge privacy in edge caching systems in Table 5, which covers methods not fully discussed in Section 6.

Table 5. A brief supplement of solutions to protect knowledge privacy in the edge caching systems.
Protected Entities Refs. Edge Cache Entities Mitigation Methods Key Ideas Potential Attackers
Users (Li et al., 2020) BSs / User Devices Distributed ML Proposed a weighted distributed DRL model for edge caching replacement in D2D networks. BSs
Vehicle Devices (Yu et al., 2021) RSUs / Vehicles FL Designed a peer-to-peer-based FL framework for proactive caching in vehicular edge networks. BS / RSUs
IoT Devices (Wang et al., 2020) BSs FL Proposed an FL-based cooperative edge caching framework with the DRL technique. BSs
Users (Zeng et al., 2021) User Devices FL / Blockchain Proposed a privacy-preserving D2D caching method with the combination of FL framework and a two-layer blockchain structure. User Devices
Users (Qi and Yang, 2020) BSs Noised-based FL Proposed FL-based method to predict content popularity with obfuscated feature information. BSs

7. Open Challenges and Future Research Directions

In this section, we discuss open challenges and future research directions worth exploring in Privacy-Preserving Edge Caching (PPEC).

7.1. Trade-off Between Collaboration and Privacy in PPEC

Due to the large scale of network applications, it is common to deploy multiple edge servers for collaboratively caching content. To enable collaborations between edge servers, critical information such as cached content or other private information will be exchanged between edge servers, which can expose user privacy and raise privacy concerns. We outline two open privacy concerns when multiple edge servers share sensitive information.

7.1.1. Content-right confirmation

Digital content can be easily copied and distributed, which is a double-edged sword making content-right confirmation difficult. For instance, when social media content cached on a particular edge server is accessed by other edge servers, the edge server completely loses control of cached social media content because other edge servers can easily copy and redistribute this social media content (Araldo et al., 2018; Cui et al., 2020a; Yuan et al., 2016). Content-right confirmation is essential for content owners to maintain availability and accountability when using edge caching services to preserve content privacy. On the one hand, with content-right confirmation, it is easy to determine content ownership. Privacy strategies can be implemented to ensure that only authorized parties can access the content cached in an edge server. On the other hand, content-right confirmation is the basis of content accountability. With content-right confirmation, the content right can be authenticated when the content is used once, with the right changed accordingly.

However, realizing content-right confirmation in edge caching systems is non-trivial due to several challenges. First, the distributed nature of edge caching systems makes it difficult to maintain a centralized and trusted authority for content ownership verification. Second, edge nodes’ dynamic and heterogeneous nature introduces complexity in implementing content-right confirmation mechanisms, which must be scalable and adaptable to different edge devices. Furthermore, the use of encryption and privacy-enhancing technologies in edge caching systems further complicates content-right confirmation. While these technologies are essential for protecting the privacy of cached content, they may also prevent content owners from verifying the use of their content in the cache.

The challenges of realizing content-right confirmation in edge caching systems call for future work in several directions. Firstly, new verification mechanisms are needed that can handle the distributed nature, dynamics, and heterogeneity of edge nodes. To overcome these challenges, several approaches have been proposed in the literature, such as blockchain-based solutions (Lei et al., 2020; Vu et al., 2019). However, these mechanisms are short in scalability and the ability to adapt to different edge devices. Additionally, privacy-preserving verification methods that can coexist with encryption and other privacy-enhancing techniques should be explored. One possible solution is to leverage secure multi-party computation (Andreoletti et al., 2018) to enable verification while preserving the privacy of cached content. Finally, standardization efforts are needed to ensure interoperability between different edge caching systems and content providers. For example, the trust management mechanisms (Zhong et al., 2021; Xu et al., 2019) are proposed to enable content-right confirmation in ISP and D2D edge caching, respectively. Thus, promoting the adoption of content-right confirmation mechanisms and facilitating the collaboration between different stakeholders in different edge caching scenarios need to be further discussed.

7.1.2. Coalition mechanism design

Collaborative edge caching is essential for enhancing QoS. However, many edge caches are deployed on leased nodes provided by profit-oriented third-party providers, which are often decentralized (Cui et al., 2020a), unreliable (Leguay et al., 2017) or self-interested (Cui et al., 2020a). For instance, edge caching routers can be unreliable (Leguay et al., 2017; Cui et al., 2020a) in CDN, while RSUs and vehicles can be semi-trust (Qian et al., 2020; Zhang et al., 2022c; Hu et al., 2018; Jiang et al., 2020) or self-interested (Dai et al., 2020; Cui et al., 2020c; Kong et al., 2019) in the edge IoV caching network. Similarly, in social media networks, edge servers can be self-interested (Xu et al., 2019, 2020). To enable privacy-preserving applications and technology cooperation among edge caching systems, coalition mechanisms are required. These mechanisms involve designing an incentive and allocation model that encourages participants in edge caching systems to join the coalition and maximize their benefits through a reasonable selection. Additionally, punishment mechanisms should also be considered when there are untrustworthy or dishonest nodes in the system. Hence, incentive and allocation mechanisms can be designed in a thoughtful manner to foster participation and cooperation among edge caching systems.

However, designing coalition mechanisms for PPEC is complicated because it is necessary to balance several conflicting objectives. On the one hand, the mechanisms should encourage participants to contribute their resources to the coalition, ensuring that the costs and benefits of participants are distributed fairly (Andreoletti et al., 2018). On the other hand, they must incentivize participants to prioritize the interests of the coalition over their individual interests and punish illegal strategies (Xu et al., 2020), which is challenging when participants are profit-oriented with conflicting goals (Xu et al., 2019).

Game theory is a powerful mathematical framework for investigating decision-making processes, and interactions among rational individuals or entities in coalition mechanisms. Its applications can optimize edge caching, capturing interactions between content providers, network operators and end users. Various game-theoretic models such as the non-cooperative game (Sivaraman and Sikdar, 2021), Stackelberg game (Xu et al., 2019, 2020; Cao et al., 2020), coalition games and potential games (Cao et al., 2020) can analyze the interaction among participants in edge caching systems. For instance, in a Stackelberg game, one player acts as the leader while the others follow. In the context of edge caching, the content provider can be modeled as the leader, while network caching operators are the followers (Xu et al., 2020). Similarly, the edge cache can act as the leader, followed by end users (Xu et al., 2019). Nevertheless, game theory-based approaches with complete information (Xu et al., 2019, 2020) can not be directly applied in privacy-preserving scenarios since the information is likely incomplete to players in edge caching systems. Besides, it is impractical to assume that every player is benign in an open-access edge network. There may exist semi-honest and even malicious players. Therefore, it is necessary to conduct further investigations into the coalition mechanisms when analyzing the complex interactions between different kinds of players in collaborative PPEC.

7.2. Limited Capacity for Running Privacy-enhancing Algorithms

Edge devices are becoming increasingly crucial in edge caching networks. However, these devices are typically limited in processing power, memory, caching space, and energy capacity, which present challenges for running privacy-enhancing algorithms:

  1. (1)

    Limited computing power and memory pose a significant challenge on implementing complex privacy-preserving algorithms on edge devices (Ni et al., 2021). To address this challenge, the development of lightweight privacy-preserving algorithms is desired to protect user privacy without compromising caching performance. Lightweight homomorphic encryption (Cui et al., 2020c), identity authentication (Xue et al., 2019, 2018; Zhang et al., 2022c; Tong et al., 2022), and differential privacy (Sivaraman and Sikdar, 2021; Acs et al., 2019) are prospective approaches that can reconcile privacy protection and computational efficiency. Additionally, it is vital to ensure that the developed algorithms are robust and productive, meeting the needs of edge devices.

  2. (2)

    Energy constraints: Edge devices such as autonomous vehicles and smartphones are often battery-powered (Mao et al., 2017), which can limit the ability of caching (Saputra et al., 2022) and communication (Cui et al., 2022), and hence lower the performance of privacy-preserving algorithms (Qiao et al., 2022; Saputra et al., 2022). To address this challenge, energy-efficient caching management techniques should be developed to minimize the energy consumption of privacy-preserving edge caching approaches. Techniques such as data compression (Cui et al., 2022) and optimization models (Qiao et al., 2022) can be adopted to reduce the consumption of caching and communication to minimize energy usage. For instance, an energy-aware client selection and communication method for FL was proposed in (Qiao et al., 2022) that reduced energy consumption by up to 50% compared to traditional FL methods when protecting the privacy of data sources.

  3. (3)

    Cache space is another significant constraint for edge caching systems due to the limited storage capacity compared to the vast amount of content that can be cached. However, research has shown that only a small fraction of content is popular, while the majority of users concentrate their access on popular content, implying a long tail distribution of content popularity (Ma et al., 2017; Wang et al., 2019b). Therefore, it is crucial to determine which content should be cached based on popularity and user preferences while considering privacy concerns. Privacy-preserving edge caching approaches need to trade-off between privacy protection and caching performance. The next subsection will further discuss the challenges and potential solutions for addressing this trade-off.

Implementing privacy-preserving algorithms can complicate the system, which adversely impacts caching performance (Liang and Liu, 2019). Conversely, simplifying the system may increase the risk of privacy breaches. Challenges associated with PPEC include balancing the complexity of protection algorithms with caching performance (Xue et al., 2019, 2018; Zhu et al., 2021) or limited resources (Sivaraman and Sikdar, 2021; Tong et al., 2022; Cui et al., 2022), and ensuring user privacy while enabling efficient content distribution (Cui et al., 2020c; Yuan et al., 2016). To sum up, an in-depth understanding of the limitations of edge devices and designing practical and feasible solutions are vital in enhancing PPEC.

7.3. Trade-off between Privacy and Intelligence in PPEC

Machine learning-based methods have become a powerful tool for optimizing edge caching performance and develo** intelligent caching algorithms (Qiao et al., 2022; Liu et al., 2022). However, there has been some controversy regarding privacy violations in edge caching. Privacy concerns arise when edge caching providers analyze and manage content in their cache since the storage spaces of edge cache are limited, and the content scale is growing rapidly (Wang et al., 2019b). To provide intelligent edge caching, providers may be curious about the content stored in their cache (e.g., popular content (Araldo et al., 2018; Cui et al., 2020c, a)) and the confidential information about consumers (e.g., request record (Wu et al., 2016; Yuan et al., 2016; Nikolaou et al., 2016), identifiable information (Nguyen et al., 2023; Zhang et al., 2022c; Xue et al., 2019, 2018)). Providers may use monitoring and inference attacks to compromise consumers’ privacy to improve caching efficiency and gain economic benefits. Therefore, develo** effective privacy-preserving mechanisms in intelligent edge caching algorithms is crucial to address these problems.

However, the open-edge network provides an ideal surface for attackers to obtain private data or knowledge from machine learning methods designed for edge caching systems. Therefore, reconciling privacy and efficiency in intelligent caching methods in the open-edge network is challenging due to several reasons. Firstly, the diversity of user requirements and content in machine learning-based edge networks can be very significant than traditional caching systems (Zhou et al., 2019), which makes it difficult to apply traditional privacy-enhancing mechanisms directly. Secondly, the edge network is usually open, and multi-access (Xu et al., 2019), implying that it is difficult to control the access of the cache so as to preserve privacy. Finally, the semi-trust or unreliable third-party caching service providers exacerbate the challenge to the design of protection method (Xue et al., 2019, 2018; Qiao et al., 2022; Araldo et al., 2018; Liu et al., 2022). Therefore, designing intelligent caching methods that can well balance privacy and efficiency is challenging.

The FL framework is one of the essential methods to preserve private data in the machine learning process (Wang et al., 2020, 2019a). Based on the FL framework, different parties may distributedly predict the critical information, e.g., content population (Qiao et al., 2022; Liu et al., 2022; Zheng et al., 2022) or user location migratory pattern (Wu et al., 2023), for intelligent caching decisions at the network edge. However, some parties may be dishonest and malicious. In particular, malicious users in FL may bring poisoned data to affect the overall computing of the global model. For example, dishonest parties may back-infer their partners’ model by collecting their gradients to infer private information (Cui et al., 2022; Yu et al., 2022). These attacks can lead to the disclosure of critical privacy information or destroy caching performance. Additionally, some adversaries even deliberately provide incorrect model parameters during lateral FL to disrupt the overall computation and impact model performance (Wang et al., 2022a). The security of private computing in the FL framework is also a challenging open topic for edge caching.

7.3.1. Reconcile privacy and efficiency in dynamic networks

Online learning has become increasingly popular for solving complex problems in various fields, including edge caching in dynamic network environments (Krishnendu et al., 2022; Zhang et al., 2022b; Chen et al., 2022b; Cui et al., 2023). However, traditional edge caching algorithms often rely on pre-determined and non-optimal edge caching policies in dynamic network environments where network conditions (Zhang et al., 2022b) and user behaviour (Ma et al., 2017) can change rapidly over time. Besides, in a real network system, the access pattern of resources is highly dynamic (Zhang et al., 2022d). Some existing works proposed to spend a high cost to train the machine learning model, which may not be acceptable for resource-constrained edge or terminal devices (Cui et al., 2023). At the same time, user interests, geographical locations, or IoT device connectivity in edge scenarios are highly dynamic, where one-time trained models may not adapt well to such scenarios (Chen et al., 2022b).

Unfortunately, online learning algorithms can also pose a risk to user privacy when collecting and processing sensitive user data. Several open challenges exist in privacy-preserving online learning algorithms. One challenge is to balance the privacy protection strength and the accuracy of the model prediction. The decision-making process in online scenarios is already highly challenging, and the introduction of privacy protection methods, such as noise perturbation, can further compromise the algorithm’s performance or even make it unusable. Another challenge is to develop privacy-preserving algorithms that are computationally efficient and can be easily deployed in dynamic network environments. Therefore, how to safely use the latest historical information to make efficient online caching decisions is a problem worthy of discussion. There are little efforts to address the privacy concerns associated with online learning algorithms. The FL may be a possible framework to allow multiple parties to process data jointly without revealing their raw datasets in dynamic scenarios (Krishnendu et al., 2022). Additionally, differential privacy techniques can be used to add random noises to the data in dynamic caching environments to obscure individual information (Zhou et al., 2023).

7.4. Privacy Quantification for PPEC

Privacy quantification is a critical aspect of privacy-preserving edge caching systems as it allows for the measurement and assessment of privacy protection levels provided by these systems (Sivaraman and Sikdar, 2021; Yan and Tuninetti, 2021; Wu et al., 2016; Acs et al., 2019). However, most current work on privacy-enhanced intelligent edge caching lacks specific privacy metrics. Rather than develo** clear and effective privacy metrics, researchers often combine existing privacy protection schemes and claim that their works can protect privacy. Unfortunately, without clear quantification of the privacy protection effect, it fails to identify weaknesses for improving privacy protection (Acs et al., 2019; Yan and Tuninetti, 2021). The development of effective privacy metrics for privacy-preserving edge caching systems is therefore an important research topic.

One of the primary challenges in privacy quantification is develo** an accurate and consistent metric for measuring privacy protection levels. It is a complicated task to propose a universal method to measure different types of data leakage in the edge cache. Therefore, an appropriate privacy quantification method with a formalized definition should be established to guide the design of privacy-preserving edge caching systems. For instance, in intelligent caching algorithms based on reinforcement learning models, a good privacy exposure quantitative index can guide the model’s reward design and help the agent make better caching decisions (Xu et al., 2020). Various privacy metrics have been proposed in the literature, such as the information-theoretic converse bound (Yan and Tuninetti, 2021) and the size of the anonymity set (Wu et al., 2016). However, each metric has its limitations and may not be suitable for general privacy-preserving edge caching systems. Future work for designing privacy metrics (similar to the privacy budget in differential privacy (Sivaraman and Sikdar, 2021; Acs et al., 2019) and entropy in information theory (Sivaraman and Sikdar, 2021)) is desired. An innovative definition of privacy measurement applicable in intelligent edge caching scenarios (Zhang et al., 2022a) should be designed for guiding PPEC.

Evaluating the privacy protection degree in dynamic network environments is another challenge in PPEC. Edge caching systems operate in a constantly changing environment, and various factors can impact privacy protection levels, which makes it challenging to determine an accurate and consistent privacy metric that can be applied in a dynamic network environment. To address these challenges, researchers can explore the use of online algorithms (Pang et al., 2022; Li et al., 2021) to predict privacy protection levels in real time based on network traffic patterns and user behaviour. This approach can help to dynamically adjust privacy protection levels in response to changes in the network environment and improve the effectiveness of PPEC.

8. Conclusion

Edge caching has shown significant potential for improving network performance and resource utilization, but privacy concerns must be considered when deploying edge cache. This article has analyzed and summarized the most prominent privacy issues in edge cache from sensitive privacy perspective, based on which a critical classification has been proposed. The recent countermeasures for alleviating the exposed threats of different private data have been retrospectively reviewed. The article concludes with lessons learned and highlights open challenges for future research in the PPEC. Further investigations are needed to ensure the privacy and performance of edge caching while also reconciling the trade-off between privacy protection and caching performance optimization.

References

  • (1)
  • Abadi et al. (2016) Martín Abadi, H. Brendan McMahan, Andy Chu, Ilya Mironov, Li Zhang, Ian Goodfellow, and Kunal Talwar. 2016. Deep learning with differential privacy. In Proceedings of the ACM Conference on Computer and Communications Security (CCS). ACM, 308–318.
  • Acs et al. (2019) Gergely Acs, Mauro Conti, Paolo Gasti, Cesar Ghali, Gene Tsudik, and Christopher A. Wood. 2019. Privacy-Aware Caching in Information-Centric Networking. IEEE Transactions on Dependable and Secure Computing 16, 2 (Mar 2019), 313–328.
  • Amini et al. (2011) Shahriyar Amini, Janne Lindqvist, Jason Hong, Jialiu Lin, Eran Toch, and Norman Sadeh. 2011. Caché: Caching location-enhanced content to improve user privacy. In Proceedings of the 9th International Conference on Mobile Systems, Applications, and Services (MobiSys’11). ACM, 197–209.
  • Andreoletti et al. (2019a) Davide Andreoletti, Omran Ayoub, Silvia Giordano, Giacomo Verticale, and Massimo Tornatore. 2019a. Privacy-preserving caching in ISP networks. In IEEE 20th International Conference on High Performance Switching and Routing (HPSR). IEEE, 1–6.
  • Andreoletti et al. (2018) Davide Andreoletti, Silvia Giordano, Giacomo Verticale, and Massimo Tornatore. 2018. Discovering the Geographic Distribution of Live Videos’ Users: A Privacy-Preserving Approach. In IEEE Global Communications Conference (GLOBECOM). IEEE, 1–6.
  • Andreoletti et al. (2019b) Davide Andreoletti, Cristina Rottondi, Silvia Giordano, Giacomo Verticale, and Massimo Tornatore. 2019b. An Open Privacy-Preserving and Scalable Protocol for a Network-Neutrality Compliant Caching. In IEEE International Conference on Communications (ICC). IEEE, 1–6.
  • Araldo et al. (2018) Andrea Araldo, Gyorgy Dan, and Dario Rossi. 2018. Caching Encrypted Content Via Stochastic Cache Partitioning. IEEE/ACM Transactions on Networking 26, 1 (Jan 2018), 548–561.
  • Brendan McMahan et al. (2017) H. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Agüera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. PMLR, 1273–1282.
  • Cao et al. (2020) Tengfei Cao, Changqiao Xu, Jun** Du, Yawen Li, Han Xiao, Changhui Gong, Lujie Zhong, and Dusit Niyato. 2020. Reliable and Efficient Multimedia Service Optimization for Edge Computing-Based 5G Networks: Game Theoretic Approaches. IEEE Transactions on Network and Service Management 17, 3 (Sep 2020), 1610–1625.
  • Chen et al. (2022a) Qi Chen, Bing Chen, Feng Hu, and Jiale Zhang. 2022a. Edge-based Protection Against Malicious Poisoning for Distributed Federated Learning. In IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD). IEEE, 459–464.
  • Chen et al. (2022b) Zhiqi Chen, Sheng Zhang, Zhi Ma, Shuai Zhang, Zhuzhong Qian, Mingjun Xiao, Jie Wu, and Sanglu Lu. 2022b. An Online Approach for DNN Model Caching and Processor Allocation in Edge Computing. In IEEE/ACM 30th International Symposium on Quality of Service (IWQoS). IEEE, 1–10.
  • Cheng et al. (2021) Runze Cheng, Yao Sun, Yi**g Liu, Le Xia, Sanshan Sun, and Muhammad Ali Imran. 2021. A Privacy-preserved D2D Caching Scheme Underpinned by Blockchain-enabled Federated Learning. In IEEE Global Communications Conference (GLOBECOM). IEEE, 1–6.
  • Cui et al. (2020c) Jie Cui, Lu Wei, Hong Zhong, **g Zhang, Yan Xu, and Lu Liu. 2020c. Edge computing in VANETs-An efficient and privacy-preserving cooperative downloading scheme. IEEE Journal on Selected Areas in Communications 38, 6 (Apr 2020), 1191–1204.
  • Cui et al. (2023) Laizhong Cui, Erchao Ni, Yipeng Zhou, Zhi Wang, Lei Zhang, Jiangchuan Liu, and Yuedong Xu. 2023. Towards Real-Time Video Caching at Edge Servers: A Cost-Aware Deep Q-Learning Solution. IEEE Transactions on Multimedia 25 (Nov 2023), 302–314.
  • Cui et al. (2022) Laizhong Cui, Xiaoxin Su, Zhongxing Ming, Ziteng Chen, Shu Yang, Yipeng Zhou, and Wei Xiao. 2022. CREAT: Blockchain-Assisted Compression Algorithm of Federated Learning for Content Caching in Edge Computing. IEEE Internet of Things Journal 9, 16 (Aug 2022), 14151–14161.
  • Cui et al. (2020a) Shujie Cui, Muhammad Rizwan Asghar, and Giovanni Russello. 2020a. Multi-CDN: Towards Privacy in Content Delivery Networks. IEEE Transactions on Dependable and Secure Computing 17, 5 (Sep 2020), 984–999.
  • Cui et al. (2020b) Yuanbo Cui, Fei Gao, Wenmin Li, Yijie Shi, Hua Zhang, Qiaoyan Wen, and Emmanouil Panaousis. 2020b. Cache-based privacy preserving solution for location and content protection in location-based services. Sensors 20, 16 (Aug 2020), 4651.
  • Dai et al. (2020) Yueyue Dai, Du Xu, Ke Zhang, Sabita Maharjan, and Yan Zhang. 2020. Deep Reinforcement Learning and Permissioned Blockchain for Content Caching in Vehicular Edge Computing and Networks. IEEE Transactions on Vehicular Technology 69, 4 (Apr 2020), 4312–4324.
  • Dhar and Varshney (2011) Subhankar Dhar and Upkar Varshney. 2011. Challenges and business models for mobile location-based services and advertising. Communications of the ACM 54, 5 (May 2011), 121–129.
  • GU Yi-ming, BAI Guang-wei, SHEN Hang (2019) HU Yu-jia. GU Yi-ming, BAI Guang-wei, SHEN Hang. 2019. Pre-cache Based Privacy Protection Mechanism in Continuous LBS Queries. Computer Science 46, 5 (May 2019), 122–128.
  • Guo et al. (2022) Jianxiong Guo, Xingjian Ding, Tian Wang, and Weijia Jia. 2022. Combinatorial resources auction in decentralized edge-thing systems using blockchain and differential privacy. Information Sciences 607 (Aug 2022), 211–229.
  • Hassanpour et al. (2023) Seyedeh Bahereh Hassanpour, Ahmad Khonsari, Masoumeh Moradian, and Seyed Pooya Shariatpanahi. 2023. Privacy-preserving edge caching: A probabilistic approach. Computer Networks 226 (May 2023), 109654.
  • He et al. (2007) Xiuli He, Ashutosh Prasad, Suresh P. Sethi, and Genaro J. Gutierrez. 2007. A survey of Stackelberg differential game models in supply and marketing channels. Journal of Systems Science and Systems Engineering 16, 4 (Nov 2007), 385–413.
  • Hu et al. (2018) Long Hu, Yongfeng Qian, Min Chen, M. Shamim Hossain, and Ghulam Muhammad. 2018. Proactive Cache-Based Location Privacy Preserving for Vehicle Networks. IEEE Wireless Communications 25, 6 (Dec 2018), 77–83.
  • Jiang et al. (2023) Bin Jiang, Jianqiang Li, Huihui Wang, and Houbing Song. 2023. Privacy-Preserving Federated Learning for Industrial Edge Computing via Hybrid Differential Privacy and Adaptive Compression. IEEE Transactions on Industrial Informatics 19, 2 (Feb 2023), 1136–1144.
  • Jiang et al. (2020) Shunrong Jiang, Jianqing Liu, Longxia Huang, Haiqin Wu, and Yong Zhou. 2020. Vehicular Edge Computing Meets Cache: An Access Control Scheme for Content Delivery. In IEEE International Conference on Communications (ICC). IEEE, 1–6.
  • Jiang et al. (2017) Wei Jiang, Gang Feng, and Shuang Qin. 2017. Optimal Cooperative Content Caching and Delivery Policy for Heterogeneous Cellular Networks. IEEE Transactions on Mobile Computing 16, 5 (May 2017), 1382–1393.
  • Kong et al. (2019) Qinglei Kong, Rongxing Lu, Maode Ma, and Haiyong Bao. 2019. A Privacy-Preserving and Verifiable Querying Scheme in Vehicular Fog Data Dissemination. IEEE Transactions on Vehicular Technology 68, 2 (Feb 2019), 1877–1887.
  • Krishnendu et al. (2022) S. Krishnendu, B. N. Bharath, Navneet Garg, Vimal Bhatia, and Tharmalingam Ratnarajah. 2022. Learning to Cache: Federated Caching in a Cellular Network with Correlated Demands. IEEE Transactions on Communications 70, 3 (Mar 2022), 1653–1665.
  • Kumar et al. (2019) Siddhartha Kumar, Alexandre I. Graell Amat, Eirik Rosnes, and Linda Senigagliesi. 2019. Private information retrieval from a cellular network with caching at the edge. IEEE Transactions on Communications 67, 7 (Jul 2019), 4900–4912.
  • Leguay et al. (2017) Jeremie Leguay, Georgios S. Paschos, Elizabeth A. Quaglia, and Ben Smyth. 2017. CryptoCache: Network caching with confidentiality. In IEEE International Conference on Communications (ICC). IEEE, 1–6.
  • Lei et al. (2020) Kai Lei, Junjie Fang, Qichao Zhang, Junjun Lou, Maoyu Du, Jiyue Huang, and Jian** Wang. 2020. Blockchain-Based Cache Poisoning Security Protection and Privacy-Aware Access Control in NDN Vehicular Edge Computing Networks. Journal of Grid Computing 18, 4 (Aug 2020), 593–613.
  • Li et al. (2023) Chunlin Li, Yong Zhang, and Youlong Luo. 2023. A Federated Learning-Based Edge Caching Approach for Mobile Edge Computing-Enabled Intelligent Connected Vehicles. IEEE Transactions on Intelligent Transportation Systems 24, 3 (Nov 2023), 3360–3369.
  • Li et al. (2020) Ruibin Li, Yiwei Zhao, Chenyang Wang, Xiaofei Wang, Victor C.M. Leung, and Xiuhua Li. 2020. Edge Caching Replacement Optimization for D2D Wireless Networks via Weighted Distributed DQN. In IEEE Wireless Communications and Networking Conference (WCNC). IEEE, 1–6.
  • Li et al. (2021) Weiting Li, Liyao Xiang, Zhou Zhou, and Feng Peng. 2021. Privacy budgeting for growing machine learning datasets. In IEEE Conference on Computer Communications (INFOCOM). IEEE, 1–10.
  • Liang and Liu (2019) Jie Liang and Yinlong Liu. 2019. A Cache Privacy Protection Strategy Based on Content Privacy and User Security Classification in CCN. In IEEE Wireless Communications and Networking Conference (WCNC). IEEE, 1–6.
  • Liu et al. (2016) Dong Liu, Binqiang Chen, Chenyang Yang, and Andreas F. Molisch. 2016. Caching at the wireless edge: Design aspects, challenges, and future directions. IEEE Communications Magazine 54, 9 (Sep 2016), 22–28.
  • Liu et al. (2020) Jiadi Liu, Songtao Guo, Yawei Shi, Liang Feng, and Cong Wang. 2020. Decentralized Caching Framework Toward Edge Network Based on Blockchain. IEEE Internet of Things Journal 7, 9 (Jun 2020), 9158–9174.
  • Liu et al. (2022) Shengheng Liu, Chong Zheng, Yongming Huang, and Tony Q.S. Quek. 2022. Distributed Reinforcement Learning for Privacy-Preserving Dynamic Edge Caching. IEEE Journal on Selected Areas in Communications 40, 3 (Mar 2022), 749–760.
  • Lu et al. (2020) Yunlong Lu, Xiaohong Huang, Yueyue Dai, Sabita Maharjan, and Yan Zhang. 2020. Differentially Private Asynchronous Federated Learning for Mobile Edge Computing in Urban Informatics. IEEE Transactions on Industrial Informatics 16, 3 (Mar 2020), 2134–2143.
  • Ma et al. (2017) Ge Ma, Zhi Wang, Miao Zhang, Jiahui Ye, Minghua Chen, and Wenwu Zhu. 2017. Understanding Performance of Edge Content Caching for Mobile Video Streaming. IEEE Journal on Selected Areas in Communications 35, 5 (May 2017), 1076–1089.
  • Mao et al. (2017) Yuyi Mao, Changsheng You, Jun Zhang, Kaibin Huang, and Khaled B. Letaief. 2017. A Survey on Mobile Edge Computing: The Communication Perspective. IEEE Communications Surveys and Tutorials 19, 4 (Oct 2017), 2322–2358.
  • Muller et al. (2017) Sabrina Muller, Onur Atan, Mihaela Van Der Schaar, and Anja Klein. 2017. Context-Aware Proactive Content Caching with Service Differentiation in Wireless Networks. IEEE Transactions on Wireless Communications 16, 2 (Feb 2017), 1024–1036.
  • Nair et al. (2023) Akarsh K Nair, Jayakrushna Sahoo, and Ebin Deni Raj. 2023. Privacy preserving Federated Learning framework for IoMT based big data analysis using edge computing. Computer Standards & Interfaces 86 (Jan 2023), 103720.
  • Nguyen et al. (2023) Duong Thuy Anh Nguyen, Jiaming Cheng, Duong Tung Nguyen, and Angelia Nedich. 2023. CrowdCache: A Decentralized Game-Theoretic Framework for Mobile Edge Content Sharing. arXiv:2304.13246 [cs.GT]
  • Ni et al. (2021) Jianbing Ni, Kuan Zhang, and Athanasios V. Vasilakos. 2021. Security and Privacy for Mobile Edge Caching: Challenges and Solutions. IEEE Wireless Communications 28, 3 (Jun 2021), 77–83.
  • Nikolaou et al. (2016) Stavros Nikolaou, Robbert Van Renesse, and Nicolas Schiper. 2016. Proactive Cache Placement on Cooperative Client Caches for Online Social Networks. IEEE Transactions on Parallel and Distributed Systems 27, 4 (Apr 2016), 1174–1186.
  • Nisha et al. (2022) Nisha Nisha, Iynkaran Natgunanathan, Shang Gao, and Yong Xiang. 2022. A novel privacy protection scheme for location-based services using collaborative caching. Computer Networks 213 (Aug 2022), 109107.
  • others Forecast (2019) GMDT others Forecast. 2019. Cisco Visual Networking Index (VNI) Update Global Mobile Data Traffic Forecast. http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/complete-white-paper-c11-481360.html
  • Pang et al. (2022) Xiaoyi Pang, Zhibo Wang, **gxin Li, Ruiting Zhou, Ju Ren, and Zhetao Li. 2022. Towards Online Privacy-preserving Computation Offloading in Mobile Edge Computing. In IEEE Conference on Computer Communications (INFOCOM), Vol. 2022-May. IEEE, 1179–1188.
  • Pu et al. (2019) Yuwen Pu, Ying Wang, Feihong Yang, ** Luo, Chunqiang Hu, and Haibo Hu. 2019. An Efficient and Recoverable Data Sharing Mechanism for Edge Storage. In International Conference on Wireless Algorithms, Systems, and Applications (WASA). Springer Verlag, 247–259.
  • Qi and Yang (2020) Kaiqiang Qi and Chenyang Yang. 2020. Popularity Prediction with Federated Learning for Proactive Caching at Wireless Edge. In IEEE Wireless Communications and Networking Conference (WCNC). IEEE, 1–6.
  • Qian et al. (2020) Yongfeng Qian, Yingying Jiang, Long Hu, M. Shamim Hossain, Mubarak Alrashoud, and Muneer Al-Hammadi. 2020. Blockchain-based privacy-aware content caching in cognitive internet of vehicles. IEEE Network 34, 2 (Mar 2020), 46–51.
  • Qiao et al. (2022) Dewen Qiao, Songtao Guo, Defang Liu, Saiqin Long, Pengzhan Zhou, and Zhetao Li. 2022. Adaptive Federated Deep Reinforcement Learning for Proactive Content Caching in Edge Computing. IEEE Transactions on Parallel and Distributed Systems 33, 12 (Dec 2022), 4767–4782.
  • Saputra et al. (2022) Yuris Mulya Saputra, Diep N. Nguyen, Dinh Thai Hoang, and Eryk Dutkiewicz. 2022. In-Network Caching and Learning Optimization for Federated Learning in Mobile Edge Networks. In IEEE International Conference on Communications (ICC). IEEE, 1653–1658.
  • Schlegel et al. (2022) Reent Schlegel, Siddhartha Kumar, Eirik Rosnes, and Alexandre Graell Graell I Amat. 2022. Privacy-Preserving Coded Mobile Edge Computing for Low-Latency Distributed Inference. IEEE Journal on Selected Areas in Communications 40, 3 (Mar 2022), 788–799.
  • Sen et al. (2018) Adnan A.Abi Sen, Fathy B. Eassa, Mohammad Yamin, and Kamal Jambi. 2018. Double Cache Approach with Wireless Technology for Preserving User Privacy. Wireless Communications and Mobile Computing 2018 (2018), 1–11.
  • Shi et al. (2018) Fang Shi, Lisheng Fan, Xin Liu, Zhenyu Na, and Yanchen Liu. 2018. Probabilistic Caching Placement in the Presence of Multiple Eavesdroppers. Wireless Communications and Mobile Computing 2018 (May 2018), 1–10.
  • Shokri et al. (2017) Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. 2017. Membership Inference Attacks Against Machine Learning Models. In IEEE Symposium on Security and Privacy (S&P). IEEE, 3–18.
  • Sivaraman and Sikdar (2021) Vignesh Sivaraman and Biplab Sikdar. 2021. A Defense Mechanism against Timing Attacks on User Privacy in ICN. IEEE/ACM Transactions on Networking 29, 6 (Dec 2021), 2709–2722.
  • Sutton and Barto (1998) R.S. Sutton and A.G. Barto. 1998. Reinforcement Learning: An Introduction. IEEE Transactions on Neural Networks 9, 5 (1998), 1054–1054.
  • Tong et al. (2022) Wei Tong, Wenjie Chen, Bingbing Jiang, Fengyuan Xu, Qun Li, and Sheng Zhong. 2022. Privacy-Preserving Data Integrity Verification for Secure Mobile Edge Storage. IEEE Transactions on Mobile Computing Early Access (Mar 2022), 1–1.
  • Tourani et al. (2018) Reza Tourani, Satyajayant Misra, Travis Mick, and Gaurav Panwar. 2018. Security, Privacy, and Access Control in Information-Centric Networking: A Survey. IEEE Communications Surveys and Tutorials 20, 1 (Jan 2018), 556–600.
  • Vu et al. (2019) Thang X. Vu, Symeon Chatzinotas, and Bjorn Ottersten. 2019. Blockchain-based Content Delivery Networks: Content Transparency Meets User Privacy. In IEEE Wireless Communications and Networking Conference (WCNC). IEEE, 1–6.
  • Wang et al. (2022b) Huanhuan Wang, Xiao Zhang, Youbing Xia, and Xiang Wu. 2022b. A differential privacy-preserving deep learning caching framework for heterogeneous communication network systems. International Journal of Intelligent Systems 37, 12 (Aug 2022), 11142–11166.
  • Wang and Deng (2022) Kailun Wang and Na Deng. 2022. A Privacy-Protected Popularity Prediction Scheme for Content Caching Based on Federated Learning. IEEE Transactions on Vehicular Technology 71, 9 (Jun 2022), 10191–10196.
  • Wang et al. (2022a) Kailun Wang, Na Deng, and Xuanheng Li. 2022a. An Efficient Content Popularity Prediction of Privacy Preserving Based on Federated Learning and Wasserstein GAN. IEEE Internet of Things Journal 10, 5 (May 2022), 3786–3798.
  • Wang et al. (2019b) Mu Wang, Changqiao Xu, Xingyan Chen, Hao Hao, Lujie Zhong, and Shui Yu. 2019b. Differential privacy oriented distributed online learning for mobile social video prefetching. IEEE Transactions on Multimedia 21, 3 (Jan 2019), 636–651.
  • Wang et al. (2017) Tianhao Wang, Jeremiah Blocki, Ninghui Li, and Somesh Jha. 2017. Locally differentially private protocols for frequency estimation. In Proceedings of the 26th USENIX Security Symposium. ACM, 729–745.
  • Wang et al. (2019a) Xiaofei Wang, Yiwen Han, Chenyang Wang, Qiyang Zhao, Xu Chen, and Min Chen. 2019a. In-edge AI: Intelligentizing mobile edge computing, caching and communication by federated learning. IEEE Network 33, 5 (Sep 2019), 156–165.
  • Wang et al. (2020) Xiaofei Wang, Chenyang Wang, Xiuhua Li, Victor C.M. Leung, and Tarik Taleb. 2020. Federated Deep Reinforcement Learning for Internet of Things with Decentralized Cooperative Edge Caching. IEEE Internet of Things Journal 7, 10 (Oct 2020), 9441–9455.
  • Wu et al. (2016) Qinghua Wu, Zhenyu Li, Gareth Tyson, Steve Uhlig, Mohamed Ali Kaafar, and Gaogang Xie. 2016. Privacy-Aware Multipath Video Caching for Content-Centric Networks. IEEE Journal on Selected Areas in Communications 34, 8 (Aug 2016), 2219–2230.
  • Wu et al. (2023) Qiong Wu, Yu Zhao, Qiang Fan, **yi Fan, and Jiangzhou Wang. 2023. Mobility-Aware Cooperative Caching in Vehicular Edge Computing Based on Asynchronous Federated and Deep Reinforcement Learning. IEEE Journal on Selected Topics in Signal Processing 17, 1 (Jan 2023), 66–81.
  • Xiao et al. (2018) Liang Xiao, Xiaoyue Wan, Canhuang Dai, Xiaojiang Du, Xiang Chen, and Mohsen Guizani. 2018. Security in Mobile Edge Caching with Reinforcement Learning. IEEE Wireless Communications 25, 3 (Jun 2018), 116–122.
  • Xu et al. (2020) Qichao Xu, Zhou Su, and Rongxing Lu. 2020. Game Theory and Reinforcement Learning Based Secure Edge Caching in Mobile Social Networks. IEEE Transactions on Information Forensics and Security 15 (Mar 2020), 3415–3429.
  • Xu et al. (2019) Qichao Xu, Zhou Su, Qinghua Zheng, Minnan Luo, Bo Dong, and Kuan Zhang. 2019. Game theoretical secure caching scheme in multihoming edge computing-enabled heterogeneous networks. IEEE Internet of Things Journal 6, 3 (Jun 2019), 4536–4546.
  • Xue et al. (2019) Kai** Xue, Peixuan He, Xiang Zhang, Qiudong Xia, David S.L. Wei, Hao Yue, and Feng Wu. 2019. A Secure, Efficient, and Accountable Edge-Based Access Control Framework for Information Centric Networks. IEEE/ACM Transactions on Networking 27, 3 (Jun 2019), 1220–1233.
  • Xue et al. (2018) Kai** Xue, Xiang Zhang, Qiudong Xia, David S.L. Wei, Hao Yue, and Feng Wu. 2018. SEAF: A Secure, Efficient and Accountable Access Control Framework for Information Centric Networking. In IEEE Conference on Computer Communications (INFOCOM). IEEE, 2213–2221.
  • Yan and Tuninetti (2021) Qifa Yan and Daniela Tuninetti. 2021. Fundamental Limits of Caching for Demand Privacy Against Colluding Users. IEEE Journal on Selected Areas in Information Theory 2, 1 (Jan 2021), 192–207.
  • Yang et al. (2019) Peng Yang, Ning Zhang, Shan Zhang, Li Yu, Junshan Zhang, and Xuemin Shen. 2019. Content Popularity Prediction Towards Location-Aware Mobile Edge Caching. IEEE Transactions on Multimedia 21, 4 (Apr 2019), 915–929.
  • Yang and Kong (2016) Qiuwei Yang and Pan Kong. 2016. RuleCache: A mobility pattern based multi-level cache approach for location privacy protection. In IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS). IEEE, 448–455.
  • Yu et al. (2018) Zhengxin Yu, Jia Hu, Geyong Min, Haochuan Lu, Zhiwei Zhao, Haozhe Wang, and Nektarios Georgalas. 2018. Federated Learning Based Proactive Content Caching in Edge Computing. In IEEE Global Communications Conference (GLOBECOM). IEEE, 1–6.
  • Yu et al. (2022) Zhengxin Yu, Jia Hu, Geyong Min, Zi Wang, Wang Miao, and Shancang Li. 2022. Privacy-Preserving Federated Deep Learning for Cooperative Hierarchical Caching in Fog Computing. IEEE Internet of Things Journal 9, 22 (May 2022), 22246–22255.
  • Yu et al. (2020) Zhengxin Yu, Jia Hu, Geyong Min, Han Xu, and Jed Mills. 2020. Proactive content caching for internet-of-vehicles based on peer-to-peer federated learning. In IEEE 26th International Conference on Parallel and Distributed Systems (ICPADS). IEEE, 601–608.
  • Yu et al. (2021) Zhengxin Yu, Jia Hu, Geyong Min, Zhiwei Zhao, Wang Miao, and M. Shamim Hossain. 2021. Mobility-Aware Proactive Edge Caching for Connected Vehicles Using Federated Learning. IEEE Transactions on Intelligent Transportation Systems 22, 8 (Aug 2021), 5341–5351.
  • Yuan et al. (2016) ** Wang, Marie Jose Montpetit, and Shucheng Liu. 2016. Enabling secure and efficient video delivery through encrypted in-network caching. IEEE Journal on Selected Areas in Communications 34, 8 (Aug 2016), 2077–2090.
  • Zeng et al. (2020) Yiming Zeng, Yaodong Huang, Ji Liu, and Yuanyuan Yang. 2020. Privacy-preserving distributed edge caching for mobile data offloading in 5G networks. In IEEE 40th International Conference on Distributed Computing Systems (ICDCS). IEEE, 541–551.
  • Zeng et al. (2021) Yiming Zeng, Yaodong Huang, Zhenhua Liu, Ji Liu, and Yuanyuan Yang. 2021. Privacy-Preserving Decentralized Edge Caching in 5G Networks. In IEEE 14th International Conference on Cloud Computing (CLOUD). 189–199.
  • Zhang et al. (2022b) Chi Zhang, Haisheng Tan, Guopeng Li, Zhenhua Han, Shaofeng H.C. Jiang, and Xiang Yang Li. 2022b. Online File Caching in Latency-Sensitive Systems with Delayed Hits and Bypassing. In IEEE Conference on Computer Communications (INFOCOM). IEEE, 1059–1068.
  • Zhang et al. (2023) Shiwen Zhang, Biao Hu, Wei Liang, Kuan-Ching Li, and Brij B. Gupta. 2023. A Caching-based Dual K-Anonymous Location Privacy-Preserving Scheme for Edge Computing. IEEE Internet of Things Journal 10, 11 (Jan 2023), 9768 –9781.
  • Zhang et al. (2019) Shaobo Zhang, Xiong Li, Zhiyuan Tan, Tao Peng, and Guojun Wang. 2019. A caching and spatial K-anonymity driven privacy enhancement scheme in continuous location-based services. Future Generation Computer Systems 94 (May 2019), 40–50.
  • Zhang et al. (2018) Xinyue Zhang, **gyi Wang, Hongning Li, Yuanxiong Guo, Qingqi Pei, Pan Li, and Miao Pan. 2018. Data-Driven Caching with Users’ Local Differential Privacy in Information-Centric Networks. In IEEE Global Communications Conference (GLOBECOM). IEEE, 1–6.
  • Zhang et al. (2022c) Xiaoyu Zhang, Hong Zhong, Chunyang Fan, Irina Bolodurina, and Jie Cui. 2022c. CBACS: A Privacy-Preserving and Efficient Cache-Based Access Control Scheme for Software Defined Vehicular Networks. IEEE Transactions on Information Forensics and Security 17 (May 2022), 1930–1945.
  • Zhang et al. (2022d) Xianzhi Zhang, Yipeng Zhou, Di Wu, Miao Hu, Xi Zheng, Min Chen, and Song Guo. 2022d. Optimizing Video Caching at the Edge: A Hybrid Multi-Point Process Approach. IEEE Transactions on Parallel and Distributed Systems 33, 10 (Oct 2022), 2597–2611.
  • Zhang et al. (2022a) Zizhen Zhang, Tengfei Cao, Xiaoying Wang, Han Xiao, and Jianfeng Guan. 2022a. VC-PPQ: Privacy-preserving Q-learning Based Video Caching Optimization in Mobile Edge Networks. IEEE Transactions on Network Science and Engineering 9, 6 (Aug 2022), 4129–4144.
  • Zhao et al. (2020) Bo Zhao, Konda Reddy Mopuri, and Hakan Bilen. 2020. iDLG: Improved Deep Leakage from Gradients. arXiv:2001.02610 [cs.LG]
  • Zheng et al. (2021) Chong Zheng, Shengheng Liu, Yongming Huang, and Tony Q.S. Quek. 2021. Privacy-Preserving Federated Reinforcement Learning for Popularity-Assisted Edge Caching. In IEEE Global Communications Conference (GLOBECOM). IEEE, 1–6.
  • Zheng et al. (2022) Chong Zheng, Shengheng Liu, Yongming Huang, Wei Zhang, and Luxi Yang. 2022. Unsupervised Recurrent Federated Learning for Edge Popularity Prediction in Privacy-Preserving Mobile-Edge Computing Networks. IEEE Internet of Things Journal 9, 23 (2022), 24328–24345.
  • Zhong et al. (2021) Yuqing Zhong, Zhaohua Li, and Li** Liao. 2021. A Privacy-Preserving Caching Scheme for Device-to-Device Communications. Security and Communication Networks 2021 (Jan 2021), 10958.
  • Zhou et al. (2019) Pan Zhou, Kehao Wang, Jie Xu, and Dapeng Wu. 2019. Differentially-private and trustworthy online social multimedia big data retrieval in edge computing. IEEE Transactions on Multimedia 21, 3 (Mar 2019), 539–554.
  • Zhou et al. (2023) Yipeng Zhou, Xuezheng Liu, Yao Fu, Di Wu, Jessie Hui Wang, and Shui Yu. 2023. Optimizing the Numbers of Queries and Replies in Convex Federated Learning with Differential Privacy. IEEE Transactions on Dependable and Secure Computing Early Access (Jan 2023), 1–15.
  • Zhu et al. (2021) Pengcheng Zhu, Jun Xu, Jiamin Li, Dongming Wang, and Xiaohu You. 2021. Learning-Empowered Privacy Preservation in beyond 5G Edge Intelligence Networks. IEEE Wireless Communications 28, 2 (Apr 2021), 12–18.