Fuzzy Q-learning-based Opportunistic Communication for MEC-enhanced Vehicular Crowdsensing

Trung Thanh Nguyen2, Truong Thao Nguyen3, Thanh-Hung Nguyen21, Phi Le Nguyen21 2School of Information and Communication Technology, Hanoi University of Science and Technology, Hanoi, Vietnam. E-mail: {thanh.nt176874@sis, hungnt@soict, lenp@soict}.hust.edu.vn 3The National Institute of Advanced Industrial Science and Technology (AIST), Japan. E-mail: [email protected] 1Corresponding authors.
Abstract

This study focuses on MEC-enhanced, vehicle-based crowdsensing systems that rely on devices installed on automobiles. We investigate an opportunistic communication paradigm in which devices can transmit measured data directly to a crowdsensing server over a 4G communication channel or to nearby devices or so-called Road Side Units positioned along the road via Wi-Fi. We tackle a new problem that is how to reduce the cost of 4G while preserving the latency. We propose an offloading strategy that combines a reinforcement learning technique known as Q-learning with Fuzzy logic to accomplish the purpose. Q-learning assists devices in learning to decide the communication channel. Meanwhile, Fuzzy logic is used to optimize the reward function in Q-learning. The experiment results show that our offloading method significantly cuts down around 30-40% of the 4G communication cost while kee** the latency of 99% packets below the required threshold.

Index Terms:
Vehicle-based mobile crowdsensing, MEC, Opportunistic Communication, Reinforcement learning, Fuzzy logic.

I Introduction

With substantial advances in sensing, communication, and mobile computing technologies in recent years, a new computing and sensing paradigm named mobile crowdsensing has demonstrated an effective solution for the capillary gathering of huge quantities of information in densely populated regions [1]. In a crowdsensing system, participants equipped with sensing and computing capabilities work collaboratively to collect, share, and extract data about a phenomenon of common interest. In the past, crowdsensing systems have mainly relied on mobile devices such as smartphones. Nowadays, vehicles with increasing sensing, computing, and storage capabilities have emerged as viable alternatives for mobile crowdsensing systems. Numerous vehicle-based crowdsensing applications have been proposed, including traffic monitoring and prediction, advertisement dissemination [2, 3, 4]. The conventional crowdsensing paradigm is a centralized cloud-based method in which data is sent from participants to a cloud server over a broadband network [5]. This approach, however, generates significant traffic on the network and computation burden on the cloud; thus, it cannot efficiently support real-time and large-scale mobile crowdsensing systems. To this end, one solution is to deploy Mobile Edge Computing (MEC) servers to Roadside Units (RSUs) that are close to vehicles. With the advantage of the closeness to the vehicles, MEC can help collect data from vehicles quickly and reduce the load on the radio access network. In the literature, significant efforts have been devoted to MEC-enhanced vehicle-based mobile crowdsensing systems with various topics. Authors in [6] focused on participant selection and task offloading problems. Liu et al. in [7] leveraged meta-heuristic to address the network selection and traffic allocation. The authors in [8] proposed a hierarchical task allocation framework which consists of two tiers: cloud and edge. The cloud layer evaluates participants’ reputations and offers the most promising candidates to the edge layer. The edge layer then contacts the participants and optimizes the task allocation. In [9], the authors introduced a quality-aware sparse data collecting technique. The goal is to guarantee spatiotemporal coverage while minimizing redundant data. The main idea is to leverage the correlation among sensing data to identify the smallest subset of grids to allocate tasks. Using correctly acquired data, the cloud server then infers the missing values. Zhao in [10] designed an optimal sensing strategy for all vehicles.

Different from existing works, this study considers a novel problem that asks to minimize the communication budget while maintaining the freshness of information in MEC-enhanced Vehicle-based Mobile Crowdsensing systems (MVMC). Specifically, we focus on MVMC systems, where the sensory data from vehicles are opportunistically transferred to the server via three communication routes: (1) directly sending to the server through the cellular networks such as 4G, (2) transferring to a roadside unit (RSU) via Wi-Fi and then transmitting from RSU to the server by the wired network, and (3) relaying to a nearby vehicle by using Wi-Fi, and then following the nearby vehicle’s policy to transfer to the server. The network model is illustrated in Fig. 3. Our goal is to ensure data freshness while reducing transmission expenses. We define data freshness as the amount of time between when the data was generated and when the cloud server collected it. We then aim to lessen the fraction of packages with freshness levels surpassing a specified threshold. Furthermore, we assume that the 4G network is widespread and that the device can always send data to the cloud in real-time through 4G. On the other hand, Wi-Fi has a limited coverage area, so the device cannot always communicate data to the RSU or other devices. The drawback of 4G over Wi-Fi is that the cost per communicated capacity on a 4G network is substantially higher (see Table I). As a result, we should design an offloading mechanism that minimizes the number of packets sent directly from the device to the server (to save expenses) while maintaining packet freshness. Additionally, 4G transmission uses much more energy than Wi-Fi. For instance, we performed an experiment to determine how much energy an air quality monitoring device consumes when communicating through Wi-Fi or 4G. The experimental findings in Table II indicate that employing 4G consumes 1.31.31.31.3 times of energy compared to Wi-Fi. We name our targeted problem as OCVC (stands for Opportunistic Communication for Vehicle-based mobile Crowdsensing). Our OCVC problem asks to minimize the use of 4G communication while guaranteeing that the information latency does not exceed a threshold. Here, the term information latency is defined by the time interval from when the data is collected until it reaches the server.

TABLE I: Average prices of 4G and Wired network communication over the world
Regional 4G communication Wired network
cost ($) communication cost ($)
(per 1GB data) (per 1 month)
Global 4.07 57.07
Asia 1.79 40.29
Baltics 2.09 19.19
Caribbean 4.44 78.44
Central America 2.40 43.87
CIS (Former Ussr) 2.84 13.96
Eastern Europe 4.64 19.90
Near East 3.94 60.62
Northern Africa 1.53 22.41
Northern America 8.21 89.44
Oceania 5.51 85.14
South America 5.52 55.17
Sub-Saharan Africa 6.44 77.70
Western Europe 2.47 49.56

To the best of our knowledge, this study is an early attempt to minimize the communication budget while maintaining the freshness of information in vehicular mobile crowdsensing. Our idea is to exploit Q-learning combined with Fuzzy logic. In our Q-learning paradigm, each crowdsensing device is considered an agent that keeps track of its Q table. We split the whole timeline into discrete time units called time slots. Packets generated by a device are temporarily stored in the device’s buffer. The agent then uses the proposed Q-learning model to perform one of the following actions at every time slot: 1) storing data in the local buffer, 2) transferring to the RSU, 3) relaying to a neighbor device, and 4) transmitting to the server. Each Q table item has a Q value that reflects how well an action performs in a particular state. At each time slot, the agent will choose an action based on the value of the Q value (usually, actions with a higher Q value have a higher chance of being selected). When the agent performs an action, the environment provides feedback indicating how beneficial the action was. This goodness is quantified by a so-called reward, which will be used to update the Q table. Q-learning actions will be selected to maximize the cumulative reward value. Consequently, our reward function will be designed to encourage actions that help reduce 4G communication costs while preserving packet freshness. Furthermore, we will exploit Fuzzy logic to adapt several hyperparameters in the reward function, making it more resistant to environmental changes. Our contribution is as follows:

  • We formulate the opportunistic communication in MEC-enhanced vehicle-based mobile crowdsensing systems.

  • We employ Q learning to propose an offloading strategy that reduces communication costs while maintaining data freshness. Furthermore, to enhance the efficiency of the Q-learning-based offloading algorithm, we adopt Fuzzy logic to adjust Q-learning hyper-parameters adaptively.

  • We perform extensive experiments to evaluate the impacts of various network configurations on the performance of the proposed algorithm and compare it with several baselines. The numerical results show that our proposed protocol outperforms the baselines.

The remainder of the paper is organized as follows. We briefly introduce the related works, and an overview of Q-learning and Fuzzy logic in Section  II and III. Sections IV describes our proposed protocol in details. We present the numerical results in Section V and conclude the paper in Section VI.

TABLE II: Energy consumption of an air quality monitoring device consisting of sensors measuring PM2.5, PM10, NO2, CO2, SO2, humidity, temperature
Parameters 4G WiFi
Supply voltage (V) 7.4 7.4
Average current consumption (mA) 305.68 236.56
Max current consumption (mA) 707.42 441.54
Average power consumption (W) 2.26 1.75
Max power consumption (W) 5.24 3.28

II Related work

The OCVC problem can be categorized as an offloading problem in V2X (i.e., Vehicle-to-Everything) networks. As such, we begin with an overview of current research addressing various aspects concerning MEC-enhanced mobile crowdsensing applications. Following that, we review the literature on MEC’s offloading issue.

In [6], the authors define Local Edge Nodes (LENs) and Main Edge Node (MEN) that are responsible for selecting workers available in the area of interest. They then proposed an offloading mechanism to offload sensory data from the workers to the identified LENs. Liu et al. In [7] leveraged meta-heuristic to address the network selection and traffic allocation problem. The objective is to maximize the users’ transmission capability while minimizing the transmission delay. The authors first presented the mathematical formulation and then applied the PSO to provide a sub-optimal solution. The authors in [8] proposed a hierarchical task allocation framework. Firstly, the cloud layer evaluates the participants’ reputations based on various data and sends the optimal subset of participants to the edge layer. The edge layer then interacts with the participants and performs task-specific optimizations. In [9], the authors presented a data collecting method aimed at minimizing data redundancy while maintaining sensor grids’ spatiotemporal coverage. To do this, the authors used correlations between sensing data to determine which user group should be selected. Additionally, the compressive sensing technique was utilized to retrieve data from the whole sensing region. In [11] the authors proposed a novel mobile crowdsensing paradigm where mobile users can act as mobile MEC nodes. The authors designed a probabilistic model and an algorithm to select appropriate users for acting as mobile MEC nodes. The authors in [12] provided a crowd-sensing-assisted vehicular distributed computing mechanism to update a High-definition map (HD Map) for autonomous driving. In addition, the authors proposed a heuristic best-effort algorithm for crowdsensing nodes selection and tasks allocation with the goal of minimizing communication load. Xu et al. in [13] investigated the data uploading problem in mobile crowdsensing systems. They proposed a mechanism for multiple edge nodes to collaborate to match a team of edge nodes with a sensing worker to satisfy the demand. The authors first proved the NP-hardness of the targeted problem and then introduced an algorithm based on Lagrangian relaxation. In [7], the authors addressed the cost reduction issue in MEC-enhanced vehicular crowdsensing systems with various data kinds carried by users. They proposed a mechanism for determining which server should be enabled for each data type of processing. The authors in [14] focused on collaborative mobile crowdsensing systems, where users are divided into groups and collaboratively exchange information inside each group. For each group, there is one owner that is responsible for gathering data from other members and forwarding it to the collector. The authors then proposed three grou** algorithms: static grou**, PoI grou**, and Dynamic grou**. The experimental results show that users can save a large amount of energy by using the proposed grou** methods. Gong et al., in [15], studied the data offloading issue in opportunistic social networks, wherein mobile users may decide to offload data using their cellular interface or relay it to surrounding users to reduce costs. The authors formulated the targeted problem mathematically and then proposed online and offline solutions. The numerical findings showed that taking advantage of opportunistic offloading between users may significantly reduce costs. K. Zhang et al. concentrated on optimizing energy usage in MEC-enabled 5G networks in [16]. The authors provide a mathematical model for the problem and propose an approximation approach. The work in [17, 18, 19] addressed the offloading decision of collaborative task execution between platoons and a MEC server. Both [17, 18] considered how to determine the location of task execution either on a vehicle, offloading to other platoon members, or an associated MEC server. However, [17] focused on minimizing the offloading cost, while [18] aimed at reducing the average energy consumption. The authors in [19] proposed a federated offloading method that exploits horizontal offloading paths between vehicles, with the objective of minimizing total latency. In [20], the authors aimed at minimizing the power consumption of MEC servers and vehicles. Zhao et al. recently utilized MEC and cloud computing resources simultaneously for offloading [21]. In that work, vehicles could offload their computation tasks to a MEC server or the cloud via RSUs. The objective was to maximize the system’s utility by optimizing both the offloading strategy and resource allocation. In [22], Y. Lin et al. addressed the traffic and capacity allocation problem in a three-tier model and proposed an optimization algorithm consisting of two phases. The first was adjusting the capacity allocation, and optimizing the traffic allocation. Their objective was to minimize total capacity and guarantee that at least some traffic has satisfying latency constraints. Nguyen et al. in [23] proposed a 3-tier offloading model that leverages both MEC and cloud computing. The authors first provided an explicit theoretical model to formulate the average task processing latency. They then proposed a meta-heuristic approach to determine sub-optimal offloading probabilities for the vehicles. In [24], the authors proposed an offloading protocol aiming at reducing communication costs while ensuring the packet latency constraint.

Unlike previous research, we use three communication planes simultaneously, namely, vehicle-to-cloud, vehicle-to-RSU, and vehicle-to-vehicle, with the goal of reducing 4G communication costs while maintaining information freshness.

III Preliminaries

In this section, we describe two techniques that will be used in our solution, namely Q-learning and Fuzzy logic.

III-A Q-learning

Refer to caption
Figure 1: Q-learning overview.

Q learning is a reinforcement learning technique that has been extensively utilized to tackle decision-making problems. Reinforcement learning is mainly based on the trial-and-error paradigm. A reinforcement learning framework, in particular, is comprised of five major components: environment, agent, action, state, and reward. The agent performs an action at each state and interacts with the environment. The environment then responds with a signal indicating the efficacy of the action. Finally, this goodness is quantified by a so-called award. Based on the reward, the agent accumulates experiences from previous actions and progressively improves the actions to maximize the accumulative reward. The Q learning model chooses actions based on the so-called Q values stored in a Q table. After each action, the agent updates the Q table using the following Bellman equation.

Q(St,At)(1α)Q(St,At)+α[t+γmaxaQ(St+1,a)],𝑄subscript𝑆𝑡subscript𝐴𝑡1𝛼𝑄subscript𝑆𝑡subscript𝐴𝑡𝛼delimited-[]subscript𝑡𝛾subscript𝑎𝑄subscript𝑆𝑡1𝑎\small Q(S_{t},A_{t})\leftarrow(1-\alpha)Q(S_{t},A_{t})+\alpha[\mathcal{R}_{t}% +\gamma\max_{a}Q(S_{t+1},a)],italic_Q ( italic_S start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_A start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ← ( 1 - italic_α ) italic_Q ( italic_S start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_A start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) + italic_α [ caligraphic_R start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + italic_γ roman_max start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT italic_Q ( italic_S start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT , italic_a ) ] , (1)

where, Stsubscript𝑆𝑡S_{t}italic_S start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and St+1subscript𝑆𝑡1S_{t+1}italic_S start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT denote the states at time slots t𝑡titalic_t and t+1𝑡1t+1italic_t + 1, respectively; atsubscript𝑎𝑡a_{t}italic_a start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT represents the action performed at time slot t𝑡titalic_t, and tsubscript𝑡\mathcal{R}_{t}caligraphic_R start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, Q(St,At)𝑄subscript𝑆𝑡subscript𝐴𝑡Q(S_{t},A_{t})italic_Q ( italic_S start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_A start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) depict the reward and Q value when performing action Atsubscript𝐴𝑡A_{t}italic_A start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT at state Stsubscript𝑆𝑡S_{t}italic_S start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT; maxaQ(St+1,a)subscript𝑎𝑄subscript𝑆𝑡1𝑎\max_{a}Q(S_{t+1},a)roman_max start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT italic_Q ( italic_S start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT , italic_a ) is the maximum value that may be obtained for all possible actions a𝑎aitalic_a at the next state St+1subscript𝑆𝑡1S_{t+1}italic_S start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT. In addition, α𝛼\alphaitalic_α and γ𝛾\gammaitalic_γ are two hyperparameters named learning and discount rates, respectively. These hyperparameters range from 00 to 1111.

III-B Fuzzy logic

Refer to caption
Figure 2: Fuzzy logic systems architecture.

The Fuzzy logic [25] architecture depicted in Fig. 2 consists of four main components: Fuzzification module, Knowledge base, Inference engine, and Defuzzification module.

III-B1 Fuzzification Module

The fuzzification module converts the crisp values of the control inputs into fuzzy values. A fuzzy variable has values, which are defined by linguistic variables (fuzzy sets or subsets) such as low, medium, high, where each is defined by a gradually varying membership function.

III-B2 Knowledge Base

The knowledge base stores IF-THEN rules provided by experts. The expert knowledge is a collection of Fuzzy membership functions and a set of Fuzzy rules having the form: IF (conditions are fulfilled) THEN (consequences are inferred). More explicitly, a Fuzzy rule Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT with k𝑘kitalic_k-inputs and 1111-output can be represented as follows.

i:𝐈𝐅(I1 is Ai1)Θ(I2 is Ai2)ΘΘ(Ik is Aik)𝐓𝐇𝐄𝐍(O is Bi),:subscript𝑖𝐈𝐅subscript𝐼1 is subscript𝐴𝑖1Θsubscript𝐼2 is subscript𝐴𝑖2ΘΘsubscript𝐼𝑘 is subscript𝐴𝑖𝑘𝐓𝐇𝐄𝐍𝑂 is subscript𝐵𝑖\begin{split}\mathfrak{R}_{i}:{}&\mathbf{IF}(I_{1}\text{ is }A_{i1})\Theta(I_{% 2}\text{ is }A_{i2})\Theta\ldots\Theta(I_{k}\text{ is }A_{ik})\\ &\mathbf{THEN}(O\text{ is }B_{i}),\end{split}start_ROW start_CELL fraktur_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT : end_CELL start_CELL bold_IF ( italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is italic_A start_POSTSUBSCRIPT italic_i 1 end_POSTSUBSCRIPT ) roman_Θ ( italic_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is italic_A start_POSTSUBSCRIPT italic_i 2 end_POSTSUBSCRIPT ) roman_Θ … roman_Θ ( italic_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is italic_A start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL bold_THEN ( italic_O is italic_B start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , end_CELL end_ROW (2)

where {I1,,Ik}subscript𝐼1subscript𝐼𝑘\{I_{1},\cdots,I_{k}\}{ italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } represents the crisp inputs to the rule. {Ai1,,Aik}subscript𝐴𝑖1subscript𝐴𝑖𝑘\{A_{i1},\cdots,A_{ik}\}{ italic_A start_POSTSUBSCRIPT italic_i 1 end_POSTSUBSCRIPT , ⋯ , italic_A start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT } and Bisubscript𝐵𝑖B_{i}italic_B start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are linguistic variables. The operator ΘΘ\Thetaroman_Θ can be AND, OR, or NOT.

III-B3 Inference Engine

The inference engine deduces the Fuzzy control actions by employing Fuzzy implication and Fuzzy rules of inference. It calculates the membership degree (μ𝜇\muitalic_μ) of the output for all linguistic variables by applying the rule set described in the Knowledge Base. For Fuzzy rules with many inputs, the output calculation depends on the operators used inside it. The calculation for each type of operator is described as follows:

(IiisAi\displaystyle(I_{i}\ \mathrm{is}\ A_{i}( italic_I start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT roman_is italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT 𝐀𝐍𝐃IjisAj):\displaystyle\mathbf{AND}\ I_{j}\ \mathrm{is}\ A_{j}):bold_AND italic_I start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT roman_is italic_A start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) : (3)
μAiAj(Iij)=min(μAi(Ii),μAj(Ij))subscript𝜇subscript𝐴𝑖subscript𝐴𝑗subscript𝐼𝑖𝑗subscript𝜇subscript𝐴𝑖subscript𝐼𝑖subscript𝜇subscript𝐴𝑗subscript𝐼𝑗\displaystyle\mu_{A_{i}\cap A_{j}}(I_{ij})=\min(\mu_{A_{i}}(I_{i}),\mu_{A_{j}}% (I_{j}))italic_μ start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∩ italic_A start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_I start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) = roman_min ( italic_μ start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_I start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , italic_μ start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_I start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) )
(IiisAi\displaystyle(I_{i}\ \mathrm{is}\ A_{i}( italic_I start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT roman_is italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT 𝐎𝐑IjisAj):\displaystyle\mathbf{OR}\ I_{j}\ \mathrm{is}\ A_{j}):bold_OR italic_I start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT roman_is italic_A start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) :
μAiAj(Iij)=max(μAi(Ii),μAj(Ij))subscript𝜇subscript𝐴𝑖subscript𝐴𝑗subscript𝐼𝑖𝑗subscript𝜇subscript𝐴𝑖subscript𝐼𝑖subscript𝜇subscript𝐴𝑗subscript𝐼𝑗\displaystyle\mu_{A_{i}\cup A_{j}}(I_{ij})=\max(\mu_{A_{i}}(I_{i}),\mu_{A_{j}}% (I_{j}))italic_μ start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∪ italic_A start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_I start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) = roman_max ( italic_μ start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_I start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , italic_μ start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_I start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) )
(𝐍𝐎𝐓Ii\displaystyle(\mathbf{NOT}\ I_{i}( bold_NOT italic_I start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT isAi):\displaystyle\ \mathrm{is}\ A_{i}):roman_is italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) :
μAi¯(Ii)=1μAi(Ii)subscript𝜇¯subscript𝐴𝑖subscript𝐼𝑖1subscript𝜇subscript𝐴𝑖subscript𝐼𝑖\displaystyle\mu_{\bar{A_{i}}}(I_{i})=1-\mu_{{A_{i}}}(I_{i})italic_μ start_POSTSUBSCRIPT over¯ start_ARG italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG end_POSTSUBSCRIPT ( italic_I start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = 1 - italic_μ start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_I start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT )

III-B4 Defuzzification Module

The defuzzification module translates Fuzzy control values into crisp numbers, that is, it links a single point to a Fuzzy set, given that the point belongs to the support of the Fuzzy set. The most well-known defuzzification technique is the centre-of-area (COA) or centre-of-gravity (COG). For continuous membership function, the defuzzified value denoted as xsuperscript𝑥x^{*}italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT using COG is defined as:

x=x.μA(x)dxμA(x)𝑑x,superscript𝑥formulae-sequence𝑥subscript𝜇𝐴𝑥𝑑𝑥subscript𝜇𝐴𝑥differential-d𝑥x^{*}=\frac{\int x.\mu_{A}(x)dx}{\int\mu_{A}(x)dx},italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = divide start_ARG ∫ italic_x . italic_μ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT ( italic_x ) italic_d italic_x end_ARG start_ARG ∫ italic_μ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT ( italic_x ) italic_d italic_x end_ARG , (4)

where μA(x)subscript𝜇𝐴𝑥\mu_{A}(x)italic_μ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT ( italic_x ) is the output membership of the linguistic variable A𝐴Aitalic_A.

IV Fuzzy Q-learning-based Opportunistic Communication

IV-A Network Model

Refer to caption
Figure 3: Network model.

Figure 3 depicts our network architecture, which comprises three parts: crowdsensing devices, a crowdsensing server, and roadside units. Crowdsensing devices are sensory data-gathering devices that are installed on buses. They include 4G and Wi-Fi network interfaces. The crowdsensing server, located in the cloud, is responsible for collecting data from devices. RSUs are roadside processing units equipped with communication and computation capabilities. The data acquired from crowdsensing devices will be sent to the crowdsensing server in one of three ways:

  1. 1.

    The devices use a 4G communication channel to send sensory data directly to the server.

  2. 2.

    The devices use Wi-Fi to relay data to the RSUs, the RSUs then deliver the data to the server through the wired network.

  3. 3.

    Using a Wi-Fi channel, devices may communicate data to a neighboring device. The nearby device will then process the packet, i.e., it can be passed to the RSU, forwarded directly to the server, or forwarded to another device.

We assume that the 4G channel is always accessible. Therefore the first transmission pathway is always feasible. Furthermore, since Wi-Fi has limited communication range, the second and third actions can only be performed when the current device reaches the coverage area of another device or a RSU. As shown in the Table I, the communication charge for 4G is much more than that of the wired network, and Wi-Fi is usually free. Therefore, this study aims to propose an offloading protocol which leverages three transmission routes mentioned above, such that the total number of packets delivered by the 4G channel is the lowest while ensuring that data delay does not exceed a certain threshold δ𝛿\deltaitalic_δ. Here the term ”data latency” refers to the time it takes for data to reach the server from when it is measured. To ease the presentation, we use the following notations, hereafter. We assume that there are n𝑛nitalic_n crowdsensing devices that are mounted on n𝑛nitalic_n buses. Each device Di(i=1,,n)subscript𝐷𝑖𝑖1𝑛D_{i}(i=1,\dots,n)italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_i = 1 , … , italic_n ) has a computing capacity of Cisubscriptsuperscript𝐶𝑖C^{*}_{i}italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and the transmission range of rDisubscript𝑟subscript𝐷𝑖r_{D_{i}}italic_r start_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT. We also assume that there are m𝑚mitalic_m RSUs denoted as Rj(j=1,,m)subscript𝑅𝑗𝑗1𝑚R_{j}(j=1,\dots,m)italic_R start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_j = 1 , … , italic_m ) which has the transmission range of rRjsubscript𝑟subscript𝑅𝑗r_{R_{j}}italic_r start_POSTSUBSCRIPT italic_R start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT.

TABLE III: Notions
Notion Description
n𝑛nitalic_n the number of crowdsensing devices
m𝑚mitalic_m the number of RSUs
Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT the i𝑖iitalic_i-th crowdsensing device
rDisubscript𝑟subscript𝐷𝑖r_{D_{i}}italic_r start_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT the transmission range of Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT
Cisubscriptsuperscript𝐶𝑖C^{*}_{i}italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT the maximum computing capacity of Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT
Rjsubscript𝑅𝑗R_{j}italic_R start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT the j𝑗jitalic_j-th RSU
rRjsubscript𝑟subscript𝑅𝑗r_{R_{j}}italic_r start_POSTSUBSCRIPT italic_R start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT the transmission range of Rjsubscript𝑅𝑗R_{j}italic_R start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT
δ𝛿\deltaitalic_δ the data latency threshold
𝒮isubscript𝒮𝑖\mathcal{S}_{i}caligraphic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT the state space of agent Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT
Si(t)subscript𝑆𝑖𝑡S_{i}(t)italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) the state at a time slot t𝑡titalic_t of agent Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT
𝒜isubscript𝒜𝑖\mathcal{A}_{i}caligraphic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT the action space of agent Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT
Ai(t)subscript𝐴𝑖𝑡A_{i}(t)italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) the action taken by agent Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT at a time slot t𝑡titalic_t
μi(t)subscript𝜇𝑖𝑡\mu_{i}(t)italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) the timing agent Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT generates the last data at a time slot t𝑡titalic_t
ci(t)subscript𝑐𝑖𝑡c_{i}(t)italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) the remaining computing resource of Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT at time slot t𝑡titalic_t
c𝒩(t)subscript𝑐𝒩𝑡c_{\mathcal{N}}(t)italic_c start_POSTSUBSCRIPT caligraphic_N end_POSTSUBSCRIPT ( italic_t ) the remaining computing resource of the nearest device at time slot t𝑡titalic_t
Δi(t)subscriptΔ𝑖𝑡\Delta_{i}(t)roman_Δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) the time interval from when Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT generates the last data until the time slot t𝑡titalic_t, Δi(t)=tμi(t)subscriptΔ𝑖𝑡𝑡subscript𝜇𝑖𝑡\Delta_{i}(t)=t-\mu_{i}(t)roman_Δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) = italic_t - italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t )
Δc(t)Δ𝑐𝑡\Delta{c}(t)roman_Δ italic_c ( italic_t ) the difference in remaining capacity of Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and nearest device at time slot t𝑡titalic_t, Δc(t)=c𝒩(t)ci(t)Δ𝑐𝑡subscript𝑐𝒩𝑡subscript𝑐𝑖𝑡\Delta{c}(t)=c_{\mathcal{N}}(t)-c_{i}(t)roman_Δ italic_c ( italic_t ) = italic_c start_POSTSUBSCRIPT caligraphic_N end_POSTSUBSCRIPT ( italic_t ) - italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t )
θ𝜃\thetaitalic_θ the priority factor

IV-B Q-learning based modeling

In this section, we first define the state and action space of the Q-learning-based model in Section IV-B1, respectively. We then propose a novel reward function for the OCVC problem in Section IV-B2. In our Q-learning-based model, the network is considered the environment, while each crowdsensing device is an agent. We utilize the distributed approach where each monitoring device runs its own Q-learning-based model. To facilitate the reading, we summarize the notations in Table III.

IV-B1 State and Action Space

For each device Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, the state at a time slot t𝑡titalic_t is a quadruple consisting of the following items:

  • μi(t)subscript𝜇𝑖𝑡\mu_{i}(t)italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ): the timing Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT generates the last packet, i.e., the time from when the last packet is generated until the current time.

  • ci(t)subscript𝑐𝑖𝑡c_{i}(t)italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ): the computing resource of Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT that is remaining at time slot t𝑡titalic_t.

  • c𝒩(t)subscript𝑐𝒩𝑡c_{\mathcal{N}}(t)italic_c start_POSTSUBSCRIPT caligraphic_N end_POSTSUBSCRIPT ( italic_t ): the remaining resource of the nearest device 𝒩𝒩\mathcal{N}caligraphic_N, if 𝒩𝒩\mathcal{N}caligraphic_N is in the communication range of Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT.

  • 𝒩iR(t)superscriptsubscript𝒩𝑖𝑅𝑡\mathcal{N}_{i}^{R}(t)caligraphic_N start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT ( italic_t ): a binary variable indicating whether Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is in the communication range of a RSU.

Furthermore, the first three entries of a state are rounded as follows. μi(t)subscript𝜇𝑖𝑡\mu_{i}(t)italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) is rounded in time step units. ci(t)subscript𝑐𝑖𝑡c_{i}(t)italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) and c𝒩(t)subscript𝑐𝒩𝑡c_{\mathcal{N}}(t)italic_c start_POSTSUBSCRIPT caligraphic_N end_POSTSUBSCRIPT ( italic_t ) are both rounded in Megabytes. In this way, we have discretized the value of the state vector. As a result, the state space is limited. A crowdsensing device can conduct one of the following actions at each time slot t𝑡titalic_t:

  1. i)

    Kee** the data in the local queue,

  2. ii)

    Sending the data directly to the crowdsensing server via 4G communication channel,

  3. iii)

    Sending the data to the nearest RSU, if Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is in the communication range of an RSU,

  4. iv)

    Sending the data to the nearest device, if they are in the communication range of each other.

Because both the state and the actions obtain discrete values, the size of the Q table is fixed. As a result, we may use the common sequential searching technique to retrieve an entry in the Q table.

IV-B2 Reward Function

We denote by i(t)subscript𝑖𝑡\mathcal{R}_{i}(t)caligraphic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) the reward received when device Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT performs action Ai(t)subscript𝐴𝑖𝑡A_{i}(t)italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ). Our goal is to minimize the total amount of data transmitted by 4G while guaranteeing that the data latency does not exceed a predefined threshold δ𝛿\deltaitalic_δ. For each type of action, the action’s goodness is reflected by different indicator. Therefore, instead of define a an unique formula for the reward function, we break it down into multiple cases as follows.

(6)
(8)
(9)
(11)
(13)

where Δi(t)=tμi(t)subscriptΔ𝑖𝑡𝑡subscript𝜇𝑖𝑡\Delta_{i}(t)=t-\mu_{i}(t)roman_Δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) = italic_t - italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) is the time elapsed from when the data is collected, Δc(t)=c𝒩(t)ci(t)Δ𝑐𝑡subscript𝑐𝒩𝑡subscript𝑐𝑖𝑡\Delta{c}(t)=c_{\mathcal{N}}(t)-c_{i}(t)roman_Δ italic_c ( italic_t ) = italic_c start_POSTSUBSCRIPT caligraphic_N end_POSTSUBSCRIPT ( italic_t ) - italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) in that c𝒩(t)subscript𝑐𝒩𝑡c_{\mathcal{N}}(t)italic_c start_POSTSUBSCRIPT caligraphic_N end_POSTSUBSCRIPT ( italic_t ) is the remaining resource of the nearest device, and p𝑝pitalic_p is a significantly large positive number. θ𝜃\thetaitalic_θ is a parameter in the range of [0,1]01[0,1][ 0 , 1 ] which we call priority factor. The value of θ𝜃\thetaitalic_θ is determined by Fuzzy logic as will be described in Section IV-C. The rationale behind the reward function is as follows. Firstly, when an action results in a resource exhaustion of the current device, or when the data’s latency exceeds the threshold, the agent will be punished by a substantial negative reward according to Formula (6). Otherwise, the reward is calculated by Formulas from (8) to (13) depending on the action type.

The reward obtained when performing the action of sending a packet to the server is represented by Formula (8). As can be seen, this reward is inversely proportional to ci(t)subscript𝑐𝑖𝑡c_{i}(t)italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ), the device’s remaining resource. It indicates that the lower ci(t)subscript𝑐𝑖𝑡c_{i}(t)italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ), the bigger the reward, indicating that the action of sending to the server is encouraged. In contrast, when ci(t)subscript𝑐𝑖𝑡c_{i}(t)italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) increases, the reward for sending a packet to the server decreases. Moreover, when ci(t)subscript𝑐𝑖𝑡c_{i}(t)italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) is large enough (more than θ×Ci𝜃subscriptsuperscript𝐶𝑖\theta\times C^{*}_{i}italic_θ × italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT), the reward turns negative, and transmitting to the server is discouraged. It means that when the remaining resource is significant enough, the device will temporarily not need to transmit the packet to the server to save money on 4G. The reward for transmitting a packet to an RSU is represented by Equation (9). Similarly to formula (8), the smaller the remaining resource, the more strongly urged is the action of transmitting to an RSU. The reward for transmitting to an adjacent device is represented by Equation (11). This reward is positive if Δc(t)Δ𝑐𝑡\Delta c(t)roman_Δ italic_c ( italic_t ) is greater than 00, i.e., the neighbor’s remaining resource is greater than the current device. It implies that transmitting to an adjacent device is only recommended if the neighboring device has a greater remaining resource than the current device. Finally, rewards for actions of sending to the server/RSU/other devices share a common term of 11+Δi(t)11subscriptΔ𝑖𝑡\frac{1}{1+\Delta_{i}(t)}divide start_ARG 1 end_ARG start_ARG 1 + roman_Δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) end_ARG. It should be noted that this is inversely proportional to Δi(t)subscriptΔ𝑖𝑡\Delta_{i}(t)roman_Δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ). As a result, this term encourages the agent to decide on offloading tasks as quickly as feasible rather than holding the packet in local memory.

Algorithm 1 Action selection and Q table update

Input: α𝛼\alphaitalic_α: learning rate, γ𝛾\gammaitalic_γ: discount factor, ϵitalic-ϵ\epsilonitalic_ϵ: a small number; current Q𝑄Qitalic_Q table; Stsubscript𝑆𝑡S_{t}italic_S start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT: current state.

R_i(t)=&Output: the next action; updated Q𝑄Qitalic_Q table.

1:// Choose the next action;
2:r𝑟absentr\leftarrowitalic_r ← uniform random number between 00 and 1111;
3:if r<ϵ𝑟italic-ϵr<\epsilonitalic_r < italic_ϵ then
4:     Atsubscript𝐴𝑡absentA_{t}\leftarrowitalic_A start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ← random action from the action space;
5:else
6:     AtargmaxAtQ(St,A)subscript𝐴𝑡subscript𝐴𝑡argmax𝑄subscript𝑆𝑡𝐴A_{t}\leftarrow\underset{A_{t}}{\text{argmax}}~{}Q(S_{t},A)italic_A start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ← start_UNDERACCENT italic_A start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_UNDERACCENT start_ARG argmax end_ARG italic_Q ( italic_S start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_A );
7:end if
8:// Update Q table;
9:St+1subscript𝑆𝑡1absentS_{t+1}\leftarrowitalic_S start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ← performing Atsubscript𝐴𝑡A_{t}italic_A start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT;
10:θ𝜃absent\theta\leftarrowitalic_θ ← calculated by Algorithm IV-B2;
11:tsubscript𝑡absent\mathcal{R}_{t}\leftarrowcaligraphic_R start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ← calculated by Formulas (5)-(9);
12:Q(St,At)(1α)Q(St,At)𝑄subscript𝑆𝑡subscript𝐴𝑡1𝛼𝑄subscript𝑆𝑡subscript𝐴𝑡Q(S_{t},A_{t})\leftarrow(1-\alpha)Q(S_{t},A_{t})italic_Q ( italic_S start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_A start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ← ( 1 - italic_α ) italic_Q ( italic_S start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_A start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT )           +α[t+γmaxaQ(St+1,a)]𝛼delimited-[]subscript𝑡𝛾subscript𝑎𝑄subscript𝑆𝑡1𝑎+~{}\alpha[\mathcal{R}_{t}+\gamma\max_{a}Q(S_{t+1},a)]+ italic_α [ caligraphic_R start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + italic_γ roman_max start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT italic_Q ( italic_S start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT , italic_a ) ];
Algorithm 2 Fuzzy logic-based θ𝜃\thetaitalic_θ determination

Input: ci(t)subscript𝑐𝑖𝑡c_{i}(t)italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ): the remaining resource

R_i(t)=&           Cisuperscriptsubscript𝐶𝑖C_{i}^{*}italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT: the maximum computing capacity; R_i(t)=           Δi(t)subscriptΔ𝑖𝑡\Delta_{i}(t)roman_Δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ): the elapsed time; R_i(t)=           δ𝛿\deltaitalic_δ: the data latency threshold. R_i(t)=Output: θ𝜃\thetaitalic_θ.

1:function FuzzyLogic(ci(t),Ci,Δi(t),δsubscript𝑐𝑖𝑡superscriptsubscript𝐶𝑖subscriptΔ𝑖𝑡𝛿c_{i}(t),C_{i}^{*},\Delta_{i}(t),\deltaitalic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) , italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , roman_Δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) , italic_δ)
2:     Cci(t)/Ci𝐶subscript𝑐𝑖𝑡superscriptsubscript𝐶𝑖C\leftarrow c_{i}(t)/C_{i}^{*}italic_C ← italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) / italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT;
3:     ΔΔi(t)/δΔsubscriptΔ𝑖𝑡𝛿\Delta\leftarrow\Delta_{i}(t)/\deltaroman_Δ ← roman_Δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) / italic_δ;
4:     // Fuzzification
5:     μL(C)=Trapezoidal(C,aL,bL,cL,dL)subscript𝜇L𝐶Trapezoidal𝐶subscript𝑎𝐿subscript𝑏𝐿subscript𝑐𝐿subscript𝑑𝐿\mu_{\mathrm{L}}(C)=\mathrm{Trapezoidal}(C,a_{L},b_{L},c_{L},d_{L})italic_μ start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ( italic_C ) = roman_Trapezoidal ( italic_C , italic_a start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_d start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT );
6:     μM(C)=Trapezoidal(C,aM,bM,cM,dM)subscript𝜇M𝐶Trapezoidal𝐶subscript𝑎𝑀subscript𝑏𝑀subscript𝑐𝑀subscript𝑑𝑀\mu_{\mathrm{M}}(C)=\mathrm{Trapezoidal}(C,a_{M},b_{M},c_{M},d_{M})italic_μ start_POSTSUBSCRIPT roman_M end_POSTSUBSCRIPT ( italic_C ) = roman_Trapezoidal ( italic_C , italic_a start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT , italic_d start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT );
7:     μH(C)=Trapezoidal(C,aH,bH,cH,dH)subscript𝜇H𝐶Trapezoidal𝐶subscript𝑎𝐻subscript𝑏𝐻subscript𝑐𝐻subscript𝑑𝐻\mu_{\mathrm{H}}(C)=\mathrm{Trapezoidal}(C,a_{H},b_{H},c_{H},d_{H})italic_μ start_POSTSUBSCRIPT roman_H end_POSTSUBSCRIPT ( italic_C ) = roman_Trapezoidal ( italic_C , italic_a start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT , italic_d start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT );
8:     μL(Δ)=Trapezoidal(Δ,aL,bL,cL,dL)subscript𝜇LΔTrapezoidalΔsubscript𝑎𝐿subscript𝑏𝐿subscript𝑐𝐿subscript𝑑𝐿\mu_{\mathrm{L}}(\Delta)=\mathrm{Trapezoidal}(\Delta,a_{L},b_{L},c_{L},d_{L})italic_μ start_POSTSUBSCRIPT roman_L end_POSTSUBSCRIPT ( roman_Δ ) = roman_Trapezoidal ( roman_Δ , italic_a start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_d start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT );
9:     μM(Δ)=Trapezoidal(Δ,aM,bM,cM,dM)subscript𝜇MΔTrapezoidalΔsubscript𝑎𝑀subscript𝑏𝑀subscript𝑐𝑀subscript𝑑𝑀\mu_{\mathrm{M}}(\Delta)=\mathrm{Trapezoidal}(\Delta,a_{M},b_{M},c_{M},d_{M})italic_μ start_POSTSUBSCRIPT roman_M end_POSTSUBSCRIPT ( roman_Δ ) = roman_Trapezoidal ( roman_Δ , italic_a start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT , italic_d start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT );
10:     μH(Δ)=Trapezoidal(Δ,aH,bH,cH,dH)subscript𝜇HΔTrapezoidalΔsubscript𝑎𝐻subscript𝑏𝐻subscript𝑐𝐻subscript𝑑𝐻\mu_{\mathrm{H}}(\Delta)=\mathrm{Trapezoidal}(\Delta,a_{H},b_{H},c_{H},d_{H})italic_μ start_POSTSUBSCRIPT roman_H end_POSTSUBSCRIPT ( roman_Δ ) = roman_Trapezoidal ( roman_Δ , italic_a start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT , italic_d start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT );
11:     // Fuzzy controller
12:     MnullMnull\mathrm{M}\leftarrow\textrm{null}roman_M ← null;
13:     for A{L,M,H}𝐴𝐿𝑀𝐻A\in\{L,M,H\}italic_A ∈ { italic_L , italic_M , italic_H } do
14:         for B{L,M,H}𝐵𝐿𝑀𝐻B\in\{L,M,H\}italic_B ∈ { italic_L , italic_M , italic_H } do
15:              μ=min{μA(C),μB(Δ)}𝜇minsubscript𝜇𝐴𝐶subscript𝜇𝐵Δ\mu=\mathrm{min}\{\mu_{A}(C),\mu_{B}(\Delta)\}italic_μ = roman_min { italic_μ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT ( italic_C ) , italic_μ start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ( roman_Δ ) };
16:              M.add(μ)formulae-sequenceMadd𝜇\mathrm{M}.\textrm{add}(\mu)roman_M . add ( italic_μ );
17:         end for
18:     end for
19:     // Defuzzification
20:     l=argmax𝜇M𝑙𝜇𝑎𝑟𝑔𝑚𝑎𝑥Ml=\underset{\mathrm{\mu}}{argmax}\mathrm{M}italic_l = underitalic_μ start_ARG italic_a italic_r italic_g italic_m italic_a italic_x end_ARG roman_M;
21:     Dthe output of rule lDthe output of rule subscript𝑙\mathrm{D}\leftarrow\textrm{the output of rule }\mathfrak{R}_{l}roman_D ← the output of rule fraktur_R start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT;
22:     θthe value of θ by CoG function𝜃the value of 𝜃 by CoG function\theta\leftarrow\textrm{the value of }\theta\textrm{ by CoG function}italic_θ ← the value of italic_θ by CoG function;
23:     return θ𝜃\thetaitalic_θ
24:end function

To choose the next action, we leverage the ϵitalic-ϵ\epsilonitalic_ϵ-greedy policy. In particular, the ϵitalic-ϵ\epsilonitalic_ϵ steadily declines over time as follows:

ϵϵ×max{0,maximum timecurrent timemaximum time},italic-ϵitalic-ϵ0maximum timecurrent timemaximum time\epsilon\leftarrow\epsilon\times\max\left\{0,\frac{\text{maximum time}-\text{% current time}}{\text{maximum time}}\right\},italic_ϵ ← italic_ϵ × roman_max { 0 , divide start_ARG maximum time - current time end_ARG start_ARG maximum time end_ARG } , (14)

where the maximum time is a predefined threshold. The details of this policy are described in Algorithm IV-B2. At each step, the agent selects a random action with a probability of ϵitalic-ϵ\epsilonitalic_ϵ from the action space (line 4 in Algorithm IV-B2) and the action with the greatest Q value with a probability of 1ϵ1italic-ϵ1-\epsilon1 - italic_ϵ (line 6 in Algorithm IV-B2).

IV-C Fuzzy Logic-based priority factor determination

IV-C1 Motivation

Refer to caption
Refer to caption
Figure 4: Fuzzy input membership function with three linguistic variables: low (L), medium (M) and high (H).
Refer to caption
Figure 5: Fuzzy output membership function.

We found that simultaneously optimizing two objectives, namely minimizing the total amount of data transmitted over the 4G communication channel and guaranteeing the data latency, is challenging. Specifically, devices tend to store data until they can send it to an RSU to reduce the usage of 4G transmission. This strategy, however, may cause a large delay due to the waiting period. As a result, rather than using a fixed value for θ𝜃\thetaitalic_θ, we propose a mechanism for adjusting θ𝜃\thetaitalic_θ dynamically in response to network status. We observe that the remaining resource and the amount of time the data has elapsed are two factors that influence a device’s behavior. The device may tolerate more latency when the device’s remaining resource is large or data latency is minimal; therefore, the agent should avoid transmitting immediately to the server to save on communication expenses. As a result, θ𝜃\thetaitalic_θ should be insignificant when the device’s remaining resource is large and the data latency is low. In contrast, when the remaining resource in the device is low or data latency is large, the device should send data to the server rather than transferring it to the nearest device to ensure the latency constraint is satisfied. Consequently, θ𝜃\thetaitalic_θ should be set high enough that the reward of the action sending to the server surpasses the reward of relaying to the nearest device. Furthermore, as long as the device is within an RSU’s communication range, transmission to the RSU takes precedence, i.e., the action of sending to an RSU is designed to get the highest reward of all the actions.

Motivated by the observation mentioned above, we design a Fuzzy-based algorithm for dynamically adjust the value of θ𝜃\thetaitalic_θ. The pseudo-code of the algorithm is presented in Algorithm IV-B2.

IV-C2 Fuzzification

The input of the fuzzification process is a pair consisting of the remaining resource, and the latency the data has elapsed (i.e., defined by the time from when the data is measured until the current time). The input then is mapped to three linguistic variables: low (L), medium (M) and high (H). The output of the Fuzzification is also mapped to three respective levels: low, medium and high. We use the trapezoidal Fuzzy set that could be described by the following formula:

Trapezoidal(x,a,b,c,d)={0 if xaxaba if axb1 if bxcdxdc if cxd0 if dx,Trapezoidal𝑥𝑎𝑏𝑐𝑑cases0 if 𝑥𝑎𝑥𝑎𝑏𝑎 if 𝑎𝑥𝑏1 if 𝑏𝑥𝑐𝑑𝑥𝑑𝑐 if 𝑐𝑥𝑑0 if 𝑑𝑥\textrm{Trapezoidal}(x,a,b,c,d)=\begin{cases}0&\text{ if }x\leq a\\ \frac{x-a}{b-a}&\text{ if }a\leq x\leq b\\ 1&\text{ if }b\leq x\leq c\\ \frac{d-x}{d-c}&\text{ if }c\leq x\leq d\\ 0&\text{ if }d\leq x,\end{cases}Trapezoidal ( italic_x , italic_a , italic_b , italic_c , italic_d ) = { start_ROW start_CELL 0 end_CELL start_CELL if italic_x ≤ italic_a end_CELL end_ROW start_ROW start_CELL divide start_ARG italic_x - italic_a end_ARG start_ARG italic_b - italic_a end_ARG end_CELL start_CELL if italic_a ≤ italic_x ≤ italic_b end_CELL end_ROW start_ROW start_CELL 1 end_CELL start_CELL if italic_b ≤ italic_x ≤ italic_c end_CELL end_ROW start_ROW start_CELL divide start_ARG italic_d - italic_x end_ARG start_ARG italic_d - italic_c end_ARG end_CELL start_CELL if italic_c ≤ italic_x ≤ italic_d end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL if italic_d ≤ italic_x , end_CELL end_ROW (15)

where x𝑥xitalic_x is the input, and a,b,c,d𝑎𝑏𝑐𝑑a,b,c,ditalic_a , italic_b , italic_c , italic_d are parameters. The values of a,b,c,d𝑎𝑏𝑐𝑑a,b,c,ditalic_a , italic_b , italic_c , italic_d are represented in Table IV and V. Fig.5 and 5 illustrate the shape of the input membership functions.

TABLE IV: Input variables with their linguistic values and corresponding membership function
Input variable Linguistic value Membership function
The remaining resource (%C\%C^{*}% italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT) low (L) [0, 0, 0.2, 0.3]
medium (M) [0.2, 0.3, 0.5, 0.6]
high (H) [0.5, 0.6, 1, 1]
 The elapsed time (%δ\%\delta% italic_δ) low (L) [0, 0, 0.4, 0.5]
medium (M) [0.4, 0.5, 0.7, 0.8]
high (H) [0.7, 0.8, 1, 1]
TABLE V: Output variable with their linguistic values and corresponding membership function
Output variable Linguistic value Membership function
θ𝜃\thetaitalic_θ low (L) [0, 0, 0.3, 0.4]
medium (M) [0.3, 0.4, 0.6, 0.7]
high (H) [0.6, 0.7, 1, 1]

IV-C3 Knowledge Base

We have two input variables, namely the remaining resource, and the elapsed time, each of which is transformed into three Fuzzy sets. Therefore, we have 32=9superscript3293^{2}=93 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 9 Fuzzy rules in total. The rules are shown in Table VI. These rules are designed based on the observation described in Section IV-C1. Our rules have the form of “IF (the remaining resource is A) AND (the elapsed time is B) THEN (θ𝜃\thetaitalic_θ is C)”, where A,B,C𝐴𝐵𝐶A,B,Citalic_A , italic_B , italic_C obtain the values of low, medium, and high. To ease the presentation, we denote low, medium, and high as L𝐿Litalic_L, M𝑀Mitalic_M, and H𝐻Hitalic_H, respectively.

TABLE VI: Fuzzy rules
No. Input Output
The remaining resource The elapsed time θ𝜃\thetaitalic_θ
1 L L M
2 L M H
3 L H H
4 M L L
5 M M M
6 M H H
7 H L L
8 H M L
9 H H M

IV-C4 Inference Engine

As our Fuzzy rules are based on AND operator, the output membership function is defined as below.

μi=min{\displaystyle\mu_{i}=\min\{italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = roman_min { μA(the remaining resource),subscript𝜇𝐴the remaining resource\displaystyle\mu_{A}(\text{the remaining resource}),italic_μ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT ( the remaining resource ) , (38)
μB(the elapsed time)},i=1,,9.\displaystyle\mu_{B}(\text{the elapsed time})\},\forall i=1,\cdots,9.italic_μ start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ( the elapsed time ) } , ∀ italic_i = 1 , ⋯ , 9 .

IV-C5 Defuzzification

After going through the steps above, the Fuzzy set with the highest membership degree is considered as the output variable. Finally, we utilize the CoG function in Formula (4) to calculate the crisp value of the output’s fuzzy set.

IV-D Computational Complexity

We analyze the computational complexity to choose the next action and update the Q table in the following. Since the agent utilizes the Epsilon Greedy policy to choose the next action, it needs to check the Q values of all actions in the action space. The computational complexity of this operation is O(|𝒜|)𝑂𝒜O(|\mathcal{A}|)italic_O ( | caligraphic_A | ), where 𝒜𝒜\mathcal{A}caligraphic_A indicates the action space. To update the Q table, the agent first utilizes Fuzzy Logic to determine the value of θ𝜃\thetaitalic_θ (Algorithm IV-B2), which has a computational complexity of O(LC×LΔ)𝑂subscriptL𝐶subscriptLΔO(\mathrm{L}_{C}\times\mathrm{L}_{\Delta})italic_O ( roman_L start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT × roman_L start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ), where LCsubscriptL𝐶\mathrm{L}_{C}roman_L start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT and LΔsubscriptLΔ\mathrm{L}_{\Delta}roman_L start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT are the number of linguistic values of the remaining resource and the elapsed time, respectively. The agent then calculates the reward and uses Formula (1) to update the Q value. This operation has a computational complexity of O(|𝒜|×|𝒮|)𝑂𝒜𝒮O(|\mathcal{A}|\times|\mathcal{S}|)italic_O ( | caligraphic_A | × | caligraphic_S | ), where 𝒮𝒮\mathcal{S}caligraphic_S denotes the state space. As a result, the total time for updating the Q table at each step is O(|𝒜|+|𝒜|×|𝒮|+LC×LΔ)𝑂𝒜𝒜𝒮subscriptL𝐶subscriptLΔO(|\mathcal{A}|+|\mathcal{A}|\times|\mathcal{S}|+\mathrm{L}_{C}\times\mathrm{L% }_{\Delta})italic_O ( | caligraphic_A | + | caligraphic_A | × | caligraphic_S | + roman_L start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT × roman_L start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ), which is equivalent to O(|𝒜|×|𝒮|+LC×LΔ)𝑂𝒜𝒮subscriptL𝐶subscriptLΔO(|\mathcal{A}|\times|\mathcal{S}|+\mathrm{L}_{C}\times\mathrm{L}_{\Delta})italic_O ( | caligraphic_A | × | caligraphic_S | + roman_L start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT × roman_L start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ).

V Evaluation

V-A Methodology

In this section, we evaluate the efficiency of our proposed algorithm in terms of optimizing the communication performance and cost.

Simulation model: We developed an in-house simulator in Python programming language version 3.8.8. The packet transmission process is implemented by extending the queue-model proposed in [23]. Our code can be access via [26]. In which, the devices iteratively generate homogeneous packets of the same size. We denote by λdsubscript𝜆𝑑\lambda_{d}italic_λ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT the packet generation interval After being created, packets are queued in the device and waiting for transmission. If the queue is full, the packet will be dropped. As a result, data latency comprises two parts: waiting time in the queue and transmission time from the device to the server. The transmission time is proportional to the packet size and inversely proportional to the communication channel bandwidth. At each time slot, the device picks a packet from the queue and transmits it in one of three ways:

  • Using 4G to communicate directly to the server.

  • Transferring data to an RSU through Wi-Fi and then transmitting data from the RSU to the server via the wired network.

  • Relaying to a nearby device via Wi-Fi and then following that device’s policy (transmitting directly to the server, transferring to RSU, or continuing to forward to another device) to send the packet to the server.

The offloading method is determined by the algorithms (our algorithm and the baselines). If the device chooses the second or third offloading mode but is not within the coverage of the RSU (or another device), it will hold the packet in the queue and wait for the next time slot (if the queue is not full), or it will transmit it immediately to the server using 4G (otherwise). We refer to the first case as offload-hit and the second case as offload-missed.

Simulation environment: In this work, we use the data collected within two days of bus routes in Seattle City, Washington [27] to simulate the movement of vehicles. Each data point includes the time and position of a bus. We only collect data of buses whose active time is no less than 90 minutes per day. We then generate the RSUs’ position on the map along each bus route. In which, the RSUs are concentrated in the city center, 1 - 3 km apart, while in the suburbs there will be a sparser number of RSUs, 4 - 8 km apart.

Evaluation metrics: It is worthy to note that the target of this work is to keep the number of packets that can reach the server as much as possible (i.e., maximizing the delivery ratio), while reducing the amount of data with a long latency (i.e., guaranteeing the information freshness) and lessening the 4G communication usage rates (i.e., minimizing the communication cost). Although we have already taken the wired and Wi-Fi costs into account since the per-second communication costs of these two plans are negligible compared to the cost of 4G communication, we focus only on the 4G communication ratio in this section. First, we define the rate of dropped packets, called rdropsubscript𝑟𝑑𝑟𝑜𝑝r_{drop}italic_r start_POSTSUBSCRIPT italic_d italic_r italic_o italic_p end_POSTSUBSCRIPT, for the delivery ratio. This metric is computed by dividing the number of dropped packets by the number of packets transmitted. To analyze the second criteria, ensuring the packet’s freshness, we propose the term δ𝛿\deltaitalic_δ-delayed packets, calculated by the total number of packets with latency higher than a threshold δ𝛿\deltaitalic_δ. In addition, we also define an additional term rate of δ𝛿\deltaitalic_δ-delayed packets (denoted rdelaysubscript𝑟𝑑𝑒𝑙𝑎𝑦r_{delay}italic_r start_POSTSUBSCRIPT italic_d italic_e italic_l italic_a italic_y end_POSTSUBSCRIPT), which is the ratio of the δ𝛿\deltaitalic_δ-delayed packets to the total number of packets sent. Finally, to evaluate the communication cost, we measure the proportion of packets sent using 4G out of the total number of packets. This number is named rserversubscript𝑟𝑠𝑒𝑟𝑣𝑒𝑟r_{server}italic_r start_POSTSUBSCRIPT italic_s italic_e italic_r italic_v italic_e italic_r end_POSTSUBSCRIPT111Assumption: the cost of sending a packet via the 4G communication channel is much higher than that of using Wi-Fi or wired network as an example shown in Table I. In addition, we also define rrsusubscript𝑟𝑟𝑠𝑢r_{rsu}italic_r start_POSTSUBSCRIPT italic_r italic_s italic_u end_POSTSUBSCRIPT, which is the rate of packets transferred via relay from RSU to the server.

TABLE VII: Simulation parameters
Parameter Value
Packet size 1 Mb
RSU transmission range 250 Meter
Sensor transmission range 40 Meter
RSU-server’s link bandwidth (wired network) 10 Gbps
Sensor-RSU’s link bandwidth (wifi network) 1 Gbps
Sensor-server’s link bandwidth (4G communication) 500 Mbps
The length of a time slot (𝒯𝒯\mathcal{T}caligraphic_T) 1 Min
Packet generation interval at a sensor (λdsubscript𝜆𝑑\lambda_{d}italic_λ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT) 1 similar-to\sim 5 𝒯𝒯\mathcal{T}caligraphic_T
Data latency threshold (δ𝛿\deltaitalic_δ) 5 similar-to\sim 25 𝒯𝒯\mathcal{T}caligraphic_T
Sensor’s computing capacity Csuperscript𝐶C^{*}italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT 25 Mb
Number of RSUs (m𝑚mitalic_m) 384
Number of vehicles (n𝑛nitalic_n) 776
Refer to caption
(a) Pkeep=0.1subscript𝑃𝑘𝑒𝑒𝑝0.1P_{keep}=0.1italic_P start_POSTSUBSCRIPT italic_k italic_e italic_e italic_p end_POSTSUBSCRIPT = 0.1
Refer to caption
(b) Pkeep=0.3subscript𝑃𝑘𝑒𝑒𝑝0.3P_{keep}=0.3italic_P start_POSTSUBSCRIPT italic_k italic_e italic_e italic_p end_POSTSUBSCRIPT = 0.3
Refer to caption
(c) Pkeep=0.5subscript𝑃𝑘𝑒𝑒𝑝0.5P_{keep}=0.5italic_P start_POSTSUBSCRIPT italic_k italic_e italic_e italic_p end_POSTSUBSCRIPT = 0.5
Figure 6: Finding sub-optimal combination of Pkeepsubscript𝑃𝑘𝑒𝑒𝑝P_{keep}italic_P start_POSTSUBSCRIPT italic_k italic_e italic_e italic_p end_POSTSUBSCRIPT, Pserversubscript𝑃𝑠𝑒𝑟𝑣𝑒𝑟P_{server}italic_P start_POSTSUBSCRIPT italic_s italic_e italic_r italic_v italic_e italic_r end_POSTSUBSCRIPT, Prsusubscript𝑃𝑟𝑠𝑢P_{rsu}italic_P start_POSTSUBSCRIPT italic_r italic_s italic_u end_POSTSUBSCRIPT and Psensorsubscript𝑃𝑠𝑒𝑛𝑠𝑜𝑟P_{sensor}italic_P start_POSTSUBSCRIPT italic_s italic_e italic_n italic_s italic_o italic_r end_POSTSUBSCRIPT for the FP baseline using grid search. δ𝛿\deltaitalic_δ is fixed to 10101010 time slots. The bigger circle refers to the higher rate.

Reducing the energy consumption of crowdsensing devices (sensors) is well-known in the literature as one of the most critical challenges. To understand how energy efficient these strategies are, we measure the total energy consumption of sensors employing different offloading strategies in this study. We solely assess the energy consumption of communication because the variation in how the sensors perform with different offloading strategies in our simulation is in the communication method. Specifically, there are two types of communications at a given sensor, i.e., communication via WiFi and 4G channels. At a given sensor i𝑖iitalic_i, for a given packet j𝑗jitalic_j transmitted via a given channel, we use the average power consumption (PWj𝑃subscript𝑊𝑗PW_{j}italic_P italic_W start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT) of that channel (stated in Table II) and the simulated transmission latency of this packet (Tjsubscript𝑇𝑗T_{j}italic_T start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT) on the corresponding channel to estimate the energy consumption. That is E=i=1nj=1NiPWj×Tj𝐸superscriptsubscript𝑖1𝑛superscriptsubscript𝑗1subscript𝑁𝑖𝑃subscript𝑊𝑗subscript𝑇𝑗E=\sum_{i=1}^{n}{\sum_{j=1}^{N_{i}}{PW_{j}\times T_{j}}}italic_E = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_P italic_W start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT × italic_T start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT where n𝑛nitalic_n is the number of sensors and Nisubscript𝑁𝑖N_{i}italic_N start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the number of packet processed by this sensor.

Comparison baseline: Because there is no current work that handles the same problem as ours, to show the efficiency of our proposed method, we compare it with two baselines:

The first comparison baseline is a simple yet effective greedy opportunistic communication method defined below. When a new packet is generated, the device performs an action based on the following rule:

  • If the device is in the communication range of an RSU, it always sends the packet to the RSU. The RSU will send the packet to the cloud server after that.

  • Otherwise, if the device is in the communication range of other devices nearer to an RSU than itself, it relays the packet to the new device.

  • If the device does stays inside the communication range of neither an RSU nor another device, it sends the packet directly to the cloud server.

The second baseline is a naive offloading strategy named FP. In FP, at a given time slot, a packet is randomly decided between four actions, namely kee** at the local, sending directly to the server, transferring to an RSU, and relaying to the nearest device, with fixed possibilities of Pkeepsubscript𝑃𝑘𝑒𝑒𝑝P_{keep}italic_P start_POSTSUBSCRIPT italic_k italic_e italic_e italic_p end_POSTSUBSCRIPT, Pserversubscript𝑃𝑠𝑒𝑟𝑣𝑒𝑟P_{server}italic_P start_POSTSUBSCRIPT italic_s italic_e italic_r italic_v italic_e italic_r end_POSTSUBSCRIPT, Prsusubscript𝑃𝑟𝑠𝑢P_{rsu}italic_P start_POSTSUBSCRIPT italic_r italic_s italic_u end_POSTSUBSCRIPT and Psensorsubscript𝑃𝑠𝑒𝑛𝑠𝑜𝑟P_{sensor}italic_P start_POSTSUBSCRIPT italic_s italic_e italic_n italic_s italic_o italic_r end_POSTSUBSCRIPT, respectively. We first conduct a grid search over the four parameters Pkeepsubscript𝑃𝑘𝑒𝑒𝑝P_{keep}italic_P start_POSTSUBSCRIPT italic_k italic_e italic_e italic_p end_POSTSUBSCRIPT, Pserversubscript𝑃𝑠𝑒𝑟𝑣𝑒𝑟P_{server}italic_P start_POSTSUBSCRIPT italic_s italic_e italic_r italic_v italic_e italic_r end_POSTSUBSCRIPT, Prsusubscript𝑃𝑟𝑠𝑢P_{rsu}italic_P start_POSTSUBSCRIPT italic_r italic_s italic_u end_POSTSUBSCRIPT and Psensorsubscript𝑃𝑠𝑒𝑛𝑠𝑜𝑟P_{sensor}italic_P start_POSTSUBSCRIPT italic_s italic_e italic_n italic_s italic_o italic_r end_POSTSUBSCRIPT to find the sub-optimal combination. We then compare the performance of our proposed solution to the FP’s sub-optimal configuration.

In the following, we first compare the performance and cost of our proposed method with the baseline concerning particular settings of the packet generation interval λdsubscript𝜆𝑑\lambda_{d}italic_λ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT and the latency threshold δ𝛿\deltaitalic_δ in Section V-B. We then investigate the impacts of λdsubscript𝜆𝑑\lambda_{d}italic_λ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT and δ𝛿\deltaitalic_δ in Section V-C.

V-B Comparison of the proposed method and the baseline

In this comparison, we set the data latency threshold δ𝛿\deltaitalic_δ to 5555 and 10101010 and the packet generation interval λdsubscript𝜆𝑑\lambda_{d}italic_λ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT to 1111. The remaining simulation parameters are derived from [23], and presented in Table VII.

V-B1 Sub-optimal configuration of FP

In this section, we perform grid search on Pkeepsubscript𝑃𝑘𝑒𝑒𝑝P_{keep}italic_P start_POSTSUBSCRIPT italic_k italic_e italic_e italic_p end_POSTSUBSCRIPT, Prsusubscript𝑃𝑟𝑠𝑢P_{rsu}italic_P start_POSTSUBSCRIPT italic_r italic_s italic_u end_POSTSUBSCRIPT and Psensorsubscript𝑃𝑠𝑒𝑛𝑠𝑜𝑟P_{sensor}italic_P start_POSTSUBSCRIPT italic_s italic_e italic_n italic_s italic_o italic_r end_POSTSUBSCRIPT to identify the sub-optimal combination222Pserver=1PkeepPrsuPsensorsubscript𝑃𝑠𝑒𝑟𝑣𝑒𝑟1subscript𝑃𝑘𝑒𝑒𝑝subscript𝑃𝑟𝑠𝑢𝑃𝑠𝑒𝑛𝑠𝑜𝑟P_{server}=1-P_{keep}-P_{rsu}-P{sensor}italic_P start_POSTSUBSCRIPT italic_s italic_e italic_r italic_v italic_e italic_r end_POSTSUBSCRIPT = 1 - italic_P start_POSTSUBSCRIPT italic_k italic_e italic_e italic_p end_POSTSUBSCRIPT - italic_P start_POSTSUBSCRIPT italic_r italic_s italic_u end_POSTSUBSCRIPT - italic_P italic_s italic_e italic_n italic_s italic_o italic_r. Fig. 6 illustrates the rate of dropped packets (rdropsubscript𝑟𝑑𝑟𝑜𝑝r_{drop}italic_r start_POSTSUBSCRIPT italic_d italic_r italic_o italic_p end_POSTSUBSCRIPT), δ𝛿\deltaitalic_δ-delayed packets (rdelaysubscript𝑟𝑑𝑒𝑙𝑎𝑦r_{delay}italic_r start_POSTSUBSCRIPT italic_d italic_e italic_l italic_a italic_y end_POSTSUBSCRIPT) and the 4G communication ratio rserversubscript𝑟𝑠𝑒𝑟𝑣𝑒𝑟r_{server}italic_r start_POSTSUBSCRIPT italic_s italic_e italic_r italic_v italic_e italic_r end_POSTSUBSCRIPT when we varies the value from 0.1 to 0.5. The bigger circle refers to the higher rate. Through this result, we consider different configurations of FP that could have small packet dropped rate, e.g., rdrop0subscript𝑟𝑑𝑟𝑜𝑝0r_{drop}\approx 0italic_r start_POSTSUBSCRIPT italic_d italic_r italic_o italic_p end_POSTSUBSCRIPT ≈ 0. We then pick up three configurations with the highest, medium and lowest 4G communication ratio rserversubscript𝑟𝑠𝑒𝑟𝑣𝑒𝑟r_{server}italic_r start_POSTSUBSCRIPT italic_s italic_e italic_r italic_v italic_e italic_r end_POSTSUBSCRIPT to be the representations for the FP baselines (named as FP1, FP2, and FP3, respectively). The detail settings of FP are summarized in Table VIII.

TABLE VIII: Configuration of the fix-possibility strategies
Strategies Pkeepsubscript𝑃𝑘𝑒𝑒𝑝P_{keep}italic_P start_POSTSUBSCRIPT italic_k italic_e italic_e italic_p end_POSTSUBSCRIPT Pserversubscript𝑃𝑠𝑒𝑟𝑣𝑒𝑟P_{server}italic_P start_POSTSUBSCRIPT italic_s italic_e italic_r italic_v italic_e italic_r end_POSTSUBSCRIPT Prsusubscript𝑃𝑟𝑠𝑢P_{rsu}italic_P start_POSTSUBSCRIPT italic_r italic_s italic_u end_POSTSUBSCRIPT Psensorsubscript𝑃𝑠𝑒𝑛𝑠𝑜𝑟P_{sensor}italic_P start_POSTSUBSCRIPT italic_s italic_e italic_n italic_s italic_o italic_r end_POSTSUBSCRIPT
FP1 0.1 0.7 0.1 0.1
FP2 0.1 0.5 0.3 0.1
FP3 0.1 0.3 0.5 0.1

V-B2 Delivery Ratio

We shows the breakdown of the simulation packets (in percentage) of the proposed method as well as the greedy, and the FP baseline strategies in Fig. 7. We consider the lower value to be the better performance. First of all, with the FP strategy, devices tend to hold the data in their queue until they can send it to an RSU/or the nearest device (the offload-hit as mentioned in section V-A). This strategy leads to a longer delay of a message and a higher number of dropped packets (when the queue is full). As a result, the FP baseline strategies can not avoid the packet-dropped issue in all the cases, e.g., 0.15% of rdropsubscript𝑟𝑑𝑟𝑜𝑝r_{drop}italic_r start_POSTSUBSCRIPT italic_d italic_r italic_o italic_p end_POSTSUBSCRIPT as in FP3 when we set the latency threshold to 5 time slots. In contrast, the greedy strategy does not keep the packet in the queue of a given sensor when there exist no RSUs and devices around it, yet encourage to send data directly to the cloud server. Thus, it could deliver all the packets to the server with a zero-packet dropped rate. Similar to the greedy strategy, our proposed method also achieves the zero-packet dropped rate and guarantees all the generated packets can reach the server. It is the result of flexibly constructing the priority factor θ𝜃\thetaitalic_θ using the Fuzzy logic approach which considers both the number of packets in the local queue and the data latency of packets.

V-B3 Information Freshness and Communication Cost

The result in Fig. 7 also shows that our method provides a smaller number of δ𝛿\deltaitalic_δ-delayed packet than the FP3 strategy, e.g., 1.86% and 0.93% which are 1.3×1.3\times1.3 × and 2.94×2.94\times2.94 × lower than those of FP3. In addition, although the rdelaysubscript𝑟𝑑𝑒𝑙𝑎𝑦r_{delay}italic_r start_POSTSUBSCRIPT italic_d italic_e italic_l italic_a italic_y end_POSTSUBSCRIPT of the greedy, FP1, and FP2 strategies are lower than those of our proposed method, they require much more 4G cost than ours. Specifically, in all the experiments, our method requires the lowest communication cost (rserversubscript𝑟𝑠𝑒𝑟𝑣𝑒𝑟r_{server}italic_r start_POSTSUBSCRIPT italic_s italic_e italic_r italic_v italic_e italic_r end_POSTSUBSCRIPT), i.e., only around 60% of packets travel through the 4G network while those are around 70%percent7070\%70 % as in the greedy. Furthermore, the 4G communication cost, i.e., rserversubscript𝑟𝑠𝑒𝑟𝑣𝑒𝑟r_{server}italic_r start_POSTSUBSCRIPT italic_s italic_e italic_r italic_v italic_e italic_r end_POSTSUBSCRIPT of the FP strategies are much higher than expected, i.e., around Pserversubscript𝑃𝑠𝑒𝑟𝑣𝑒𝑟P_{server}italic_P start_POSTSUBSCRIPT italic_s italic_e italic_r italic_v italic_e italic_r end_POSTSUBSCRIPT, due to the offload-missed issue mentioned in Section V-A. When a device decides to send a packet to the nearest RSU/device but there exist no RSUs/devices around, or when the local queue is full, it sends the packet to server directly vía the 4G network. Such offload-missed issue increases the 4G communication cost significantly. For example, in the FP1, FP2 and FP3, more than 96%percent9696\%96 %, 86%percent8686\%86 % and 73%percent7373\%73 % of packets use the 4G communication channel (although it is designed with a fixed possibility of 70%, 50% and 30%, respectively).

V-B4 Energy Consumption

Refer to caption
Figure 7: Relative breakdown of the simulation packets.

In Fig. 8, we present the total energy consumption of offloading strategies using the stacked bars. The line series shows the average energy consumption of packets that reach the crowdsensing server. Interestingly, in the case of δ=5𝛿5\delta=5italic_δ = 5, although the energy consumption of our proposed strategy is higher than that of FP3, the average energy consumption per packet of our proposal is smaller than that of FP3. It refers to the result in Fig. 7(a) when the rate of packets transmitted to the server (1 - rdropsubscript𝑟𝑑𝑟𝑜𝑝r_{drop}italic_r start_POSTSUBSCRIPT italic_d italic_r italic_o italic_p end_POSTSUBSCRIPT) of our proposed strategy is higher while the rate of the packet transmitted to the server via the 4G channel (rserversubscript𝑟𝑠𝑒𝑟𝑣𝑒𝑟r_{server}italic_r start_POSTSUBSCRIPT italic_s italic_e italic_r italic_v italic_e italic_r end_POSTSUBSCRIPT) is smaller than those of the FP1 and FP2. On the other hand, when the packets transmitted to the server of the four targeted strategies are similar, e.g., δ=10𝛿10\delta=10italic_δ = 10, our proposal consumes the lowest energy because it uses the 4G channel least.

Refer to caption
Figure 8: Energy consumption of estimated offloading strategies (λdsubscript𝜆𝑑\lambda_{d}italic_λ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT =1).

V-B5 Summary

Refer to caption
(a) 4G communication ratio
Refer to caption
(b) Rate of δ𝛿\deltaitalic_δ-delayed packets
Refer to caption
(c) RSU to server rate

R_i(t)=& Figure 9: Impacts of the packet generation interval (λdsubscript𝜆𝑑\lambda_{d}italic_λ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT). δ𝛿\deltaitalic_δ is fixed to 5 time slots.

Refer to caption
(a) 4G communication ratio
Refer to caption
(b) Rate of δ𝛿\deltaitalic_δ-delayed packets
Refer to caption
(c) RSU to server rate

R_i(t)=& Figure 10: Impact of the data latency threshold (δ𝛿\deltaitalic_δ). λdsubscript𝜆𝑑\lambda_{d}italic_λ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT is fixed to 1 time slot.

In summary, the results imply that the performance/cost of the baseline FP approach is sensitive to the environment/network due to its coarse fixed configuration. Thus, it requires effort to manually figure out the best configuration when implementing this method in the real world (for a given specific environment). The greedy strategy focuses too much on the rate of news stories without balancing the cost of communication. By contrast, our proposed method learns the environment/network information to make the decision flexibly so that it can avoid both the packet-dropped issue and offload-missed issue while using a smaller communication cost.

V-C Discussion

In the following, we investigate the impacts of the packet generation interval, the data latency threshold, the packet size, and the contact rate on our proposed method.

V-C1 Impacts of the Packet Generation Interval λdsubscript𝜆𝑑\lambda_{d}italic_λ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT

In this evaluation, we estimate the impact of the packet generation interval, i.e., the frequency the devices collect the sensory data, on our proposed method. We set the data latency threshold to 5555 time slots while changing the packet generation interval from 15151\to 51 → 5 time slots. The result in Fig. V-B5 shows that the relative communication cost between our proposed method and the baselines does not change. There’s a trivial impact of packet generation rate on all the strategies in communication cost. Our proposed method always uses least 4G communication with a small rate of the delayed packet. Although the greedy and FP3 could achieve smaller delayed rates than ours, these two strategies scarify the communication cost. Specifically, FP3 sends 96% of packets via the 4G network, which is 1.5×1.5\times1.5 × higher than that of our proposed method. When the packet generation interval is slow enough, instead of being dropped, a packet will be stored in a queue and, thus, the rate of delayed packets is increased. In general, this trend also appears when the relative packet size over the capacity of the local queue is too big (that leads to a smaller number of the packet can be stored in the queue until the queue is full). Interestingly, the packet generation interval does not affect the packet delayed rate of FP1, FP2 and greedy strategies where the possibility of kee** a packet in the edge devices are not too high, i.e., Prsu+Psensor=0.2subscript𝑃𝑟𝑠𝑢subscript𝑃𝑠𝑒𝑛𝑠𝑜𝑟0.2P_{rsu}+P_{sensor}=0.2italic_P start_POSTSUBSCRIPT italic_r italic_s italic_u end_POSTSUBSCRIPT + italic_P start_POSTSUBSCRIPT italic_s italic_e italic_n italic_s italic_o italic_r end_POSTSUBSCRIPT = 0.2 as in FP1 strategy. In contrast, because FP3 and our proposed method try to avoid to use the 4G communication channel as much as possible, the packet generation time has much higher impact on those strategies. For example, to maintain 63%absentpercent63\approx 63\%≈ 63 % of 4G communication rate when the packet generation interval changed, our proposed method could not avoid the delayed packet, i.e., around 24%2percent42-4\%2 - 4 %.

V-C2 Impacts of the Data Latency Threshold δ𝛿\deltaitalic_δ

In this experiment, we explore the relationship of our proposed method’s performance with the data latency threshold, e.g., the maximum latency required by an application. We fix the packet generation interval λd=1subscript𝜆𝑑1\lambda_{d}=1italic_λ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT = 1 while changing the latency threshold δ𝛿\deltaitalic_δ from 5255255\to 255 → 25 time slots, Fig. V-B5 shows the related results. When the latency threshold increases the δ𝛿\deltaitalic_δ-delayed packets rate of our proposed method as well as the baseline strategies decrease. This is an expected result because a packet has a longer time to stay in the local devices. Furthermore, let us remind our strategy of the reward function in our Q-Learning method (as shown in the Formula (6)). When the data latency exceeds the threshold, there is a higher possibility of a packet being directly sent to the server using the 4G communication channel by assigning a negative reward. As the result, when the latency threshold increases, the number of packets that meets such condition and use the 4G communication channel becomes smaller. For example, the 4G communication ratio of our proposed method changes from 64.864.864.864.8% to 58.758.758.758.7% when the latency threshold change from 5 to 25-time slots. This trend does not appear in the greedy and FP strategies, These strategies maintain the 4G communication ratio at around 70%, 96%, 88%, and 77%. with all the configuration of the data latency threshold.

V-C3 Impacts of packet size

In this section, we investigate the performance of our proposed method when varying the forwarding data’s amount, e.g., the packet size from 15151\to 51 → 5 MB. It is worth noting that the packet size affects both the cost of 4G as well as the packet delayed rate because the maximum number of packets stored in the queue decrease when the packet size increases. In this evaluation, we fix the queue size of a sensor to 25 MB, i.e., the ratio of packet size to queue size varies from 1:25:1251:251 : 25 to 1:5:151:51 : 5. First of all, we observe that there is no packet dropped, e.g., rdrop=0subscript𝑟𝑑𝑟𝑜𝑝0r_{drop}=0italic_r start_POSTSUBSCRIPT italic_d italic_r italic_o italic_p end_POSTSUBSCRIPT = 0, in all the cases. Second, the result in Fig. V-C3 shows that the performance of the greedy and FP baselines is nearly not changed when the packet size is varied. Only our proposed method shows the decrease of ratio δ𝛿\deltaitalic_δ-delay packet when the packet size becomes smaller because our method dynamically selects the transmission route based on the device’s remaining resource (Formula  8).

Refer to caption
(a) 4G communication ratio
Refer to caption
(b) Rate of δ𝛿\deltaitalic_δ-delayed packets
Refer to caption
(c) RSU to server rate

R_i(t)=& Figure 11: Impact of the packet size. δ𝛿\deltaitalic_δ is fixed to 10 time slots.

V-C4 Impacts of contact rate

TABLE IX: Network contact rate of experiment in Fig. V-C4
Sensor transmission range (m) Network contact rate
120 0.889
100 0.868
80 0.838
60 0.789
40 0.714
20 0.536
10 0.377
Refer to caption
(a) 4G communication ratio
Refer to caption
(b) Rate of δ𝛿\deltaitalic_δ-delayed packets
Refer to caption
(c) RSU to server rate

R_i(t)=& Figure 12: Impact of the contact rate. δ𝛿\deltaitalic_δ is fixed to 10 time slots.

In this section, we investigate the impacts of contact rate on the performance of the algorithms. We begin by explicitly defining contact rate mathematically. We refer to each device’s contact time as the interval at which it enters the communication area of other devices. The contact rate of a device is then calculated by dividing its contact time by its entire time on the road. Furthermore, the network’s average contact rate is defined as the average contact rate across all devices. In this section, we vary the contact rate by changing the transmission range of the devices from 101201012010\to 12010 → 120 meter as shown in Table IX. The higher the sensor transmission range, the higher the network contact rate. We then observe the change in the 4G communication ratio, delayed packet ratio, and the packet dropped rate. The results are presented in Fig V-C4. We observe that there is no packet dropped in most of the cases except the FP3, e.g., around 0.01%. We also confirm that there is no clear correlation between the 4G communication ratio of the FP strategies and the transmission range of devices. However, when the transmission range of devices increases, the 4G communication ratio of the greedy strategy decreases significantly, e.g., from 71% to 63%. In all the configurations, our proposed method always achieves the lowest 4G communication cost. This result implies the stability of the proposed method.

VI Conclusion

This study focused on real-time vehicular mobile crowdsensing systems which rely on devices mounted on vehicles. The devices continuously collect and transmit relevant data to the server through 4G or Wi-Fi communication channels. We proposed an opportunistic communication algorithm that minimizes the 4G communication cost while guaranteeing the data latency is under a predefined threshold. We leveraged the Q-learning to make the offloading decision. Besides, the Fuzzy logic is utilized to optimize the reward function of the Q-learning. The experiment results indicated that the proposed method could reduce 30-40% of the 4G communication cost while guaranteeing that 99% of packets had a latency less than the necessary threshold.

Acknowledgement

This work was funded by Vingroup Joint Stock Company (Vingroup JSC), Vingroup, and supported by Vingroup Innovation Foundation (VINIF) under project code VINIF.2021.DA00128, partially funded by Ministry of Education and Training of Vietnam under grant number B2020-BKA-13, and partially funded by Hanoi University of Science and Technology under grant number T2021-PC-019.

References

  • [1] **wei Liu, Haiying Shen, Husnu S. Narman, Wingyan Chung, and Zongfang Lin. A survey of mobile crowdsensing techniques: A critical component for the internet of things. ACM Trans. Cyber-Phys. Syst., 2(3), June 2018.
  • [2] Jim Cherian, Jun Luo, Hongliang Guo, Shen-Shyang Ho, and Richard Wisbrun. Parkgauge: Gauging the occupancy of parking garages with crowdsensed parking characteristics. In 2016 17th IEEE International Conference on Mobile Data Management (MDM), volume 1, pages 92–101, 2016.
  • [3] Jun Qin, Hongzi Zhu, Yanmin Zhu, Li Lu, Guangtao Xue, and Minglu Li. Post: Exploiting dynamic sociality for mobile advertising in vehicular networks. IEEE Transactions on Parallel and Distributed Systems, 27(6):1770–1782, 2016.
  • [4] Suk-Bok Lee, Joon-Sang Park, Mario Gerla, and Songwu Lu. Secure incentives for commercial ad dissemination in vehicular networks. IEEE Transactions on Vehicular Technology, 61(6):2715–2728, 2012.
  • [5] Andrea Capponi, Claudio Fiandrino, Burak Kantarci, Luca Foschini, Dzmitry Kliazovich, and Pascal Bouvry. A survey on mobile crowdsensing systems: Challenges, solutions, and opportunities. IEEE Communications Surveys Tutorials, 21(3):2419–2465, 2019.
  • [6] Hanane Lamaazi, Rabeb Mizouni, Shakti Singh, and Hadi Otrok. A mobile edge-based crowdsensing framework for heterogeneous iot. IEEE Access, 8:207524–207536, 2020.
  • [7] Luning Liu, Luhan Wang, and Xiangming Wen. Joint network selection and traffic allocation in multi-access edge computing-based vehicular crowdsensing. In IEEE INFOCOM 2020 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pages 1184–1189, 2020.
  • [8] Pan Zhou, Wenbo Chen, Shouling Ji, Hao Jiang, Li Yu, and Dapeng Wu. Privacy-preserving online task allocation in edge-computing-enabled massive crowdsensing. IEEE Internet of Things Journal, 6(5):7773–7787, 2019.
  • [9] Xingyou Xia, Yan Zhou, Jie Li, and Ruiyun Yu. Quality-Aware Sparse Data Collection in MEC-Enhanced Mobile Crowdsensing Systems. IEEE Transactions on Computational Social Systems, 6(5):1051–1062, 2019.
  • [10] Yinuo Zhao and Chi Harold Liu. Social-Aware Incentive Mechanism for Vehicular Crowdsensing by Deep Reinforcement Learning. IEEE Transactions on Intelligent Transportation Systems, 22(4):2314–2325, 2021.
  • [11] Dimitri Belli, Stefano Chessa, Luca Foschini, and Michele Girolami. A probabilistic model for the deployment of human-enabled edge computing in massive sensing scenarios. IEEE Internet of Things Journal, 7(3):2421–2431, 2020.
  • [12] Yanli Qi, Yiqing Zhou, Zhengang Pan, Ling Liu, and **glin Shi. Crowd-sensing assisted vehicular distributed computing for hd map update. In ICC 2021 - IEEE International Conference on Communications, pages 1–6, 2021.
  • [13] Chenghao Xu and Wei Song. Efficient data uploading for mobile crowdsensing via team collaborating and matching. IEEE Transactions on Green Communications and Networking, pages 1–1, 2021.
  • [14] Piergiorgio Vitello, Andrea Capponi, Claudio Fiandrino, Paolo Giaccone, Dzmitry Kliazovich, Ulrich Sorger, and Pascal Bouvry. Collaborative data delivery for smart city-oriented mobile crowdsensing systems. In 2018 IEEE Global Communications Conference (GLOBECOM), pages 1–6, 2018.
  • [15] Wei Gong, Xiaoyao Huang, Guanglun Huang, Baoxian Zhang, and Cheng Li. Data offloading for mobile crowdsensing in opportunistic social networks. In 2019 IEEE Global Communications Conference (GLOBECOM), pages 1–6, 2019.
  • [16] Ke Zhang, Yuming Mao, Supeng Leng, Quanxin Zhao, Longjiang Li, Xin Peng, Li Pan, Sabita Maharjan, and Yan Zhang. Energy-efficient offloading for mobile edge computing in 5g heterogeneous networks. IEEE Access, 4:5896–5907, 2016.
  • [17] ** Cui, Chunyan Cao, Qianbin Chen, and Kyung Sup Kwak. Minimum-cost offloading for collaborative task execution of mec-assisted platooning. Sensors, 19(4), 2019.
  • [18] Tai** Cui, Yuyu Hu, Bin Shen, and Qianbin Chen. Task offloading based on lyapunov optimization for mec-assisted vehicular platooning networks. Sensors, 19(22), 2019.
  • [19] Hansong Wang, Xi Li, Hong Ji, and Heli Zhang. Federated offloading scheme to minimize latency in mec-enabled vehicular networks. In 2018 IEEE Globecom Workshops (GC Wkshps), pages 1–6, 2018.
  • [20] **g Zhang, Weiwei Xia, Feng Yan, and Lianfeng Shen. Joint computation offloading and urllc resource allocation for collaborative mec assisted cellular-v2x networks. IEEE Access, 8:24914–24926, 2020.
  • [21] Junhui Zhao, Qiu** Li, Yi Gong, and Ke Zhang. Computation offloading and resource allocation for cloud assisted mobile edge computing in vehicular networks. IEEE Transactions on Vehicular Technology, 68(8):7944–7956, 2019.
  • [22] Ying-Dar Lin, Yuan-Cheng Lai, Jian-Xun Huang, and Hsu-Tung Chien. Three-tier capacity and traffic allocation for core, edges, and devices for mobile edge computing. IEEE Transactions on Network and Service Management, 15(3):923–933, 2018.
  • [23] Phi Le Nguyen, Ren-Hung Hwang, Pham Minh Khiem, Kien Nguyen, and Ying-Dar Lin. Modeling and minimizing latency in three-tier v2x networks. In 2020 IEEE Global Communications Conference, pages 1–6, 2020.
  • [24] Trung Thanh Nguyen, Truong Thao Nguyen, Tuan Anh Nguyen Dinh, Thanh-Hung Nguyen, and Phi Le Nguyen. Q-learning-based opportunistic communication for real-time mobile air quality monitoring systems. In 2021 IEEE International Performance, Computing, and Communications Conference (IPCCC), pages 1–7, 2021.
  • [25] Lotfi A Zadeh. Fuzzy logic. Computer, 21(4):83–93, 1988.
  • [26] Opportunistic communication simulator. https://github.com/AIoT-Lab-BKAI/Fuzzy-Q-learning-based-Opportunistic-Communication, 04 2022.
  • [27] Jetcheva, Hu, PalChaudhuri, Saha, and Johnson. Design and evaluation of a metropolitan area multitier wireless ad hoc network architecture. 2003 Proceedings Fifth IEEE Workshop on Mobile Computing Systems and Applications, pages 32–43, 2003.
[Uncaptioned image] Trung Thanh Nguyen is a final-year student at the School of Information and Communication Technology at Hanoi University of Science and Technology. He is also a research assistant at the Intelligent Communication Networks laboratory, an International research center for Artificial Intelligence (BK.AI). His research is related to optimization, reinforcement learning, and IoT networks.
[Uncaptioned image] Truong Thao Nguyen Dr. Truong Thao Nguyen received the BE and ME degrees from Hanoi University of Science and Technology, Hanoi, Vietnam, in 2011 and 2014, respectively. He received the Ph.D. in Informatics from the Graduate University for Advanced Studies, Japan in 2018. He is currently working at Digital Architecture Research Center, at National Institute of Advanced Industrial Science and Technology (AIST), where he focuses on the topics of High Performance Computing system, Distributed Deep Learning and beyond.
[Uncaptioned image] Thanh-Hung Nguyen holds a Ph.D. degree in computer science from the University of Grenoble. Currently, he is an assistant professor at School of Information and Communication, Hanoi University of Science and Technology, Vietnam. His research interests are in the Modeling and verification of component-based systems, Network architecture, and Artificial Intelligence.
[Uncaptioned image] Phi Le Nguyen received her B.E. and M.S. degrees from the University of Tokyo in 2007 and 2010, respectively. She received her Ph.D. in Informatics from The Graduate University for Advanced Studies, SOKENDAI, Tokyo, Japan, in 2019. Currently, she is a lecturer at the School of Information and Communication, Hanoi University of Science and Technology (HUST), Vietnam. In addition, she is serving as the managing director at the International research center for Artificial Intelligence (BKAI) and the coordinator of the HEDSPI program, HUST. Her research interests include Network architecture, Optimization, and Artificial Intelligence.