### Introduction

Organizations in many countries today are investing in vehicular networks to leverage wireless networking support to improve state-of-the-art in road transportation. The US Federal Communications Commission (FCC) has allocated 75 MHz of spectrum in the 5.9 GHz band for Dedicated Short Range Communications, a set of protocols and standards for short to medium-range wireless communication for automotive use. Some recent vehicular networking efforts are the USDOT’s Vehicle Infrastructure Integration (VII), which is a cooperative initiative between USDOT and automobile manufacturers, focusing on feasibility of deploying communications systems for safety and efficiency of road transportation systems. The ERTICO partnership is a multi-sector partnership pursuing development and deployment of Intelligent Transport Systems across Europe. Apart from these efforts, a variety of VANET test-beds have been set up in academia also for basic research and development of services.

This paper addresses a critical and emerging security problem in vehicular networks, namely detecting the presence of Sybil attacks. Sybil attacks are classified as an attack on the trust of a peer-to-peer system by an attacker assuming many pseudonymous identities. Using these identities, the attacker can gain a disproportionately large influence on system functionality. In vehicular networks, the presence of a Sybil attack can have negative consequences. For instance, in an application like road safety, consider a single malicious vehicle, *V*_{
M
}, assuming a large number of fake identities incorrectly reporting road conditions. Other benign vehicles will tend to believe such a message, since it appears to be coming from multiple vehicles, and may adjust their routes. In such a case *V*_{
M
} can potentially obtain exclusive access to the road, which it otherwise could not. A number of other applications like content exchange, intelligent traffic signalling, and ramp metering can all be compromised in the presence of Sybil attacks. Unlike static networks like the Internet, vehicular mobilities make Sybil detection very difficult with the added spatio-temporal constraints.

#### Related work

The problem of detection Sybil attacks in VANETs has been previously studied. In [1] and [2], the proposed solution detects Sybil attacks when vehicles may only hold one valid pseudonym at a time. When a pseudonym need to be refreshed, a new pseudonym is obtained from a trusted Road-Side Unit (RSU). The consequence of this approach is a possibly complex pseudonym allocation mechanism implemented by the RSU network. Another technique leverages directional antennas to identify the location/direction of message arrival [3]. A vehicle launching a Sybil attack will likely be detected as many messages will arrive from the location/direction. However, in dense networks, localization errors can lead to frequent false positives. This scheme may be compromised as a smart attacker may use directional antennas to mislead its neighbors about its direction.

In [4], heavy-weight cryptographic techniques are leveraged for detecting Sybil attacks in VANETs. Specifically, each vehicle is given a list of pseudonyms to protect their privacy during communication. However, the pseudonyms of each vehicle are designed in such a manner wherein they are all hashed to a common value. By calculating the hashed values at Road Side Units, a central server can determine whether or not certain pseudonyms came from the same pool. Sybil attacks are detected if many pseudonyms from the same pool are detected in a short interval of time. Unfortunately, the computational complexity of cryptographic protocols in this technique is quite high.

In [5], GPS and RSSI signal measurements are used for detecting Sybil nodes. The proposed scheme uses Vehicle-to-Vehicle (V2V) communications to confirm reported positions of vehicles by referencing the RSSI measurements. To correct inaccuracies arising from RSSI measurement, caused by vehicle mobility, traffic patterns and support from roadside base stations are used. Specifically, statistical algorithms are implemented to verify the signal strength distribution of a suspect vehicle over time to significantly reduce the detection rate. In [6], analysis is performed to quantify performance of Sybil detection under assumptions like transmission range, antenna model, signal strength etc. Unfortunately, the un-reliability of RSSI measurements limits the practical reliability of these techniques [7]. In [8], inability of multiple vehicles to exhibit close temporal and spatial correlations at multiple locations is exploited for Sybil defense. The idea is to have RSUs sign location and timestamp information for vehicles as they move. Upon detecting groups of vehicles having many similar locations with similar timestamps, a Sybil attack is detected. The overhead in this scheme though is quite high, especially in the case of urban networks. Significant cryptographic overhead is incurred as RSUs have to sign each received message.

We also would like to point out two other areas of work that are also closely related to our problem of Sybil attack detection in vehicular networks. The first area is secure localization in wireless networks like sensor and mobile ad hoc networks [9, 10], wherein locations of nodes are determined in a secure manner. Our problem is similar, but orthogonal in the sense that we are attempting to verify integrity of relative location updates as vehicles move in the network. The other area we wish to highlight is the issue of detecting Sybil attacks and nodes in static networks like sensor, Internet scale, and social networks [11–14]. As can be observed, while the goal of these works are related to ours, the issue of vehicle mobility and unique mobility patterns of these nodes necessitates fundamentally new approaches for Sybil detection, which we attempt in this paper.

#### Contributions

Presented in this paper is an innovative protocol for Sybil detection in vehicular networks. Vehicular networks today are examples of cyber physical systems, where there is a clear integration of cyber and physical components. The premise of this paper starts with two simple questions: Can the natural physics of the underlying transportation domain be integrated with the Cyber domain in detecting Sybil attacks, and b) If so, can such an integration generate high quality solutions to detect Sybil attacks, while alleviating complexities (in the form of complex cryptography and additional hardware requirements) in the cyber domain. This papers yields a positive response to both questions.

The technique employs a certain number of road side units (RSUs) that periodically collect reports from communicating vehicles regarding this neighborhood. In the event of a vehicle performing Sybil attacks, the geographic proximity the Sybil identities will be long-term and repeating, while the geographical proximity of benign vehicles will short-term. To put it in terms of transportation engineering, Sybil identities will appear to “*platoon*” together, while identities of benign vehicles will eventually “*disperse*”. The dispersion of vehicles in roads occurs due to a combination of road conditions, vehicle dynamics and human factors. This theory has been extensively studied by transportation engineers in the last five decades in the form of a theory called “*platoon dispersion*” [15–18]. Integrating platoon dispersion models provide an alternative method for Sybil attack detection. To detect attacks, RSUs compare models of naturally occurring dispersion among benign vehicles with anomalously occurring platoons among Sybil nodes. Using a combination of both theoretical analysis and simulations, the simplicity, efficiency, practicality and quality of the protocol for Sybil detection in vehicular networks is demonstrated. To the best of the authors’ knowledge, this paper is unique in proposing an inter-disciplinary approach for addressing cyber space attacks in emerging vehicular networks.

#### Paper organization

The rest of the paper is organized as follows. Section ‘Platoon dispersion and its application to Sybil detection’ presents a brief overview of platoon dispersion theory in transportation engineering, and its application for Sybil detection in vehicular networks. Section ‘Research design and methodology’ presents the formal attack model, problem statement, overall framework, and protocol for Sybil detection. Section ‘Performance evaluations’ will demonstrate the performance of the protocol, and the paper concludes in Section ‘Conclusions’.

### Platoon dispersion and its application to Sybil detection

Provided first is a brief overview of how models of dispersion among vehicles that naturally occur in roads have been studied by transportation engineers. Afterwards, a simplified example of how to use platoon dispersion theory for Sybil detection is presented. The discussions will help guide the proposed Sybil detection protocol discussed in the next section.

#### Platoon dispersion theory in transportation engineering

A platoon is a group of vehicles traveling in close proximity for some amount of time as shown in Figure 1. Ideally, consistent vehicle platooning is preferable and improves critical transportation parameters like signal optimization, congestion avoidance, improved road safety, and capacity [19–23]^{a}. Under normal traffic, vehicle platooning is short-term. Clearly, if all vehicles in an existing platoon are traveling at a constant speed, a platoon will never disperse. However, due to physical factors like road friction, vehicle characteristics and signalling, human factors, lane changes, and fatigue [24] cause platoons to disperse over time. The longer the travel time between points the greater dispersion, due to the difficulty of maintaining constant speed over longer time scales. This phenomena is called *platoon dispersion*, a simple illustration of which is shown in Figure 2.

Platoon dispersion has been well studied in transportation engineering [15–17, 25–31], via two mathematical models. One is the (more popular) Robertson’s geometric distribution model [16] and the other is the Pacey’s normal distribution model [15]. Both models assume that road segment travel times follow some probability distribution. The Robertson platoon dispersion model follows a shifted geometric series, and has been implemented in traffic-simulation software like SCOOT [32], SATURN [33] and TRAFLO [34]. The basic of Robertson recursive platoon dispersion model takes the following form:

{q}_{t}^{\prime}=R\xb7{q}_{t-{T}_{\mathit{\text{min}}}}+(1-R)\xb7{q}_{t-\mathrm{\delta t}}^{\prime}.

(1)

R=\frac{1}{1+\mathrm{\alpha \beta}{T}_{\mathit{\text{mean}}}},\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\text{where}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}0\le R\le 1.

(2)

A numerical procedure was developed for the Robertson model in [25] by rewriting Equation 1 as,

{q}_{t}^{\prime}=\sum _{i={T}_{\mathit{\text{min}}}}^{\infty}R\xb7{(1-R)}^{i-{T}_{\mathit{\text{min}}}}\xb7{q}_{t-i}.

(3)

where,

: arrival flow at the downstream location at time t-T (veh/hr);

*q*_{
t
}: departure flow at the upstream location at time t (veh/hr);

*δ* *t*: time step duration;

*T*_{
m
i
n
}: minimum travel time on the roadway;

*T*_{
m
e
a
n
}: mean roadway travel time, measured in units of time steps.

\alpha =\frac{1-\beta}{\beta}

: dimensionless platoon dispersion factor depending on the level We are also investigating moreof friction along the roadway;

\beta =\frac{2{T}_{\mathit{\text{mean}}}+1-\sqrt{1+4{\sigma}^{2}}}{2{T}_{\mathit{\text{mean}}}}

: dimensionless travel-time factor;

*R*: smoothing factor governing dispersion, where 0≤*R*≤1;

*σ*: the standard deviation of link travel time assuming individual vehicle speeds follow normal distribution and are unchanged.

As can be seen from Equations 1, 2, 3 and definitions of parameters, all we need to know are the speed deviations *σ* among vehicles, and the mean travel time *T*_{
a
} between the upstream and downstream locations. If both can be determined (which is quite straightforward to obtain), one could compute platoon dispersion factors *α* and *β*. These parameters subsequently can be used to compute the smoothing factor *R*, from which the degree of how an upstream platoon will disperse at the downstream location can be computed.

Figure 3 shows an illustration of *upstream* platooning and its *downstream* dispersion, wherein the shaded portion represents *similar* vehicle speeds that tend to platoon together, while non-shaded portion represents *varying* speeds of vehicles that *disperse* from the original platoon. A numerical example of dispersion based on the Robertson model [16–18] is shown in Figure 4. Each observation (i.e., downstream) point is one mile apart, and the minimum travel time between each point is one minute. For small speed deviations, the dispersion in expected number of vehicles reaching the observation point is less than the case where the speed deviation increases. This leads to platoon sizes decreasing, progressively, as vehicles travel from one observation point to another.

#### An illustrative example of Sybil detection using platoon dispersion

Consider a case where there are 50 vehicles in an upstream platoon. Let each vehicle have a unique identity given by {*V*_{1},*V*_{2},…,*V*_{50}}. Vehicle *V*_{50} is malicious and possesses 50 fake identities \left\{{\stackrel{\u0304}{V}}_{1},{\stackrel{\u0304}{V}}_{2},\dots ,{\stackrel{\u0304}{V}}_{50}\right\}. When all vehicle communicate with each other (including *V*_{50} with all of its identities to launch a Sybil attack), the up-stream platoon will appear to have 100 vehicles. With prior knowledge of road characteristics and (either currently sampled or prior estimates of) vehicle speeds, the dispersion parameters and the expected degree of dispersion at downstream can be computed. Say the smoothing factor is *R*=20*%*. If the Sybil, *V*_{50}, is part of the downstream platoon (recall shaded area in Figure 3), the number of identities actually seen in the downstream platoon is *n*_{
d
}=0.20 × 50 (benign vehicle identities) + 50 Sybil identities =60 identities. If the Sybil vehicle falls outside of the downstream platoon (recall the non-shaded area in Figure 3), then the number of identities actually seen in the downstream platoon is *n*_{
d
}=0.20×50=10 identities.

It is easy to see that abnormalities in the physical domain will manifest in the form of abnormal platooning (and ensuing dispersion) under Sybil identities in cyber space. If all the identities upstream (i.e. 100 of them) are benign, the number of vehicle identities in the downstream platoon is expected to be *n*_{
d
}=0.20×100=20. Sufficient abnormalities in platoon dispersion that are straightforward to determine leading to a natural, elegant, and simple technique to detect Sybil attacks. To the best of the authors’ knowledge, such a technique has not been attempted yet, and is formalized and elaborated in the next section.