Spatial cloaking

Spatial cloaking is a privacy mechanism that is used to satisfy specific privacy requirements by blurring users’ exact locations into cloaked regions. This technique is usually integrated into applications in various environments to minimize the disclosure of private information when users request location-based service. Since the database server does not receive the accurate location information, a set including the satisfying solution would be sent back to the user. General privacy requirements include K-anonymity, maximum area, and minimum area.

Background
With the emergence and popularity of location-based services, people are getting more personalized services, such as getting the names and locations of nearby restaurants and gas stations. Receiving these services requires users to send their positions either directly or indirectly to the service provider. A user's location information could be shared more than 5000 times in two weeks. Therefore, this convenience also exposes users’ privacy to certain risks, since the attackers may illegally identify the users’ locations and even further exploit their personal information. Continuously tracking users' location has not only been identified as a technical issue, but also a privacy concern as well. It has been realized that Quasi-identifiers, which refer to a set of information attributes, can be used to re-identify the user when linked with some external information. For example, the social security number could be used to identify a specific user by adversaries, and the combined disclosure of birth date, zip code, and gender can uniquely identify a user. Thus, multiple solutions have been proposed to preserve and enhance users’ privacy when using location-based services. Among all the proposed mechanisms, spatial cloaking is one of those which has been widely accepted and revised, thus having been integrated into many practical applications.

Location privacy
Location privacy is usually considered falling into the category of information privacy, though there is little consensus on the definition of location privacy. There are often three aspects of location information: identity, location (spatial information), and time (temporal information). Identity usually refers to a user's name, email address, or any characteristic which makes a user distinguishable. For example, Pokémon Go requires a consistent user identity, since users are required to log in. Spatial information is considered as the main approach to determine a location. Temporal information can be separated into real-time and non-real time and is usually described as a time stamp with a place. If a link is established between them, then the location privacy is considered violated. Accessing personal location data has been raised as a severe privacy concern, even with personal permission. Therefore, privacy-aware management of location information has been identified as an essential challenge, which is designed to provide privacy protection against abuse of location information. The overall idea of preserving location privacy is to introduce enough noise and quantization to reduce the chances of successful attacks.

Spatial crowdsourcing uses devices that has GPS (global positioning system) and collects information. Data retrieved includes location data that can be used to analyze maps and local spatial characteristics. In recent years, researchers have been making a connection between social aspects and technological aspects regarding location information. For example, if co-location information is considered as the data which potential attackers would get and take into consideration, the location privacy is decreased by more than 60%. Also, by a constant report of a user's location information, a movement profile could be constructed for this specific user based on statistical analysis, and a large amount of information could be exploited and generated from this profile such as user's office location, medical records, financial status, and political views. Therefore, more and more researchers have taken account of the social influence in their algorithms, since this socially networked information is accessible to the public and might be used by potential attackers.

History
In order to meet user's requirements for location privacy in the process of data transportation, researchers have been exploring and investigating models to address the disclosure of private information.

The secure-multi-party model is constructed based on the idea of sharing accurate information among n parties. Each party has access to a particular segment of the precise information and at the same time being prevented from acquiring the other shares of the data. However, the computation problem is introduced in the process, since a large amount of data processing is required to satisfy the requirement.

The minimal information sharing model is introduced to use cryptographic techniques to perform join and intersection operations. However, the inflexibility of this model to fit into other queries makes it hard to be satisfying to most practical applications.

The untrusted third-party model is adopted in peer-to-peer environments.

The most popular model right now is the trusted third-party model. Some of the practical applications have already adopted the idea of a trusted third party into their services to preserve privacy. For example, Anonymizer is integrated into various websites, which could give anonymous surfing service to its users. Also, when purchasing through PayPal, users are not required to provide their credit card information. Therefore, by introducing a trusted-third-party, users’ private information is not directly exposed to the service providers.

Approaches for preserving location information
The promising approach of preserving location privacy is to report data on users' behavior and at the same time protect identity and location privacy. Several methods have been investigated to enhance the performances of location-preserving techniques, such as location perturbation and the report of landmark objects.

Location perturbation
The idea of location perturbation is to replace the exact location information with a coarser grained spatial range, and thus uncertainty would be introduced when the adversaries try to match the user to either a known location identity or external observation of location identity. Location perturbation is usually satisfied by using spatial cloaking, temporal cloaking, or location obfuscation. Spatial and temporal cloaking refers to the wrong or imprecise location and time reported to the service providers, instead of the exact information. For example, location privacy could be enhanced by increasing the time between location reports, since higher report frequencies makes reidentification more possible to happen through data mining. There are other cases when the report of location information is delayed until the visit of K users is identified in that region.

However, this approach could affect the service reported by the service providers since the data they received are not accurate. The accuracy and timelessness issues are usually discussed in this approach. Also, some attacks have been recognized based on the idea of cloaking and break user privacy.

Landmark objects
Based on the idea of landmark objects, a particular landmark or a significant object is reported to the service provider, instead of a region.

Avoid location tracking
In order to avoid location tracking, usually less or no location information would be reported to the service provider. For example, when requesting weather, a zip code instead of a tracked location would be accurate enough for the quality of the service received.

Centralized scheme
A centralized scheme is constructed based on a central location anonymizer (anonymizing server) and is considered as an intermediate between the user and the service provider. Generally, the responsibilities of a location anonymizer include tracking users' exact location, blurring user specific location information into cloaked areas and communicate with the service provider. For example, one of the methods to achieve this is by replacing the correct network addresses with fake-IDs before the information are forward to the service provider. Sometimes user identity is hidden, while still allowing the service provider to authenticate the user and possibly charge the user for the service. These steps are usually achieved through spatial cloaking or path confusion. Except in some cases where the correct location information are sent for high service quality, the exact location information or temporal information are usually modified to preserve user privacy.

Serving as an intermediate between the user and location-based server, location anonymizer generally conducts the following activities:


 * Receiving users’ exact location information and private profile
 * Blurring the location into cloaked areas based on the specific privacy requirements
 * In most of the times, removing user identities from the location information
 * Reporting the cloaked area to the service provider and receiving a list of solutions, which is referred to as a candidate list, from the service provider which satisfies user's requests
 * Deciding the most appropriate solution based on the user's exact location and returning the accurate solution information back to the user ( Some location anonymizer may not adopt this step)

The location anonymizer could also be considered as a trusted-third party since it is trusted by the user with the accurate location information and private profile stored in the location anonymizer. However, this could also expose users’ privacy into great risks at the same time. First, since the anonymizer keeps tracking users' information and has access to the users’ exact location and profile information, it is usually the target of most attackers and thus under higher risks Second, the extent to which users trust the location anonymizers could be essential. If a fully trusted third party is integrated into the algorithm, user location information would be reported continuously to the location anonymizer, which may cause privacy issues if the anonymizer is compromised. Third, the location anonymizer may lead to a performance bottleneck when a large number of requests are presented and required to be cloaked. This is because the location anonymizer is responsible for maintaining the number of users in a region in order to provide an acceptable level of service quality.

Distributed scheme (decentralized scheme)
In a distributed environment, users anonymize their location information through fixed communication infrastructures, such as base stations. Usually, a certification server is introduced in a distributed scheme where users are registered. Before participating in this system, users are required to obtain a certificate which means that they are trusted. Therefore, every time after user request a location-based service and before the exact location information is forward to the server, the auxiliary users registered in this system collaborate to hide the precise location of the user. The number of assistant users involved in cloaking this region is based on K-anonymity, which is usually set be the specific user. In the cases where there are not enough users nearby, S-proximity is generally adopted to generate a high number of paired user identities and location information for the actual user to be indistinguishable in the specific area. The other profiles and location information sent to the service provider are sometimes also referred to as dummies.

However, the complexity of the data structure which is used to anonymize the location could result in difficulties when applying this mechanism to highly dynamic location-based mobile applications. Also, the issue of large computation and communication is posed to the environment.

Peer-to-peer environment
A peer-to-peer (P2P) environment relies on the direct communication and information exchange between devices in a community where users could only communicate through P2P multi-hop routing without fixed communication infrastructures. The P2P environment aims to extend the scope of cellular coverage in a sparse environment. In this environment, peers have to trust each other and work together, since their location information would be reported to each other when a cloaked area is constructed to achieve the desired K-anonymity during the requesting for location-based services.

Researchers have been discussing some privacy requirements and security requirements which would make the privacy-preserving techniques appropriate for the peer-to-peer environment. For example, authentication and authorization are required to secure and identify the user and thus making authorized users distinguishable from unauthorized users. Confidentiality and integrity make sure that only those who are authorized have access to the data transmitted between peers, and the transmitted information cannot be modified.

Some of the drawbacks identified in a peer-to-peer environment are the communication costs, not enough users and threats of potential malicious users hiding in the community.

Mobile environments
Mobile devices have been considered as an essential tool for communication, and mobile computing has thus become a research interest in recent years. From online purchase to online banking, mobile devices have frequently been connected to service providers for online activities, and at the same time sending and receiving information. Generally, mobile users can receive very personal services from anywhere at any time through location-based services. In mobile devices, Global Positioning System (GPS) is the most commonly used component to provide location information. Besides that, Global System for Mobile Communications (GSM) and WiFi signals could also help with estimating locations. There are generally two types of privacy concerns in mobile environments, data privacy and contextual privacy. Usually, location privacy and identity privacy are included in the discussion of contextual privacy in a mobile environment, while the data transferred between various mobile devices is discussed under data privacy. In the process of requesting location-based services and exchanging location data, both the quality of data transferred and the safety of information exchanged could be potentially exposed to malicious people.

Privacy requirements
No matter what the specific privacy-preserving solution is integrated to cloak a particular region in which the service requester stays. It is usually constructed from several angles to satisfy different privacy requirements better. These standards are either adjusted by the users or are decided by the application designers. Some of the privacy parameters include K-anonymity, entropy, minimum area, and maximum area.

K-anonymity
The concept of K-anonymity was first introduced in relational data privacy to guarantee the usefulness of the data and the privacy of users, when data holders want to release their data. K-anonymity usually refers to the requirement that the information of the user should be indistinguishable from a minimum of $$k-1 $$people in the same region, with k being any real number. Thus, the disclosed location scope would be expected to keep expanding until $$k$$users could be identified in the region and these $$k$$people form an anonymity set. Usually, the higher the K-anonymity, the stricter the requirements, the higher the level of anonymity. If K-anonymity is satisfied, then the possibility of identifying the exact user would be around $$1/k$$ which subjects to different algorithms, and therefore the location privacy would be effectively preserved. Usually, if the cloaking region is designed to be more significant when the algorithm is constructed, the chances of identifying the exact service requester would be much lower even though the precise location of the user is exposed to the service providers, let alone the attackers' abilities to run complex machine learning or advanced analysis techniques.

Some approaches have also been discussed to introduce more ambiguity to the system, such as historical K-anonymity, p-sensitivity, and l-diversity. The idea of historical K-anonymity is proposed to guarantee the moving objects by making sure that there are at least $$k-1 $$users who share the same historical requests, which requires the anonymizer to track not only the current movement of the user but also the sequence location of the user. Therefore, even user's historical location points are disclosed, the adversaries could not distinguish the specific user from a group of potential users. P-sensitivity is used to ensure that the critical attributes such as the identity information have at least $$p$$different values within $$k$$users. Moreover, l-diversity aims to guarantee the user is unidentifiable from l different physical locations.

However, setting a large K value would also requires additional spatial and temporal cloaking which leads to a low resolution of information, which in turn could lead to degraded quality of service.

Minimum area size
Minimum area size refers to the smallest region expanded from the exact location point which satisfies the specific privacy requirements. Usually, the higher the privacy requirements, the bigger the area is required to increase the complicity of distinguishing the exact location of users. Also, the idea of minimum area is particularly important in dense areas when K-anonymity might not be efficient to provide the guaranteed privacy-preserving performance. For example, if the requestor is in a shopping mall which has a promising discount, there might be a lot of people around him or her, and thus this could be considered a very dense environment. Under such a situation, a large K-anonymity such as L=100 would only correspond to a small region, since it does not require a large area to include 100 people near the user. This might result in an inefficient cloaked area since the space where the user could potentially reside is smaller compared with the situation of the same level of K-anonymity, yet people are more scattered from each other.

Maximum area size
Since there is a tradeoff relationship between quality of service and privacy requirements in most location-based services,  sometimes a maximum area size is also required. This is because a sizable cloaked area might introduce too much inaccuracy to the service received by the user, since increasing the reported cloaked area also increases the possible satisfying results to the user's request. These solutions would match the specific requirements of the user, yet are not necessarily applicable to the users’ exact location.

Applications
The cloaked region generated by the method of spatial cloaking could fit into multiple environments, such as snapshot location, continuous location, spatial networks, and wireless sensor networks. Sometimes, the algorithms which generate a cloaked area are designed to fit into various frameworks without changing the original coordinate. In fact, with the specification of the algorithms and well-establishment of most generally adopted mechanisms, more privacy-preserving techniques are designed specifically for the desired environment to fit into different privacy requirements better.

Geosocial applications
Geosocial applications are generally designed to provide a social interaction based on location information. Some of the services include collaborative network services and games, discount coupons, local friend recommendation for dining and shopping, and social rendezvous. For example, Motion Based allows users to share exercise path with others. Foursquare was one of the earliest location-based applications to enable location sharing among friends. Moreover, SCVNGR was a location-based platform where users could earn points by going to places.

Despite the privacy requirements such as K-anonymity, maximum area size, and minimum area size, there are other requirements regarding the privacy preserved in geosocial applications. For example, location and user unlinkability require that the service provider should not be able to identify the user who conducts the same request twice or the correspondence between a given cloaked area and its real-time location. Also, the location data privacy requires that the service provider should not have access to the content of data in a specific location. For example, LoX is mainly designed to satisfy these privacy requirements of geosocial applications.

Location-based services
With the popularity and development of global positioning system (GPS) and wireless communication, location-based information services have been in high growth in recent years. It has already been developed and deployed in both the academia and the practical sphere. Many practical applications have integrated the idea and techniques of location-based services, such as mobile social networks, finding places of interest (POI), augmented reality (AR) games, awareness of location-based advertising, transportation service, location tracking, and location-aware services. These services usually require the service providers to analyze the received location information based on their algorithms and a database to come up with an optimum solution, and then report it back to the requesting user. Usually, the location-based services are requested either through snapshot queries or continuous queries. Snapshot queries generally require the report of an exact location at a specific time, such as “where is the nearest gas station?” while continuous queries need the tracking of location during a period of time, such as “constantly reporting the nearby gas stations.”

With the advancement of global positioning systems and the development of wireless communication which are introduced in the extensive use of location-based applications, high risks have been placed on user privacy. Both the service providers and users are under the dangers of being attacked and information being abused. It has been reported that some GPS devices have been used to exploit personal information and stalk personal locations. Sometimes, only reporting location information would already indicate much private information. One of the attacks specific to location-based services is the space or time correlated inference attacks, in which the visited location is correlated with the particular time, and this could lead to the disclosure of private life and private business.

Some of the popular location-based services include:


 * Location-aware emergency service
 * Location-based advertisement
 * Live traffic report
 * Location-based store finders
 * Map and navigation system

Continuous location-based service

Continuous location-based services require a constant report of location information to the service providers. During the process of requesting a continuous location-based service, pressure has been recognized on privacy leakage issues. Since the a series of cloaked areas are reported, with the advancing technological performances, a correlation could be generated between the blurred regions. Therefore, many types of research have been conducted addressing the location privacy issues in continuous location-based services.

Snapshot location-based services

While snapshot location generally refers to the linear relation between the specific location point and a point in the temporal coordinate.

Some mechanisms have been proposed to either address the privacy-preserving issues in both of the two environments simultaneously or concentrate on fulfilling each privacy requirement respectively. For example, a privacy grid called a dynamic grid system is proposed to fit into both snapshot and continuous location-based service environments.

Other privacy mechanisms
The existing privacy solutions generally fall into two categories: data privacy and context privacy. Besides addressing the issues in location privacy, these mechanisms might be applied to other scenarios. For example, tools such as cryptography, anonymity, obfuscation and caching have been proposed, discussed, and tested to better preserve user privacy. These mechanisms usually try to solve location privacy issues from different angles and thus fit into different situations.


 * Cryptography
 * Anonymity
 * Obfuscation
 * Caching
 * Pseudonymous technique

Concerns
Even though the effectiveness of spatial cloaking has been widely accepted and the idea of spatial cloaking has been integrated into multiple designs, there are still some concerns towards it. First, the two schemes of spatial cloaking both have their limitations. For example, in the centralized scheme, although users' other private information including identity has been cloaked, the location itself would be able to release sensitive information, especially when a specific user requests service for multiple times with the same pseudonym. In a decentralized scheme, there are issues with large computation and not enough peers in a region.

Second, the ability of attackers requires a more in-depth consideration and investigation according to the advancement of technology such as machine learning and its connection with social relations, particularly the share of information online.

Third, the credibility of a trusted-third-party has also been identified as one of the issues. There is a large number of software published on app markets every day, and some of them have not undergone a strict examination. Software bugs, configuration errors at the trusted-third-party and malicious administrators could expose private user data under high risks. Based on a study from 2010, two-thirds of all the trusted-third-party applications in the Android market are considered to be suspicious towards sensitive information.

Fourth, location privacy has been recognized as a personalized requirement and is sensitive to various contexts. Customizing privacy parameters has been exploring in recent years since different people have different expectations on the amount of privacy preserved and sometimes the default settings do not fully satisfy user needs. Considering that there is often a trade-off relation between privacy and personalization and personalization usually leads to better service,  people would have different preferences. In the situations where users can change the default configurations, accepting the default instead of customizing seems to be a more popular choice. Also, people's attitudes towards disclosing their location information could vary based on the service's usefulness, privacy safeguards, and the disclosed quantity etc. In most situations, people are weighing the price of privacy sharing and the benefits they received.

Fifth, there are many protection mechanism proposed in literature yet few of them have been practically integrated into commercial applications. Since there is little analysis regarding the implementation of location privacy-preserving mechanisms, there is still a large gap between theory and privacy.

Attack
During the process of exchanging data, the three main parties—the user, the server, and the networks—can be attacked by adversaries. The knowledge held by adversaries which could be used to carry out location attacks includes observed location information, precise location information, and context knowledge. The techniques of machine learning and big data have also led to an emerging trend in location privacy, and the popularity of smart devices has led to an increasing number of attacks. Some of the adopted approaches include the virus, the Trojan applications, and several cyber-attacks.


 * Man-in-the-middle attack

Man-in-the-middle attacks usually occur in the mobile environment which assumes that all the information going through the transferring process from user to the service provider could be under attacks and might be manipulated further by attackers revealing more personal information.


 * Cross-service attack

Cross-servicing attacks usually take place when users are using poorly protected wireless connectivity, especially in public places.


 * Video-based attack

Video-based attacks are more prevalent in mobile devices usually due to the use of Bluetooth, camera, and video capacities, since there are malicious software applications secretly recording users’ behavior data and reporting that information to a remote device. Stealthy Video Capture is one of the intentionally designed applications which spies an unconscious user and further report the information.


 * Sensor sniffing attack

Sensor sniffing attacks usually refer to the cases where intentionally designed applications are installed on a device. Under this situation, even adversaries do not have physical contact with the mobile device, users’ personal information would still under risks of being disclosed.


 * Context linking attack

In a localization attack, contextual knowledge is combined with observed location information to disclose a precise location. The contextual knowledge can also be combined with precise location information to carry out identity attacks.


 * Machine/deep learning attack

Integrating learning algorithms and other deep learning methods are posing a huge challenge to location privacy, along with the massive amount of data online. For example, current deep learning methods can come up with predictions about geolocations based on the personal photos from social networks and performs types of object detection based on their abilities to analyze millions of photos and videos.

Regulations and policies
Policy approaches have also been discussed in recent years which intend to revise relevant guidelines or propose new regulations to better manage location-based service applications. The current technology state does not have a sufficiently aligned policies and legal environment, and there are efforts from both academia and industry trying to address this issue. Two uniformly accepted and well- established requirements are the users' awareness of location privacy policies in a specific service and their consents of sending their personal location to a service provider. Besides these two approaches, researchers have also been focusing on guarding the app markets, since an insecure app market would expose unaware users to several privacy risks. For example, there have been identified much malware in the Android app market, which are designed to carry cyber attacks on Android devices. Without effective and clear guidelines to regulate location information, it would generate both ethical and lawful problems. Therefore, many guidelines have been discussed in years recently, to monitor the use of location information.

European data protection guideline
European data protection guideline was recently revised to include and specify the privacy of an individual's data and personally identifiable information (PIIs). These adjustments intend to make a safe yet effective service environment. Specifically, location privacy is enhanced by making sure that the users are fully aware and consented on the location information which would be sent to the service providers. Another important adjustment is that a complete responsibility would be given to the service providers when users’ private information is being processed.

European Union's Directive
The European Union's Directive 95/46/EC on the protection of individuals with regard to the processing of personal data and on the free movement of such data specifies that the limited data transfer to non-EU countries which are with "an adequate level of privacy protection". The notion of explicit consent is also introduced in the Directive, which stated that except for legal and contractual purpose, personal data might only be processed if the user has unambiguously given his or her consent.

European Union's Directive 2002/58/EC on privacy and electronic communication explicitly defines location information, user consent requirements and corporate disposal requirement which helps to regulate and protect European citizens' location privacy. Under the situation when data are unlinkable to the user, the legal frameworks such as the EU Directive has no restriction on the collection of anonymous data.

The electronic communications privacy act of 1986
The electronic communications privacy act discusses the legal framework of privacy protection and gives standards of law enforcement access to electronic records and communications. It is also very influential in deciding electronic surveillance issues.

Global system for mobile communication association (GSMA)
GSMA published a new privacy guideline, and some mobile companies in Europe have signed it and started to implement it so that users would have a better understanding of the information recorded and analyzed when using location-based services. Also, GSMA has recommended the operating companies to inform their customers about people who have access to the users’ private information.

Corporate examples
Even though many privacy preserving mechanisms have not been integrated into common use due to effectiveness, efficiency, and practicality, some location-based service providers have started to address privacy issues in their applications. For example, Twitter enables its users to customize location accuracy. Locations posted in Glympse will automatically expire. Also, SocialRadar allows its users to choose to be anonymous or invisible when using this application.

Google
It has been stated that Google does not meet the European Union’s data privacy law and thus increasing attention has been placed on the advocation of guidelines and policies regarding data privacy.

Facebook
It has been arguing that less than a week after Facebook uses its “Places” feature, the content of that location information has been exploited by thieves and are used to conduct a home invasion.

United States v. Knotts case
In this case, the police used a beeper to keep track of the suspect's vehicle. After using the beeper alone to track the suspect, the officers secured a search warrant and confirmed that the suspect was producing illicit drugs in the van. The suspect tried to suppress the evidence based on the tracking device used during the monitoring process, but the court denied this. The court concluded that “A person traveling in an automobile on a public thouroughfare [sic] has no reasonable expectation of privacy in his movement from one place to another.” Nevertheless, the court reserved the discussion of whether twenty-four-hour surveillance would constitute a search.

However, the cases using GPS and other tracking devices are different with this case, since GPS tracking can be conducted without human interaction, while the beeper is considered as a method to increase police's sensory perception through maintaining visual contact of the suspect. Police presence is required when using beepers yet is not needed when using GPS to conduct surveillance. Therefore, law enforcement agents are required to secure a warrant before obtaining vehicle's location information with the GPS tracking devices.

United States v. Jones
In this case (https://www.oyez.org/cases/2011/10-1259), the police had a search warrant to install Global Positioning System on a respondent wife's car, while the actual installation was on the 11th day in Maryland, instead of the authorized installation district and beyond the approved ten days. The District Court ruled that the data recorded on public roads admissible since the respondent Jones had no reasonable exception of privacy in public streets, yet the D.C. Circuit reversed this through the violation of the Fourth Amendment of unwarranted use of GPS device.

Popular culture

 * In George Orwell's novel 1984, a world where everyone being watched is depicted, practically at all time and places.
 * Brønnøysund Register Center(https://www.brreg.no) in Norway provides a free public register service, where people can register and specify that they do not want to receive direct marketing, or sale phone calls or mails.