Reality mining

Reality mining is the collection and analysis of machine-sensed environmental data pertaining to human social behavior, with the goal of identifying predictable patterns of behavior. In 2008, MIT Technology Review called it one of the "10 technologies most likely to change the way we live."

Reality mining studies human interactions based on the usage of wireless devices such as mobile phones and GPS systems providing a more accurate picture of what people do, where they go, and with whom they communicate with rather than from more subjective sources such as a person's own account. Reality mining is one aspect of digital footprint analysis.

Reality Mining is using Big Data to conduct research and analyze how people interact with technology everyday to build systems that allow for positive change from the individual to the global community. Reality Mining also deals with data exhaust .

Individual Scale (1 person)
Individuals use mobile phones, tablets, laptops, cameras, and any device connected to the internet for a variety of purposes, therefore creating a variety of data from GPS locations to frequently asked questions on Google. Mobile phones carry so much data about the individual that now phones can suggest restaurants based on our searches, visited places, book preference, and even guess the ends of sentences we type. A simple application of Reality Mining is listening to voices and understanding speech patterns to diagnose medical problems such as the simple flu to even early onset Parkinson's. More powerful phones also allow for calendar customization and event tracking which display behaviors within individuals, what is deemed important enough to track. Social websites also allow researchers to view snapshots of a person's life by following status updates on Facebook or tweets from Twitter. Even more specific, a recent app called Snapchat allows users to post videos, pictures, or even live streams of exactly what they're doing when they're doing it, strong indicators of behaviors and interactions with the world. In 2004, MIT conducted the Reality Mining Project which gave 100 MIT students a Nokia 6600 which was tracked in a variety of ways by the researchers. The Cell Tower ID #'s (a very cheap and unobtrusive way to measure location), the status of the phone (charging or idle), and any use of the phone's applications (games, web surfing, etc...). They found that by collecting this kind of data, they could predict with high accuracy the behaviors of the students, for example, if one of the students woke up on a Saturday morning at 10 AM, the researches could predict what they were going to do that day using "eigenbehaviors". This new way of understanding data opened up doors for new research and possibly even larger survey research with detailed and accurate statistics. There are hundreds of websites offering software for mobile phones that will track just about everything the phone does, useful for worried parents or people who want to increase their personal productivity. This data is then uploaded to a server and can be accessed at any time.

Although already a lot of data can be collected from personal devices, they only make up a part of a person's life. Reality Miners can also use biometric devices to measure physical health and activity. There are many devices like this such as the Fitbit, Nike+, and Polar and Garmin GPS watches. There is even an app called Sleep Cycle for iPhone and Android users that measures sleep quality, which includes the amount of sleep and even optimal alarms settings. Using this data, Reality Miners may be able to measure one's actual health and processes that allow us to function (or dysfunction). Heart attacks generally don't have any longitudinal indicators, but using all this data or even when a person engages in Lifelogging can create date useful to the medical field and track the lifestyles of those who undergo heart attacks to then create preventative guidelines. There are several ways to start Lifelogging, for instance Google has its own device called Google Glass that has a Heads-Up-Display (HUD), a microphone, a processor, and a camera. These are all ways to log information in specific directories.

Community Scale (10 to 1,000 people)
The way researchers have started to observe and record behaviors in large groups was by using RFID badges. Data is also recorded in work places using Knowledge Management Systems that try to improve worker productivity and efficiency, although a short-coming of this is the inability to converge the social and technological cultures of the work place, therefore providing incomplete behavioral data. Another way to measure larger groups of people in a community is through conference attendance. This data allows researchers to know where participants are from, ethnic demographics, and the actual number of people attending the event. Some conferences use smart-badges with more functions than the standard RFID badges. Companies like Microsoft and IBM have used them to record the number of people they interact with during the conference and allow people to answer survey questions. The smart-badges also record vocal interactions and when attendees are at certain booths and can even alert booth workers when certain profiles enter within a certain range of the booth. Smart-badges have obvious advantages for gathering data for reality miners. In 2009, a company called nTag, which was then acquired by Alliance Technology used nTag technology which allows for users to even be notified whom to talk to and its able to exchange business cards electronically. Another type of data reality miners are looking are climate and environmental information. They collect data from neighborhoods by employing air-quality sensors which records carbon dioxide and nitrogen oxides as well as the general climate. Information like this could help policy makers decide whether to act or not or to see progress. Another way to collect data about the surrounding is through Project Noah. Project Noah was an effort to collect data on types of plant species by geotagging pictures of plants and fungi people upload, allowing users to see the kind of ecosystems users live in. This helps schools and students who want to collect data for projects, but also for bird-watchers to know what kind of birds are in the area.

City Scale (1,000 to 1,000,000 people)
In general terms for this section, a city is defined by 1,000 to 1,000,000 people. One way data is collected on a city scale is through collecting data on traffic with traffic signals and speed cameras. Data can also be collected from police reports and road scanners as well as GPS from mobile applications. Using this kind of traffic data, cities can create routes that would best allow for efficient movement and flow of traffic. A company called Inrix, started in 2010, has been compiling data on traffic and buys data from bridge operators and other transportation systems. It uses this data to predict traffic routes and time of congestion. Another way traffic can be monitored is through bluetooth technology, which is a technology that Inrix does not consider. The University of Maryland completed a project in 2012 that demonstrated that two Bluetooth sensors permanently placed two miles apart could accurately detect traffic speeds. All of this combined can be created to make route-suggestion algorithms to help people get to and from places in an efficient matter that, additionally, the route can update itself in real time using these type of sensors and information. Notable start-up, now a subsidiary of google, Waze, which also collected data from users (anonymously) who reported accidents and this game them in-app currencies and rewards. For crime on the city scale, the first way to collect and view data is through historical research of previous reports within any area. Now, more complex algorithms automatically place officers in places of high crime rates before any actual crime has been committed. Since 2005, the Memphis Police Department has been using a program called Blue CRUSH (Criminal Reduction Utilization Statistical History) which uses the police reports and uses heat maps to distinguish between high and low areas of crime. This program updates itself weekly and allows to the police department to change tactics accordingly. Using this kind of data will allow police departments to interact with the society in a much more meaningful way, also allowing preventative work to be done rather than rehabilitative work.

National Scale (1,000,000 to 100,000,000 people)
On the national scale, government play a much larger role. Census data are by far the easiest to acquire. Many nations make their census findings public via websites from which data can be downloaded and visualized for further analysis. "In addition, the World Bank conducts international surveys and compiles census data from all participating nations— a sort of one-stop shop for information on its member countries. These data are publicly accessible: they can be downloaded and independently sorted and analyzed. Importantly, the World Bank offers an open API that allows programmers to integrate various data into software applications. Using World Bank data, Google has integrated a simple visualization tool into its search results; a search query on the population of Botswana will pull up the number, the dated World Bank source, and a graph showing population change over decades". Another way to collect data is through call data record (or call detail record) which is just a log of phone calls and texts with information such as time and location of both the caller or sender and the recipient. CDR's allow phone companies to view human mobility trends. Major data companies like Google, Facebook, and Twitter also allow researchers to track cultural trends and even the when/where of the allocation of resources in time of natural disasters.

Global Scale (100,000,000 to 7,000,000,000 people)
The biggest worry for the world is the spread of disease and is one of reality mining's best applications. With globalization, the ability to travel is unprecedented compared to previous histories. The United Nations has created an agenda called the Millennium Development Goals (MDG) which are eight goals that aim to improve the world. They collect population data, the first step to allowing for policy making on disease control, nations must first collect data on air travel as billions of people travel by air each year and sea travel. Air travel carries more people each year than sea shipments, but the primary reason for collecting data on shipments is that shipments often carry pests that carry diseases, food-borne illnesses, and sometimes invasive species of plant and animal. The idea of managing and collecting seems monumental, but the World Bank has already started which helps statistical software like MAPS which stands for Marrakech Action Plan for Statistics. MAPS aims to complete six objectives, which include these three; For people traveling on flights, a source of data is the International Air Transportation Association (IATA) which has been collecting data on about 90% of global air traffic on a monthly basis since 2000. This data allowed researchers and professionals to view the ability of disease to spread from certain location on Earth. Ships carry about 90% of global trade; in 2001, the Automatic Identification system was implemented to record the "comings and goings of sea traffic".
 * Planning statistical systems and preparing national statistical development strategies for all low-income countries
 * Ensuring full participation of developing countries
 * Setting up the International Household Survey Network, a global collection of household-based socioeconomic data sets