Personalized search

Personalized search is a web search tailored specifically to an individual's interests by incorporating information about the individual beyond the specific query provided. There are two general approaches to personalizing search results, involving modifying the user's query and re-ranking search results.

History
Google introduced personalized search in 2004 and it was implemented in 2005 to Google search. Google has personalized search implemented for all users, not only those with a Google account. There is not much information on how exactly Google personalizes their searches; however, it is believed that they use user language, location, and web history.

Early search engines, like Google and AltaVista, found results based only on key words. Personalized search, as pioneered by Google, has become far more complex with the goal to "understand exactly what you mean and give you exactly what you want." Using mathematical algorithms, search engines are now able to return results based on the number of links to and from sites; the more links a site has, the higher it is placed on the page. Search engines have two degrees of expertise: the shallow expert and the deep expert. An expert from the shallowest degree serves as a witness who knows some specific information on a given event. A deep expert, on the other hand, has comprehensible knowledge that gives it the capacity to deliver unique information that is relevant to each individual inquirer. If a person knows what he or she wants then the search engine will act as a shallow expert and simply locate that information. But search engines are also capable of deep expertise in that they rank results indicating that those near the top are more relevant to a user's wants than those below.

While many search engines take advantage of information about people in general, or about specific groups of people, personalized search depends on a user profile that is unique to the individual. Research systems that personalize search results model their users in different ways. Some rely on users explicitly specifying their interests or on demographic/cognitive characteristics. However, user-supplied information can be difficult to collect and keep up to date. Others have built implicit user models based on content the user has read or their history of interaction with Web pages.

There are several publicly available systems for personalizing Web search results (e.g., Google Personalized Search and Bing's search result personalization ). However, the technical details and evaluations of these commercial systems are proprietary. One technique Google uses to personalize searches for its users is to track log in time and if the user has enabled web history in his browser. If a user accesses the same site through a search result from Google many times, it believes that they like that page. So when users carry out certain searches, Google's personalized search algorithm gives the page a boost, moving it up through the ranks. Even if a user is signed out, Google may personalize their results because it keeps a 180-day record of what a particular web browser has searched for, linked to a cookie in that browser.

In search engines on social networking platforms like Facebook or LinkedIn, personalization could be achieved by exploiting homophily between searchers and results. For example, in People search, searchers are often interested in people in the same social circles, industries or companies. In Job search, searchers are usually interested in jobs at similar companies, jobs at nearby locations and jobs requiring expertise similar to their own.

In order to better understand how personalized search results are being presented to the users, a group of researchers at Northeastern University compared an aggregate set of searches from logged in users against a control group. The research team found that 11.7% of results show differences due to personalization; however, this varies widely by search query and result ranking position. Of various factors tested, the two that had measurable impact were being logged in with a Google account and the IP address of the searching users. It should also be noted that results with high degrees of personalization include companies and politics. One of the factors driving personalization is localization of results, with company queries showing store locations relevant to the location of the user. So, for example, if a user searched for "used car sales", Google may produce results of local car dealerships in their area. On the other hand, queries with the least amount of personalization include factual queries ("what is") and health.

When measuring personalization, it is important to eliminate background noise. In this context, one type of background noise is the carry-over effect. The carry-over effect can be defined as follows: when a user performs a search and follow it with a subsequent search, the results of the second search is influenced by the first search. A noteworthy point is that the top-ranked URLs are less likely to change based on personalization, with most personalization occurring at the lower ranks. This is a style of personalization based on recent search history, but it is not a consistent element of personalization because the phenomenon times out after 10 minutes, according to the researchers.

The filter bubble
Several concerns have been brought up regarding personalized search. It decreases the likelihood of finding new information by biasing search results towards what the user has already found. It introduces potential privacy problems in which a user may not be aware that their search results are personalized for them, and wonder why the things that they are interested in have become so relevant. Such a problem has been coined as the "filter bubble" by author Eli Pariser. He argues that people are letting major websites drive their destiny and make decisions based on the vast amount of data they've collected on individuals. This can isolate users in their own worlds or "filter bubbles" where they only see information that they want to, such a consequence of "The Friendly World Syndrome". As a result, people are much less informed of problems in the developing world which can further widen the gap between the North (developed countries) and the South (developing countries).

The methods of personalization, and how useful it is to "promote" certain results which have been showing up regularly in searches by like-minded individuals in the same community. The personalization method makes it very easy to understand how the filter bubble is created. As certain results are bumped up and viewed more by individuals, other results not favored by them are relegated to obscurity. As this happens on a community-wide level, it results in the community, consciously or not, sharing a skewed perspective of events. Filter bubbles have become more frequent in search results and are envisaged as disruptions to information flow in online more specifically social media.

An area of particular concern to some parts of the world is the use of personalized search as a form of control over the people utilizing the search by only giving them particular information (selective exposure). This can be used to give particular influence over highly talked about topics such as gun control or even gear people to side with a particular political regime in different countries. While total control by a particular government just from personalized search is a stretch, control of the information readily available from searches can easily be controlled by the richest corporations. The biggest example of a corporation controlling the information is Google. Google is not only feeding you the information they want but they are at times using your personalized search to gear you towards their own companies or affiliates. This has led to a complete control of various parts of the web and a pushing out of their competitors such as how Google Maps took a major control over the online map and direction industry with MapQuest and others forced to take a backseat.

Many search engines use concept-based user profiling strategies that derive only topics that users are highly interested in but for best results, according to researchers Wai-Tin and Dik Lun, both positive and negative preferences should be considered. Such profiles, applying negative and positive preferences, result in highest quality and most relevant results by separating alike queries from unalike queries. For example, typing in 'apple' could refer to either the fruit or the Macintosh computer and providing both preferences aids search engines' ability to learn which apple the user is really looking for based on the links clicked. One concept-strategy the researchers came up with to improve personalized search and yield both positive and negative preferences is the click-based method. This method captures a user's interests based on which links they click on in a results list, while downgrading unclicked links.

The feature also has profound effects on the search engine optimization industry, due to the fact that search results will no longer be ranked the same way for every user. An example of this is found in Eli Pariser's, The Filter Bubble, where he had two friends type in "BP" into Google's search bar. One friend found information on the BP oil spill in the Gulf of Mexico while the other retrieved investment information. The aspect of information overload is also prevalent when using search engine optimization. However, one means of managing information overload is through accessing value-added information—information that has been collected, processed, filtered, and personalized for each individual user in some way. For instance, Google uses various ‘‘signals’’ in order to personalize searches including location, previous search keywords and recently contacts in a user’s social network while on the other hand, Facebook registers the user’s interactions with other users, the so-called ‘‘social gestures’’. The social gestures in this case include things such as use likes, shares, subscribe and comments. When the user interacts with the system by consuming a set of information, the system registers the user interaction and history. On a later date, on the basis of this interaction history, some critical information is filtered out. This include content produced by some friends might be hidden from the user. This is because the user did not interact with the excluded friends over a given time. It is also essential to note that within the social gestures, photos and videos receives higher ranking than regular status posts and other related posts.

The filter bubble has made a heavy effect on the search for information of health. With the influence of search results based upon search history, social network, personal preference and other aspects, misinformation has been a large contributor in the drop of vaccination rate. In 2014/15 there was an outbreak of measles in America with there being 644 reported cases during the time period. The key contributors to this outbreak were anti-vaccine organizations and public figures, who at the time were spreading fear about the vaccine.

Some have noted that personalized search results not only serve to customize a user's search results, but also advertisements. This has been criticized as an invasion on privacy.

The case of Google
An important example of search personalization is Google. There are a host of Google applications, all of which can be personalized and integrated with the help of a Google account. Personalizing search does not require an account. However, one is almost deprived of a choice, since so many useful Google products are only accessible if one has a Google account. The Google Dashboard, introduced in 2009, covers more than 20 products and services, including Gmail, Calendar, Docs, YouTube, etc. that keeps track of all the information directly under one's name. The free Google Custom Search is available for individuals and big companies alike, providing the Search facility for individual websites and powering corporate sites such as that of the New York Times. The high level of personalization that was available with Google played a significant part in helping it remain the world's favorite search engine.

One example of Google's ability to personalize searches is in its use of Google News. Google has geared its news to show everyone a few similar articles that can be deemed interesting, but as soon as the user scrolls down, it can be seen that the news articles begin to differ. Google takes into account past searches as well as the location of the user to make sure that local news gets to them first. This can lead to a much easier search and less time going through all of the news to find the information one want. The concern, however, is that the very important information can be held back because it does not match the criteria that the program sets for the particular user. This can create the "filter bubble" as described earlier.

An interesting point about personalization that often gets overlooked is the privacy vs personalization battle. While the two do not have to be mutually exclusive, it is often the case that as one becomes more prominent, it compromises the other. Google provides a host of services to people, and many of these services do not require information to be collected about a person to be customizable. Since there is no threat of privacy invasion with these services, the balance has been tipped to favor personalization over privacy, even when it comes to search. As people reap the rewards of convenience from customizing their other Google services, they desire better search results, even if it comes at the expense of private information. Where to draw the line between the information versus search results tradeoff is new territory and Google gets to make that decision. Until people get the power to control the information that is being collected about them, Google is not truly protecting privacy. Google's popularity as a search engine and Internet browser has allowed it to gain a lot of power. Their popularity has created millions of usernames, which have been used to collect vast amounts of information about individuals. Google can use multiple methods of personalization such as traditional, social, geographic, IP address, browser, cookies, time of day, year, behavioral, query history, bookmarks, and more. Although having Google personalize search results based on what users searched previously may have its benefits, there are negatives that come with it. With the power from this information, Google has chosen to enter other sectors it owned, such as videos, document sharing, shopping, maps, and many more. Google has done this by steering searchers to their own services offered as opposed to others such as MapQuest.

Using search personalization, Google has doubled its video market share to about eighty percent. The legal definition of a monopoly is when a firm gains control of seventy to eighty percent of the market. Google has reinforced this monopoly by creating significant barriers of entry such as manipulating search results to show their own services. This can be clearly seen with Google Maps being the first thing displayed in most searches.

The analytical firm Experian Hitwise stated that since 2007, MapQuest has had its traffic cut in half because of this. Other statistics from around the same time include Photobucket going from twenty percent of market share to only three percent, Myspace going from twelve percent market share to less than one percent, and ESPN from eight percent to four percent market share. In terms of images, Photobucket went from 31% in 2007 to 10% in 2010 and Yahoo Images has gone from 12% to 7%. It becomes apparent that the decline of these companies has come because of Google's increase in market share from 43% in 2007 to about 55% in 2009.

It can be said that Google is more dominant because they provide better services. However, Experian Hitwise has also created graphs to show the market share of about fifteen different companies at once. This has been done for every category for the market share of pictures, videos, product search, and more. The graph for product search is evidence enough for Google's influence because their numbers went from 1.3 million unique visitors to 11.9 unique visitors in one month. That kind of growth can only come with the change of a process.

In the end, there are two common themes with all of these graphs. The first is that Google's market share has a direct inverse relationship to the market share of the leading competitors. The second is that this directly inverse relationship began around 2007, which is around the time that Google began to use its "Universal Search" method.

Benefits
One of the most critical benefits personalized search has is to improve the quality of decisions consumers make. The internet has made the transaction cost of obtaining information significantly lower than ever. However, human ability to process information has not expanded much. When facing overwhelming amount of information, consumers need a sophisticated tool to help them make high quality decisions. Two studies examined the effects of personalized screening and ordering tools, and the results show a positive correlation between personalized search and the quality of consumers' decisions.

The first study was conducted by Kristin Diehl from the University of South Carolina. Her research discovered that reducing search cost led to lower quality choices. The reason behind this discovery was that 'consumers make worse choices because lower search costs cause them to consider inferior options.' It also showed that if consumers have a specific goal in mind, they would further their search, resulting in an even worse decision. The study by Gerald Haubl from the University of Alberta and Benedict G.C. Dellaert from Maastricht University mainly focused on recommendation systems. Both studies concluded that a personalized search and recommendation system significantly improved consumers' decision quality and reduced the number of products inspected.

On the same note the use of the use of filter bubbles in personalized search has also led to several benefits to the users. For instance filter bubbles have the potential of enhancing opinion diversity by allowing like-minded citizens to come together and reinforce their beliefs. This also helps in protecting users from fake and extremist content by enclosing them in bubbles of reliable and verifiable information. Filter bubbles can be an important element of information freedom by providing users more choice.

Personalized search has also proved to work on the benefit of the user in the sense that they improve the information search results. Personalized search tailors search result to the needs of the user in the sense that it matches what the user wants with past search history. This also helps reduce the amount of irrelevant information and also reduces the amount of time users spend in searching for information. For instance, in Google, the search history of user is kept and matched with the user query in the user's next searches. Google achieves this through three important techniques. The three techniques include (i) query reformulation using extra knowledge, i.e., expansion or refinement of a query, (ii) post filtering or re-ranking of the retrieved documents (based on the user profile or the context), and (iii) improvement of the IR model.

Models
Personalized search gains popularity because of the demand for more relevant information and the fact that most people could really use some personal information such as personalized search gains. Research has indicated low success rates among major search engines in providing relevant results; in 52% of 20,000 queries, searchers did not find any relevant results within the documents that Google returned. Personalized search can improve search quality significantly and there are mainly two ways to achieve this goal.

The first model available is based on the users' historical searches and search locations. People are probably familiar with this model since they often find the results reflecting their current location and previous searches.

There is another way to personalize search results. In Bracha Shapira and Boaz Zabar's "Personalized Search: Integrating Collaboration and Social Networks", Shapira and Zabar focused on a model that utilizes a recommendation system. This model shows results of other users who have searched for similar keywords. The authors examined keyword search, the recommendation system, and the recommendation system with social network working separately and compares the results in terms of search quality. The results show that a personalized search engine with the recommendation system produces better quality results than the standard search engine, and that the recommendation system with social network even improves more.

Recent paper “Search personalization with embeddings” shows that a new embedding model for search personalization, where users are embedded on a topical interest space, produces better search results than strong learning-to-rank models.

Disadvantages
While there are documented benefits of the implementation of search personalization, there are also arguments against its use. The foundation of this argument against its use is because it confines internet users' search engine results to material that aligns with the users' interests and history. It limits the users' ability to become exposed to material that would be relevant to the user's search query but due to the fact that some of this material differs from the user's interests and history, the material is not displayed to the user. Search personalization takes the objectivity out of the search engine and undermines the engine. "Objectivity matters little when you know what you are looking for, but its lack is problematic when you do not". Another criticism of search personalization is that it limits a core function of the web: the collection and sharing of information. Search personalization prevents users from easily accessing all the possible information that is available for a specific search query. Search personalization adds a bias to user's search queries. If a user has a particular set of interests or internet history and uses the web to research a controversial issue, the user's search results will reflect that. The user may not be shown both sides of the issue and miss potentially important information if the user's interests lean to one side or another. A study done on search personalization and its effects on search results in Google News resulted in different orders of news stories being generated by different users, even though each user entered the same search query. According to Bates, "only 12% of the searchers had the same three stories in the same order. This to me is prima facie evidence that there is filtering going on". If search personalization was not active, all the results in theory should have been the same stories in an identical order.

Another disadvantage of search personalization is that internet companies such as Google are gathering and potentially selling their users' internet interests and histories to other companies. This raises a privacy issue concerning whether people are comfortable with companies gathering and selling their internet information without their consent or knowledge. Many web users are unaware of the use of search personalization and even fewer have knowledge that user data is a valuable commodity for internet companies.

Sites that use it
E. Pariser, author of The Filter Bubble, explains how there are differences that search personalization has on both Facebook and Google. Facebook implements personalization when it comes to the amount of things people share and what pages they "like". An individual's social interactions, whose profile they visit the most, who they message or chat with are all indicators that are used when Facebook uses personalization. Rather than what people share being an indicator of what is filtered out, Google takes into consideration what we "click" to filter out what comes up in our searches. In addition, Facebook searches are not necessarily as private as the Google ones. Facebook draws on the more public self and users share what other people want to see. Even while tagging photographs, Facebook uses personalization and face recognition that will automatically assign a name to face. Facebook's like button utilizes its users to do their own personalization for the website. What posts the user comments on or likes tells Facebook what type of posts they will be interested in for the future. In addition to this, it helps them predict what type of posts they will “comment on, share, or spam in the future.” The predictions are combined to produce one relevancy score which helps Facebook decide what to show you and what to filter out.

In terms of Google, users are provided similar websites and resources based on what they initially click on. There are even other websites that use the filter tactic to better adhere to user preferences. For example, Netflix also judges from the users search history to suggest movies that they may be interested in for the future. There are sites like Amazon and personal shopping sites also use other peoples history in order to serve their interests better. Twitter also uses personalization by "suggesting" other people to follow. In addition, based on who one "follows", "tweets" and "retweets" at, Twitter filters out suggestions most relevant to the user. LinkedIn personalizes search results at two levels. LinkedIn federated search exploits user intent to personalize vertical order. For instance, for the same query like "software engineer", depending on whether a searcher has hiring or job seeking intent, he or she is served with either people or jobs as the primary vertical. Within each vertical, e.g., people search, result rankings are also personalized by taking into account the similarity and social relationships between searchers and results. Mark Zuckerberg, founder of Facebook, believed that people only have one identity. E. Pariser argues that is completely false and search personalization is just another way to prove that isn't true. Although personalized search may seem helpful, it is not a very accurate representation of any person. There are instances where people also search things and share things in order to make themselves look better. For example, someone may look up and share political articles and other intellectual articles. There are many sites being used for different purposes and that do not make up one person's identity at all, but provide false representations instead.

Online shopping
Search engines such as Google and Yahoo! utilize personalized search to attract possible customers to products that fit their presumed desires. Based on a large amount of collected data aggregated from an individual's web clicks, search engines can use personalized search to put advertisements that may pique the interest of an individual. Utilizing personalized search can help consumers find what they want faster, as well as help match up products and services to individuals within more specialized and/or niche markets. Many of these products or services that are sold via personalized online results would struggle to sell in brick-and-mortar stores. These types of products and services are called long tail items. Using personalized search allows faster product and service discoveries for consumers, and reduces the amount of necessary advertisement money spent to reach those consumers. In addition, utilizing personalized search can help companies determine which individuals should be offered online coupon codes to their products and/or services. By tracking if an individual has perused their website, considered purchasing an item, or has previously made a purchase a company can post advertisements on other websites to reach that particular consumer in an attempt to have them make a purchase.

Aside from aiding consumers and businesses in finding one another, the search engines that provide personalized search benefit greatly. The more data collected on an individual, the more personalized results will be. In turn, this allows search engines to sell more advertisements because companies understand that they will have a better opportunity to sell to high percentage matched individuals then medium and low percentage matched individuals. This aspect of personalized search angers many scholars, such as William Badke and Eli Pariser, because they believe personalized search is driven by the desire to increase advertisement revenues. In addition, they believe that personalized search results are frequently utilized to sway individuals into using products and services that are offered by the particular search engine company or any other company in partnered with them. For example, Google searching any company with at least one brick-and-mortar location will offer a map portraying the closest company location using the Google Maps service as the first result to the query. In order to use other mapping services, such as MapQuest, a user would have to dig deeper into the results. Another example pertains to more vague queries. Searching the word "shoes" using the Google search engine will offer several advertisements to shoe companies that pay Google to link their website as a first result to consumer's queries.