User:Emil LaMarca/Personalized search

Personalized search is web search results that are tailored specifically to an individual's interests by incorporating information about the individual beyond the specific query provided. There are two general approaches to personalizing search results, involving modifying the user's query and re-ranking search results. There are three information access paradigms that users undertake each time they need to meet particular information needs on web environment. These information access paradigms include searching by surfing (or browsing), searching by query and recommendation.

History
Google introduced personalized search in 2004 and it was implemented in 2005 to Google search. Google has personalized search implemented for all users, not only those with a Google account. There is not much information on how exactly Google personalizes their searches; however, it is believed that they use user language, location, and web history.

Early search engines, like Google and AltaVista, found results based only on key words. Personalized search, as pioneered by Google, has become far more complex with the goal to "understand exactly what you mean and give you exactly what you want." Using mathematical algorithms, search engines are now able to return results based on the number of links to and from sites; the more links a site has, the higher it is placed on the page. Search engines have two degrees of expertise: the shallow expert and the deep expert. An expert from the shallowest degree serves as a witness who knows some specific information on a given event. A deep expert, on the other hand, has comprehensible knowledge that gives it the capacity to deliver unique information that is relevant to each individual inquirer. If a person knows what he or she wants then the search engine will act as a shallow expert and simply locate that information. But search engines are also capable of deep expertise in that they rank results indicating that those near the top are more relevant to a user's wants than those below.

Links can be classified based on destination of the link. The destination in this case refers to whether it leads users to another page on the same site or a different website. Internal links are links between the pages with the website. Search engines determine them by looking at the domain name. If the link on a page link to other pages within the same domain, they are considered internal links.

The filter bubble
Main article: Filter bubble

Filter bubble is a persistent concept in which an internet user encounters only information and opinions that conform to and reinforce their own beliefs, caused by algorithms that personalize an individual’s online experience. It decreases the likelihood of finding new information by biasing search results towards what the user has already found. It introduces potential privacy problems in which a user may not be aware that their search results are personalized for them, and wonder why the things that they are interested in have become so relevant. Such a problem has been coined as the "filter bubble" by Eli Pariser. He argues that people are letting major websites drive their destiny and make decisions based on the vast amount of data they've collected on individuals. This can isolate users in their own worlds or "filter bubbles" where they only see information that they want to, such a consequence of "The Friendly World Syndrome". As a result, people are much less informed of problems in the developing world which can further widen the gap between the North (developed countries) and the South (developing countries).

The methods of personalization, and how useful it is to "promote" certain results which have been showing up regularly in searches by like-minded individuals in the same community. The personalization method makes it very easy to understand how the filter bubble is created. As certain results are bumped up and viewed more by individuals, other results not favored by them are relegated to obscurity. As this happens on a community-wide level, it results in the community, consciously or not, sharing a skewed perspective of events. Filter bubbles have become more frequent in search results and are envisaged as disruptions to information flow in online more specifically social media.

The case of Google
Main article: Google Personalized Search

An important example of search personalization is Google. There are a host of Google applications, all of which can be personalized and integrated with the help of a Google account. Personalizing search does not require an account. However, one is almost deprived of a choice, since so many useful Google products are only accessible if one has a Google account. However, in order to incorporate personalization into full-scale Web search tools, we must study the behavior of the users as they interact with information sources. The Google Dashboard, introduced in 2009, covers more than 20 products and services, including Gmail, Calendar, Docs, YouTube, etc. that keeps track of all the information directly under one's name. The free Google Custom Search is available for individuals and big companies alike, providing the Search facility for individual websites and powering corporate sites such as that of the New York Times. The high level of personalization that was available with Google played a significant part in helping remain the world's most favorite search engine.

One example of Google's ability to personalize searches is in its use of Google News. Google has geared its news to show everyone a few similar articles that can be deemed interesting, but as soon as the user scrolls down, it can be seen that the news articles begin to differ. Google takes into account past searches as well as the location of the user to make sure that local news gets to them first. This can lead to a much easier search and less time going through all of the news to find the information one want. The concern, however, is that the very important information can be held back because it does not match the criteria that the program sets for the particular user. This can create the "filter bubble" as described earlier.

An interesting point about personalization that often gets overlooked is the privacy vs. personalization battle. While the two do not have to be mutually exclusive, it is often the case that as one becomes more prominent, it compromises the other. Google provides a host of services to people, and many of these services do not require information to be collected about a person to be customizable. Since there is no threat of privacy invasion with these services, the balance has been tipped to favor personalization over privacy, even when it comes to search. As people reap the rewards of convenience from customizing their other Google services, they desire better search results, even if it comes at the expense of private information. Where to draw the line between the information versus search results tradeoff is new territory and Google gets to make that decision. Until people get the power to control the information that is being collected about them, Google is not truly protecting privacy. Google's popularity as a search engine and Internet browser has allowed it to gain a lot of power. Their popularity has created millions of usernames, which have been used to collect vast amounts of information about individuals. Google can use multiple methods of personalization such as traditional, social, geographic, IP address, browser, cookies, time of day, year, behavioral, query history, bookmarks, and more. Although having Google personalize search results based on what users searched previously may have its benefits, there are negatives that come with it. With the power from this information, Google has chosen to enter other sectors it owned, such as videos, document sharing, shopping, maps, and many more. Google has done this by steering searchers to their own services offered as opposed to others such as MapQuest.

Using search personalization, Google has doubled its video market share to about eighty percent. The legal definition of a monopoly is when a firm gains control of seventy to eighty percent of the market. Google has reinforced this monopoly by creating significant barriers of entry such as manipulating search results to show their own services. This can be clearly seen with Google Maps being the first thing displayed in most searches.

The analytical firm Experian Hitwise stated that since 2007, MapQuest has had its traffic cut in half because of this. Other statistics from around the same time include Photobucket going from twenty percent of market share to only three percent, Myspace going from twelve percent market share to less than one percent, and ESPN from eight percent to four percent market share. In terms of images, Photobucket went from 31% in 2007 to 10% in 2010 and Yahoo Images has gone from 12% to 7%. It becomes apparent that the decline of these companies has come because of Google's increase in market share from 43% in 2007 to about 55% in 2009.

It can be said that Google is more dominant because they provide better services. However, Experian Hitwise has also created graphs to show the market share of about fifteen different companies at once. This has been done for every category for the market share of pictures, videos, product search, and more. The graph for product search is evidence enough for Google's influence because their numbers went from 1.3 million unique visitors to 11.9 unique visitors in one month. That kind of growth can only come with the change of a process.

In the end, there are two common themes with all of these graphs. The first is that Google's market share has a directly inverse relationship to the market share of the leading competitors. The second is that this directly inverse relationship began around 2007, which is around the time that Google began to use its "Universal Search" method.

Benefits
One of the most critical benefits personalized search has is to improve the quality of decisions consumers make. The internet has made the transaction cost of obtaining information significantly lower than ever. However, human ability to process information has not expanded much. When facing overwhelming amount of information, consumers need a sophisticated tool to help them make high quality decisions. Two studies examined the effects of personalized screening and ordering tools, and the results show a positive correlation between personalized search and the quality of consumers' decisions.

The first study was conducted by Kristin Diehl from the University of South Carolina. Her research discovered that reducing search cost led to lower quality choices. The reason behind this discovery was that 'consumers make worse choices because lower search costs cause them to consider inferior options.' It also showed that if consumers have a specific goal in mind, they would further their search, resulting in an even worse decision. The study by Gerald Haubl from the University of Alberta and Benedict G.C. Dellaert from Maastricht University mainly focused on recommendation systems. Both studies concluded that a personalized search and recommendation system significantly improved consumers' decision quality and reduced the number of products inspected.

Models
Personalized search gains popularity because of the demand for more relevant information and the fact that most people could really use some personal information such as personalized search gains. Research has indicated low success rates among major search engines in providing relevant results; in 52% of 20,000 queries, searchers did not find any relevant results within the documents that Google returned. Personalized search can improve search quality significantly and there are mainly two ways to achieve this goal.

The first model available is based on the users' historical searches and search locations. People are probably familiar with this model since they often find the results reflecting their current location and previous searches.

There is another way to personalize search results. In Bracha Shapira and Boaz Zabar's "Personalized Search: Integrating Collaboration and Social Networks", Shapira and Zabar focused on a model that utilizes a recommendation system. This model shows results of other users who have searched for similar keywords. The authors examined keyword search, the recommendation system, and the recommendation system with social network working separately and compares the results in terms of search quality. The results show that a personalized search engine with the recommendation system produces better quality results than the standard search engine, and that the recommendation system with social network even improves more.

Recent paper “Search personalization with embeddings” shows that a new embedding model for search personalization, where users are embedded on a topical interest space, produces better search results than strong learning-to-rank models.

Disadvantages
While there are documented benefits of the implementation of search personalization, there are also arguments against its use. The foundation of this argument against its use is because it confines internet users' search engine results to material that aligns with the users' interests and history. It limits the users' ability to become exposed to material that would be relevant to the user's search query but due to the fact that some of this material differs from the user's interests and history, the material is not displayed to the user. Search personalization takes the objectivity out of the search engine and undermines the engine. "Objectivity matters little when you know what you are looking for, but its lack is problematic when you do not". Another criticism of search personalization is that it limits a core function of the web: the collection and sharing of information. Search personalization prevents users from easily accessing all the possible information that is available for a specific search query. Search personalization adds a bias to user's search queries. If a user has a particular set of interests or internet history and uses the web to research a controversial issue, the user's search results will reflect that. The user may not be shown both sides of the issue and miss potentially important information if the user's interests lean to one side or another. A study done on search personalization and its effects on search results in Google News resulted in different orders of news stories being generated by different users, even though each user entered the same search query. According to Bates, "only 12% of the searchers had the same three stories in the same order. This to me is prima facie evidence that there is filtering going on". If search personalization was not active, all the results in theory should have been the same stories in an identical order.

Another disadvantage of search personalization is that internet companies such as Google are gathering and potentially selling their users' internet interests and histories to other companies. This raises a privacy issue concerning whether people are comfortable with companies gathering and selling their internet information without their consent or knowledge. Many web users are unaware of the use of search personalization and even fewer have knowledge that user data is a valuable commodity for internet companies.

Sites that Use It
E. Pariser, author of The Filter Bubble, explains how there are differences that search personalization has on both Facebook and Google. Facebook implements personalization when it comes to the amount of things people share and what pages they "like". An individual's social interactions, whose profile they visit the most, who they message or chat with are all indicators that are used when Facebook uses personalization. Rather than what people share being an indicator of what is filtered out, Google takes into consideration what we "click" to filter out what comes up in our searches. In addition, Facebook searches are not necessarily as private as the Google ones. Facebook draws on the more public self and users share what other people want to see. Even while tagging photographs, Facebook uses personalization and face recognition that will automatically assign a name to face. Facebook’s like button utilizes its users to do their own personalization for the website. What posts the user comments on or likes tells Facebook what type of posts they will be interested in for the future. In addition to this, it helps them predict what type of posts they will “comment on, share, or spam in the future.” The predictions are combined together to produce one relevancy score which helps Facebook decide what to show you and what to filter out. In 2016, Facebook introduced reactions (Love, Thankful, Haha, Wow, Sad, and Angry) in addition to liking a post. “Facebook has learned that any Reaction left on a post is a strong indicator that the user was more interested in that post than any other ‘liked’ posts.” Facebook is starting to weigh reactions the same way as likes. So even if you leave the “angry” reaction on a post, Facebook will show posts on the user’s feed because the user showed an interest in it.

In terms of Google, users are provided similar websites and resources based on what they initially click on. There are even other websites that use the filter tactic to better adhere to user preferences. For example, Netflix also judges from the users search history to suggest movies that they may be interested in for the future. There are sites like Amazon and personal shopping sites also use other peoples history in order to serve their interests better. Twitter also uses personalization by "suggesting" other people to follow. In addition, based on who one "follows", "tweets" and "retweets" at, Twitter filters out suggestions most relevant to the user. LinkedIn personalizes search results at two levels. LinkedIn federated search exploits user intent to personalize vertical order. For instance, for the same query like "software engineer", depending on whether a searcher has hiring or job seeking intent, he or she is served with either people or jobs as the primary vertical. Within each vertical, e.g., people search, result rankings are also personalized by taking into account the similarity and social relationships between searchers and results. Mark Zuckerberg, founder of Facebook, believed that people only have one identity. E. Pariser argues that is completely false and search personalization is just another way to prove that isn't true. Although personalized search may seem helpful, it is not a very accurate representation of any person. There are instances where people also search things and share things in order to make themselves look better. For example, someone may look up and share political articles and other intellectual articles. There are many sites being used for different purposes and that do not make up one person's identity at all, but provide false representations instead.

Online Shopping
Main article: Online shopping

Search engines such as Google and Yahoo! utilize personalized search to attract possible customers to products that fit their presumed desires. Based on a large amount of collected data aggregated from an individual's web clicks, search engines can use personalized search to put advertisements that may pique the interest of an individual. Utilizing personalized search can help consumers find what they want faster, as well as help match up products and services to individuals within more specialized and/or niche markets. Many of these products or services that are sold via personalized online results would struggle to sell in brick-and-mortar stores. These types of products and services are called long tail items. Using personalized search allows faster product and service discoveries for consumers, and reduces the amount of necessary advertisement money spent to reach those consumers. In addition, utilizing personalized search can help companies determine which individuals should be offered online coupon codes to their products and/or services. By tracking if an individual has perused their website, considered purchasing an item, or has previously made a purchase a company can post advertisements on other websites to reach that particular consumer in an attempt to have them make a purchase.

Aside from aiding consumers and businesses in finding one another, the search engines that provide personalized search benefit greatly. The more data collected on an individual, the more personalized results will be. In turn, this allows search engines to sell more advertisements because companies understand that they will have a better opportunity to sell to high percentage matched individuals then medium and low percentage matched individuals. This aspect of personalized search angers many scholars, such as William Badke and Eli Pariser, because they believe personalized search is driven by the desire to increase advertisement revenues. In addition, they believe that personalized search results are frequently utilized to sway individuals into using products and services that are offered by the particular search engine company or any other company in partnered with them. For example, Google searching any company with at least one brick-and-mortar location will offer a map portraying the closest company location using the Google Maps service as the first result to the query. In order to use other mapping services, such as MapQuest, a user would have to dig deeper into the results. Another example pertains to more vague queries. Searching the word "shoes" using the Google search engine will offer several advertisements to shoe companies that pay Google to link their website as a first result to consumer's queries.