Peer-to-peer file sharing

Peer-to-peer file sharing is the distribution and sharing of digital media using peer-to-peer (P2P) networking technology. P2P file sharing allows users to access media files such as books, music, movies, and games using a P2P software program that searches for other connected computers on a P2P network to locate the desired content. The nodes (peers) of such networks are end-user computers and distribution servers (not required).

The early days of file-sharing were done predominantly by client-server transfers from web pages, FTP and IRC before Napster popularised a Windows application that allowed users to both upload and download with a freemium style service. Record companies and artists called for its shutdown and FBI raids followed. Napster had been incredibly popular at its peak, spawning a grass-roots movement following from the mixtape scene of the 80's and left a significant gap in music availability with its followers. After much discussion on forums and in chat-rooms, it was decided that Napster had been vulnerable due to its reliance on centralised servers and their physical location and thus competing groups raced to build a decentralised peer-to-peer system.

Peer-to-peer file sharing technology has evolved through several design stages from the early networks like Gnutella, which popularized the technology in several iterations that used various front ends such as Kazaa, Limewire and WinMX before Edonkey then on to later models like the BitTorrent protocol. Microsoft uses it for Update distribution (Windows 10) and online video games use it as their content distribution network for downloading large amounts of data without incurring the dramatic costs for bandwidth inherent when providing just a single source.

Several factors contributed to the widespread adoption and facilitation of peer-to-peer file sharing. These included increasing Internet bandwidth, the widespread digitization of physical media, and the increasing capabilities of residential personal computers. Users are able to transfer one or more files from one computer to another across the Internet through various file transfer systems and other file-sharing networks.



History
Peer-to-peer file sharing saw its first wave of popularity after the introduction of Napster, a file sharing application that used P2P technology.

The central index server indexed the users and their shared content. When someone searched for a file, the server searched all available copies of that file and presented them to the user. The files would be transferred directly between private computers (peers/nodes). A limitation was that only music files could be shared. Because this process occurred on a central server, however, Napster was held liable for copyright infringement and shut down in July 2001. It later reopened as a pay service.

After Napster was shut down, peer-to-peer services were invented such as Gnutella and Kazaa. These services also allowed users to download files other than music, such as movies and games.

Technology evolution
Napster and eDonkey2000 both used a central server-based model. These systems relied on the operation of the respective central servers, and thus were susceptible to centralized shutdown. Their demise led to the rise of networks like Limewire, Kazaa, Morpheus, Gnutella, and Gnutella2, which are able to operate without any central servers, eliminating the central vulnerability by connecting users remotely to each other. However, these networks still relied on specific, centrally distributed client programs, so they could be crippled by taking legal action against a sufficiently large number of publishers of the client programs. Sharman Networks, the publisher of Kazaa, has been inactive since 2006. StreamCast Networks, the publisher of Morpheus, shut down on April 22, 2008. Limewire LLC was shut down in late 2010 or early 2011. This cleared the way for the dominance of the Bittorrent protocol, which differs from its predecessors in two major ways. The first is that no individual, group, or company owns the protocol or the terms "Torrent" or "Bittorrent", meaning that anyone can write and distribute client software that works with the network. The second is that Bittorrent clients have no search functionality of their own. Instead, users must rely on third-party websites like Isohunt or The Pirate Bay to find "torrent" files, which function like maps that tell the client how to find and download the files that the user actually wants. These two characteristics combined offer a level of decentralization that makes Bittorrent practically impossible to shut down. File-sharing networks are sometimes organized into three "generations" based on these different levels of decentralization. Illegal darknets, including networks like Freenet, are sometimes considered to be third-generation file-sharing networks.

Peer-to-peer file sharing is also efficient in terms of cost. The system administration overhead is smaller because the user is the provider and usually the provider is the administrator as well. Hence each network can be monitored by the users themselves. At the same time, large servers sometimes require more storage and this increases the cost since the storage has to be rented or bought exclusively for a server. However, usually peer-to-peer file sharing does not require a dedicated server.

Economic impact
There are ongoing discussion about the economic impact of P2P file sharing. Norbert Michel, a policy analyst at The Heritage Foundation, said that studies had produced "disparate estimates of file sharing's impact on album sales".

In the book The Wealth of Networks, Yochai Benkler states that peer-to-peer file sharing is economically efficient and that the users pay the full transaction cost and marginal cost of such sharing even if it "throws a monkey wrench into the particular way in which our society has chosen to pay musicians and re-cording executives. This trades off efficiency for longer-term incentive effects for the recording industry. However, it is efficient within the normal meaning of the term in economics in a way that it would not have been had Jack and Jane used subsidized computers or network connections".

A calculation example:

with peer to peer file sharing: $$\text{total cost} = \frac{\text{filesize}}{\text{customers}} \times \text{cost-per-byte}$$

with casual content delivery networks: $$\text{total cost} = \text{filesize} \times \text{customers} \times \text{cost-per-byte}$$

Music industry
The economic effect of copyright infringement through peer-to-peer file sharing on music revenue has been controversial and difficult to determine. Unofficial studies found that file sharing had a negative impact on record sales. It has proven difficult to untangle the cause and effect relationships among a number of different trends, including an increase in legal online purchases of music; illegal file-sharing; drop in the prices of compact disks; and the closure of many independent music stores with a concomitant shift to sales by big-box retailers.

Film industry
The Motion Picture Association (MPAA) reported that American studios lost $2,373 billion in 2005 (equivalent to $ billion in ) representing approximately one third of the total cost of film piracy in the United States. The MPAA's estimate was doubted by commentators since it was based on the assumption that one download was equivalent to one lost sale, and downloaders might not purchase the movie if illegal downloading was not an option. Due to the private nature of the study, the figures could not be publicly checked for methodology or validity. In January 2008, as the MPAA was lobbying for a bill which would compel Universities to crack down on piracy, it was admitted by MPAA that its figures on piracy in colleges had been inflated by up to 300%.

A 2010 study, commissioned by the International Chamber of Commerce and conducted by independent Paris-based economics firm TERA, estimated that unlawful downloading of music, film and software cost Europe's creative industries several billion dollars in revenue each year. A further TERA study predicted losses due to piracy reaching as much as 1.2 million jobs and €240 billion in retail revenue by 2015 if the trend continued. Researchers applied a substitution rate of ten percent to the volume of copyright infringements per year. This rate corresponded to the number of units potentially traded if unlawful file sharing were eliminated and did not occur. Piracy rates for popular software and operating systems have been common, even in regions with strong intellectual property enforcement, such as the United States or the European Union.

Public perception and usage
In 2004, an estimated 70 million people participated in online file sharing. According to a CBS News poll, nearly 70 percent of 18- to 29-year-olds thought file sharing was acceptable in some circumstances and 58 percent of all Americans who followed the file sharing issue considered it acceptable in at least some circumstances. In January 2006, 32 million Americans over the age of 12 had downloaded at least one feature-length movie from the Internet, 80 percent of whom had done so exclusively over P2P. Of the population sampled, 60 percent felt that downloading copyrighted movies off the Internet did not constitute a very serious offense, however 78 percent believed taking a DVD from a store without paying for it constituted a very serious offense.

In July 2008, 20 percent of Europeans used file sharing networks to obtain music, while 10 percent used paid-for digital music services such as iTunes. In February 2009, a survey undertaken by Tiscali in the UK found that 75 percent of the English public polled were aware of what was legal and illegal in relation to file sharing, but there was a divide as to where they felt the legal burden should be placed: 49 percent of people believed P2P companies should be held responsible for illegal file sharing on their networks and 18 percent viewed individual file sharers as the culprits.

According to an earlier poll, 75 percent of young voters in Sweden (18-20) supported file sharing when presented with the statement: "I think it is OK to download files from the Net, even if it is illegal." Of the respondents, 38 percent said they "adamantly agreed" while 39 percent said they "partly agreed". An academic study among American and European college students found that users of file-sharing technologies were relatively anti-copyright and that copyright enforcement created backlash, hardening pro-file sharing beliefs among users of these technologies.

Communities in P2P file sharing networks
Communities have a prominent role in many peer to peer networks and applications, such as BitTorrent, Gnutella and DC++. There are different elements that contribute to the formation, development and the stability of these communities, which include interests, user attributes, cost reduction, user motivation and the dimension of the community.

Interest attributes
Peer communities are formed on the basis of common interests. For Khambatti, Ryu and Dasgupta common interests can be labelled as attributes "which are used to determine the peer communities in which a particular peer can participate". There are two ways in which these attributes can be classified: explicit and implicit attributes.

Explicit values are information that peers provide about themselves to a specific community, such as their interest in a subject or their taste in music. With implicit values, users do not directly express information about themselves, albeit, it is still possible to find information about that specific user by uncovering his or her past queries and research carried out in a P2P network. Khambatti, Ryu and Dasgupta divide these interests further into three classes: personal, claimed and group attributes.

A full set of attributes (common interests) of a specific peer is defined as personal attributes, and is a collection of information a peer has about him or herself. Peers may decide not to disclose information about themselves to maintain their privacy and online security. It is for this reason that the authors specify that "a subset of...attributes is explicitly claimed public by a peer", and they define such attributes as "claimed attributes". The third category of interests is group attributes, defined as "location or affiliation oriented" and are needed to form a...basis for communities", an example being the "domain name of an internet connection" which acts as an online location and group identifier for certain users.

Cost reduction
Cost reduction influences the sharing component of P2P communities. Users who share do so to attempt "to reduce...costs" as made clear by Cunningham, Alexander and Adilov. In their work Peer-to-peer File Sharing Communities, they explain that "the act of sharing is costly since any download from a sharer implies that the sharer is sacrificing bandwidth". As sharing represents the basis of P2P communities, such as Napster, and without it "the network collapses", users share despite its costs in order to attempt to lower their own costs, particularly those associated with searching, and with the congestion of internet servers.

User motivation and size of community
User motivation and the size of the P2P community contribute to its sustainability and activity. In her work Motivating Participation in Peer to Peer Communities, Vassileva studies these two aspects through an experiment carried out in the University of Saskatchewan (Canada), where a P2P application (COMUTELLA) was created and distributed among students. In her view, motivation is "a crucial factor" in encouraging users to participate in an online P2P community, particularly because the "lack of a critical mass of active users" in the form of a community will not allow for a P2P sharing to function properly.

Usefulness is a valued aspect by users when joining a P2P community. The specific P2P system must be perceived as "useful" by the user and must be able to fulfil his or her needs and pursue his or her interests. Consequently, the "size of the community of users defines the level of usefulness" and "the value of the system determines the number of users". This two way process is defined by Vassileva as a feedback loop, and has allowed for the birth of file-sharing systems like Napster and KaZaA. However, in her research Vassileva has also found that "incentives are needed for the users in the beginning", particularly for motivating and getting users into the habit of staying online. This can be done, for example, by providing the system with a wide amount of resources or by having an experienced user provide assistance to a less experienced one.

User classification
Users participating in P2P systems can be classified in different ways. According to Vassileva, users can be classified depending on their participation in the P2P system. There are five types of users to be found: users who create services, users who allow services, users who facilitate search, users who allow communication, users who are uncooperative and free ride.

In the first instance, the user creates new resources or services and offers them to the community. In the second, the user provides the community with disk space "to store files for downloads" or with "computing resources" to facilitate a service provided by another users. In the third, the user provides a list of relationships to help other users find specific files or services. In the fourth, the user participates actively in the "protocol of the network", contributing to keeping the network together. In the last situation, the user does not contribute to the network, downloads what he or she needs but goes immediately offline once the service is not needed anymore, thus free-riding on the network and community resources.

Tracking
Corporations continue to combat the use of the internet as a tool to illegally copy and share various files, especially that of copyrighted music. The Recording Industry Association of America (RIAA) has been active in leading campaigns against infringers. Lawsuits have been launched against individuals as well as programs such as Napster in order to "protect" copyright owners. One effort of the RIAA has been to implant decoy users to monitor the use of copyrighted material from a firsthand perspective.

Risks
In early June 2002, Researcher Nathaniel Good at HP Labs demonstrated that user interface design issues could contribute to users inadvertently sharing personal and confidential information over P2P networks.

In 2003, Congressional hearings before the House Committee of Government Reform (Overexposed: The Threats to Privacy & Security on File Sharing Networks) and the Senate Judiciary Committee (The Dark Side of a Bright Idea: Could Personal and National Security Risks Compromise the Potential of P2P File-Sharing Networks?) were convened to address and discuss the issue of inadvertent sharing on peer-to-peer networks and its consequences to consumer and national security.

Researchers have examined potential security risks including the release of personal information, bundled spyware, and viruses downloaded from the network. Some proprietary file sharing clients have been known to bundle malware, though open source programs typically have not. Some open source file sharing packages have even provided integrated anti-virus scanning.

Since approximately 2004 the threat of identity theft had become more prevalent, and in July 2008 there was another inadvertent revealing of vast amounts of personal information through P2P sites. The "names, dates of birth, and Social Security numbers of about 2,000 of (an investment) firm's clients" were exposed, "including [those of] Supreme Court Justice Stephen Breyer." A drastic increase in inadvertent P2P file sharing of personal and sensitive information became evident in 2009 at the beginning of President Obama's administration when the blueprints to the helicopter Marine One were made available to the public through a breach in security via a P2P file sharing site. Access to this information has the potential of being detrimental to US security. Furthermore, shortly before this security breach, the Today show had reported that more than 150,000 tax returns, 25,800 student loan applications and 626,000 credit reports had been inadvertently made available through file sharing.

The United States government then attempted to make users more aware of the potential risks involved with P2P file sharing programs through legislation such as H.R. 1319, the Informed P2P User Act, in 2009. According to this act, it would be mandatory for individuals to be aware of the risks associated with peer-to-peer file sharing before purchasing software with informed consent of the user required prior to use of such programs. In addition, the act would allow users to block and remove P2P file sharing software from their computers at any time, with the Federal Trade Commission enforcing regulations. US-CERT also warns of the potential risks.

Nevertheless, in 2010, researchers discovered thousands of documents containing sensitive patient information on popular peer-to-peer (P2P) networks, including insurance details, personally identifying information, physician names and diagnosis codes on more than 28,000 individuals. Many of the documents contained sensitive patient communications, treatment data, medical diagnoses and psychiatric evaluations.

Copyright issues
The act of file sharing is not illegal per se and peer-to-peer networks are also used for legitimate purposes. The legal issues in file sharing involve violating the laws of copyrighted material. Most discussions about the legality of file sharing are implied to be about solely copyright material. Many countries have fair use exceptions that permit limited use of copyrighted material without acquiring permission from the rights holders. Such documents include commentary, news reporting, research and scholarship. Copyright laws are territorial- they do not extend beyond the territory of a specific state unless that state is a party to an international agreement. Most countries today are parties to at least one such agreement.

In the area of privacy, recent court rulings seem to indicate that there can be no expectation of privacy in data exposed over peer-to-peer file-sharing networks. In a 39-page ruling released November 8, 2013, US District Court Judge Christina Reiss denied the motion to suppress evidence gathered by authorities without a search warrant through an automated peer-to-peer search tool.

Curtailing the sharing of copyrighted materials
Media industries have made efforts to curtail the spread of copyrighted materials through P2P systems. Initially, the corporations were able to successfully sue the distribution platforms such as Napster and have them shut down. Additionally, they litigated users who prominently shared copyrighted materials en masse. However, as more decentralized systems such as FastTrack were developed, this proved to be unenforceable. There are also millions of users worldwide who use P2P systems illegally, which made it impractical to seek widespread legal action. One major effort involves distributing polluted files into the P2P network. For instance, one may distribute unrelated files that has the metadata of a copyrighted media. This way, users who downloads the media would receive something unrelated to what they have been expecting.