BitTorrent is a peer-to-peer file sharing protocol used to distribute large amounts of data. The initial distributor of the complete file or collection acts as the first seed. Each peer who downloads the data also uploads them to other peers. Relative to standard internet hosting, this provides a significant reduction in the original distributor's hardware and bandwidth resource costs. It also provides redundancy against system problems and reduces dependence on the original distributor.
Programmer Bram Cohen designed the protocol in April 2001 and released a first implementation on 2 July 2001. It is now maintained by Cohen's company BitTorrent, Inc. Usage of the protocol accounts for significant Internet traffic, though the precise amount has proven difficult to measure. There are numerous BitTorrent clients available for a variety of computing platforms.
A BitTorrent client is any program that implements the BitTorrent protocol. Each client is capable of preparing, requesting, and transmitting any type of computer file over a network, using the protocol. A peer is any computer running an instance of a client.
To share a file or group of files, a peer first creates a small file called a "torrent" (e.g. MyFile.torrent). This file contains metadata about the files to be shared and about the tracker, the computer that coordinates the file distribution. Peers that want to download the file must first obtain a torrent file for it, and connect to the specified tracker, which tells them from which other peers to download the pieces of the file.
Though both ultimately transfer files over a network, a BitTorrent download differs from a classic full-file HTTP request in several fundamental ways:
Taken together, these differences allow BitTorrent to achieve much lower cost, much higher redundancy, and much greater resistance to abuse or to "flash crowds" than a regular HTTP server. However, this protection comes at a cost: downloads can take time to rise to full speed because it may take time for enough peer connections to be established, and it takes time for a node to receive sufficient data to become an effective uploader. As such, a typical BitTorrent download will gradually rise to very high speeds, and then slowly fall back down toward the end of the download. This contrasts with an HTTP server that, while more vulnerable to overload and abuse, rises to full speed very quickly and maintains this speed throughout.
In general, BitTorrent's non-contiguous download methods have prevented it from supporting "progressive downloads" or "streaming playback". But comments made by Bram Cohen in January 2007 suggest that streaming torrent downloads will soon be commonplace and ad supported streaming appears to be the result of those comments.
The peer distributing a data file treats the file as a number of identically-sized pieces, typically between 64 kB and 4 MB each. The peer creates a checksum for each piece, using the SHA1 hashing algorithm, and records it in the torrent file. Pieces with sizes greater than 512 kB will reduce the size of a torrent file for a very large payload, but is claimed to reduce the efficiency of the protocol . When another peer later receives a particular piece, the checksum of the piece is compared to the recorded checksum to test that the piece is error-free. Peers that provide a complete file are called seeders, and the peer providing the initial copy is called the initial seeder.
The exact information contained in the torrent file depends on the version of the BitTorrent protocol. By convention, the name of a torrent file has the suffix
.torrent. Torrent files have an "announce" section, which specifies the URL of the tracker, and an "info" section, containing (suggested) names for the files, their lengths, the piece length used, and a SHA-1 hash code for each piece, all of which is used by clients to verify the integrity of the data they receive.
Completed torrent files are typically published on websites or elsewhere, and registered with a tracker. The tracker maintains lists of the clients currently participating in the torrent. Alternatively, in a trackerless system (decentralized tracking) every peer acts as a tracker. This is implemented by the BitTorrent, µTorrent, BitComet, KTorrent and Deluge clients through the distributed hash table (DHT) method. Vuze also supports a trackerless method that is incompatible (as of April 2007) with the DHT offered by all other supporting clients.
Clients incorporate mechanisms to optimize their download and upload rates; for example they download pieces in a random order to increase the opportunity to exchange data, which is only possible if two peers have different pieces of the file.
The effectiveness of this data exchange depends largely on the policies that clients use to determine to whom to send data. Clients may prefer to send data to peers who send data back to them (a tit for tat scheme), which encourages fair trading. But strict policies often result in suboptimal situations, such as when newly joined peers are unable to receive any data because they don't have any pieces yet to trade themselves or when two peers with a good connection between them do not exchange data simply because neither of them wants to take the initiative. To counter these effects, the official BitTorrent client program uses a mechanism called “optimistic unchoking,” where the client reserves a portion of its available bandwidth for sending pieces to random peers (not necessarily known-good partners, so called preferred peers), in hopes of discovering even better partners and to ensure that newcomers get a chance to join the swarm.
CableLabs, the research organization of the North American cable industry, estimates that BitTorrent represents 18% of all broadband traffic. In 2004, CacheLogic put that number at roughly 35% of all traffic on the Internet. The discrepancies in these numbers are caused by differences in the method used to measure P2P traffic on the Internet.
Routers that use NAT, Network Address Translation, must maintain tables of source and destination IP addresses and ports. Typical home routers are limited to about 2000 table entries while some more expensive routers have larger table capacities. BitTorrent frequently contacts 300-500 servers per second rapidly filling the NAT tables. This is a common cause of home routers locking up.
The BitTorrent protocol provides no way to index torrent files. As a result, a comparatively small number of websites have hosted the large majority of torrents linking to (possibly) copyrighted material, rendering those sites especially vulnerable to lawsuits. Several types of websites support the discovery and distribution of data on the BitTorrent network.
Public tracker sites such as The Pirate Bay allow users to search in and download from their collection of .torrent files; they also run BitTorrent trackers for those files. Users can typically also upload .torrent files for content they wish to distribute.
Private tracker sites such as Demonoid operate like public ones except that they restrict access to registered users and keep track of the amount of data each user uploads and downloads, in an attempt to reduce leeching.
There are specialized tracker sites such as FlixFlux for films, MVgroup for educational content, Metal-Torrents.com for metal music, cheggit.net for pornographic content, and tv torrents for television series. Often these will also be private.
Search engines allow the discovery of .torrent files that are hosted and tracked on other sites; examples include Mininova, Monova, BTJunkie, Torrentz and isoHunt. These sites allow the user to ask for content meeting specific criteria (such as containing a given word or phrase) and retrieve a list of links to .torrent files matching those criteria. This list is often sorted with respect to relevance or number of seeders. Bram Cohen launched a BitTorrent search engine on http://search.bittorrent.com that commingles licensed content with search results. Metasearch engines allow to search several BitTorrent indices and search engines at once.
BitTorrent does not offer its users anonymity. It is possible to obtain the IP addresses of all current, and possibly previous, participants in a swarm from the tracker. This may expose users with insecure systems to attacks.
BitTorrent is best suited to continuously connected broadband environments, since dial-up users find it less efficient due to frequent disconnects and slow download rates.
BitTorrent file sharers, compared to users of client/server technology, often have little incentive to become seeders after they finish downloading. The result of this is that torrent swarms gradually die out, meaning a lower possibility of obtaining older torrents. Some BitTorrent websites have attempted to address this by recording each user's download and upload ratio for all or just the user to see, as well as the provision of access to newer torrent files to people with better ratios. Users who have low upload ratios may see slower download speeds until they upload more. This prevents (statistical) leeching, since after a while they become unable to download much faster than 1-10 kB/s on a high-speed connection. Some trackers exempt dial-up users from this policy, because they cannot upload faster than 1-3 kB/s.
To combat this leeching problem, some seeders deliberately withhold one final piece from the seed, thus leaving a large number of potential seeders once they receive the withheld piece of data. With clients each awaiting that one final piece, the seeder ensures that there will be many more seeds once the final piece is released.
It is considered good etiquette to utilize the "Share Ratio" data, and equal (1.000 Ratio) or double (2.000 Ratio) one's leeching. This provides an opportunity for one to compensate for one's own leeching, and support the torrent, and nature of the protocol. While this is usually most easily accomplished with a DSL or ADSL connection, those using Dial-up will not be able to conform easily to this rule of etiquette.
There are "cheating" clients like BitThief which claim to be able to download without uploading. Such exploitation negatively affects the cooperative nature of the BitTorrent protocol.
The BitTorrent protocol is still under development and therefore may still acquire new features and other enhancements such as improved efficiency.
In June 2005, BitTorrent, Inc. released version 4.2.0 of the Mainline BitTorrent client. This release supported "trackerless" torrents, featuring a DHT implementation which allowed the client to use torrents that do not have a working BitTorrent tracker. Current versions of the official BitTorrent client, µTorrent, BitComet, and BitSpirit all share a compatible DHT implementation that is based on Kademlia. Vuze uses its own incompatible DHT system called the "distributed database", but a plugin is available which allows use of the mainline DHT.
Another idea that has surfaced in Vuze is that of virtual torrents. This idea is based on the distributed tracker approach and is used to describe some web resource. Currently, it is used for instant messaging. It is implemented using a special messaging protocol and requires an appropriate plugin. Anatomic P2P is another approach, which uses a decentralized network of nodes that route traffic to dynamic trackers.
Most BitTorrent clients also use Peer exchange (PEX) to gather peers in addition to trackers and DHT. Peer exchange checks with known peers to see if they know of any other peers. With the 126.96.36.199 release of Azureus, now known as Vuze, all major BitTorrent clients now have compatible peer exchange.
Web seeding was implemented in 2006. The advantage of this feature is that a site may distribute a torrent for a particular file or batch of files and make those files available for download from that same web server; this can simplify seeding and load balancing greatly once support for this feature is implemented in the various BitTorrent clients. In theory, this would make using BitTorrent almost as easy for a web publisher as simply creating a direct download while allowing some of the upload bandwidth demands to be placed upon the downloaders (who normally use only a very small portion of their upload bandwidth capacity). This feature was created by John "TheSHAD0W" Hoffman, who created BitTornado.. From version 5.0 onward the Mainline BitTorrent client also supports web seeds and the BitTorrent web site has a simple publishing tool that creates web seeded torrents. µTorrent added support for web seeds in version 1.7. The latest version of the popular download manager GetRight supports downloading a file from HTTP, FTP, and BitTorrent protocols.
Broadcatching combines RSS with the BitTorrent protocol to create a content delivery system, further simplifying and automating content distribution. Steve Gillmor explained the concept in a column for Ziff-Davis in December, 2003. The discussion spread quickly among bloggers (Techdirt, Ernest Miller, Chris Pirillo, etc.). In an article entitled Broadcatching with BitTorrent, Scott Raymond explained:
The BitTorrent web-service MoveDigital has the ability to make torrents available to any web application capable of parsing XML through its standard Representational State Transfer (REST) based interface. Additionally, Torrenthut is developing a similar torrent API that will provide the same features, as well as further intuition to help bring the torrent community to Web 2.0 standards. Alongside this release is a first PHP application built using the API called PEP, which will parse any Really Simple Syndication (RSS 2.0) feed and automatically create and seed a torrent for each enclosure found in that feed.
Since BitTorrent makes up a large proportion of total traffic, some ISPs have chosen to throttle (slow down) BitTorrent transfers to ensure network capacity remains available for other uses. For this reason methods have been developed to disguise BitTorrent traffic in an attempt to thwart these efforts.
Protocol header encrypt (PHE) and Message stream encryption/Protocol encryption (MSE/PE) are features of some BitTorrent clients that attempt to make BitTorrent hard to detect and throttle. At the moment Vuze, Bitcomet, KTorrent, Transmission, Deluge, µTorrent, MooPolice, Halite, rTorrent and the latest official BitTorrent client (v6) support MSE/PE encryption.
Reports in August 2007 indicated that Comcast was preventing BitTorrent seeding by monitoring and interfering with the communication between peers. Protection against these efforts is provided by proxying the client-tracker traffic through the Tor anonymity network or, via an encrypted tunnel to a point outside of the Comcast network. Comcast has more recently called a 'truce' with BitTorrent, Inc. with the intention of shaping traffic in a protocol-agnostic manner. Questions about the ethics and legality of Comcast's behavior have led to renewed debate about Net neutrality in the United States.
In general, although encryption can make it difficult to determine what is being shared, BitTorrent is vulnerable to traffic analysis. Thus even with MSE/PE, it may be possible for an ISP to recognize BitTorrent and also to determine that a system is no longer downloading, only uploading, information and terminate its connection by injecting TCP RST (reset flag) packets.
Another unofficial feature is an extension to the BitTorrent metadata format proposed by John Hoffman and implemented by several indexing websites. It allows the use of multiple trackers per file, so if one tracker fails, others can continue supporting file transfer. It is implemented in several clients, such as Vuze, BitComet, BitTornado, KTorrent and µTorrent. Trackers are placed in groups, or tiers, with a tracker randomly chosen from the top tier and tried, moving to the next tier if all the trackers in the top tier fail.
Torrents with multiple trackers can decrease the time it takes to download a file, but also has a few consequences:
Even with distributed trackers, a third party is still required to find a specific torrent. This is usually done in the form of a direct hyperlink from the website of the content owner or through indexing websites like The Pirate Bay or Torrentz.
In May 2007 Cornell University published a paper proposing a new approach to searching a peer-to-peer network for inexact strings which could replace the functionality of a central indexing site. A year later, the same team implemented the system as a plugin for Vuze called Cubit and published a follow-up paper reporting its success.
Because of the open nature of the protocol, many clients have been developed that support numerous platforms and written using various programming languages. The official client is also named BitTorrent.
Some clients, like Torrentflux, can be run straight from a server, allowing hosting companies to offer speeds unavailable to most users. Sites such as Torrent2FTP offer services to download torrents and then make them available to the customer on a FTP server.
Torrent Relay is a service that allows users to load torrents remotely and have them download as a simple  link. Unlike Torrent2FTP this site offers a free version that isn't the common PHP based Torrentflux that has been widely available for years. This implementation offers some unique features such as ZIP compression, RAR decompression and Playstation 3 streaming & download support, that aren't seen in any other client to date.
An as-yet (2 February 2008) unimplemented unofficial feature is Similarity Enhanced Transfer (SET), a technique for improving the speed at which peer-to-peer file sharing and content distribution systems can share data. SET, proposed by researchers Pucha, Andersen, and Kaminsky, works by spotting chunks of identical data in files that are an exact or near match to the one needed and transferring these data to the client if the 'exact' data are not present. Their experiments suggested that SET will help greatly with less popular files, but not as much for popular data, where many peers are already downloading it. Andersen believes that this technique could be immediately used by developers with the BitTorrent file sharing system.
There has been much controversy over the use of BitTorrent trackers. BitTorrent metafiles themselves do not store copyrighted data, hence BitTorrent itself is not illegal—it is the use of it to copy copyrighted material that contravenes laws in some locations.
Various jurisdictions have pursued legal action against websites that host BitTorrent trackers. High-profile examples include the closing of Suprnova.org, Torrentspy, LokiTorrent, Demonoid (now back online), OiNK.cd and EliteTorrents.org. The Pirate Bay torrent website, formed by a Swedish anti-copyright group, is noted for the "legal" section of its website in which letters and replies on the subject of alleged copyright infringements are publicly displayed. On 31 May 2006, The Pirate Bay's servers in Sweden were raided by Swedish police on allegations by the MPAA of copyright infringement; however, the tracker was up and running again three days later.
HBO, in an effort to combat the distribution of its programming on BitTorrent networks, has sent cease and desist letters to the Internet Service Providers of BitTorrent users. Many users have reported receiving letters from their ISPs that threatened to cut off their internet service if the alleged infringement continues. HBO, unlike the RIAA, has not been reported to have filed suit against anyone for sharing files as of April 2007. In 2005 HBO began "poisoning" torrents of its show Rome, by providing bad chunks of data to clients.
On 23 November 2005, the movie industry and BitTorrent Inc. CEO Bram Cohen, signed a deal they hoped would reduce the number of unlicensed copies available through bittorrent.com's search engine, run by BitTorrent, Inc. It meant BitTorrent.com had to remove any links to unlicensed copies of films made by seven of Hollywood's major movie studios.
More recently, the BitTorrent network has been subject to scrutiny by the British Phonographic Industry (BPI). There are suggestions that they are using the network to obtain the IPs of those currently connected to the tracker. The information is then used to contact the ISP of each downloader so that notifications can be made (this was given sizeable coverage in the UK press with regard to Virgin Media sending letters out to customers suspected of using P2P networks).
There are two major differences between BitTorrent and many other peer-to-peer file-trading systems, which advocates suggest make it less useful to those sharing copyrighted material without authorization. First, BitTorrent itself does not offer a search facility to find files by name. A user must find the initial torrent file by other means, such as a web search. Second, BitTorrent makes no attempt to conceal the host ultimately responsible for facilitating the sharing: a person who wishes to make a file available must run a tracker on a specific host or hosts and distribute the tracker address(es) in the .torrent file. Because it is possible to operate a tracker on a server that is located in a jurisdiction where the copyright holder cannot take legal action, the protocol does offer some vulnerability that other protocols lack. It is far easier to request that the server's ISP shut down the site than it is to find and identify every user sharing a file on a peer-to-peer network. However, with the use of a distributed hash table (DHT), trackers are no longer required, though often used for client software that does not support DHT to connect to the stream.