HTTP cookie

HTTP cookie

HTTP cookies, or more commonly referred to as Web cookies, tracking cookies or just cookies, are parcels of text sent by a server to a Web client (usually a browser) and then sent back unchanged by the client each time it accesses that server. HTTP cookies are used for authenticating, session tracking (state maintenance), and maintaining specific information about users, such as site preferences or the contents of their electronic shopping carts. The term "cookie" is derived from "magic cookie," a well-known concept in UNIX computing which inspired both the idea and the name of HTTP cookies.

Because they can be used for tracking browsing behavior, cookies have been of concern for Internet privacy. As a result, they have been subject to legislation in various countries such as the United States, as well as the European Union. Cookies have also been criticized because the identification of users they provide is not always accurate and because they could potentially be a target of network attackers. Some alternatives to cookies exist, but each has its own uses, advantages, and drawbacks.

Cookies are also subject to a number of misconceptions, mostly based on the erroneous notion that they are computer programs. In fact, cookies are simple pieces of data unable to perform any operation by themselves. In particular, they are neither spyware nor viruses, although cookies from certain sites are described as spyware by many anti-spyware products because they allow users to be tracked when they visit various sites.

Most modern browsers allow users to decide whether to accept cookies, but rejection makes some websites unusable. For example, shopping carts implemented using cookies do not work if cookies are rejected.

Use

HTTP cookies are used by Web servers to differentiate users and to maintain data related to the user during navigation, possibly across multiple visits. HTTP cookies were introduced to provide a way to implement a "shopping cart" (or "shopping basket"), a virtual device into which the user can "place" items to purchase, so that users can navigate a site where items are shown, adding or removing items from the shopping basket at any time.

Allowing users to log in to a website is another use of cookies. Users typically log in by inserting their credentials into a login page; cookies allow the server to know that the user is already authenticated, and therefore is allowed to access services or perform operations that are restricted to logged-in users.

Many websites also use cookies for personalization based on users' preferences. Sites that require authentication often use this feature, although it is also present on sites not requiring authentication. Personalization includes presentation and functionality. For example, the Wikipedia website allows authenticated users to choose the webpage skin they like best; the Google search engine allows users (even non-registered ones) to decide how many search results per page they want to see.

Cookies are also used to track users across a website. Third-party cookies and Web bugs, explained below, also allow for tracking across multiple sites. Tracking within a site is typically done with the aim of producing usage statistics, while tracking across sites is typically used by advertising companies to produce anonymous user profiles, which are then used to target advertising (deciding which advertising image to show) based on the user profile.

Implementation

Technically, cookies are arbitrary pieces of data chosen by the Web server and sent to the browser. The browser returns them unchanged to the server, introducing a state (memory of previous events) into otherwise stateless HTTP transactions. Without cookies, each retrieval of a Web page or component of a Web page is an isolated event, mostly unrelated to all other views of the pages of the same site. By returning a cookie to a web server, the browser provides the server a means of connecting the current page view with prior page views. Other than being set by a web server, cookies can also be set by a script in a language such as JavaScript, if supported and enabled by the Web browser.

Cookie specifications suggest that browsers should support a minimal number of cookies or amount of memory for storing them. In particular, an internet browser is expected to be able to store at least 300 cookies of four kilobytes each, and at least 20 cookies per server or domain.

Relevant count of maximum stored cookies per domain for the major browsers are:

  • Firefox 1.5: 50
  • Firefox 2.0: 50
  • Opera 9: 30
  • Internet Explorer 6: 20 (raised to 50 in update on August 14, 2007)
  • Internet Explorer 7: 20 (raised to 50 in update on August 14, 2007)

In practice cookies must be smaller than 4 kilobytes. Internet Explorer imposes a 4KB total for all cookies stored in a given domain.

Cookie names are case insensitive according to section 3.1 of RFC 2965

The cookie setter can specify a deletion date, in which case the cookie will be removed on that date. If the cookie setter does not specify a date, the cookie is removed once the user quits his or her browser. As a result, specifying a date is a way for making a cookie survive across sessions. For this reason, cookies with an expiration date are called persistent. As an example application, a shopping site can use persistent cookies to store the items users have placed in their basket. (In reality, the cookie may refer to an entry in a database stored at the shopping site, not on your computer.) This way, if users quit their browser without making a purchase and return later, they still find the same items in the basket so they do not have to look for these items again. If these cookies were not given an expiration date, they would expire when the browser is closed, and the information about the basket content would be lost.

Cookies can also be limited in scope to a specific domain, subdomain or path on the web server which created them.

Misconceptions

Since their introduction on the Internet, misconceptions about cookies have circulated on the Internet and in the media. In 1998, CIAC, a computer incident response team of the United States Department of Energy, found the security vulnerability "essentially nonexistent" and explained that "information about where you come from and what web pages you visit already exists in a web server's log files". In 2005, Jupiter Research published the results of a survey, according to which a consistent percentage of respondents believed some of the following false claims:

Cookies are in fact only data, not program code: they cannot erase or read information from the user's computer. However, cookies allow for detecting the Web pages viewed by a user on a given site or set of sites. This information can be collected in a profile of the user. Such profiles are often anonymous, that is, they do not contain personal information of the user (name, address, etc.) More precisely, they cannot contain personal information unless the user has made it available to some sites. Even if anonymous, these profiles have been the subject of some privacy concerns.

According to the same survey, a large percentage of Internet users do not know how to delete cookies.

Browser settings

Most modern browsers support cookies. However, a user can usually also choose whether cookies should be used or not. The following are common options:

  1. To enable or disable cookies completely, so that they are always accepted or always blocked.
  2. To allow the user to see the cookies that are active with respect to a given page by typing

javascript:alert("Cookies: "+document.cookie)
in the browser URL field. Some browsers incorporate a cookie manager for the user to see and selectively delete the cookies currently stored in the browser.

Privacy and third-party cookies

Cookies have some important implications on the privacy and anonymity of Web users. While cookies are only sent to the server setting them or one in the same Internet domain, a Web page may contain images or other components stored on servers in other domains. Cookies that are set during retrieval of these components are called third-party cookies.

Advertising companies use third-party cookies to track a user across multiple sites. In particular, an advertising company can track a user across all pages where it has placed advertising images or web bugs. Knowledge of the pages visited by a user allows the advertisement company to target advertisement to the user's presumed preferences.

The possibility of building a profile of users has been considered by some a potential privacy threat, even when the tracking is done on a single domain but especially when tracking is done across multiple domains using third-party cookies. For this reason, some countries have legislation about cookies.

The United States government has set strict rules on setting cookies in 2000 after it was disclosed that the White House drug policy office used cookies to track computer users viewing its online anti-drug advertising. In 2002, privacy activist Daniel Brandt found that the CIA had been leaving persistent cookies on computers for ten years. When notified it was violating policy, CIA stated that these cookies were not intentionally set and stopped setting them. On December 25, 2005, Brandt discovered that the National Security Agency had been leaving two persistent cookies on visitors' computers due to a software upgrade. After being informed, the National Security Agency immediately disabled the cookies.

The 2002 European Union telecommunication privacy Directive contains rules about the use of cookies. In particular, Article 5, Paragraph 3 of this directive mandates that storing data (like cookies) in a user's computer can only be done if: 1) the user is provided information about how this data is used; and 2) the user is given the possibility of denying this storing operation. However, this article also states that storing data that is necessary for technical reasons is exempted from this rule. This directive was expected to have been applied since October 2003, but a December 2004 report says (page 38) that this provision was not applied in practice, and that some member countries (Slovakia, Latvia, Greece, Belgium, and Luxembourg) did not even implement the provision in national law. The same report suggests a thorough analysis of the situation in the Member States.

The P3P specification includes the possibility for a server to state a privacy policy, which specifies which kind of information it collects and for which purpose. These policies include (but are not limited to) the use of information gathered using cookies. According to the P3P specification, a browser can accept or reject cookies by comparing the privacy policy with the stored user preferences or ask the user, presenting them the privacy policy as declared by the server.

Many web browsers including Apple's Safari and Microsoft Internet Explorer versions 6 and 7 support P3P which allows the web browser to determine whether to allow 3rd party cookies to be stored. The Opera web browser allows users to refuse third-party cookies and to create global and specific security profiles for Internet domains. Firefox 2.x dropped this option from its menu system but it restored it with the release of version 3.x.

Blocking third-party cookies

Third-party cookies can be blocked by most browsers to increase privacy and reduce tracking by advertising and tracking companies -- without causing any problems. A secondary benefit is that it stops tracking by third-party companies' web bugs that are on many web pages. Instructions on how to do this can be found here

Drawbacks of cookies

Besides privacy concerns, cookies also have some technical drawbacks. In particular, they do not always accurately identify users, they can be used for security attacks, and they are at odds with the Representational State Transfer (REST) software architectural style.

Inaccurate identification

If more than one browser is used on a computer, each usually has a separate storage area for cookies. Hence cookies do not identify a person, but a combination of a user account, a computer, and a Web browser. Thus, anyone who uses multiple accounts, computers, or browsers has multiple sets of cookies.

Likewise, cookies do not differentiate between multiple users who share a computer and browser, if they do not use different user accounts.

Cookie hijacking

During normal operation cookies are sent back and forth between a server (or a group of servers in the same domain) and the computer of the browsing user. Since cookies may contain sensitive information (user name, a token used for authentication, etc.), their values should not be accessible to other computers. Cookie theft is the act of intercepting cookies by an unauthorized party.

Cookies can be stolen via packet sniffing in an attack called session hijacking. Traffic on a network can be intercepted and read by computers on the network other than its sender and its receiver (particularly on unencrypted public Wi-Fi networks). This traffic includes cookies sent on ordinary unencrypted [] sessions. Where network traffic is not encrypted, malicious users can therefore read the communications of other users on the network, including their cookies, using programs called packet sniffers.

This issue can be overcome by securing the communication between the user's computer and the server by employing Transport Layer Security ([] protocol) to encrypt the connection. A server can specify the secure flag while setting a cookie; the browser will then send it only over a secure channel, such as an SSL connection.

However a large number of websites, although using secure [] communication for user authentication (i.e. the login page), subsequently send session cookies and other data over ordinary unencrypted [] connections for performance reasons. Attackers can therefore easily intercept the cookies of other users and impersonate them on the relevant websites or use them in a cookiemonster attack.

A different way to steal cookies is cross-site scripting and making the browser itself send cookies to servers that should not receive them. Modern browsers allow execution of pieces of code retrieved from the server. If cookies are accessible during execution, their value may be communicated in some form to servers that should not access them. Encrypting cookies before sending them on the network does not help against this attack.

This type of cross-site scripting is typically exploited by attackers on sites that allow users to post HTML content. By embedding a suitable piece of code in an HTML post, an attacker may receive cookies of other users. Knowledge of these cookies can then be exploited by connecting to the same site using the stolen cookies, thus being recognised as the user whose cookies have been stolen.

A way for preventing such attacks is by the HttpOnly flag; this is an option, first introduced by Microsoft and implemented in PHP since version 5.20 that is intended to make a cookie inaccessible to client side script. However, web developers should consider developing their websites so that they are immune to cross-site scripting.

Cookie poisoning

While cookies are supposed to be stored and sent back to the server unchanged, an attacker may modify the value of cookies before sending them back to the server. If, for example, a cookie contains the total value a user has to pay for the items in their shopping basket, changing this value exposes the server to the risk of making the attacker pay less than the supposed price. The process of tampering with the value of cookies is called cookie poisoning, and is sometimes used after cookie theft to make an attack persistent.

Most websites, however, only store a session identifier — a randomly generated unique number used to identify the user's session — in the cookie itself, while all the other information is stored on the server. In this case, the problem of cookie poisoning is largely eliminated.

Cross-site cooking

Each site is supposed to have its own cookies, so a site like example.com should not be able to alter or set cookies for another site, like example.org. Cross-site cooking vulnerabilities in web browsers allow malicious sites to break this rule. This is similar to cookie poisoning, but the attacker exploits non-malicious users with vulnerable browsers, instead of attacking the actual site directly. The goal of such attacks may be to perform session fixation.

Users are advised to use the more recent versions of web browsers in which such issue is mitigated.

Inconsistent state on client and server

The use of cookies may generate an inconsistency between the state of the client and the state as stored in the cookie. If the user acquires a cookie and then clicks the "Back" button of the browser, the state on the browser is generally not the same as before that acquisition. As an example, if the shopping cart of an online shop is realized using cookies, the content of the cart may not change when the user goes back in the browser's history: if the user presses a button to add an item in the shopping cart and then clicks on the "Back" button, the item remains in the shopping cart. This might not be the intention of the user, who possibly wanted to undo the addition of the item. This can lead to unreliability, confusion, and bugs. Web developers should therefore be aware of this issue and implement measures to handle such situations as this.

Cookie expiration

Persistent cookies have been criticized by privacy experts for not being set to expire soon enough, and thereby allowing some websites to track users and build up a profile of them over time. This aspect of cookies also compounds the issue of session hijacking, because a stolen persistent cookie can potentially be used to impersonate a user for a considerable period of time.

Alternatives to cookies

Some of the operations that can be realised using cookies can also be realised using other mechanisms. However, these alternatives to cookies have their own drawbacks, which make cookies usually preferred to them in practice. Most of the following alternatives allow for user tracking, even if not as reliably as cookies. As a result, privacy is an issue even if cookies are rejected by the browser or not set by the server.

IP address

An unreliable technique for tracking users is based on storing the IP addresses of the computers requesting the pages. This technique has been available since the introduction of the World Wide Web, as downloading pages requires the server holding them to know the IP address of the computer running the browser or the proxy, if any is used. This information is available for the server to be stored regardless of whether cookies are used or not.

However, these addresses are typically less reliable in identifying a user than cookies because computers and proxies may be shared by several users, and the same computer may be assigned different Internet addresses in different work sessions (this is often the case for dial-up connections). The reliability of this technique can be improved by using another feature of the HTTP protocol: when a browser requests a page because the user has followed a link, the request that is sent to the server contains the URL of the page where the link is located. If the server stores these URLs, the path of page viewed by the user can be tracked more precisely. However, these traces are less reliable than the ones provided by cookies, as several users may access the same page from the same computer, NAT router, or proxy and then follow two different links. Moreover, this technique only allows tracking and cannot replace cookies in their other uses.

Tracking by IP address can be impossible with some systems that are used to retain Internet anonymity, such as Tor. With such systems, not only could one browser carry multiple addresses throughout a session, but multiple users could appear to be coming from the same IP address, thus making IP address use for tracking wholly unreliable.

Some major ISPs, including AOL, route all web traffic through a small number of proxies which makes this scheme particularly unworkable.

URL (query string)

A more precise technique is based on embedding information into URLs. The query string part of the URL is the one that is typically used for this purpose, but other parts can be used as well. The Java Servlet and PHP session mechanisms both use this method if cookies are not enabled.

This method consists of the Web server appending query strings to the links of a Web page it holds when sending it to a browser. When the user follows a link, the browser returns the attached query string to the server.

Query strings used in this way and cookies are very similar, both being arbitrary pieces of information chosen by the server and sent back by the browser. However, there are some differences: since a query string is part of a URL, if that URL is later reused, the same attached piece of information is sent to the server. For example, if the preferences of a user are encoded in the query string of a URL and the user sends this URL to another user by e-mail, those preferences will be used for that other user as well.

Moreover, even if the same user accesses the same page two times, there is no guarantee that the same query string is used in both views. For example, if the same user arrives to the same page but coming from a page internal to the site the first time and from an external search engine the second time, the relative query strings are typically different while the cookies would be the same. For more details, see query string.

Other drawbacks of query strings are related to security: storing data that identifies a session in a query string enables or simplifies session fixation attacks, referer logging attacks and other security exploits. Transferring session identifiers as HTTP cookies is more secure.

Hidden form fields

A form of session tracking, used by ASP.NET, is to use web forms with hidden fields. This technique is very similar to using URL query strings to hold the information and has many of the same advantages and drawbacks; and if the form is handled with the [] GET method, the fields actually become part of the URL the browser will send upon form submission. But most forms are handled with HTTP POST, which causes the form information, including the hidden fields, to be appended as extra input that is neither part of the URL, nor of a cookie.

This approach presents two advantages from the point of view of the tracker: first, having the tracking information placed in the HTML source and POST input rather than in the URL means it will not be noticed by the average user; second, the session information is not copied when the user copies the URL (to save the page on disk or send it via email, for example). A drawback of this technique is that session information is in the HTML code; therefore, each web page must be generated dynamically each time someone requests it, placing an additional workload on the web server.

window.name

All current web browsers can store a fairly large amount of data (2-32 MB) via JavaScript using the DOM property window.name. This data can be used instead of session cookies and is also cross domain. The technique can be coupled with JSON/JavaScript objects to store complex sets of session variables on the client side.

The downside is that every separate window or tab will initially have an empty window.name; in times of tabbed browsing this means that individually opened tabs (initiation by user) will not have a window name. Furthermore window.name can be used for tracking visitors across different web sites, making it of concern for Internet privacy.

HTTP authentication

As for authentication, the HTTP protocol includes the basic access authentication and the digest access authentication protocols, which allow access to a Web page only when the user has provided the correct username and password. If the server requires such credential for granting access to a Web page, the browser requests them to the user; once obtained, the browser stores and uses them also for accessing subsequent pages, without requiring the user to provide them again. From the point of view of the user, the effect is the same as if cookies were used: username and password are only requested once, and from that point on the user is given access to the site. In the basic access authentication protocol, a combination of username and password is sent to the server in every browser request. This means that someone listening in on this traffic can simply read this information and store for later use. This problem is overcome in the digest access authentication protocol, in which the username and password are encrypted using a random nonce created by the server.

Macromedia Flash Local Stored Objects

If a browser includes the Macromedia Flash Player plugin, the Local Shared Objects functionality can be used in a way very similar to cookies. Local Stored Objects may be an attractive choice to web developers because a majority of Windows users have Flash Player installed, the default size limit is 100 kB, and the security controls are distinct from the user controls for cookies, so Local Shared Objects may be enabled when cookies are not.

The major drawback with this approach is the same as every platform/vendor-specific approach: it breaks the web's global accessibility and interoperability, tying up web development to a specific client's platform, excluding users who use standards-compliant web user agents and instead forcing them to use platform/vendor-specific web agents, which propitiates vendor lock-in.

Client-side persistence

Some web browsers support a script-based persistence mechanism that allows the page to store information locally for later retrieval. Internet Explorer, for example, supports persisting information in the browser's history, in favorites, in an XML store, or directly within a Web page saved to disk. A different mechanism relies on browsers normally caching (holding in memory instead of reloading) JavaScript programs used in web pages. As an example, a page may contain a link such as
Search another word or see Http cookieon Dictionary | Thesaurus |Spanish
Copyright © 2014 Dictionary.com, LLC. All rights reserved.
  • Please Login or Sign Up to use the Recent Searches feature
FAVORITES
RECENT

;