Definitions

Internet Movie Database

The Internet Movie Database (IMDb) is an online database of information related to movies, actors, television shows, production crew personnel, and video games. IMDb launched on 17 October, 1990, and in 1998 was acquired by Amazon.com.

Overview

The IMDb website consists of one of the largest accumulation of data about films, television programs, direct-to-video products, and video games, reaching back to each medium's respective beginning. In many cases, the information goes beyond simple title or press credits to include complete cast and crew credits, uncredited personnel, production and distribution companies, plot summaries, memorable quotes, awards, reviews, box office performance, filming locations, technical specs, promotional content, trivia, and links to official and other websites. Furthermore, the IMDb tracks titles in production, including major announced projects still in development. The database also houses filmographies for all persons, cast and crew, identified in listed titles. Filmographies include biographical details, awards listings, external links, and information about other professional work not covered by title entries in the database such as theatrical and commercial advertising appearances. The IMDb also offers ancillary material such as daily movie and TV news, weekly box office reports, TV listings, cinema showtimes, user polls and ratings, and special features about various movie events such as the Academy Awards. The website also has an active message board system. There are message boards for each database entry, found at the bottom of each respective page, as well as general discussion boards on various topics.

All of the basic database information is available without registration and without providing any personal information. However, to submit information, to use the message boards, to search for information about adult movies or to use some other of the site's features requires registration. Some advanced features require verification which can sometimes require some personal financial information such as credit card details. IMDb has 57 million visitors, 17 million of which are registered users, as of October 10, 2007.

Database content is largely provided and updated by a cadre of volunteer contributors; although 20 members of the IMDb staff are dedicated to monitoring received data.

For automated queries, most of the database can be downloaded as compressed plain text files and the information can be extracted using the tools provided, typically using a command line interface.

In 2002, the IMDb spun off a private, subscription-funded for profit site, IMDbPro, offering the entire content of the database plus additional information for business professionals, such as personnel contact details, titles in development, movie event calendars, and a greater range of industry news.

In 2006, IMDb introduced its "Résumé subscription service", where actors and crew can post their own résumé and upload photos of themselves for a yearly fee. IMDb résumé pages are kept separately from the regular entry about that person, but a regular entry is automatically created for each résumé subscriber who does not already have one.

History

In rec.arts.movies

The database originated from two lists started as independent projects in early 1989 by participants in the Usenet newsgroup rec.arts.movies. In each case, a single maintainer recorded items emailed by newsgroup readers, and posted updated versions of his list from time to time. Google Groups coverage of rec.arts.movies is incomplete during the relevant time period, with a 6-month gap in late 1988 and early 1989 and a number of missing articles after that.

It began with a posting titled "Those Eyes", on the subject of actresses with beautiful eyes. Hank Driskill began to collect a list of attractive actresses and what movies they had appeared in, and as the size of the repeated posting grew far beyond a normal newsgroup article, it soon became known simply as "THE LIST".

The other project, started by Chuck Musciano, was briefly called the "Movie Ratings List" and soon became the "Movie Ratings Report". Musciano simply asked readers to rate movies on a scale of one to ten, and reported on the votes. He soon began posting "ballots" with lists of movies for people to rate, so his list also grew quickly.

In 1990, Col Needham collated the two lists and produced a "Combined LIST & Movie Ratings Report". (His first posting of the database scripts is not available.) Needham soon started a (male) "Actors List", while Dave Knight began a "Directors List", and Andy Krieg took over THE LIST, which would later be renamed as the "Actress List". Both this and the Actors List had been restricted to people who were still alive and working, but retired people began to be added, and Needham also started what was then (but did not remain) a separate "Dead Actors/Actresses List". The goal now was to make the lists as inclusive as the maintainers could manage. In late 1990, the lists included almost 10,000 movies and television series. On October 17, 1990, Needham posted a collection of Unix shell scripts which could be used to search the four lists, and the database that would become the IMDb was born. At the time, it was known as the "rec.arts.movies movie database".

On the web

By 1993, the database had been expanded to include additional categories of filmmakers and other demographic material, as well as trivia, biographies, and plot summaries; the movie ratings had been properly integrated with the list data; and a centralized email interface for querying the database had been created. Later in the year, it moved onto the World Wide Web (a network in its infancy back then) under the name of Cardiff Internet Movie Database. The database resided on the servers of the computer science department of Cardiff University in the UK. Rob Hartill was the original web interface author. In 1994, the email interface was revised to accept the submission of all information, meaning that people no longer had to email the specific list maintainer with their updates. However, the structure remained that information received on a single film was divided among multiple section managers, the sections being defined and determined by categories of film personnel and the individual filmographies contained therein. Its management also continued to be in the hands of a small contingent of underpaid or volunteer "section managers" who were receiving ever-growing quantities of information on films from around the world and across time from contributors of widely varying levels of expertise and informational resources. Despite the annual claims of Needham, in a year-end report newsletter to the Top fifty contributors, that "fewer holes" must now remain for the coming year, the amount of information still missing from the database was vastly underestimated. Over the next few years, the database was run on a network of mirrors across the world with donated bandwidth.

As an independent company

In 1995, it became obvious to the principal site managers that the project had become too large to maintain merely through donations and in their spare time. The decision was made to become a commercial venture and in 1996, IMDb was incorporated in the United Kingdom, becoming the Internet Movie Database Ltd, with Col Needham the primary owner as well as identified figurehead. The remaining shareholders were the people maintaining the database. Revenue was generated through advertising, licensing and partnerships.

This state of affairs continued until 1998. The database was growing every day, and it was again reaching a critical point. Most revenues were being spent on equipment, and there was not enough money left over to pay full-time salaries. The system was also suffering noticeable slowdowns both in accessing the site and in having new data posted. Offers were solicited and received from major businesses to purchase the database; however, the shareholders were unwilling to sell if it could not be guaranteed that the information would be accessible to the internet community for free.

As a subsidiary company

In 1998, Jeff Bezos, founder, owner and CEO of Amazon.com struck a deal with Col Needham and other principal shareholders, to buy IMDb outright and attach it to Amazon as a subsidiary, private company. This gave IMDb the ability to pay the shareholders salaries for their work, while Amazon.com would be able to use the IMDb as an advertising resource for selling DVDs and videotapes. Volunteer contributors were not advised in advance of even the possibility of IMDb - and their contributions along with it - being sold to a private business, which created some initial discord and defection of regulars.

IMDb continues to expand its functionality. In 2002, it added a subscription service known as IMDbPro aimed at entertainment professionals. It provides a variety of services including production and box office details, as well as a company directory. Most information contained in the IMDb database proper continues to come from volunteer researchers. An additional incentive, since 2003, is that if they are identified as being one of "the top 100 contributors" in terms of amounts of hard data submitted, they receive complimentary free access to IMDbPro for the following calendar year; for 2006 this was increased to the top 150 contributors, and for 2007 to the top 175.

TV episodes

On January 26, 2006, the long-awaited "Full Episode Support" came online, allowing the database to support separate cast and crew listings for each episode of every TV series. This was described by Col Needham as "the largest change we've ever made to our data model", and increased the number of titles in the database from 485,000 to nearly 755,000.

At present, the database entries for TV series are in a state of flux, as listings are migrated from series titles to individual episodes. The maintainers anticipated "a couple of months for data to settle down and bugs to be ironed out", but inaccuracies were still present one year later.

Characters filmography

On October 2, 2007 the characters filmography feature was launched. The feature is similar to the existing title, name and company feature, except now users can see by whom a certain character was played, read memorable quotes and a biography about the character. All data in the characters filmography is submitted by regular users and is largely not verified by the IMDb staff, in comparison to most other data that is first verified and might be rejected by the staff. This is possibly because very little new data is sent in, but rather already existing data is being connected together.

Instant viewing

On September 15, 2008, a feature was added that enables instant viewing of over 6,000 movies and television shows from CBS, Sony and a number of independent film makers, with direct links from their profiles. Due to licensing restrictions this feature is only available to viewers in the United States.

Ancillary features

User ratings of films

As one adjunct to data, the IMDb offers a rating scale that allows users to rate films by choosing one of ten categories in the range 1–10, with each user able to submit one rating. The points of reference given to users of these categories are the descriptions "1 (awful)" and "10 (excellent)"; and these are the only descriptions of categories. Due to the minimum category being scored one, the mid-point of the range of scores is 5.5, rather than 5.0 as might intuitively be expected given a maximum score of ten. This rating system has also recently been implemented for television programming on an episode-by-episode basis.

In adopting this method, IMDb is following its widespread usage; the method is the same as rating in the range of a half star to five stars. The simplicity of this method makes it popular, but in terms of psychometric, statistical and other criteria, the method suffers shortcomings (see online rating scales).

Filters and weights

IMDb indicates that submitted ratings are filtered and weighted in various ways in order to produce a weighted mean that is displayed for each film, series, and so on. It states that filters are used to avoid vote stuffing; the method is not described in detail to avoid attempts to circumvent it.

Ranking

The IMDb Top 250 is intended to be a listing of the top 'rated' 250 films, based on ratings by the registered users of the website using the methods described. Only non-documentary theatrical releases running at least forty-five minutes with over 1300 ratings are considered; all other products are ineligible. Also, the 'top 250' rating is based on only the ratings of "regular voters". The exact number of votes a registered user would have to make to be considered to be a user who votes regularly has been kept secret. IMDb has stated that to maintain the effectiveness of the top 250 list they "deliberately do not disclose the criteria used for a person to be counted as a regular voter". In addition to other weightings, the top 250 films are also based on a weighted rating formula referred to in actuarial science as a credibility formula. This label arises because a statistic is taken to be more credible the greater the number of individual pieces of information; in this case from eligible users who submit ratings. IMDb uses the following formula to calculate the weighted rating:

$W = \left\{Rv + Cmover v+m\right\}$
where:
$W$ = Weighted Rating
$R$ = average for the movie as a number from 0 to 10 (mean) = (Rating)
$v$ = number of votes for the movie = (votes)
$m$ = minimum votes required to be listed in the Top 250 (currently 1300)
$C$ = the mean vote across the whole report (currently 6.7)

An extended listing of the Top 500 - following the same formula - is available to IMDbPro subscribers.

The IMDb also has a Bottom 100 feature which is assembled through a similar process although only 650 votes must be received to qualify for the list.

The top 250 list comprises a wide strata of films, including major releases, cult films, independent films, critically acclaimed films, silent films and non-English language films.

Ranking criticisms

The validity of the Top 250 has come under scrutiny. The skepticism includes accusations of such things as ballot-box stuffing and voting ambiguity, as well as considerable objections about the overall placement of any given movie in the Top 250.

The most recent example occurred in July of 2008 when The Dark Knight ended up taking the number one spot away from The Godfather. In addressing the controversy of the issue, CNET's Harrison Hoffman theorized that the hype surrounding the movie outweighed clear thinking. Hoffman also goes on to say that the number of "10" votes for the latest Batman movie and the number of "1" votes that have suddenly appeared in The Godfather's voting bin (that have knocked that movie down to #3 on the list) are the markings of a "drastic shift" that "hardly seems the work of a wise crowd." A "mob mentality", he maintains, can "greatly skew a product of its collective wisdom."

Dave Thomas, blogger on E-Gear, takes a more positive and understanding tone, saying that he believes IMDB's Top 250 to be a better list than AFI's Top 100 since the board. Thomas notes that his admiration of the "always shifting" list is because it recognizes modern movies (giving the example of WALL-E currently standing at #30) and mixes them in with all-time greats which is a product of the Top 250 being "updated every ten minutes instead of every ten years". However, Thomas admits that the list is "imperfect", saying that users can "cheat like crazy", further adding that there is "no guarantee anyone who voted ever actually saw The Dark Knight, much less liked it."

Plot-related features and spoiler warnings

IMDb main pages for each film include one or more of the sections titled Plot outline, Plot synopsis, and Plot keywords, and separate pages for Plot summary and Plot synopsis. The Plot synopsis pages are accessed through links that notify the reader a spoiler may be included.

The plot outline is a short summary of the premise with a general overview, usually not including details that may be considered to be spoilers. The plot outline is presented on the main page for the film if short enough, and if it extends beyond a couple of lines includes a "more" link that opens to the Plot summary page for the film.

On the Plot summary page, IMDb includes the full text of the plot outline, along with the first few lines of the plot synopsis, followed by a link to a further more detailed page, with the link text written as "more (warning! contains spoilers)".

The plot synopsis is a more complete summary of the plot that can be edited by readers of IMDb, often including twists and turns that some readers may consider to be spoilers and may not want to know about if they have not yet seen the film. IMDb places the synopsis on a separate page, with a link on the film's main page using text that advises the reader as follows: "View full synopsis. (warning! may contain spoilers)". The separate Plot synopsis page includes the headline "Warning! This synopsis contains spoilers. See plot summary for non-spoiler summarized description."

The IMDb User's Guide advises user contributors to avoid revealing spoilers outside of the synopsis section where they are covered by the spoiler warning in the page headline. IMDb also provides a spoiler warning template for use when spoilers occur in an unexpected location, for example, according to their help page, when a synopsis includes a spoiler for a different movie. In the IMDb Submission Guide for the "Trivia and Goofs" page section and for their message boards, the guide states that spoilers should be avoided in general in those sections, but that if a spoiler is included, it must be preceded by an announcement, such as using the word "SPOILER:" or their provided spoiler template.

Plot keywords are keywords that contributors to the IMDb submit. These are keywords regarding objects and occurrences in each film on the IMDb.

Message boards

One of the most used features of the Internet Movie Database is the message boards that coincide with every title (excepting TV episodes) and name entry, along with forty-seven main boards. These boards allow registered users to share, discuss and debate information about the movie and/or people that worked on it. They were not originally part of the IMDb, but were added only after its purchase by Amazon.com, some time in the year 2000.

As the IMDb expires older posts from all message boards variably, it is difficult to precisely measure traffic according to individual board, but the Soapbox and the Sandbox and Oscar Buzz are amongst the highest traffic boards on IMDb. The Soapbox is a general purpose discussion board, where users can go for "their more heated discussions". The Sandbox is a general purpose, anything-goes board designated for test messages and off-topic posts.

Boards for various political persons (most notably President George W. Bush) have also been used for political discussion. On May 9, 2007, a "Politics" message board was created.

All volunteers who contribute content to the database retain copyright to their contributions but grant full rights to copy, modify, and sublicense the content to IMDb.

Foreign-language films

Although the IMDb is written completely in English, (but it does have other versions in Brazilian Portuguese, Finnish, French, German, Hungarian, Italian, Polish, Portuguese, Romanian and Spanish), the IMDb lists the titles of foreign-language films in their original language. For example, the Japanese anime film Spirited Away is listed as Sen to Chihiro no kamikakushi (Sen and Chihiro's Spiriting Away), the original Japanese title, but in Romanization of Japanese. English-speaking readers must look at the section "Also Known As" to find the title it is known as in their country, although a search by the English title will find the film.