Tool for finding information, especially on the Internet or World Wide Web. Search engines are essentially massive databases that cover wide swaths of the Internet. Most consist of three parts: at least one program, called a spider, crawler, or bot, which “crawls” through the Internet gathering information; a database, which stores the gathered information; and a search tool, with which users search through the database by typing in keywords describing the information desired (usually at a Web site dedicated to the search engine). Increasingly, metasearch engines, which search a subset (usually 10 or so) of the huge number of search engines and then compile and index the results, are being used.
Learn more about search engine with a free trial on Britannica.com.
The list of items that meet the criteria specified by the query is typically sorted, or ranked. Ranking items by relevance (from highest to lowest) reduces the time required to find the desired information. Probabilistic search engines rank items based on measures of similarity (between each item and the query, typically on a scale of 1 to 0, 1 being most similar) and sometimes popularity or authority (see Bibliometrics) or use relevance feedback. Boolean search engines typically only return items which match exactly without regard to order, although the term boolean search engine may simply refer to the use of boolean-style syntax (the use of operators AND, OR, NOT, and XOR) in a probabilistic context.
To provide a set of matching items that are sorted according to some criteria quickly, a search engine will typically collect metadata about the group of items under consideration beforehand through a process referred to as indexing. The index typically requires a smaller amount of computer storage, which is why some search engines only store the indexed information and not the full content of each item, and instead provide a method of navigating to the items in the search engine result page. Alternatively, the search engine may store a copy of each item in a cache so that users can see the state of the item at the time it was indexed or for archive purposes or to make repetitive processes work more efficiently and quickly.
Other types of search engines do not store an index. Crawler, or spider type search engines (a.k.a. real-time search engines) may collect and assess items at the time of the search query, dynamically considering additional items based on the contents of a starting item (known as a seed, or seed URL in the case of an Internet crawler). Meta search engines store neither an index nor a cache and instead simply reuse the index or results of one or more other search engines to provide an aggregated, final set of results.