As an expression of a schema, a DTD specifies, in effect, the syntax of an "application" of SGML or XML, such as the derivative language HTML or XHTML. This syntax is usually a less general form of the syntax of SGML or XML.
In a DTD, the structure of a class of documents is described via element and attribute-list declarations. Element declarations name the allowable set of elements within the document, and specify whether and how declared elements and runs of character data may be contained within each element. Attribute-list declarations name the allowable set of attributes for each declared element, including the type of each attribute value, if not an explicit set of valid value(s).
A DTD is associated with an XML document via a Document Type Declaration, which is a tag that appears near the start of the XML document. The declaration establishes that the document is an instance of the type defined by the referenced DTD.
The declarations in a DTD are divided into an internal subset and an external subset. The declarations in the internal subset are embedded in the Document Type Declaration in the document itself. The declarations in the external subset are located in a separate text file. The external subset may be referenced via a public identifier and/or a system identifier. Programs for reading documents may not be required to read the external subset.
Here is an example of a Document Type Declaration containing both public and system identifiers:
All HTML 4.01 documents are expected to conform to one of three SGML DTDs. The public identifiers of these DTDs are constant and are as follows:
The system identifiers of these DTDs, if present in the Document Type Declaration, will be URI references. System identifiers can vary, but are expected to point to a specific set of declarations in a resolvable location. SGML allows for public identifiers to be mapped to system identifiers in catalogs that are optionally made available to the URI resolvers used by document parsing software.
The XML DTD syntax is one of several XML schema languages.
A common misconception is that non-validating XML parsers are not required to read DTDs, when in fact, the DTD must still be scanned for correct syntax as well as for declarations of entities and default attributes. A non-validating parser may, however, elect not to read external entities, including the external subset of the DTD. If the XML document depends on declarations found only in external entities, it should assert
standalone="no" in its XML declaration.
The syntax of SGML and XML DTDs are very similar, but not identical.
An example of a very simple XML DTD to describe a list of persons is given below:
Taking this line by line, it says:
people_listis a valid element name, and an instance of such an element contains any number of
*denotes there can be 0 or more
personelements within the
personis a valid element name, and an instance of such an element contains one element named
name, followed by one named
gender(also optional) and
socialsecuritynumber(also optional). The
?indicates that an element is optional. The reference to the
nameelement name has no
?, so a
personelement must contain a
nameis a valid element name, and an instance of such an element contains "parsed character data" (#PCDATA).
birthdateis a valid element name, and an instance of such an element contains parsed character data.
genderis a valid element name, and an instance of such an element contains parsed character data.
socialsecuritynumberis a valid element name, and an instance of such an element contains parsed character data.
An example of an XML file which makes use of and conforms to this DTD follows. It assumes the DTD is identifiable by the relative URI reference "example.dtd", and the "people_list" after "!DOCTYPE" tells us that the root tags, or the first element defined in the DTD, is called "people_list":
It is possible to render this in an XML-enabled browser (such as IE5 or Mozilla) by pasting and saving the DTD component above to a text file named example.dtd and the XML file to a differently-named text file, and opening the XML file with the browser. The files should both be saved in the same directory. However, many browsers do not check that an XML document conforms to the rules in the DTD; they are only required to check that the DTD is syntactically correct. For security reasons, they may also choose not to read the external DTD.
Other alternatives to DTDs have become available in the last few years: