MIME's use, however, has grown beyond describing the content of e-mail to describing content type in general.
Virtually all human-written Internet e-mail and a fairly large proportion of automated e-mail is transmitted via SMTP in MIME format. Internet e-mail is so closely associated with the SMTP and MIME standards that it is sometimes called SMTP/MIME e-mail.
The content types defined by MIME standards are also of importance outside of e-mail, such as in communication protocols like HTTP for the World Wide Web. HTTP requires that data be transmitted in the context of e-mail-like messages, even though the data may not actually be e-mail.
MIME is specified in six linked RFC memoranda: RFC 2045, RFC 2046, RFC 2047, RFC 4288, RFC 4289 and RFC 2077, which together define the specifications.
MIME defines mechanisms for sending other kinds of information in e-mail. These include text in languages other than English using character encodings other than ASCII, and 8-bit binary content such as files containing images, sounds, movies, and computer programs. MIME is also a fundamental component of communication protocols such as , which requires that data be transmitted in the context of e-mail-like messages even though the data might not fit this context. Mapping messages into and out of MIME format is typically done automatically by an e-mail client or by mail servers when sending or receiving Internet (SMTP/MIME) e-mail.
The basic format of Internet e-mail is defined in RFC 2822, which is an updated version of RFC 822. These standards specify the familiar formats for text e-mail headers and body and rules pertaining to commonly used header fields such as "To:", "Subject:", "From:", and "Date:". MIME defines a collection of e-mail headers for specifying additional attributes of a message including content type, and defines a set of transfer encodings which can be used to represent 8-bit binary data using characters from the 7-bit ASCII character set. MIME also specifies rules for encoding non-ASCII characters in e-mail message headers, such as "Subject:", allowing these header fields to contain non-English characters.
MIME is extensible. Its definition includes a method to register new content types and other MIME attribute values.
The goals of the MIME definition included requiring no changes to extant e-mail servers, and allowing plain text e-mail to function in both directions with extant clients. These goals were achieved by using additional RFC 822-style headers for all MIME message attributes and by making the MIME headers optional with default values ensuring a non-MIME message is interpreted correctly by a MIME-capable client. A simple MIME text message is therefore likely to be interpreted correctly by a non-MIME client although it has e-mail headers the non-MIME client won't know how to interpret. Similarly, if the quoted printable transfer encoding (see below) is used, the ASCII part of the message will be intelligible to users with non-MIME clients.
It should be noted that implementers have attempted to change the version number in the past and the change had unforeseen results. It was decided at an IETF meeting to leave the version number as is even though there have been many updates and versions of MIME.
Through the use of the multipart type, MIME allows messages to have parts arranged in a tree structure where the leaf nodes are any non-multipart content type and the non-leaf nodes are any of a variety of multipart types. This mechanism supports:
The following example is taken from RFC 2183, where the header is defined
Content-Disposition: attachment; filename=genome.jpeg;
modification-date="Wed, 12 Feb 1997 16:29:51 -0500";
The filename may be encoded as defined by RFC 2231. Besides attachment, one can specify inline, or any other disposition type. Unfortunately, no name is defined for the nominal "default" disposition that corresponds to no content-disposition being present. Thus the recommended practice for generating agents is to only include filename information when it is necessary, also to avoid leaking sensitive information. If filename information has to be included, an agent should either put it in a filename= parameter or both a filename= and name= parameter. Never ever use just a name= parameter because that opens up to gratuitous interpretation of the part using an unintended disposition value.
The RFC and the IANA's list of transfer encodings define the values shown below, which are not case sensitive. Note that '7bit', '8bit', and 'binary' mean that no binary-to-text encoding on top of the original encoding was used. In these cases, the header is actually redundant for the email client to decode the message body, but it may still be useful as an indicator of what type of object is being sent. Values 'quoted-printable' and 'base64' tell the email client that a binary-to-text encoding scheme was used and that appropriate initial decoding is necessary before the message can be read with its original encoding (e.g. UTF-8).
There is no encoding defined which is explicitly designed for sending arbitrary binary data through SMTP transports with the 8BITMIME extension. Thus base64 or quoted-printable (with their associated inefficiency) must sometimes still be used. This restriction does not apply to other uses of MIME such as Web Services with MIME attachments or MTOM
The form is: "
Q" denoting Q-encoding that is similar to the quoted-printable encoding, or "
B" denoting base64 encoding.
Difference between Q-encoding and quoted-printable
The ASCII codes for the question mark (?) and equals sign may not be represented directly as they are used to delimit the encoded-word. The ASCII code for space may not be represented directly because it could cause older parsers to split up the encoded word undesirably. To make the encoding smaller and easier to read the underscore is used to represent the ASCII code for space creating the side effect that underscore cannot be represented directly. Use of encoded words in certain parts of headers imposes further restrictions on which characters may be represented directly.
is interpreted as "Subject: ¡Hola, señor!".
The encoded-word format is not used for the names of the headers (for example
Subject). These header names are always in English in the raw message. When viewing a message with a non-English e-mail client, the header names are usually translated by the client.
Content-type: multipart/mixed; boundary="frontier"
This is a message with multiple parts in MIME format.
This is the body of the message.
Each part consists of its own content header (zero or more Content- header fields) and a body. Multipart content can be nested. The content-transfer-encoding of a multipart type must always be "7bit", "8bit" or "binary" to avoid the complications that would be posed by multiple levels of decoding. The multipart block as a whole does not have a charset; non-ASCII characters in the part headers are handled by the Encoded-Word system, and the part bodies can have charsets specified if appropriate for their content-type.
The RFC initially defined 4 subtypes: mixed, digest, alternative and parallel. A minimally compliant application must support mixed and digest; other subtypes are optional. Additional subtypes, such as signed and form-data, have since been separately defined in other RFCs.
The following is a list of the most commonly used subtypes; it is not intended to be a comprehensive list.
Defined in RFC 2046, Section 5.1.3
Defined in RFC 2046.
Defined in RFC 2231, Section 5.1.5
Since a client is unlikely to want to send a version that is less faithful than the plain text version this structure places the plain text version (if present) first. This makes life easier for users of clients that do not understand multipart messages.
Most commonly multipart/alternative is used for email with two parts, one plain text (text/plain) and one HTML (text/html). The plain text part provides backwards compatibility while the HTML part allows use of formatting and hyperlinks. Most email clients offer a user option to prefer plain text over HTML; this is an example of how local factors may affect how an application chooses which "best" part of the message to display.
While it is intended that each part of the message represent the same content, the standard does not require this to be enforced in any way. At one time, anti-spam filters would only examine the text/plain part of a message, because it is easier to parse than the text/html part. But spammers eventually took advantage of this, creating messages with an innocuous-looking text/plain part and advertising in the text/html part. Anti-spam software eventually caught up on this trick, penalizing messages with very different text in a multipart/alternative message.
Defined in RFC 2046, Section 5.1.4
One common usage of this subtype is to send a web page complete with images in a single message. The root part would contain the HTML document, and use image tags to reference images stored in the latter parts.
Defined in RFC 2387
Defined in RFC 3462
Defined in RFC 1847, Section 2.1
Defined in RFC 1847, Section 2.2
Defined in RFC 2388
All parts of a mixed-replace message have the same semantic meaning. However, each part invalidates - "replaces" - the previous parts as soon as it is received completely. Clients should process the individual parts as soon as they arrive and should not wait for the whole message to finish.
Originally developed by Netscape, it is still supported by Mozilla, Firefox, Safari (but not in Safari on the iPhone) and Opera, but traditionally ignored by Microsoft. It is commonly used in IP cameras as the MIME type for MJPEG streams.