The Latin alphabet, used for English, was originally written scripta continua, without any word separators. Later interpuncts, centred dots, were added to make reading easier, and replaced with spaces after 600–800 AD. In typesetting, spaces have historically been of multiple lengths with particular space-lengths being used for specific typographic purposes, such as separating words or separating sentences or separating punctuation from words. Following the invention of the typewriter and the subsequent overlap of designer style-preferences and computer-technology limitations, much of this reader-centric variation has been lost in normal use.
In computer representation of text, spaces of various sizes, styles, or language characteristics (different space characters) are indicated with unique code points.
Modern English uses a space to separate words, but not all languages follow this practice. Spaces were not used to separate words in Latin until roughly AD 600–AD 800. Ancient Hebrew and Arabic did use spaces, partly to compensate in clarity for the lack of vowels. Traditionally, all CJK languages have no spaces: modern Chinese and Japanese (except when written with little or no kanji) still do not, but modern Korean uses spaces.
There are three main conventions relating to the number of spaces used to separate sentences within the same paragraph:
Double spacing can also refer to a style of line spacing: the insertion of a full additional empty line between lines of text. This is commonly used for text which may incorporate later markup or modifications, such as proof-readers' copies, legal documents, or academic assignments for correction.
In addition to this general-purpose space, it is possible to encode a space of a specific width. See the table below for a complete list.
(In monospaced proofreading copy, only em- and en-spaces are represented using this character (which is called an em-quad or an en-quad), while other types of spaces are represented with a number sign.
|Normal space||left right|
|Normal space with em dash||left — right|
|Hair space with em dash||left — right|
|No space with em dash||left—right|
|Code||No break||HTML entity||Name||In Block||Display||Description|
|U+0020|| ||Space||Basic Latin||] [||Normal space, same as ASCII character 0x20|
|U+00A0||✓|| ||No-Break Space||Latin-1 Supplement||] [||Identical to U+0020, but not a point at which a line may be broken|
|U+1680|| ||Ogham Space Mark||Ogham||] [||Used for interword separation in Ogham text. Normally a vertical line in vertical text or a horizontal line in horizontal text, but may also be a blank space in "stemless" fonts. Requires an Ogham font.|
|U+180E||᠎|| Mongolian Vowel Separator, |
|Mongolian||][||A thin space character used in Mongolian to cause the final two characters of a word to take on different shapes.|
|U+2002|| || En Space, |
|General Punctuation||] [||Width of one en (half of one em). U+2000 En Quad is canonically equivalent to this character (En Space is preferred).|
|U+2003|| || Em Space, |
|General Punctuation||] [||Width of one em. U+2001 Em Quad is canonically equivalent to this character (Em Space is preferred).|
|U+2004|| || Three-Per-Em Space, |
or Thick Space
|General Punctuation||] [||One third of an em wide|
|U+2005|| || Four-Per-Em Space, |
or Mid Space
|General Punctuation||] [||One fourth of an em wide|
|U+2006|| ||Six-Per-Em Space||General Punctuation||] [||One sixth of an em wide. In computer typography sometimes equated to U+2009.|
|U+2007||✓|| ||Figure Space||General Punctuation||] [||In fonts with monospaced digits, equal to the width of one digit|
|U+2008|| ||Punctuation Space||General Punctuation||] [||As wide as the narrow punctuation in a font|
|U+2009|| ||Thin Space||General Punctuation||] [||One fifth (sometimes one sixth) of an em wide|
|U+200A|| ||Hair Space||General Punctuation||] [||Thinner than a thin space|
|U+200B||​|| Zero Width Space, |
|General Punctuation||][|| Used to indicate word boundaries to text processing systems when using scripts that do not use explicit spacing; normally not a visible separation, but it may expand in passages that are fully justified. In HTML pages this space can be used as a potential line-break in long words as a replacement for the non-standard |
|U+200C||‌|| Zero Width Non Joiner, |
|General Punctuation||][||When placed between two characters that would otherwise be connected, a ZWNJ causes them to be printed in their final and initial forms, respectively.|
|U+200D||‍|| Zero Width Joiner, |
|General Punctuation||][||When placed between two characters that would otherwise not be connected, a ZWJ causes them to be printed in their connected forms.|
|U+202F||✓|| ||Narrow No-Break Space||General Punctuation||] [||Similar to U+00A0 No-Break Space|
|U+205F|| ||Medium Mathematical Space||General Punctuation||] [||Used in mathematical formulae|
|U+2060||✓||⁠||Word Joiner||General Punctuation||][||Identical to U+200B, but not a point at which a line may be broken. Introduced in Unicode 3.2 to replace the deprecated "zero width no-break space" function of the U+FEFF character.|
|U+3000||　||Ideographic Space||CJK Symbols and Punctuation||] [||As wide as a CJK character cell (fullwidth)|
|U+FEFF||✓||﻿|| Zero Width No-Break Space|
= Byte Order Mark (BOM)
|Arabic Presentation Forms-B||][||Used primarily as a Byte Order Mark character. Use as an indication of non-breaking is deprecated as of Unicode 3.2. See U+2060 instead.|
Unicode also provides some visible characters to stand in for space when necessary in the "Control Pictures" block: the Symbol For Space ␠ (U+2420), the Blank Symbol ␢ (U+2422), and the Open Box ␣ (U+2423). The interpunct · is also often used to represent a space in word processing programs such as Microsoft Word.
In programming language syntax, spaces are frequently used to explicitly separate tokens. Aside from this use, spaces and other whitespace characters are usually ignored by modern programming languages. Exceptions are Haskell, ABC, and Python, which use the amount of whitespace in indentation to indicate the bounds of a block, and a whimsical language called Whitespace, where whitespace is the only meaningful syntactical element.
Text editors, word processors, and desktop publishing software differ in how they represent whitespace on the screen, and how they represent spaces at the ends of lines longer than the screen or column width. In some cases, spaces are shown simply as blank space; in other cases they may be represented by an interpunct or other symbols. Many different characters (described below) could be used to produce spaces, and non-character functions (such as margins and tab settings) can also affect whitespace.
However, special-purpose markup languages may do. In particular, web markup languages such as XML and HTML treat whitespace characters specially, including space characters, for programmers' convenience. One or more space characters read by conforming Display-time processors of those markup languages are collapsed to 0 or 1 space, depending on their semantic context. For example, double (or more) spaces within text are collapsed to a single space, and spaces which appear on either side of the " In XML attribute values, sequences of whitespace characters are treated as a single space when the document is read by a parser. Whitespace in XML element content is not changed in this way by the parser, but an application receiving information from the parser may choose to apply similar rules to element content. An XML document author can use the In most HTML elements, a sequence of whitespace characters is treated as a single inter-word separator, which may manifest as a single space character when rendering text in a language that normally inserts such space between words. Conforming HTML renderers are required to apply a more literal treatment of whitespace within a few prescribed elements, such as the In both XML and HTML, the non-breaking space character, along with other non-"standard" spaces, is not treated as collapsible "whitespace", so it is not subject to the rules above.
=" that separates an attribute name from its value have no effect on the interpretation of the document. Element end tags can contain trailing spaces, and empty-element tags in XML can contain spaces before the "
xml:space="preserve" attribute on an element to force the parser to discourage the downstream application from altering whitespace in that element's content.
pre tag and any element for which CSS has been used to apply
pre-like whitespace processing. In such elements, space characters will not be "collapsed" into inter-word separators.
In XML attribute values, sequences of whitespace characters are treated as a single space when the document is read by a parser. Whitespace in XML element content is not changed in this way by the parser, but an application receiving information from the parser may choose to apply similar rules to element content. An XML document author can use the
In most HTML elements, a sequence of whitespace characters is treated as a single inter-word separator, which may manifest as a single space character when rendering text in a language that normally inserts such space between words. Conforming HTML renderers are required to apply a more literal treatment of whitespace within a few prescribed elements, such as the
In both XML and HTML, the non-breaking space character, along with other non-"standard" spaces, is not treated as collapsible "whitespace", so it is not subject to the rules above.