Definitions

# Numeral system

A numeral system (or system of numeration) is a mathematical notation for representing numbers of a given set by symbols in a consistent manner. It can be seen as the context that allows the numeral "11" to be interpreted as the binary numeral for three, the decimal numeral for eleven, or other numbers in different bases.

Ideally, a numeral system will:

• Represent a useful set of numbers (e.g. all whole numbers, integers, or real numbers)
• Give every number represented a unique representation (or at least a standard representation)
• Reflect the algebraic and arithmetic structure of the numbers.

For example, the usual decimal representation of whole numbers gives every whole number a unique representation as a finite sequence of digits, with the operations of arithmetic (addition, subtraction, multiplication and division) being present as the standard algorithms of arithmetic. However, when decimal representation is used for the rational or real numbers, the representation is no longer unique: many rational numbers have two numerals, a standard one that terminates, such as 2.31, and another that recurs, such as 2.309999999... . Numerals which terminate have no non-zero digits after a given position. For example, numerals like 2.31 and 2.310 are taken to be the same, except in the experimental sciences, where greater precision is denoted by the trailing zero.

Numeral systems are sometimes called number systems, but that name is misleading, as it could refer to different systems of numbers, such as the system of real numbers, the system of complex numbers, the system of p-adic numbers, etc. Such systems are not the topic of this article.

## Types of numeral systems

The most commonly used system of numerals is known as Hindu-Arabic numerals, and two great Indian mathematicians could be given credit for developing them. Aryabhatta of Kusumapura who lived during the 5th century developed the place value notation and Brahmagupta a century later introduced the symbol zero.

The simplest numeral system is the unary numeral system, in which every natural number is represented by a corresponding number of symbols. If the symbol / is chosen, for example, then the number seven would be represented by ///////. Tally marks represent one such system still in common use. In practice, the unary system is normally only useful for small numbers, although it plays an important role in theoretical computer science. Also, Elias gamma coding which is commonly used in data compression expresses arbitrary-sized numbers by using unary to indicate the length of a binary numeral.

The unary notation can be abbreviated by introducing different symbols for certain new values. Very commonly, these values are powers of 10; so for instance, if / stands for one, - for ten and + for 100, then the number 304 can be compactly represented as +++ //// and number 123 as + - - /// without any need for zero. This is called sign-value notation. The ancient Egyptian system is of this type, and the Roman system is a modification of this idea.

More useful still are systems which employ special abbreviations for repetitions of symbols; for example, using the first nine letters of our alphabet for these abbreviations, with A standing for "one occurrence", B "two occurrences", and so on, we could then write C+ D/ for the number 304. The numeral system of English is of this type ("three hundred [and] four"), as are those of virtually all other spoken languages, regardless of what written systems they have adopted.

More elegant is a positional system, also known as place-value notation. Again working in base 10, we use ten different digits 0, ..., 9 and use the position of a digit to signify the power of ten that the digit is to be multiplied with, as in 304 = 3×100 + 0×10 + 4×1. Note that zero, which is not needed in the other systems, is of crucial importance here, in order to be able to "skip" a power. The Hindu-Arabic numeral system, borrowed from India, is a positional base 10 system; it is used today throughout the world.

Arithmetic is much easier in positional systems than in the earlier additive ones; furthermore, additive systems have a need for a potentially infinite number of different symbols for the different powers of 10; positional systems need only 10 different symbols (assuming that it uses base 10).

The numerals used when writing numbers with digits or symbols can be divided into two types that might be called the arithmetic numerals 0,1,2,3,4,5,6,7,8,9 and the geometric numerals 1,10,100,1000,10000... respectively. The sign-value systems use only the geometric numerals and the positional system use only the arithmetic numerals. The sign-value system does not need arithmetic numerals because they are made by repetition (except for the Ionic system), and the positional system does not need geometric numerals because they are made by position. However, the spoken language uses both arithmetic and geometric numerals.

In certain areas of computer science, a modified base-k positional system is used, called bijective numeration, with digits 1, 2, ..., k (k ≥ 1), and zero being represented by the empty string. This establishes a bijection between the set of all such digit-strings and the set of non-negative integers, avoiding the non-uniqueness caused by leading zeros. Bijective base-k numeration is also called k-adic notation, not to be confused with p-adic numbers. Bijective base-1 the same as unary.

## Bases used

### Computing

Switches, mimicked by their electronic successors built originally of vacuum tubes and in modern technology of transistors, have only two possible states: "open" and "closed". Substituting open=1 and closed=0 (or the other way around) yields the entire set of binary digits. This base-2 system (binary) is the basis for digital computers. It is used to perform integer arithmetic in almost all digital computers; some exotic base-3 (ternary) and base-10 computers have also been built, but those designs were discarded early in the history of computing hardware.

Modern computers use transistors that represent two states with either high or low voltages. The smallest unit of memory for this binary state is called a bit. Bits are arranged in groups to aid in processing, and to make the binary numbers shorter and more manageable for humans. More recently these groups of bits, such as bytes and words, are sized in multiples of four. Thus base 16 (hexadecimal) is commonly used as shorthand. Base 8 (octal) has also been used for this purpose.

A computer does not treat all of its data as numerical. For instance, some of it may be treated as program instructions or data such as text. However, arithmetic and Boolean logic constitute most internal operations. Whole numbers are represented exactly, as integers. Real numbers, allowing fractional values, are usually approximated as floating point numbers. The computer uses different methods to do arithmetic with these two kinds of numbers.

### Five

A base-5 system (quinary) has been used in many cultures for counting. Plainly it is based on the number of fingers on a human hand. It may also be regarded as a sub-base of other bases, such as base 10 and base 60.

### Eight

A base-8 system (octal) was devised by the Yuki of Northern California, who used the spaces between the fingers to count, corresponding to the digits one through eight. There is also linguistic evidence which suggests that the Bronze Age Proto-Indo Europeans (from whom most European and Indic languages descend) might have replaced a base 8 system (or a system which could only count up to 8) with a base 10 system. The evidence is that the word for 9, newm, is suggested by some to derive from the word for 'new', newo-, suggesting that the number 9 had been recently invented and called the 'new number' (Mallory & Adams 1997).

### Ten

The base-10 system (decimal) is the one most commonly used today. It is assumed to have originated because humans have ten fingers. These systems often use a larger superimposed base. See Decimal superbase.

### Twelve

Base-12 systems (duodecimal or dozenal) have been popular because multiplication and division are easier than in base-10, with addition and subtracting being just as easy. 12 is a useful base because it has many factors. It is the smallest multiple of one, two, three, four and six. There is still a special word for "dozen" and just like there is a word for 102, hundred, there is also a word for 122, gross. Base-12 could have originated from the number of knuckles in the four fingers of a hand excluding the thumb, which is used as a pointer in counting.

There are 24 hours per day, usually counted till 12 until noon (p.m.) and once again until midnight (a.m.), often further divided per 6 hours in counting (for instance in Thailand) or as switches between using terms like 'night', 'morning', 'afternoon', and 'evening', whereas other languages use such terms with durations of 3 to 9 hours often according to switches at some of the 3 hour interval marks.

Multiples of 12 have been in common use as English units of resolution in the analog and digital printing world, where 1 point equals 1/72 of an inch and 12 points equal 1 pica, and printer resolutions like 360, 600, 720, 1200 or 1440 dpi (dots per inch) are common. These are combinations of base-12 and base-10 factors: (3×12)×10, (5×12)×10, (6×12)×10, (10×12)×10 and (12×12)×10.

### Twenty

The Maya civilization and other civilizations of Pre-Columbian Mesoamerica used base-20 (vigesimal), possibly originating from the number of a person's fingers and toes. Evidence of base-20 counting systems is also found in the languages of central and western Africa.

Remnants of a Gaulish base-20 system also exist in French, as seen today in the names of the numbers from 60 through 99. For example, sixty-five is soixante-cinq (literally, "sixty [and] five"), while seventy-five is soixante-quinze (literally, "sixty [and] fifteen"). Furthermore, for any number between 80 and 99, the "tens-column" number is expressed as a multiple of twenty (somewhat similar to the archaic English manner of speaking of "scores", probably originating from the same underlying Celtic system). For example, eighty-two is quatre-vingt-deux (literally, four twenty[s] [and] two), while ninety-two is quatre-vingt-douze (literally, four twenty[s] [and] twelve). In Old French, forty was expressed as two twenties and sixty was three twenties, so that fifty-three was expressed as two twenties [and] thirteen, and so on.

The Irish language also used base-20 in the past, twenty being fichid, forty dhá fhichid, sixty trí fhichid and eighty ceithre fhichid. A remnant of this system may be seen in the modern word for 40, daoichead.

Danish numerals display a similar base-20 structure.

### Sixty

Base 60 (sexagesimal) was used by the Sumerians and their successors in Mesopotamia and survives today in our system of time (hence the division of an hour into 60 minutes and a minute into 60 seconds) and in our system of angular measure (a degree is divided into 60 minutes and a minute is divided into 60 seconds). 60 also has a large number of factors, including the first six counting numbers. Base-60 systems are believed to have originated through the merging of base-10 and base-12 systems. The Chinese Calendar, for example, uses a base-60 Jia-Zi甲子 system to denote years, with each year within the 60-year cycle being named with two symbols, the first being base-10 (called Tian-Gan天干 or heavenly stems) and the second symbol being base 12 (called Di-Zhi地支 or earthly branches). Both symbols are incremented in successive years until the first pattern recurs 60 years later. The second symbol of this system is also related to the 12-animal Chinese zodiac system. The Jia-zi system can also be applied to counting days, with a year containing roughly six 60-day cycles.

### Dual base (five and twenty)

Many ancient counting systems use 5 as a primary base, almost surely coming from the number of fingers on a person's hand. Often these systems are supplemented with a secondary base, sometimes ten, sometimes twenty. In some African languages the word for 5 is the same as "hand" or "fist" (Dyola language of Guinea-Bissau, Banda language of Central Africa). Counting continues by adding 1, 2, 3, or 4 to combinations of 5, until the secondary base is reached. In the case of twenty, this word often means "man complete". This system is referred to as quinquavigesimal. It is found in many languages of the Sudan region.

### Base names

Number From Latin From Greek Mixed or Other
Cardinals Ordinals Distributives
1 unary primal singulary henadic Primary
3 tertial ternary, trinary triadic Tertiary
6 sextal senary hexadic heximal, hexary
8 octal octaval, octavary octonary ogdoadic octonal
9 nonary novenary enneadic novary, noval
13 tridecimal, tredecimal triodecimal
18 octodecimal decennoctal
20 vicesimal, vigesimal vicenary icosadic bigesimal, bidecimal
50 quinquagesimal quinquagenary pentagesimal
70 septuagesimal septuagenary
80 octogesimal octogenary
90 nonagesimal nonagenary
200 ducentesimal ducenary bicentesimal, bicentimal
300 trecentesimal trecenary tercentimal, tricentesimal
500 quingentesimal quingenary pentacentesimal, quincentimal
600 sescentesimal hexacentesimal, hexacentimal
700 septingentesimal septingenary heptacentesimal, heptacentimal
800 octingentesimal octingenary octacentesimal, octacentimal
900 noningentesimal nongenary

`   24 - quadrovigesimal / quadriovigesimal`
`   26 - hexavigesimal / sexavigesimal`
`   27 - heptovigesimal`
`   28 - octovigesimal`
`   29 - novovigesimal`
`   31 - unotrigesimal`
`        (...repeat naming pattern...)`
`   36 - hexatridecimal / sexatrigesimal`
`        (...repeat naming pattern...)`
`   41 - unoquadragesimal`
`        (...repeat naming pattern...)`
`   51 - unoquinquagesimal`
`        (...repeat naming pattern...)`
`   64 - quadrosexagesimal`
`        (...repeat naming pattern...)`
`  110 - decacentimal`
`  111 - unodecacentimal`
`        (...repeat naming pattern...)`
`  210 - decabicentimal`
`  211 - unodecabicentimal`
`        (...repeat naming pattern...)`
`  800 - octocentimal / octocentesimal`
` 2000 - bimillesimal`
`        (...repeat naming pattern...)`

## Positional systems in detail

In a positional base-b numeral system (with b a positive natural number known as the radix), b basic symbols (or digits) corresponding to the first b natural numbers including zero are used. To generate the rest of the numerals, the position of the symbol in the figure is used. The symbol in the last position has its own value, and as it moves to the left its value is multiplied by b.

For example, in the decimal system (base 10), the numeral 4327 means (4×103) + (3×102) + (2×101) + (7×100), noting that 100 = 1.

In general, if b is the base, we write a number in the numeral system of base b by expressing it in the form anbn + an − 1bn − 1 + an − 2bn − 2 + ... + a0b0 and writing the enumerated digits anan − 1an − 2 ... a0 in descending order. The digits are natural numbers between 0 and b − 1, inclusive.

If a text (such as this one) discusses multiple bases, and if ambiguity exists, the base (itself represented in base 10) is added in subscript to the right of the number, like this: numberbase. Unless specified by context, numbers without subscript are considered to be decimal.

By using a dot to divide the digits into two groups, one can also write fractions in the positional system. For example, the base-2 numeral 10.11 denotes 1×21 + 0×20 + 1×2−1 + 1×2−2 = 2.75.

In general, numbers in the base b system are of the form:


(a_na_{n-1}cdots a_1a_0.c_1 c_2 c_3cdots)_b = sum_{k=0}^n a_kb^k + sum_{k=1}^infty c_kb^{-k}.

The numbers bk and bk are the weights of the corresponding digits. The position k is the logarithm of the corresponding weight w, that is $k = log_\left\{b\right\} w = log_\left\{b\right\} b^k$. The highest used position is close to the order of magnitude of the number.

The number of tally marks required in the unary numeral system for describing the weight would have been w. In the positional system the number of digits required to describe it is only $k + 1 =$$log_\left\{b\right\} w$$+ 1$, for $k ge 0$. E.g. to describe the weight 1000 then 4 digits are needed since $log_\left\{10\right\} 1000 + 1 = 3 + 1$. The number of digits required to describe the position is $log_\left\{b\right\} k + 1 = log_\left\{b\right\} log_\left\{b\right\} w + 1$ (in positions 1, 10, 100... only for simplicity in the decimal example).

Position 3 2 1 0 -1 -2 ...
Weight $b^3$ $b^2$ $b^1$ $b^0$ $b^\left\{-1\right\}$ $b^\left\{-2\right\}$ ...
Digit $a_3$ $a_2$ $a_1$ $a_0$ $c_1$ $c_2$ ...
Decimal example weight 1000 100 10 1 0.1 0.01 ...
Decimal example digit 4 3 2 7 0 0 ...

Note that a number has a terminating or repeating expansion if and only if it is rational; this does not depend on the base. A number that terminates in one base may repeat in another (thus 0.310 = 0.0100110011001...2). An irrational number stays unperiodic (infinite amount of unrepeating digits) in all integral bases. Thus, for example in base 2, π = 3.1415926...10 can be written down as the unperiodic 11.001001000011111...2.

If b = p is a prime number, one can define base-p numerals whose expansion to the left never stops; these are called the p-adic numbers.

A simple algorithm for converting integers between positive-integer radices is repeated division by the target radix; the remainders give the "digits" starting at the least significant. E.g., 1020304 base 10 into base 7:
```1020304 / 7 = 145757 r 5
145757 / 7 =  20822 r 3
20822 / 7 =   2974 r 4
2974 / 7 =    424 r 6
424 / 7 =     60 r 4
60 / 7 =      8 r 4
8 / 7 =      1 r 1
1 / 7 =      0 r 1   => 11446435```

E.g., 10110111 base 2 into base 5:

```10110111 / 101 = 100100 r 11  (3)
100100 / 101 =    111 r  1  (1)
111 / 101 =      1 r 10  (2)
1 / 101 =      0 r  1  (1)  => 1213```

To convert a "decimal" fraction, do repeated multiplication, taking the protruding integer parts as the "digits". Unfortunately a terminating fraction in one base may not terminate in another. E.g., 0.1A4C base 16 into base 9:

```0.1A4C × 9 = 0.ECAC
0.ECAC × 9 = 8.520C
0.520C × 9 = 2.E26C
0.E26C × 9 = 7.F5CC
0.F5CC × 9 = 8.A42C
0.A42C × 9 = 5.C58C  => 0.082785...```

## Generalized variable-length integers

More general is using a notation (here written little-endian) like $a_0 a_1 a_2$ for $a_0 + a_1 b_1 + a_2 b_1 b_2$, etc.

This is used in punycode, one aspect of which is the representation of a sequence of non-negative integers of arbitrary size in the form of a sequence without delimiters, of "digits" from a collection of 36: a-z and 0-9, representing 0-25 and 26-35 respectively. A digit lower than a threshold value marks that it is the most-significant digit, hence the end of the number. The threshold value depends on the position in the number. For example, if the threshold value for the first digit is b (i.e. 1) then a (i.e. 0) marks the end of the number (it has just one digit), so in numbers of more than one digit the range is only b-9 (1-35), therefore the weight b1 is 35 instead of 36. Suppose the threshold values for the second and third digit are c (2), then the third digit has a weight 34 × 35 = 1190 and we have the following sequence:

a (0), ba (1), ca (2), .., 9a (35), bb (36), cb (37), .., 9b (70), bca (71), .., 99a (1260), bcb (1261), etc.

Note that unlike a regular base-35 numeral system, we have numbers like 9b where 9 and b each represent 35; yet the representation is unique because ac and aca are not allowed.

The flexibility in choosing threshold values allows optimization depending on the frequency of occurrence of numbers of various sizes.

The case with all threshold values equal to 1 corresponds to bijective numeration, where the zeros correspond to separators of numbers with digits which are nonzero.

### Properties of numerical systems with integer bases

Numeral systems with base A, where A is a positive integer, possess the following properties:

If A is even and A/2 is odd, all integral powers greater than zero of the number (A/2)+1 will contain (A/2)+1 as their last digit

If both A and A/2 are even, then all integral powers greater than or equal to zero of the number (A/2)+1 will alternate between having (A/2)+1 and 1 as their last digit. (For odd powers it will be (A/2)+1, for even powers it will be 1)

Proof of the first property:

Define $\left\{A over 2\right\} + 1 = x$ Then x is even, and all $x^p$ for p greater than 0 must be even. The property is equivalent to

$! x^p equiv x \left(mbox\left\{mod\right\} A\right)$

We first check the case for p=1

$! x equiv x \left(mbox\left\{mod\right\} A\right)$

x is less than A, so the result is trivial. We then check for p=2:

$! x^2 = xx$
$! x^2 = x\left(x-1\right) + x$

Since $x-1 = \left(\left\{A over 2\right\} + 1\right) - 1 = \left\{A over 2\right\}$, then for all even N:

$! \left\{NA over 2\right\} = N\left(x-1\right) equiv 0 \left(mbox\left\{mod\right\} A\right) \left(1\right)$

Because x is even, then $x\left(x-1\right)$ is congruent to zero modulo A. Therefore:

$! x^2 equiv x \left(mbox\left\{mod\right\} A\right)$

Using induction, assuming that the property holds for p-1:

$! x^p = \left\{x^\left\{p-1\right\}\right\}x = \left\{x^\left\{p-1\right\}\right\}\left(x-1\right) + x^\left\{p-1\right\}$

Since the case holds for p-1, then $\left\{x^\left\{p-1\right\}\right\} equiv x \left(mbox\left\{mod\right\} A\right)$. Since

$! \left\{x^\left\{p-1\right\}\right\}\left(x-1\right)$

is a case of Equation 1, then $\left\{x^\left\{p-1\right\}\right\}\left(x-1\right) equiv 0 \left(mbox\left\{mod\right\} A\right)$. This leaves, for all p greater than 0,

$! x^p equiv x \left(mbox\left\{mod\right\} A\right)$

Q.E.D.

Proof of the second property:

Define $\left\{A over 2\right\} + 1 = x$ Then x is odd, and all $x^p$ for p greater than or equal to 0 must be odd. The property is equivalent to

$! x^p equiv 1 \left(mbox\left\{mod\right\} A\right); mbox\left\{if\right\} p equiv 0 \left(mbox\left\{mod\right\} 2\right)$
$! x^p equiv x \left(mbox\left\{mod\right\} A\right); mbox\left\{if\right\} p equiv 1 \left(mbox\left\{mod\right\} 2\right)$

Since $x-1 = \left(\left\{A over 2\right\} + 1\right) - 1 = \left\{A over 2\right\}$, then for all odd E:

$! \left\{EA over 2\right\} = E\left(x-1\right) equiv \left\{A over 2\right\} \left(mbox\left\{mod\right\} A\right) \left(2\right)$

The case is first checked for p=0:

$! x^0 = 1$
$! 1 equiv 1 \left(mbox\left\{mod\right\} A\right)$

This result is trivial

Next, for p=1:

$! x^1 = x$
$! x equiv x \left(mbox\left\{mod\right\} A\right)$

This result is also trivial

Next, for p=2:

$! x^2 = xx = x\left(x-1\right) + x$

Because x is odd, then x(x-1) is a case of Equation 2,

$x\left(x-1\right) + x equiv \left\{\left\{A over 2\right\} + x\right\} \left(mbox\left\{mod\right\} A\right)$

$! \left\{A over 2\right\} + x = \left\{A over 2\right\} + \left\{A over 2\right\} + 1 = A+1$
$! A+1 equiv 1 \left(mbox\left\{mod\right\} A\right), \left(mbox\left\{so\right\} x\left(x-1\right) + x = x^2 equiv 1 \left(mbox\left\{mod\right\} A\right)$

Next, for p=3:

$! x^3 = \left\{x^2\right\}x = \left\{x^2\right\}\left(x-1\right) + x^2$

Because $x^2$ is odd, $\left\{x^2\right\}\left(x-1\right) + x^2$ is a case of Equation 2,

$! \left\{x^2\right\}\left(x-1\right) + x^2 equiv \left\{\left\{A over 2\right\} + x^2\right\} \left(mbox\left\{mod\right\} A\right)$

Since $x^2 equiv 1 \left(mbox\left\{mod\right\} A\right)$,

$! \left\{x^2\right\}\left(x-1\right) + x^2 equiv \left\{\left\{A over 2\right\} + 1\right\} \left(mbox\left\{mod\right\} A\right)$

$\left\{\left\{A over 2\right\} + 1\right\} = x$, so $x^3 equiv x \left(mbox\left\{mod\right\} A\right)$.

Using induction, assuming that the property holds for p-1:

$! x^p equiv \left\{x^\left\{p-1\right\}\right\}\left(x-1\right) + x^\left\{p-1\right\}$

If p is odd:

$! x^\left\{p-1\right\} equiv 1 \left(mbox\left\{mod\right\} A\right)$

Since $\left\{x^\left\{p-1\right\}\right\}\left(x-1\right)$ is a case of Equation (2), $\left\{x^\left\{p-1\right\}\right\}\left(x-1\right) + x^\left\{p-1\right\} equiv \left\{\left\{A over 2\right\} + 1\right\} \left(mbox\left\{mod\right\} A\right)$, so

$x^p equiv x \left(mbox\left\{mod\right\} A\right)$

If p is even:

$! x^\left\{p-1\right\} equiv x \left(mbox\left\{mod\right\} A\right)$

Since $\left\{x^\left\{p-1\right\}\right\}\left(x-1\right)$ is a case of Equation (2), $\left\{x^\left\{p-1\right\}\right\}\left(x-1\right) + x^\left\{p-1\right\} equiv \left\{\left\{A over 2\right\} + x\right\} \left(mbox\left\{mod\right\} A\right)$.

$\left\{A over 2\right\} + x = \left\{A over 2\right\} + \left\{A over 2\right\} + 1 = A+1$

$A+1 equiv 1 \left(mbox\left\{mod\right\} A\right)$, so

$x^p equiv 1 \left(mbox\left\{mod\right\} A\right)$

## References

• Georges Ifrah. The Universal History of Numbers : From Prehistory to the Invention of the Computer, Wiley, 1999. ISBN 0-471-37568-3.
• D. Knuth. The Art of Computer Programming. Volume 2, 3rd Ed. Addison-Wesley. pp.194–213, "Positional Number Systems".
• J.P. Mallory and D.Q. Adams, Encyclopedia of Indo-European Culture, Fitzroy Dearborn Publishers, London and Chicago, 1997.
• Hans J. Nissen, P. Damerow, R. Englund, Archaic Bookkeeping, University of Chicago Press, 1993, ISBN 0-226-58659-6.
• Denise Schmandt-Besserat, How Writing Came About, University of Texas Press, 1992, ISBN 0-292-77704-3.
• Claudia Zaslavsky, Africa Counts: Number and Pattern in African Cultures, Lawrence Hill Books, 1999, ISBN 1-55652-350-5.