A byte (pronounced "bite", ) is the basic unit of measurement of information storage in computer science. In many computer architectures it is a unit of memory addressing, most often consisting of eight bits. A byte is one of the basic integral data types in some programming languages, especially system programming languages.
A byte is an ordered collection of bits, with each bit denoting a single binary value of 1 or 0. The size of a byte can vary and is generally determined by the underlying computer operating system or hardware, although the 8-bit byte is the standard in modern systems. Historically, byte size was determined by the number of bits required to represent a single character from a Western character set. Its size was generally determined by the number of possible characters in the supported character set and was chosen to be a divisor of the computer's word size. Historically bytes have ranged from five to twelve bits.
The popularity of IBM's System/360 architecture starting in the 1960s and the explosion of microcomputers based on 8-bit microprocessors in the 1980s has made eight bits by far the most common size for a byte. The term octet is widely used as a more precise synonym where ambiguity is undesirable (for example, in protocol definitions).
There has been considerable confusion about the meanings of metric -- or SI prefixes -- used with the word "byte", especially concerning prefixes such as kilo- (k or K) and mega- (M) as shown in the chart Prefixes for bit and byte. Since computer memory comes in a Power of two rather than 10, a large portion of the software and computer industry use binary estimates of the SI-prefixed quantities, while producers of computer storage devices prefer the SI values. This is why a computer hard drive advertised with a "100 GB" decimal storage capacity actually contains no more than 93 GB of 8-bit (power of 2) addressable storage. Because of the confusion, a contract specifying a quantity of bytes must define what the prefixes mean in terms of the contract (i.e., the alternative binary equivalents or the actual decimal values, or a binary estimate based on the actual values).
To make the meaning of the table absolutely clear: A kibibyte (KiB) is made up of 1,024 bytes. A mebibyte (MiB) is made up of 1,024 × 1,024 i.e. 1,048,576 bytes. The figures in the column using 1,024 raised to powers of 1, 2, 3, 4 and so on are in units of bytes.
char integral data type must contain at least 8 bits (clause 5.2.4.2.1), a byte in C is at least capable of holding 256 different values (signed or unsigned char does not matter). Various implementations of C and C++ define a "byte" as 8, 9, 16, 32, or 36 bits. The actual number of bits in a particular implementation is documented as CHAR_BIT as implemented in the limits.h file. Java's primitive byte data type is always defined as consisting of 8 bits and being a signed data type, holding values from −128 to 127.Early microprocessors, such as Intel 8008 (the direct predecessor of the 8080, and then 8086) could perform a small number of operations on four bits, such as the DAA (decimal adjust) instruction, and the "half carry" flag, that were used to implement decimal arithmetic routines. These four-bit quantities were called "nybbles," in homage to the then-common 8-bit "bytes."
The unit symbol "kb" with a lowercase "b" is a commonly used abbreviation for "kilobyte". Use of this abbreviation leads to confusion with the alternative use of "kb" to mean "kilobit". IEEE 1541 specifies "b" as the symbol for bit; however the IEC 60027 and Metric-Interchange-Format specify "bit" (e.g. Mbit for megabit) for the symbol, achieving maximum disambiguation from byte.
French-speaking countries sometimes use an uppercase "o" for "octet". This is not consistent with SI because of the risk of confusion with the zero, and the convention that capitals are reserved for unit names derived from proper names, such as the ampere (whose symbol is A) and joule (symbol J), versus the second (symbol s) and metre (symbol m).
Lowercase "o" for "octet" is a commonly used symbol in several non-English-speaking countries, and is also used with metric prefixes (for example, "ko" and "Mo").
Today the harmonized ISO/IEC IEC 80000-13:2008 - Quantities and units -- Part 13: Information science and technology standard cancels and replaces subclauses 3.8 and 3.9 of IEC 60027-2:2005 (those related to Information theory and Prefixes for binary multiples). See Units of information#Byte for detailed discussion on names for derived units.