Binary prefix

A binary prefix is a unit prefix that indicates a multiple of a unit of measurement by an integer power of two. The most commonly used binary prefixes are kibi (symbol Ki, meaning 210 = 1024), mebi (Mi, 220 = $1,048,576$), and gibi (Gi, 230 = $1,073,741,824$). They are most often used in information technology as multipliers of bit and byte, when expressing the capacity of storage devices or the size of computer files.

The binary prefixes "kibi", "mebi", etc. were defined in 1999 by the International Electrotechnical Commission (IEC), in the IEC 60027-2 standard (Amendment 2). They were meant to replace the metric (SI) decimal power prefixes, such as "kilo" ("k", 103 = 1000), "mega" ("M", 106 = $1,000,000$) and "giga" ("G", 109 = $1,000,000,000$), that were commonly used in the computer industry to indicate the nearest powers of two. For example, a memory module whose capacity was specified by the manufacturer as "2 megabytes" or "2 MB" would hold 2 × 220 = $2,097,152$ bytes, instead of 2 × 106 = $2,000,000$.

On the other hand, a hard disk whose capacity is specified by the manufacturer as "10 gigabytes" or "10 GB", holds 10 × 109 = $10,000,000,000$ bytes, or a little more than that, but less than 10 × 230 = $10,737,418,240$ and a file whose size is listed as "2.3 GB" may have a size closer to 2.3 × 230 ≈ $2,470,000,000$ or to 2.3 × 109 = $2,300,000,000$, depending on the program or operating system providing that measurement. This kind of ambiguity is often confusing to computer system users and has resulted in lawsuits. The IEC 60027-2 binary prefixes have been incorporated in the ISO/IEC 80000 standard and are supported by other standards bodies, including the BIPM, which defines the SI system, the US NIST,  and the European Union.

Prior to the 1999 IEC standard, some industry organizations, such as the Joint Electron Device Engineering Council (JEDEC), attempted to redefine the terms kilobyte, megabyte, and gigabyte, and the corresponding symbols KB, MB, and GB in the binary sense, for use in storage capacity measurements. However, other computer industry sectors (such as magnetic storage) continued using those same terms and symbols with the decimal meaning. Since then, the major standards organizations have expressly disapproved the use of SI prefixes to denote binary multiples, and recommended or mandated the use of the IEC prefixes for that purpose, but the use of SI prefixes in this sense has persisted in some fields.

While the binary prefixes are almost always used with the units of information, bits and bytes, they may be used with any other unit of measure, when convenient. For example, in signal processing one may need binary multiples of the frequency unit hertz (Hz), for example the kibihertz (KiHz), equal to $1,024 Hz$.

Definitions
In 2022, the International Bureau of Weights and Measures (BIPM) adopted the decimal prefixes ronna for 10009 and quetta for 100010. In analogy to the existing binary prefixes, a consultation paper of the International Committee for Weights and Measures' Consultative Committee for Units (CCU) suggested the prefixes robi (Ri, 10249) and quebi (Qi, 102410) for their binary counterparts, but, no corresponding binary prefixes have been adopted.

Comparison of binary and decimal prefixes
The relative difference between the values in the binary and decimal interpretations increases, when using the SI prefixes as the base, from 2.4% for kilo to nearly 27% for the quetta prefix. Although the prefixes ronna and quetta have been defined, as of 2022 no names have been officially assigned to the corresponding binary prefixes.

Early prefixes
The original metric system adopted by France in 1795 included two binary prefixes named double- (2×) and demi- ($1,024$×). However, these were not retained when the SI prefixes were internationally adopted by the 11th CGPM conference in 1960.

Main memory
Early computers used one of two addressing methods to access the system memory; binary (base 2) or decimal (base 10). For example, the IBM 701 (1952) used a binary methods and could address 2048 words of 36 bits each, while the IBM 702 (1953) used a decimal system, and could address ten thousand 7-bit words.

By the mid-1960s, binary addressing had become the standard architecture in most computer designs, and main memory sizes were most commonly powers of two. This is the most natural configuration for memory, as all combinations of states of their address lines map to a valid address, allowing easy aggregation into a larger block of memory with contiguous addresses.

While early documentation specified those memory sizes as exact numbers such as 4096, 8192, or $1.024$ units (usually words, bytes, or bits), computer professionals also started using the long-established metric system prefixes "kilo", "mega", "giga", etc., defined to be powers of 10, to mean instead the nearest powers of two; namely, 210 = 1024, 220 = 10242, 230 = 10243, etc. The corresponding metric prefix symbols ("k", "M", "G", etc.) where used with the same binary meanings. The symbol for 210 = 1024 could be written either in lower case ("k")  or in uppercase ("K"). The latter was often used intentionally to indicate the binary rather than decimal meaning. This convention, which could not be extended to higher powers, was widely used in the documentation of the IBM 360 (1964) and of the IBM System/370 (1972), of the CDC 7600, of the DEC PDP-11/70 (1975) and of the DEC VAX-11/780 (1977).

In other documents, however, the metric prefixes and their symbols were used to denote powers of 10, but usually with the understanding that the values given were approximate, often truncated down. Thus, for example, a 1967 document by Control Data Corporation (CDC) abbreviated "216 = 64 × 1024 = $1,048,576$ words" as "65K words" (rather than "64K" or "66K"), while the documentation of the HP 21MX real-time computer (1974) denoted 3 × 216 = 192 × 1024 = $1.049$ as "196K" and 220 = $1,073,741,824$ as "1M".

These three possible meanings of "k" and "K" ("1024", "1000", or "approximately 1000") were used loosely around the same time, sometimes by the same company. The HP 3000 business computer (1973) could have "64K", "96K", or "128K" bytes of memory. The use of SI prefixes, and the use of "K" instead of "k" remained popular in computer-related publications well into the 21st century, although the ambiguity persisted. The correct meaning was often clear from the context; for instance, in a binary-addressed computer, the true memory size had to be either a power of 2, or a small integer multiple thereof. Thus a "512 megabyte" RAM module was generally understood to have 512 × 10242 = $1.074$ bytes, rather than $1,099,511,627,776$.

Hard disks
In specifying disk drive capacities, manufacturers have always used conventional decimal SI prefixes representing powers of 10. Storage in a rotating disk drive is organized in platters and tracks whose sizes and counts are determined by mechanical engineering constraints so that the capacity of a disk drive has hardly ever been a simple multiple of a power of 2. For example, the first commercially sold disk drive, the IBM 350 (1956), had 50 physical disk platters containing a total of $1.1$ sectors of 100 characters each, for a total quoted capacity of 5 million characters.

Moreover, since the 1960s, many disk drives used IBM's disk format, where each track was divided into blocks of user-specified size; and the block sizes were recorded on the disk, subtracting from the usable capacity. For example, the|IBM 3336]] disk pack was quoted to have a 200-megabyte capacity, achieved only with a single $1,125,899,906,842,624$-byte block in each of its 808 × 19 tracks.

Decimal megabytes were used for disk capacity by the CDC in 1974. The Seagate ST-412, one of several types installed in the IBM PC/XT, had a capacity of $1.126$ when formatted as 306 × 4 tracks and 32 256-byte sectors per track, which was quoted as "10 MB". Similarly, a "300 GB" hard drive can be expected to offer only slightly more than $1,152,921,504,606,847,000$ = $1.153$, bytes, not 300 × 230 (which would be about $1,180,591,620,717,411,300,000$ bytes or "322 GB"). The first terabyte (SI prefix, $1.181$ bytes) hard disk drive was introduced in 2007. Decimal prefixes were generally used by information processing publications when comparing hard disk capacities.

Some programs and operating systems, such as Microsoft Windows, still use "MB" and "GB" to denote binary prefixes even when displaying disk drive capacities, as did Classic Mac OS. Thus, for example, the capacity of a "10 MB" (decimal "M") disk drive could be reported as "9.56 MB", and that of a "300 GB" drive as "279.4 GB". Some operating systems, such as Mac OS X, Ubuntu, and Debian, have been updated to use "MB" and "GB" to denote decimal prefixes when displaying disk drive capacities. Some manufacturers, such as Seagate Technology, have released recommendations stating that properly-written software and documentation should specify clearly whether prefixes such as "K", "M", or "G" mean binary or decimal multipliers.

Floppy disks
Floppy disks used a variety of formats, and their capacities was usually specified with SI-like prefixes "K" and "M" with either decimal or binary meaning. The capacity of the disks was often specified without accounting for the internal formatting overhead, leading to more irregularities.

The early 8-inch diskette formats could contain less than a megabyte with the capacities of those devices specified in kilobytes, kilobits or megabits.

The 5.25-inch diskette sold with the IBM PC AT could hold 1200 × 1024 = $1,208,925,819,614,629,200,000,000$ bytes, and thus was marketed as "1200 KB" with the binary sense of "KB". However, the capacity was also quoted "1.2 MB", which was a hybrid decimal and binary notation, since the "M" meant 1000 × 1024. The precise value was $1.209$ (decimal) or $1⁄2$ (binary).

The 5.25-inch Apple Disk II had 256 bytes per sector, 13 sectors per track, 35 tracks per side, or a total capacity of $16,384$ bytes. It was later upgraded to 16 sectors per track, giving a total of 140 × 210 = $65,536$ bytes, which was described as "140KB" using the binary sense of "K".

The most recent version of the physical hardware, the "3.5-inch diskette" cartridge, had 720 512-byte blocks (single-sided). Since two blocks comprised 1024 bytes, the capacity was quoted "360 KB", with the binary sense of "K". On the other hand, the quoted capacity of "1.44 MB" of the High Density ("HD") version was again a hybrid decimal and binary notation, since it meant 1440 pairs of 512-byte sectors, or 1440 × 210 = $196,608$. Some operating systems displayed the capacity of those disks using the binary sense of "MB", as "1.4 MB" (which would be 1.4 × 220 ≈ $1,048,576$). User complaints forced both Apple and Microsoft to issue support bulletins explaining the discrepancy.

Optical disks
When specifying the capacities of optical compact discs, "megabyte" and "MB" usually meant 10242 bytes. Thus a "700-MB" (or "80-minute") CD has a nominal capacity of about $536,870,912$, which is approximately $512,000,000$ (decimal).

On the other hand, capacities of other optical disc storage media like DVD, Blu-ray Disc, HD DVD and magneto-optical (MO) have been generally specified in decimal gigabytes ("GB"), that is, 10003 bytes. In particular, a typical "$50,000$" DVD has a nominal capacity of about $13,030$, which is about $10,027,008 bytes$.

Tape drives and media
Tape drive and media manufacturers have generally used SI decimal prefixes to specify the maximum capacity, although the actual capacity would depend on the block size used when recording.

Data and clock rates
Computer clock frequencies are always quoted using SI prefixes in their decimal sense. For example, the internal clock frequency of the original IBM PC was $300$, that is $300,000,000,000$.

Similarly, digital information transfer rates are quoted using decimal prefixe. The Parallel ATA "100 MB/s" disk interface can transfer $322$ bytes per second, and a "56 Kb/s" modem transmits $1,000,000,000,000$ bits per second. Seagate specified the sustained transfer rate of some hard disk drive models with both decimal and IEC binary prefixes. The standard sampling rate of music compact disks, quoted as $1,228,800$, is indeed $1.229 MB$ samples per second. A "1 Gb/s" Ethernet interface can receive or transmit up to 109 bits per second, or $1.172 MiB$ bytes per second within each packet. A "56k" modem can encode or decode up to $116,480$ bits per second.

Decimal SI prefixes are also generally used for processor-memory data transfer speeds. A PCI-X bus with $143,360$ clock and 64 bits wide can transfer $1,474,560 bytes$ 64-bit words per second, or $1,468,000 bytes$ = $700 MiB$, which is usually quoted as $730 MB$. A PC3200 memory on a double data rate bus, transferring 8 bytes per cycle with a clock speed of $4.7 GB$ has a bandwidth of $4.7 bytes$ × 8 × 2 = $4.38 GiB$, which would be quoted as $4.77 MHz$.

Ambiguous standards
The ambiguous usage of the prefixes "kilo ("K" or "k"), "mega" ("M"), and "giga" ("G"), as meaning both powers of 1000 or (in computer contexts) of 1024, has been recorded in popular dictionaries,  and even in some obsolete standards, such as ANSI/IEEE 1084-1986 and ANSI/IEEE 1212-1991, IEEE 610.10-1994, and IEEE 100-2000. Some of these standards specifically limited the binary meaning to multiples of "byte" ("B") or "bit" ("b").

Early binary prefix proposals
Before the IEC standard, several alternative proposals existed for unique binary prefixes, starting in the late 1960s. In 1996, Markus Kuhn proposed the extra prefix "di" and the symbol suffix or subscript "2" to mean "binary"; so that, for example, "one dikilobyte" would mean "1024 bytes", denoted "K2B" or "K2B".

In 1968, Donald Morrison proposed to use the Greek letter kappa (κ) to denote 1024, κ2 to denote 10242, and so on. (At the time, memory size was small, and only K was in widespread use.) In the same year, Wallace Givens responded with a suggestion to use bK as an abbreviation for 1024 and bK2 or bK2 for 10242, though he noted that neither the Greek letter nor lowercase letter b would be easy to reproduce on computer printers of the day. Bruce Alan Martin of Brookhaven National Laboratory proposed that, instead of prefixes, binary powers of two were indicated by the letter B followed by the exponent, similar to E in decimal scientific notation. Thus one would write 3B20 for 3 × 220. This convention is still used on some calculators to present binary floating point-numbers today.

In 1969, Donald Knuth, who uses decimal notation like 1 MB = 1000 kB, proposed that the powers of 1024 be designated as "large kilobytes" and "large megabytes", with abbreviations KKB and MMB.

Consumer confusion
The ambiguous meanings of "kilo", "mega", "giga", etc., has caused significant consumer confusion, especially in the personal computer era. A common source of confusion was the discrepancy between the capacities of hard drives specified by manufacturers, using those prefixes in the decimal sense, and the numbers reported by operating systems and other software, that used them in the binary sense, such as the Apple Macintosh in 1984. For example, a hard drive marketed as "1 TB" could be reported as having only "931 GB". The confusion was compounded by fact that RAM manufacturers used the binary sense too.

Legal disputes
The different interpretations of disk size prefixes led to class action lawsuits against digital storage manufacturers. These cases involved both flash memory and hard disk drives.

Early cases
Early cases (2004–2007) were settled prior to any court ruling with the manufacturers admitting no wrongdoing but agreeing to clarify the storage capacity of their products on the consumer packaging. Accordingly, many flash memory and hard disk manufacturers have disclosures on their packaging and web sites clarifying the formatted capacity of the devices or defining MB as 1 million bytes and 1 GB as 1 billion bytes.

Willem Vroegh v. Eastman Kodak Company
On 20 February 2004, Willem Vroegh filed a lawsuit against Lexar Media, Dane–Elec Memory, Fuji Photo Film USA, Eastman Kodak Company, Kingston Technology Company, Inc., Memorex Products, Inc.; PNY Technologies Inc., SanDisk Corporation, Verbatim Corporation, and Viking Interworks alleging that their descriptions of the capacity of their flash memory cards were false and misleading.

Vroegh claimed that a 256 MB Flash Memory Device had only 244 MB of accessible memory. "Plaintiffs allege that Defendants marketed the memory capacity of their products by assuming that one megabyte equals one million bytes and one gigabyte equals one billion bytes." The plaintiffs wanted the defendants to use the customary values of 10242 for megabyte and 10243 for gigabyte. The plaintiffs acknowledged that the IEC and IEEE standards define a MB as one million bytes but stated that the industry has largely ignored the IEC standards.

The parties agreed that manufacturers could continue to use the decimal definition so long as the definition was added to the packaging and web sites. The consumers could apply for "a discount of ten percent off a future online purchase from Defendants' Online Stores Flash Memory Device".

Orin Safier v. Western Digital Corporation
On 7 July 2005, an action entitled Orin Safier v. Western Digital Corporation, et al. was filed in the Superior Court for the City and County of San Francisco, Case No. CGC-05-442812. The case was subsequently moved to the Northern District of California, Case No. 05-03353 BZ.

Although Western Digital maintained that their usage of units is consistent with "the indisputably correct industry standard for measuring and describing storage capacity", and that they "cannot be expected to reform the software industry", they agreed to settle in March 2006 with 14 June 2006 as the Final Approval hearing date.

Western Digital offered to compensate customers with a free download of backup and recovery software valued at US$30. They also paid $$4,770,000 Hz$ in fees and expenses to San Francisco lawyers Adam Gutride and Seth Safier, who filed the suit. The settlement called for Western Digital to add a disclaimer to their later packaging and advertising.

Western Digital had this footnote in their settlement. "Apparently, Plaintiff believes that he could sue an egg company for fraud for labeling a carton of 12 eggs a 'dozen', because some bakers would view a 'dozen' as including 13 items."

Cho v. Seagate Technology (US) Holdings, Inc.
A lawsuit (Cho v. Seagate Technology (US) Holdings, Inc., San Francisco Superior Court, Case No. CGC-06-453195) was filed against Seagate Technology, alleging that Seagate overrepresented the amount of usable storage by 7% on hard drives sold between 22 March 2001 and 26 September 2007. The case was settled without Seagate admitting wrongdoing, but agreeing to supply those purchasers with free backup software or a 5% refund on the cost of the drives.

Dinan et al. v. SanDisk LLC
On 22 January 2020, the district court of the Northern District of California ruled in favor of the defendant, SanDisk, upholding its use of "GB" to mean $100,000,000$.

IEC 1999 Standard
In 1995, the International Union of Pure and Applied Chemistry's (IUPAC) Interdivisional Committee on Nomenclature and Symbols (IDCNS) proposed the prefixes "kibi" (short for "kilobinary"), "mebi" ("megabinary"), "gibi" ("gigabinary") and "tebi" ("terabinary"), with respective symbols "kb", "Mb", "Gb" and "Tb", for binary multipliers. The proposal suggested that the SI prefixes should be used only for powers of 10; so that a disk drive capacity of "500 gigabytes", "0.5 terabytes", "500 GB", or "0.5 TB" should all mean $56,000$, exactly or approximately, rather than 500 × 230 (= $44.1 kHz$) or 0.5 × 240 (= $44,100$).

The proposal was not accepted by IUPAC at the time, but was taken up in 1996 by the Institute of Electrical and Electronics Engineers (IEEE) in collaboration with the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC). The prefixes "kibi", "mebi", "gibi" and "tebi" were retained, but with the symbols "Ki" (with capital "K"), "Mi", "Gi" and "Ti" respectively.

In January 1999, the IEC published this proposal, with additional prefixes "pebi" ("Pi") and "exbi" ("Ei"), as an international standard (IEC 60027-2 Amendment 2)  The standard reaffirmed the BIPM's position that the SI prefixes should always denote powers of 10. The third edition of the standard, published in 2005, added prefixes "zebi" and "yobi", thus matching all then-defined SI prefixes with binary counterparts.

The harmonized ISO/IEC IEC 80000-13:2008 standard cancels and replaces subclauses 3.8 and 3.9 of IEC 60027-2:2005 (those defining prefixes for binary multiples). The only significant change is the addition of explicit definitions for some quantities. In 2009, the prefixes kibi-, mebi-, etc. were defined by ISO 80000-1 in their own right, independently of the kibibyte, mebibyte, and so on.

The BIPM standard JCGM 200:2012 "International vocabulary of metrology – Basic and general concepts and associated terms (VIM), 3rd edition" lists the IEC binary prefixes and states "SI prefixes refer strictly to powers of 10, and should not be used for powers of 2. For example, 1 kilobit should not be used to represent $125,000,000$ bits (210 bits), which is 1 kibibit."

The IEC 60027-2 standard recommended operating systems and other software were updated to use binary or decimal prefixes consistently, but incorrect usage of SI prefixes for binary multiples is still common. At the time, the IEEE decided that their standards would use the prefixes "kilo", etc. with their metric definitions, but allowed the binary definitions to be used in an interim period as long as such usage was explicitly pointed out on a case-by-case basis.

Other standards bodies and organizations
The IEC standard binary prefixes are supported by other standardization bodies and technical organizations.

The United States National Institute of Standards and Technology (NIST) supports the ISO/IEC standards for "Prefixes for binary multiples" and has a web page documenting them, describing and justifying their use. NIST suggests that in English, the first syllable of the name of the binary-multiple prefix should be pronounced in the same way as the first syllable of the name of the corresponding SI prefix, and that the second syllable should be pronounced as bee. NIST has stated the SI prefixes "refer strictly to powers of 10" and that the binary definitions "should not be used" for them.

As of 2014, the microelectronics industry standards body JEDEC describes the IEC prefixes in its online dictionary, but acknowledges that the SI prefixes and the symbols "K", "M" and "G" are still commonly used with the binary sense for memory sizes.

On 19 March 2005, the IEEE standard IEEE 1541-2002 ("Prefixes for Binary Multiples") was elevated to a full-use standard by the IEEE Standards Association after a two-year trial period. , the IEEE Publications division does not require the use of IEC prefixes in its major magazines such as Spectrum or Computer.

The International Bureau of Weights and Measures (BIPM), which maintains the International System of Units (SI), expressly prohibits the use of SI prefixes to denote binary multiples, and recommends the use of the IEC prefixes as an alternative since units of information are not included in the SI.

The Society of Automotive Engineers (SAE) prohibits the use of SI prefixes with anything but a power-of-1000 meaning, but does not cite the IEC binary prefixes.

The European Committee for Electrotechnical Standardization (CENELEC) adopted the IEC-recommended binary prefixes via the harmonization document HD 60027-2:2003-03. The European Union (EU) has required the use of the IEC binary prefixes since 2007.

Current practice


Some computer industry participants, such as Hewlett-Packard (HP), and IBM have adopted or recommended IEC binary prefixes as part of their general documentation policies.

As of 2023, the use of SI prefixes with the binary meanings is still prevalent for specifying the capacity of the main memory of computers, of RAM, ROM, EPROM, and EEPROM chips and modules, and of the cache of computer processors. For example, a "512-megabyte" or "512 MB" memory module holds 512 MiB; that is, 512 × 220 bytes, not 512 × 106 bytes.

JEDEC continues to include the customary binary definitions of "kilo", "mega", and "giga" in the document Terms, Definitions, and Letter Symbols, and,, still used those definitions in their memory standards.

On the other hand, the SI prefixes with powers of ten meanings are generally used for the capacity of external storage units, such as disk drives,    solid state drives, and USB flash drives, except for some flash memory chips intended to be used as EEPROMs. However, some disk manufacturers have used the IEC prefixes to avoid confusion. The decimal meaning of SI prefixes is usually also intended in measurements of data transfer rates, and clock speeds.

Some operating systems and other software use either the IEC binary multiplier symbols ("Ki", "Mi", etc.)     or the SI multiplier symbols ("k", "M", "G", etc.) with decimal meaning. Some programs, such as the Linux/GNU ls command, let the user choose between binary or decimal multipliers. However, some continue to use the SI symbols with the binary meanings, even when reporting disk or file sizes. Some programs may also use "K" instead of "k", with either meaning.