User:Locke Cole/IEC units are bad

This page outlines what IEC units are, how they came about, how infrequently they are used in the real world, and why they should be avoided in article content (with very rare exceptions).

What are IEC units, how are they bad, and what do we do instead?
The IEC units were something cooked up in 1997 to deal with a discrepancy between how hard disk drive manufacturers used units like "megabyte" and "gigabyte" and how the manufacturers of RAM utilized the units. For hard drives, the manufacturers used a definition of gigabyte that had it equal to one billion (1,000,000,000) bytes. For RAM manufacturers, they had it set to 1,024 &times; 1,024 &times; 1,024 (1,073,741,824) bytes. This is the difference between binary and decimal variations of the units. Because of this discrepancy (and the fact that as you move up from gigabyte to terabyte to petabyte and so forth the discrepancy becomes even more significant) the IEC units were created which used similar names, so instead of kilobyte or megabyte, you had kibibyte and mebibyte. Instead of GB or TB, you had GiB and TiB.

The problem is, the vast majority of sources (primary and secondary), software, manufacturers and scholarly writings still utilize the traditional metric-derived units for both instances. Because our sources don't typically use these new IEC units, our articles are placed in a position of having a single term (ex: gigabyte/GB) having multiple meanings throughout the article. The solution, as devised in WP:COMPUNITS, is to disambiguate conflicting meanings using footnotes or precise definitions within parenthesis.

We do this because the IEC units do not have widespread acceptance. And as most people are generally unfamiliar with these units (heck, most people are barely understanding of megabyte/gigabyte/terabyte to begin with), burdening readers with learning and understanding the difference between what a GB and a GiB is is bad for our readers. Further, it deviates from the sources (which already makes using IEC units a non-starter before you consider that the terms are generally unknown outside of people with a purely technical background like programmers or engineers).

The Wall Street Journal summed it up best in a 2003 article on the issue (one of the two instances of a real world print publication ever referring to the IEC units):.

Finally, it's worth noting the commentary of Donald Knuth, author of The Art of Computer Programming (from What is a kilobyte?; emphasis added):

Now to my astonishment, I learn that the committee proposals have actually become an international standard. Still, I am extremely reluctant to adopt such funny-sounding terms; Jeffrey Harrow says "we're going to have to learn to love (and pronounce)" the new coinages, but he seems to assume that standards are automatically adopted just because they are there. Surely a huge number of standards for other computer things, like networking protocols, have been replaced by better ideas when they came along. Thus I hope it still isn't too late to propose what I believe is a significantly better alternative, and I still think it unlikely that people will automatically warm to "mebibytes". Unsurprisingly, Knuth was right.

IEC unit usage in sources
Detailed below are some examples of IEC unit usage vs. the common usage as found in various source-types. For some, a sanity check is included in the form of fatberg so you can see just how uncommon IEC units are in the wild.

IEC usage in print media

 * List taken from List of newspapers in the United States, List of newspapers in the United Kingdom by circulation, List of newspapers in Australia by circulation and List of newspapers in Canada by circulation.

IEC units in scholarly writings
At User:Thunderbird2/The case against deprecation of IEC prefixes there is a Google Scholars link that is used to determine how many articles are using IEC units. I note that currently for the 2020-2022 period there are 582 hits for MiB/GiB. That same search ran with MB/GB returns 44,900. Granted, some of those may be false positives (since MB/GB are more likely to occur as initials for other terms), so for clarity I ran the search using mebibytes/gibibytes and megabytes/gigabytes. There were 28 hits for the IEC unit, and 1,560 for the traditional metric unit.

It would seem, even among research papers, that IEC units make up a small fragment, about 1.76%. Metric units accounted for 98.23% of the results. As I've already explained above, the wider media at large does not use the IEC units whatsoever, and their use in academic circles appears to be vastly outnumbered by the traditional metric units. And just to continue the "fatberg" sanity check from above, that returned 49 results for the same period.

IEC units in our sources
To be clear, there are sources out there that one might choose to deliberately cite to give an article the appearance that it has more IEC units referring to it than it really does, so claims to the contrary should be met with significant skepticism until the true nature of sources on a topic can be sussed out.

Typically, where no shenanigans have taken place, our sources almost exclusively utilize the traditional metric units for both binary and decimal meanings, and often do so interchangeably with little in the way of disambiguation. Some companies, like Apple or Western Digital, will make clear that their storage products (or products with storage included in them) use terms like GB and TB to refer to products where a GB is one billion bytes and a TB is one trillion bytes, but that is typically the most they will do. They do not use GiB or TiB. When we force these IEC units into our articles, we are giving them undue weight and promoting something that the wider world simply has not adopted. This causes unnecessary confusion for our readers, and forces them to learn about this obscure unit so they can continue reading our article where a simple footnote or parenthetical explanation would have sufficed.

IEC units encourage bad behaviors in our editors
As IEC units are rarely used in our sources, using them in any widespread way may give new editors the impression that reliable sources can be deviated from in significant ways. Sometimes editors may deviate from the unit used in a source, believe they're "fixing" something, and perform a calculation on a value that was actually correct as it was. This can take whatever minor discrepancy exists between the actual metric and decimal values and amplify them. Using the units that are used in the wider world and in the vast majority of sources encourages good editing behavior by not introducing original research that could go undetected until a more experienced editor corrects the mistake.

WP:COMPUNITS annotated
Some editors appear to have difficulty parsing the meaning of the end of WP:COMPUNITS. This section attempts to annotate and explain how this applies to articles, and how the exceptions should be applied. First, the full text as it existed on 2021-07-05T05:31:06.

Text
The IEC prefixes kibi- (symbol Ki), mebi- (Mi), gibi- (Gi), etc., are generally not to be used except:
 * when the majority of cited sources on the article topic use IEC prefixes;
 * in a direct quote using the IEC prefixes;
 * when explicitly discussing the IEC prefixes; or
 * in articles in which both types of prefix are used with neither clearly primary, or in which converting all quantities to one or the other type would be misleading or lose necessary precision, or declaring the actual meaning of a unit on each use would be impractical.

Annotation
At the end of the day if we follow our sources (absent any cherry picking) we shouldn't usually run in to problems. Perhaps one day IEC units will be accepted by the computing and technology industry, and when that day arrives this discussion won't be nearly as controversial as it seems to be. But in the world as it exists as this is written IEC units simply are not used with any significance and we should not force our readers to simultaneously grapple with a topic they're already reading to learn more about and throwing in units they've never heard of to further complicate it for them.