Talk:Endianness/Archive 8

endianness refers to the order of bytes and not the order of bits
I removed the mentioning of endianness sometimes referring to bit order. This is not true. The name of the variable BYTES_BIG_ENDIAN seems to suggest that endianness is related to bit order. Note, however, that BYTES_BIG_ENDIAN is the name of a variable and not a definition of the term Endianness. Furthermore the meaning of the variable BYTES_BIG_ENDIAN has been defined as the order of bits in a bit field. Endianness refers to the order of digits. Even if we define a digit as a set of one bit, endianness still refers to the ordering of the digits and not of the bits. The bit order is defined by its weight (MSB, LSB).

95.172.177.37 (talk) 07:26, 14 October 2019 (UTC)


 * First, BYTES_BIG_ENDIAN is NOT "defined as the order of bits in a bit field". That's the order of bit fields within a structure, not the order of bits within a bit field, and it's BITS_BIG_ENDIAN, not BYTES_BIG_ENDIAN - the order of bytes within multi-byte values and the order of bit fields within a structure are not necessarily the same.


 * Second, nobody claimed that the name of the #define (it's a #define, not a variable) is "a definition of the term Endianness" - BYTES_BIG_ENDIAN is a #define the value of which, on a particular platform, indicates the byte-endianness of that platform, and BITS_BIG_ENDIAN is a #define the value of which, on a particular platform, indicates the bit-endianness of the code generated for that platform. The byte-endianness might be dictated by the instruction set architecture of the platform (for example, x86 doesn't easily support big-endian storage of data, and System/3x0 and z/Architecture don't easily support little-endian storage of data), or it might be an option (for example, PowerPC/Power ISA can run in little-endian or big-endian mode - I think a given Power ISA machine might be able to have one virtual machine/LPAR running AIX or IBM i in big-endian mode, another one running Linux in big-endian mode, and yet another one running Linux in little-endian mode).  The bit-endianness might be be less dependent on the hardware - if the machine has bit-field instructions, it might make the bit endianness match that, but if it doesn't, it's largely up to the compiler.


 * The bit-endianness of code can be thought of as "do bit fields go from MSB to LSB or do they go from LSB to MSB"?


 * In the VAX ISA, where the byte order is little-endian, the bits are numbered from the LSB to the MSB, with the LSB being bit 0, and the bit-field index in the bit-field instructions is relative to bit 0, i.e. relative to the LSB. That's little-endian, so, if you have a 6-bit bit field "A", followed by a 7-bit bit field "B", followed by a 3-bit bit field "C", within a 16-bit quantity, so the 16-bit quantity would, if we show the MSB on the left and the LSB on the right, be

+---+---+--+   | C |   B   |   A  | +---+---+--+


 * So the VAX could be considered as having a little-endian byte order (the least-significant byte has the lowest address) and a little-endian bit-field order (the least-significant bit has the lowest bit-index).


 * In the Motorola 68020 and later, where the byte order is big-endian, the bits, for the bit-field instructions, are numbered from the MSB to the LSB, with the MSB being bit 0, and the bit-field index in the bit-field instructions is again relative to bit 0 in that sense of "bit 0", i.e. relative to the MSB. That's big-endian, so, for the previous example of 3 bit fields, with the MSB on the left and the LSB on the right, that'd be

+---+---+--+   | A |   B   |   C  | +---+---+--+


 * Confusingly, in Motorola's manuals, the drawings show bit 0 as the LSB, which is the opposite of the convention for the bit-field instructions, and the single-bit operations use those bit numbers as the bit offset of the bit being set/tested/cleared. So there is no consistent bit order in the 6802 and later versions of the 68K ISA.


 * So for 68K, you have a big-endian byte order (the most-significant bit has the lowest address), a little-endian bit-order for the single-bit instructions (the least-significant bit has the lowest bit index), and a big-endian bit-order for the bit-field instructions (the most-significant bit has the lowest bit index).


 * In the x86 ISA, where the byte order is little-endian, they don't seem to explicitly say how the bits are numbered for bit and bit-field instructions, but the bits are numbered from the LSB to the MSB, with the LSB being bit 0, so I'm guessing that's the bit index used in the instructions. That'd be little-endian, like the VAX.


 * So the x86 could be considered as having a little-endian byte order (the least-significant byte has the lowest address) and a little-endian bit-field order (the least-significant bit has the lowest bit-index).


 * System/3x0 and z/Architecture are big-endian, but have no bit-field instructions. z/Architecture, at least, has ROTATE THAN XXX SELECTED BITS instructions, for various values of XXX, with bit position immediate fields.  The bit positions are presumably relative to the MSB, because the documentation numbers bits with bit 0 as the MSB.


 * So z/Architecture could be considered as having a big-endian byte order and a big-endian bit-field order.


 * And:


 * The bit order is defined by its weight (MSB, LSB).


 * For bytes in a multi-byte integral value, there's a most significant byte and a least significant byte; on some processors, the most significant byte has the lowest address, and, on some processors, the least significant byte has the lowest address. (And on some, either form of addressing can be used, although that might be a mode bit that needs to be set early in the startup process.)


 * Similarly, if there are instructions that refer to a bit, or a set of bits, within a larger quantity, the bit index might be relative to the MSB or might be relative to the LSB. Guy Harris (talk) 17:41, 14 October 2019 (UTC)


 * A few years ago, I went to a talk by David Patterson, about the RISC-V processor, at which he handed out reference cards. The first thing I notice is that the RISC-V instruction formats have the Opcode on the right. I suppose for a processor with only one instruction length, and that requires instructions to be word aligned, the opcode could be anywhere, but it is interesting to see it on the right in human readable documentation. The variable instruction length of VAX makes it rare to write out a whole instruction in human readable form. In addition to any byte/bit ordering visible to programs, is the byte/bit ordering visible in documentation. Gah4 (talk) 20:51, 14 October 2019 (UTC)


 * And if there's any bit ordering visible to machine-language programs, is it the same as the bit ordering (in the sense of "is the MSB bit 0 or is the LSB bit 0?")? In the 68020 and later, the answer is "it depends" - the documentation uses little-endian bit ordering (LSB = bit 0), the single-bit instructions use little-endian bit ordering, and the bit-field instructions use big-endian bit ordering. Guy Harris (talk) 21:30, 14 October 2019 (UTC)


 * VAX documentation uses the right to left notation, consistent with little-endian. Writing the diagrams right to left, makes the bit order consistent with byte order. I did just notice, though, that the addressing mode selector is the high half of the addressing mode byte. But also, hardware designers have to number bits in their logic diagrams. They might not always agree with the software documentation, though hopefully are consistent within the documentation.  I believe that both Verilog and VHDL allow for either bit numbering convention.  Gah4 (talk) 21:58, 14 October 2019 (UTC)


 * And Motorola 68000 family documentation numbers the bits with the LSB being 0 even though the ISA is big-endian, and uses those bit numbers in the single-bit instructions but not the bit-field instructions, so any expectation that bit order and byte order will always be the same should be dropped. Guy Harris (talk) 23:16, 14 October 2019 (UTC)

This is confusing to mix position ordering and memory ordering at the same time in an example
Removal of This is confusing to mix position ordering and memory ordering at the same time in an example seems right to me, but I thought it could be discussed. Among others is the connection between left to right order, and low to high addressing. Memory in computers doesn't have a left or right! It is convenient to number print columns increasing to the right, but note the VMS DUMP command, which prints hex data right to left, for easier reading. Gah4 (talk) 02:09, 15 May 2020 (UTC)


 * First, it may be helpful to clarify this is about this deletion of the phrase This is analogous to the lowest address of memory being used first. from the 2nd paragraph in the section Endianess, which is introducing big and little endianness with the analogy of left and right.


 * Second, I agree that the removal is an improvement. Mixing real-world technical details in the middle an analogy is likely to just add confusion.  This should be done only after the analogy is complete.


 * On left and right in the analogy: I agree computers don't use left and right. But endianness is not a problem for computers by themselves; they work consistently one way or the other, and never notice any difference.  The issues with endianness basically stem from the people around computers; people know all about right and left, as well as up and down, and impose this on their dealings with computers.  Either looking a computer dumps (as noted), or when hooking two different computers together, people are the ones than can get this "wrong", and have to invent some additional explanations or workarounds.


 * So, yes, analogies of the "endianness problem" with "computers" tend to have "left" and "right" in them (or even "up" and "down"). I can't think of any good analogy that would avoid employing these human perceptions, because that's where the problem actually lies.  If a computer could roll its eyes, it would do so any time we mention the "endianness problem" to it.  --A&#8239;D&#8239;Monroe&#8239;III(talk)  21:57, 16 May 2020 (UTC)

Changes as of 29 May 2020 18:28
«logically most-significant byte of a word of digital data» is too weak as a specification. I indeed assume that endianness pertains to numerical data in place-value notation. E.g. "UNIX" is also digital data, and the "U" could be considered the most-significant byte. - Nomen4Omen (talk) 19:54, 29 May 2020 (UTC)
 * Endianess has nothing to do per se with numerical data or any other data type. The most significant part of a character string is arguably the first character, as typical text sorting illustrates, but this may not be true for RTL languages. I think Unicode has a solution for these issues. You are welcome to propose a more universally fitting term. Kbrose (talk) 20:08, 29 May 2020 (UTC)


 * You may be right in a “per se” abstract sense. But wrong in Danny Cohen's sense and also in normal computer specification. Otherwise all Intel architectures would be little-endian numerically and big-endian in the lexicographically ordered character strings. - Nomen4Omen (talk) 20:18, 29 May 2020 (UTC)

You mention RTL languages and that “Unicode has a solution for these issues”. Maybe. But Unicode is a software solution — and I assume that there is a software solution for all kinds of these problems. But Endianness is a hardware category. And NONE of your edits describes endianness as a software issue. Moreover, all mentioned switching instructions switch the hardware from left- to right-endianness or vice versa.
 * In the RTL article, there is no mention at all of endian, -endian, endianness. Certainly, on an extremely high level of consideration you could speak of the Hebrew alphabet as a little-endian alphabet. (But I never heard or read about such a definition and do not think that such a kind of naming helps anybody to reduce the confusion.)
 * In the end, there is great need to make «logically most-significant byte of a word of digital data» more precise, because I have no idea what logically most-significant means, e.g. for DJT. And: to restrict the issue of endianness explicitly and pronouncedly to hardware. - Nomen4Omen (talk) 10:54, 31 May 2020 (UTC)


 * All I really meant with the comment was that the first letter of a string is used for first stage of sorting, then the second, etc., not some kind if storage sequence of character arrays in sorted lists. And with "first letter" I imply the common notion of the idea. In this sense, I'd assert that the first letter is the most significant character of a string, and the last the least-significant.  Of course that may only be true for certain languages or scripts, but this is the English Wikipedia.  So, it is a logical significance, not a hardware significance tied to storage. That appears to be somewhat equivalent to the common definition of 'most-significant' for numerical data, using the largest digit position (by value) as the most-significant. That seems to be the common sense. You still haven't provided a suggestion that brings you closer to satisfaction, and it is not clear what your intent is—a universal definition that applies to all situations, scripts, languages, etc ? Kbrose (talk) 14:19, 31 May 2020 (UTC)
 * I am not sure, but you appear to object to a definition tied to hardware sequencing. Isn't hardware sequencing the primary meaning of endianness? That has come to mind first whenever I have heard or used the term. I don't even feel tempted to associate it with eggs. The article ought to present first and foremost what is most commonly understood with the term, per WP:COMMON.  Perhaps a separate article is in order to deal with all the variations of the theme for speaking numbers, dates, whatever, in languages. I am sure one can fill volumes with extensions of the idea. Kbrose (talk) 14:36, 31 May 2020 (UTC)

Somehow I am completely helpless how it is possible that you misunderstand me so severely. Is it my poor English?

Didn't I say: How is it possible that you conclude But I remain asking you how you can say in the most important section, namely Endianness: “The right-side graphic shows the little-endian definition. With increasing memory address, the logical significance of each byte increases, so that the most-significant byte is stored at the highest address of the storage region.” And not restricting this definition to numerical —as it has been even in the lead of the article up to your edits— data being handled by the instructions of the different hardware architectures. The point is NOT to “propose a more universally fitting term”, but a definition which pertains to endianness. - Nomen4Omen (talk) 17:55, 31 May 2020 (UTC)
 * 1) ... restrict the issue of endianness explicitly and pronouncedly to hardware ?
 * 2) ... all Intel architectures are little-endian numerically and big-endian in the lexicographically ordered character strings ?
 * 1) that I “object to a definition tied to hardware sequencing” ?
 * 2) that I do not assert “that the first letter is the most significant character of a string” ?


 * Does endianness even apply to character data? I think the lead has it right endianness is the ordering or sequencing of bytes of a word of digital data in computer memory storage or during transmission. I'm not sure that qualifying things with logical in the disputed text is helpful but I don't think it's wrong and I suggest needs to suggest an alternative if we're to contemplate changing it. ~Kvng (talk) 14:42, 3 June 2020 (UTC)

Danny Cohen talks about character and numerical data (integer and floating point). And storage address and transmission. And of the consistency of ordering the bits within these coordinate systems. It is kind of a philosophical view and tries to touch almost all aspects of ordering.

Earlier, other computer people had found out that the addresses are a major coordinate axis which can be used as orientation to talk about the other issues. (As to Danny Cohen, it looks as if the left-to-right axis is more important for him as this anchoring major axis. He also brings the direction of writing, whether left-to-right or top-down, into the discussion and investigates the consistency of all these coordinates.) Moreover, they found out that within character data one wants the first chunk (lowest address) to be the most discriminant one. (I do not really know about RTL writing, but I am not able to find a contradiction to this assumption — nowhere.) So character data are settled and do not require further consideration.

The remaining important type of data are the numerical data, and of them the most important ones the integer data; and handling such data was in the beginning the most important task of the computing machines. As far as I know the computing machines started with the decimal number system coded in Binary Coded Decimal (BCD) digits with the most significant decimal digit first. This is the second major coordinate axis: significance. Later some clever computer scientists found out that three of the four basic arithmetic operations naturally run from low-order to high-order, namely addition, subtraction and multiplication. The less important division and comparison runs the other way around. After implementing in hardware the correlation of the two major coordinates (increasing numerical significance = increasing address) there was a need to distinguish this approach from the previous one. At that time this was done using the manufacturer name. And it appears that there have been some holy wars, but after Danny Cohen's publication of ON HOLY WARS AND A PLEA FOR PEACE the term big-endian was used for anticorrelation (increasing address = decreasing significance) and little-endian for correlation.

So I propose to focus the article mainly on this kind of essence of endianness, i.e. on numerical data related to the coordinate axis of increasing addresses, which is a hardware issue. I propose, however, to include (at least in theory) bit endianness, because (as already Danny Cohen mentions) the bit-shift instructions induce (both, with left and right shift) a consistent ordering of the bytes to the bits. Of course, an outlook on Danny Cohen's more philosophical considerations is not to be suppressed. (But everybody is free to read it, it is easily accessible in the internet and it is really funny.) - Nomen4Omen (talk) 16:58, 3 June 2020 (UTC)


 * So what exactly do you propose to change? ~Kvng (talk) 13:49, 7 June 2020 (UTC)


 * Yes, I'll make a try soon. - Nomen4Omen (talk) 20:24, 7 June 2020 (UTC)

segment descriptors
The 80386 (eventually IA32) segment descriptor is an upward extension of the 80286 descriptor, such that 80286 code still runs. Also, OS such as OS/2 2.0 and up run OS/2 1.x (16 bit protected mode) code. The 80286 has a 24 bit segment origin, so the high bits ended up somewhere else. The 80286 has 16 bit segment length in units of bytes. As there weren't enough bits, the 80386 extends the length to 20 bits, but also has a bit indicating the the unit is 4K bytes. That allows lengths up to 4GB. Gah4 (talk) 06:29, 9 July 2020 (UTC)

Grammar/meaning in Intro Section
In the opening section there's a sentence containing, "it is an attribute of a computing machine resp. its hardware architecture"

What does this even mean? Can someone who understand the original intent please clarify and fix the text. I'd gladly do the edit, but can't parse what it's trying to say. mikro2nd (talk) 09:25, 10 July 2020 (UTC)

More specifically: big- resp. little-endianness is ... - Nomen4Omen (talk) 14:00, 10 July 2020 (UTC)
 * It means that "Endianness is an attribute (or if you like: property) of a computing machine resp. its hardware architecture (or if you like: hardware architecture of a computing machine)".
 * And what does "resp." mean? Guy Harris (talk) 19:13, 10 July 2020 (UTC)


 * To address this question, I went to edit the (poor) use of "resp."; it's an unencyclopedic abbreviation for "respectively". But fixing that didn't make the sentence any better, since that word is overly formal and vague.  Fixing the sentence wording to be more simple and clear, I found the sentence didn't really say very much anyway, other than "it's about hardware".  So I merged that sentence with the one before that was kind of saying the same thing.  Then I had to refactor the following sentence for flow, and then...


 * Anyway, I ended up refactoring the whole paragraph. It's now focused on why endianness matters.  Please review, comment, tweak, improve, or rewrite as you see fit.  --A&#8239;D&#8239;Monroe&#8239;III(talk)  23:24, 14 July 2020 (UTC)


 * It seems correct and uses normal English and appears to be an improvement compared to what we had before. The level of detail and number of parenthitical interjections goes beyond what I expect from a lead paragraph. I will try to simplify it a bit when I get a chance. ~Kvng (talk) 13:52, 17 July 2020 (UTC)


 * I agree with this assessment; most of these issues are in the first sentence (the one I left most intact). Working to simplify this sentence, it began to duplicate the article's first sentence.  So, I simply removed the problem sentence as redundant.  Again, feel free to improve in any way.  --A&#8239;D&#8239;Monroe&#8239;III(talk)  21:46, 24 July 2020 (UTC)


 * By your change has been lost that (in order to call it big-endianness) there have to be hardware instructions which operate exactly on these types of multi-byte objects. It is simple and easy to store the most significant byte of a word at the smallest memory address — every computer can do that, just as it can do the other way around. - Nomen4Omen (talk) 15:36, 25 July 2020 (UTC)
 * Existence of such machine instructions is only a convenient and efficient situation, but not necessary. We can install a memory, such as an EPROM, written on one architecture into another, and the endianess of the data in that memory does not change, and it could in principle read the data either way, big- or little-endian manner. We do this purposely in reverse-engineering, for example. Of course an architecture of either type has instructions that automatically and atomically do the reading or writing correctly. Kbrose (talk) 16:25, 25 July 2020 (UTC)


 * There is no doubt that you can do that. (Didn't I say this above?) But what then characterizes a big-endian machine ? I claim that every big-endian machine can store the least significant byte at the smallest address whatsoever. So every big-endian machine is a little-endian machine, isn't it? - Nomen4Omen (talk) 16:49, 25 July 2020 (UTC)
 * If you consider byte addressability, then the concept of endianness looses relevance. The question is, how do word addressing operations store data. Kbrose (talk) 17:28, 25 July 2020 (UTC)

"the most-significant byte of a multi-byte data element" ? And from where you take your definition of significance ? - Nomen4Omen (talk) 18:24, 25 July 2020 (UTC)
 * 1) I do(!) consider byte addressability and the concept of endianness does not loooose relevance! Many machines have byte addressability and have endianness, some of them have only big-endianness and some have only little-endianness, and some have bi-endianness.
 * 2) To your rolled-back Endianness: Can you explain us what is
 * This seems to be going a circular path. When you write 0x0a0b0c0d, then most people will call 0a the most significant byte. You can substitute 'word' for your multibyte object. I don't think I introduced the multi-byte construct. With byte-addressability data can be stored in any order, there is no relevance to endianness, unless such a machine intentionally stores bytes in one direction or the other, but that becomes a software issue of incrementing pointers. Endianness is defined by word operations Kbrose (talk) 19:11, 25 July 2020 (UTC)

Questions over questions ! Are there other people who don't get your point ? - Nomen4Omen (talk) 19:40, 25 July 2020 (UTC)
 * 1) There are even more (circular?) questions coming up: If I write 0x0d0c0b0a is then 0a the most significant byte ? Does significance depend on the data ? Or do you mean 0x0a0b0c0d is an integer in 0x hexadecimal notation and that 0d is the most significant byte of this integer ?
 * 2) Didn't you comment your change by "whether it is an integer or not is irrelevant" ?
 * 3) Doesn't everybody already know that with byte-addressability everybody can store data in any order ? The only thing could be: that it is understood differently.
 * 4) Doesn't most of the existing article describe machines which have byte-addressability ? And some of them have big-endianness and no little-endianness and some of them have little-endianness and no big-endianness ?
 * 5) You say "a machine intentionally stores bytes in one direction or the other" ? Which intentions do machines have ? Why do machines have which intentions ?


 * Why would there anything be different about 0x0d0c0b0a than the reverse order of values. They are just values, not ordering indexes. Why would such a question even be raised? Perhaps the example should just use random byte values in the quartet. And for the matter of endianness, it is completely irrelevant what they represent to a human or to a computer program. It could be a fixed point real number stored in a four-byte word. The hardware doesn't care what the data means, it only cares about the method by which it is stored, namely as word. If the data is stored as sequence of individual bytes, than endianness is not an issue, the software can store it which ever way the programmer instructed. Again, whether a computer architecture has byte addressability or not is irrelevant. Both, little and big endian machines may have byte addressability.Kbrose (talk) 20:43, 25 July 2020 (UTC)
 * And the byte significance is defined from the integer when the processor interprets the word as an unsigned integer. Vincent Lefèvre (talk) 20:53, 25 July 2020 (UTC)

Change section Definition to Example?
The current article section Endianness has problems. (I think this is what the previous talk discussion about the lede has wandered off into, but I haven't been able to usefully follow it.)

A possible cause of the problems (and confusion with the lede) may come from the section heading itself: "Definition". The definition of any article subject belongs in the lede; why does this article have it a separate following section? One good reason may be that the lede definition, though technically complete and accurate, isn't good for any layperson reader -- too full of jargon, implied context, etc. A good way to explain something like that is to give an analogy -- describing situations not precisely accurate or perfectly complete, but providing enough context to allow enough basic understanding to follow the rest of the subject information. It seems to me that's what this whole section is trying to do, with debatable success.

Before we focus on solutions to this section, can we agree on the goal? Should we change the name of this section to "Example" to do this? ("Analogy" might also accomplish this, but I think it's less engaging to a reader.)

Comments? --A&#8239;D&#8239;Monroe&#8239;III(talk) 19:00, 27 July 2020 (UTC)


 * As far as I can see the § Endianness is a precise formulation of a precise use of the notion of Endianness.
 * Unfortunately, it has to be admitted that already Danny Cohen who has introduced the terminology of endianness into computer science discussed such vague things like
 * consistency of endianness (which has got no relevance at all),
 * endianness of human writings (such as Hebrew or Chinese) which pertains to computer science (but has been successfully handled completely outside the domain of endianness),
 * and endianness of floating point data which is already too complex for defining the notion of significance.
 * Because of these problems being caused by an early and suboptimal introduction of the notion, I propose to stick to a precise definition (which indeed has received great importance in computer science) and then leave the remainder of the article to the more philosophical considerations as already and partially discussed by Danny Cohen.
 * - Nomen4Omen (talk) 19:49, 27 July 2020 (UTC)
 * I don't think that there is a "precise formulation of a precise use of the notion of Endianness", mainly because there wasn't one at the beginning (as you say), and it seems that there was never a strict agreement, and it is not the goal of Wikipedia to define one. This mainly works by examples, IMHO. For floating point, this is particularly difficult, because for instance, with (64-bit) double-precision numbers on 32-bit machines, you have two 32-bit words, and their order is not specified by endianness. That's why ARM FPA used a different encoding from x86 (and later, ARM VFP), probably so that the software implementation did not depend on the endianness (note that in the past, ARM did not have hardware FP, hence the importance of a software implementation). Moreover, with the 80-bit x87 extended precision, you typically have padding bytes, but does their significance depend on the endianness? This has probably never been formally defined. And what about the double-double 128-bit format? It seems logical to have the most significant double always first in memory (as this is typically handled as an array of 2 double elements), but this may be against a general definition of endianness. — Vincent Lefèvre (talk) 21:07, 27 July 2020 (UTC)


 * I completely agree that it is not the goal of Wikipedia to define a strict agreement. But I have never found an objection to the grafics in Endianness. Maybe this is not everything what people think or imagine what Endianness is or could be, but I always found the following evidence with the left graphic: if it has endianness then it has big-endianness and vice versa with the graphic to the right and little-endianness.
 * And btw, the holy war Danny Cohen is talking about is the war being mainly fired by the difference between exactly these two.
 * - Nomen4Omen (talk) 10:58, 28 July 2020 (UTC)
 * I think that the problem is not the graphic itself, but the section title "Definition". suggested "Example". IMHO, "Description" or "Presentation" could be OK too. − Vincent Lefèvre (talk) 12:13, 28 July 2020 (UTC)


 * OK, "Example" is not too bad. Maybe also "Prime example" or "Perfect example" or "Classical example". – Nomen4Omen (talk) 13:44, 28 July 2020 (UTC)
 * "Prime example" or "Classical example" would be OK. I think that the word "perfect" should be avoided. Vincent Lefèvre (talk) 13:51, 28 July 2020 (UTC)
 * Having a "prime" or "classical" example would be great, but I don't think that exists. What we currently have in this section is a good enough example (or can be made so with a little work), but AFAIK it's just one that WP editors have put together, not the one-and-only widely recognized standard example for endianness, as the Hello world example is for programs.  Unless we can find multiple sources  all using the same example, we'd best stick with plain "Example" and admit that it can be improved over time.  --A&#8239;D&#8239;Monroe&#8239;III(talk)  22:06, 28 July 2020 (UTC)

Indeed, there is no example like "Hello world". But, IMHO, binary integer is classical enough. So I propose: ==The prominent example: binary integers== – Nomen4Omen (talk) 07:03, 1 August 2020 (UTC)
 * I agree that binary integers are fundamental to endianness -- in fact, it's largely redundant to state this. But describing this example as "The prominent" one isn't correct.  It's not "The" example, as there are countless others, and is not "prominent" anywhere except here in WP.  Again, just "Example" is accurate, complete, and informative; I see no benefit in trying to embellish the section heading beyond this.  --A&#8239;D&#8239;Monroe&#8239;III(talk)  00:21, 3 August 2020 (UTC)


 * As stated in the footnote [note 4], the addressing scheme of the binary digits in a binary integer has been turned around at that time — in contrast to every earlier convention. This was «the» “revolution” which lead to a war kind of holy. Then this was dubbed little-endian by Cohen. Then the term endianness has been used for everything which has been turned around. So the binary integer is kind of starting point of the terminology. And as you say it is so initial that „in fact, it's largely redundant to state this“. – Nomen4Omen (talk) 09:17, 3 August 2020 (UTC)

Original publication / attribution error
The article states that "Danny Cohen introduced the terms big-endian and little-endian into computer science for data ordering, in an article published by the Internet Engineering Task Force (IETF) in 1980".

The IETF was only founded in 1986, it could not be the original source of publication. — Preceding unsigned comment added by HeidenChristenseun (talk • contribs) 16:31, 20 September 2020 (UTC)


 * I changed it to just say it was in an Internet Experiment Note published in 1980, and left out the publisher. Anybody curious about how IENs were published can follow the link. Guy Harris (talk) 20:09, 20 September 2020 (UTC)

Numeric literals and shift directions in programming languages
I never found a programming language, where the numeric literal “12” had the meaning of twenty-one. Even if the programming language was used on a little-endian machine. Thus, this kind of “shift right” is a BIG-endian way of speaking. And this way of speaking is totally independent of the target machine for which the manual has been written. That's what the text wanted to say. –Nomen4Omen (talk) 10:58, 18 January 2021 (UTC)
 * 1) As far as I can see, the numeric literal “12” stands for the number twelve in almost all programming languages. Since the programming languages are written left-to-right, the “1” stands before (left to) the “2”, so this notation is to be considered big-endian, because the “1” comes before the “2”. The same is true for hexadecimal notation, where 0x102 is defined to stand for the number 258dec: because the 1 stands before the 0 stands before the 1, 0x102 = 258.
 * 1) Similarly, when the term “right” is used with bit-shift operations, it is written “>>” (a sort of arrows to the right) in the manuals
 * (even in, which describes a little-endian machine)
 * and ALWAYS means division. E.g. 0x102 >> 8 = &lfloor;0x102÷256&rfloor; = 0x1 is a shift right, although the two bytes within the 2-byte integer in the Intel machine now contain (0x01, 0x00), so that it apparently has been shifted LEFT from its original 0x102 = (0x02, 0x01). (Pls remember: the Intel machine is a little-endian machine.)


 * Concerning 1, I agree about the analogy, though I think that the term "big-endian" is not used in practice for literals (IMHO, a source would be needed, otherwise this analogy could be regarded as WP:OR). Concerning 2, I agree that "<<" means left shift and ">>" means right shift (I suppose that this notation originally came from the C programming language), but I don't see any relation with big-endian; if there is such a relation, this needs to be clarified, and a source about the relation would be needed too. — Vincent Lefèvre (talk) 12:27, 18 January 2021 (UTC)

OK, if you don't see (“don't see any relation with big-endian”) that the right shift of the byte sequence (0x02, 0x01) >> 8 is a right shift (of the integer 513 resulting in 2) on a big-endian machine; and the right shift of the very same byte sequence (0x02, 0x01) >> 8 is a right shift (interestingly now of the integer 258, resulting in 1) on a little-endian machine, then I have to give up. And I'm absolutely sure that I am not the first observer of this fact. But nevertheless, I am not at all interested in looking up the literature for former observers. –Nomen4Omen (talk) 13:08, 18 January 2021 (UTC)
 * When IBM was designing big-endian System/360 over 50 years ago now, they knew that big-endian was the right way. Little endian is sometimes called wrong-endian. Since carry in addition (and subtraction) is LSD to MSD, it is slightly easier to do on little-endian systems where addresses are increasing. But division is done MSD to LSD, so any system that supports division in hardware might just as well be big-endian. In earlier days, debugging from hexadecimal dumps was not so rare, and reading numerical values is easier in big-endian order.  Though VAX/VMS has fixed this in their DUMP program, which for each line gives the ASCII value left to right, and hexadecimal value right to left, with the address column in between. The unix od program, with the -x option, prints out hexadecimal 16 bit words such that reading bytes come out in the wrong order.  This complicates, for example, reading ASCII data in hex dumps. Most written (non-programming) languages now use Arabic numerals in big-endian order, even when the underlying language is written in a different direction. Yes I am an endianist. Gah4 (talk) 13:11, 18 January 2021 (UTC)
 * Many early computers were word addressed such that this problem didn't occur. Well, many scientific machines use 36 bit words with a 6 bit character set. (Lower case letters hadn't been invented yet.) As well as I know, the software on those machines (with no help from the hardware) stored text data with leftmost characters in the more significant digits. That is, big-endian. Also, as well as I know, left and right shift operations trace back to such word addressed machines. Since as written on paper, << points left, and >> points right, the names of the C operators seem to make sense.  Mnemonics like SHL and SHR were commonly used in assemblers for left and right shift instructions long before C.  Yes in our world big-endian is much more natural. Gah4 (talk) 13:22, 18 January 2021 (UTC)
 * In any case, shifts are defined on numbers, so that things like "(0x02, 0x01) >> 8" do not mean anything. — Vincent Lefèvre (talk) 13:31, 18 January 2021 (UTC)
 * Well shifts can be applied to bit patterns that don't necessarily have a numeric value. Bits in a bit image aren't numbers, but can still be shifted. Ethernet MAC addresses also aren't numbers. Gah4 (talk) 13:47, 18 January 2021 (UTC)
 * The point is that shifts have standard definitions only on integers, via their binary representation. You may want to define shifts on other kinds of abstract data, but then you need to define them, which is not done in this article. — Vincent Lefèvre (talk) 14:33, 18 January 2021 (UTC)
 * In any case, shifts are defined on numbers, so that things like "(0x02, 0x01) >> 8" do not mean anything. — Vincent Lefèvre (talk) 13:31, 18 January 2021 (UTC)
 * Well shifts can be applied to bit patterns that don't necessarily have a numeric value. Bits in a bit image aren't numbers, but can still be shifted. Ethernet MAC addresses also aren't numbers. Gah4 (talk) 13:47, 18 January 2021 (UTC)
 * The point is that shifts have standard definitions only on integers, via their binary representation. You may want to define shifts on other kinds of abstract data, but then you need to define them, which is not done in this article. — Vincent Lefèvre (talk) 14:33, 18 January 2021 (UTC)
 * The point is that shifts have standard definitions only on integers, via their binary representation. You may want to define shifts on other kinds of abstract data, but then you need to define them, which is not done in this article. — Vincent Lefèvre (talk) 14:33, 18 January 2021 (UTC)
 * The point is that shifts have standard definitions only on integers, via their binary representation. You may want to define shifts on other kinds of abstract data, but then you need to define them, which is not done in this article. — Vincent Lefèvre (talk) 14:33, 18 January 2021 (UTC)

Yes, you're right, shifts are defined on numbers. But numbers are represented as byte sequences, and have effects on the byte sequences which are well-known and caused by the definition. And there is no need to define the effects if one can look at them. It is pointless to define away the effects of such a definition as you appear to want to. So, if you have a 32-bit integer, e.g. containing  (as already mentioned in the article), and you shift it to the right on a big-endian then you get   (looking like a right shift), whereas on a little-endian you get   which is apparently a left-shift. –Nomen4Omen (talk) 15:03, 18 January 2021 (UTC)
 * I fully agree with Gah4. Although I would not use the expression wrong-endian. But it's obvious that little-endian is anti-correlated to the usual direction of writing, reading, thinking, which the programming languages adhere to.
 * OK, if I understand correctly, you have a 4-character text stored in memory, which you read as a 32-bit integer, then you perform an 8-bit right-shift, and you store the result back to memory, which is eventually observed as text. Then yes, on a big-endian machine, the text appears to be shifted one character to the right in a LTR script (but one character to the left in a RTL script), and on a little-endian machine, the text appears to be shifted one character to the left in a LTR script (but one character to the right in a RTL script). However, this is very different from what the current text in the article says (which just mentions bit-shifts without explanation). However, even when properly described, I don't think that programmers do such a kind of things in practice (if some do, a source is welcome), so that I'm not sure that this is worth mentioning, at least not at the very beginning. Still, this doesn't imply that bit-shifts are regarded as big-endian. — Vincent Lefèvre (talk) 21:05, 18 January 2021 (UTC)

It IS already mentioned in the article. As a problem – i.e. a trap into which programmers already have fallen. And it tells us that only extremely few programmers have to write programs which have to deal with the problem – and extremely soon, namely during testing, will notice that there is a problem. But this problem is not presented in a systematic way in the article. [By the way, do you know how RTL script is recorded in storage? On a big-endian machine? It is not easy to find that out. But there are good reasons for the assumption that the Hebrew bible is kept in storage starting with the thora at the low addresses. And because a right shift on a big-endian shifts into the high addresses this remains a right shift – and only on screen it appears as a left shift.] –Nomen4Omen (talk) 21:51, 18 January 2021 (UTC)
 * No, it is not mentioned in the article (BTW, your link is invalid, it seems that you left some control character). There is nothing about bit-shift operations in the article. LTR and RTL scripts are stored in the same way: one character after the other, the first character being stored first (the first character being the one on the left in a LTR script, and the one on the right in a RTL script), and such scripts can even be mixed. There are special Unicode characters to indicate the writing directions: Left-to-right mark and Right-to-left mark (see the note about the computer's memory there). Note also that the notion of left and right makes sense only when writing, e.g. on screen. — Vincent Lefèvre (talk) 22:56, 18 January 2021 (UTC)

Endianness is not just bytes in a word
I'm tired of fighting this edit battle, but endianness is not just "bytes in a word" but rather describes arbitrary sets of bits within a bit vector. "Bytes within a word" is the most common use but this arbitrarily narrow definition does a disservice to our users. J.Mayer (talk) 20:38, 13 November 2020 (UTC)
 * , do you want to offer a citation or two supporting this? ~Kvng (talk) 14:43, 16 November 2020 (UTC)
 * Please don't run to playing the citation card when the vast majority of this article remains unsourced.
 * 𝓦𝓲𝓴𝓲𝓹𝓮𝓭𝓲𝓪𝓘𝓼𝓝𝓸𝓽𝓟𝓮𝓮𝓻𝓡𝓮𝓿𝓲𝓮𝔀𝓮𝓭-𝓟𝓮𝓮𝓻𝓡𝓮𝓿𝓲𝓮𝔀𝓮𝓭𝓜𝓮𝓪𝓷𝓼𝓡𝓮𝓿𝓲𝓮𝔀𝓮𝓭𝓑𝔂𝓟𝓮𝓮𝓻𝓼𝓞𝓷𝓵𝔂 (talk) 22:08, 4 February 2021 (UTC)
 * This has been discussed before, with no consensus to add this. Even if current sources are weak, adding unsourced info makes it worse, not better.  Without sources, this won't be added.  --A&#8239;D&#8239;Monroe&#8239;III(talk)  02:31, 5 February 2021 (UTC)
 * I guess, almost nobody disagrees: endianness is not just "bytes in a word" but rather describes arbitrary sets of bits within a bit vector.
 * But it is first "bytes in a word", and later extended to whatsoever. So for the purpose of discussion one may concentrate the first half to "order of bytes in a word and bits in a byte" (although "bits in a byte" could be placed in a separate article), then give an outlook to the so extremely open end, literally and philosophically. –Nomen4Omen (talk) 08:33, 5 February 2021 (UTC)
 * Well, it is at least bytes in a doubleword, quadword, and octoword on some processors. Not to mention where the order of bytes in a word is different from the order of words in a doubleword or quadword. Direct bit addressing is rare, but addressing bits in a word is not so rare. Gah4 (talk) 08:49, 5 February 2021 (UTC)
 * But it is first "bytes in a word", and later extended to whatsoever. So for the purpose of discussion one may concentrate the first half to "order of bytes in a word and bits in a byte" (although "bits in a byte" could be placed in a separate article), then give an outlook to the so extremely open end, literally and philosophically. –Nomen4Omen (talk) 08:33, 5 February 2021 (UTC)
 * Well, it is at least bytes in a doubleword, quadword, and octoword on some processors. Not to mention where the order of bytes in a word is different from the order of words in a doubleword or quadword. Direct bit addressing is rare, but addressing bits in a word is not so rare. Gah4 (talk) 08:49, 5 February 2021 (UTC)
 * Well, it is at least bytes in a doubleword, quadword, and octoword on some processors. Not to mention where the order of bytes in a word is different from the order of words in a doubleword or quadword. Direct bit addressing is rare, but addressing bits in a word is not so rare. Gah4 (talk) 08:49, 5 February 2021 (UTC)

RTL script in RAM
As far as I can find out in Right-to-left mark the first character of a Hebrew text appears on the right of the screen, but is recorded "first" (i.e. at the low address) in RAM. (See also the discussion one section higher with Vincent Lefèvre.) So the Hebrew bible would be held in storage starting with the thora at the low addresses. This would mean (as you say) "addresses increasing to the (screen-)left". But because the first char is the most significant one and is at the low RAM-address, this resembles the big-endian convention.

This would mean that right-to-left (RTL) languages do not have an intrinsic conflict in the (RAM-) big -endian systems!? And consequently have an intrinsic conflict in the little-endian systems!?

Or, in your words: "This conflict between the memory arrangements of binary data and text is intrinsic to the nature of the little-endian convention", and ―as we now know― is a conflict for languages written left-to-right, such as English, but also for RTL scripts!? ―Nomen4Omen (talk) 06:13, 28 April 2021 (UTC)


 * First of all I want to clarify that I merely restored that last sentence in the section to a previous version (5 June 2020) where it looked more complete, and not write it myself, but I'm not sure that version accurately reflects what I wanted, and I've been thinking on rewriting it entirely, without getting into details of how text is stored in memory for different languages and focusing on how those languages represent numbers, but I was out of ideas on how to write it so for the time being I just undid that part of your edition.
 * What I wanted to clarify was that the reason little endian "doesn't feel natural" is not due to little endian itself, but also to the fact that most Western scripts write numbers starting from the most significant place, but this is not a universal fact: for example, Arabic script is written from right to left, but numbers are still written with the most significant digit on the left (and thus in "little endian"), so the "natural" way to represent numbers would be in little endian for them; so it is wrong or misleading to say that this discrepancy between source code and little endian representation is "intrinsic to the nature of little endian" as the article stated: it's intrinsic to little endian AND the way some languages such as English write text and numbers.
 * Do you have any suggestions on how this could be written to better reflect this? Or maybe the paragraph could simply be removed, since it doesn't really contribute much. —Cousteau (talk) 21:25, 28 April 2021 (UTC)
 * Concerning Arabic, even though numbers are still written with the most significant digit on the left, they are read from left to right, like in English.
 * But comparing the way to store numbers and how numbers are written/read doesn't make much sense. If some people think that little endian isn't natural because numbers don't look like the way they are written, how about leading zeros obtained with big endian? Leading zeros are not natural either when writing numbers.
 * When a human reads a sequence of bytes written from left to right, visualizing the numbers is more difficult in little endian, but that's all. While numbers are written from left to right in Arabic, I'm wondering how they write a sequence of bytes. If they write the bytes from right to left, then little endian would be easier to handle than big endian for them. Vincent Lefèvre (talk) 22:12, 28 April 2021 (UTC)


 * That could still be a quirk of the language though. For example, in German, the ones are spelled before the tens, so for example 42 is called "two-and-forty" (zweiundvierzig).  (They still write it as "42" though ☺)  This also happens in English with the "-teen" numbers, e.g. "fourteen" ("four-ten").
 * Regarding storage in memory, I noticed that, while Arabic is written from right to left (and thus the first/rightmost character is stored on the lowest memory position), i.e., Arabic letters are right-to-left Unicode characters, Arabic-Indic numerals (Eastern Arabic numerals) seem to be left-to-right characters, so e.g. "١٠٠٠" (1000) is written with the "١" on the first memory position and the three "٠" on the following three positions, so it was probably a poor example for memory storage since the numbers would still be "stored in big endian in memory". (However, this is probably for consistency with the alternative Western Arabic numerals also in use, which are also left-to-right characters; if text encoding had been invented in an Arabic-speaking country, it would probably store everything scanning the characters from right to left, including numbers.) —Cousteau (talk) 12:24, 29 April 2021 (UTC)

Thank you both for your contributions. I guess that much of the confusion is coming in by mixing up some reading or writing direction with the sequence of addresses in RAM or memory. As far as I can see, there are about 3 coordinates which are relevant in this discussion: As far as I could find out, the reading, writing or "visualizing" (as Vincent Lefèvre names it) direction has absolutely nothing at all to do with RAM addresses: texts stored in RAM have the character which is to be read first situated at the lowest RAM address (where it is impossible to visualize them) ― with the consequence that the coordinates relevant to endianness are totally unchanged by LTR or RTL scripting. (The direction in which things are displayed on screen is specified by extra Right-to-left marks, located also in the computer memory.) Whatever Cousteau wanted to express with his insertion of "(On the other hand, right-to-left languages have a complementary intrinsic conflict in the big-endian system.)" it is wrong, because RTL scripts fit very well to and do not have a conflict with big-endian systems. (This kind of conflict arises only if one identifies the reading, writing or visualizing direction (on paper or on screen) with some recording direction in the RAM, an identification which seems to be very popular, but, in fact, is completely mislead.) −Nomen4Omen (talk) 08:02, 29 April 2021 (UTC)
 * 1) increasing RAM addresses (natural numbers, orderable: simple, but relevant for endianness)
 * 2) "significance":
 * 3) numerics: most important digit in positional notations which is the almost only relevant system of writing numbers, mostly decimal on paper, mostly binary in memory.
 * 4) text strings: sortable one way or the other. But it is extremely important that they are sortable (telephone books etc.). When to be sorted, the first different character decides the order. (This way of sorting coincides with the sort order of (fixed length) numerics when the most significant digit is the first character.)
 * 5) reading direction on paper or screen: left-to-right or right-to left. (Cousteau writes: "Arabic script is written from right to left", but ONLY on paper or on screen and NOT in memory! As far as I could find out, in RAM it is recorded ("written") with addresses increasing.)
 * I believe in Pugh's book there is a statement about big-endian being the right choice. In those days, it was common to debug from hex dumps. If you print out bytes in hex, in little endian order, the digits of each byte are (usually) still big-endian. Even more, the unix  program prints 16 bit words, such that the bytes of a word are in the wrong order. The VMS (hex) DUMP program prints the address in a column, with the ASCII values on the right, and read left to right, and the hex data on the left, read right to left.  Some years ago, I went to a talk about RISC-V, where they handed out a (green) reference card. The instruction formats are written with the opcode on the right. It just looks weird to read instructions with the opcode on the right.  (They are at lower address.)  It is slightly easier to build a simple processor little-endian, such as the 6502. (Look at the 6502 CALL instruction for some fun), as addition, and so carry, is done LSB to MSB. But multiply and divide or done MSB to LSB, so it doesn't help on those machines.  The 8080 is little-endian, as that is the way CTC designed the 8008. I suppose reading hex dumps is less common now, but it is still inconvenient for little-endian systems. Gah4 (talk) 08:27, 29 April 2021 (UTC)
 * Actually, there is no requirement on order in RAM, though it is common to store data at increasing addresses. The first Fortran compiler (on the IBM 704) allocated arrays in decreasing addresses. As long as you do it consistently, there is no problem. I believe code was from the bottom of memory up, and data from the top down. The 704 has 36 bit words, with word addressing. It is common to store 6 characters per word, but there is no hardware to do that. (Actually, tape I/O probably does, but the card reader doesn't.) Many systems allocate the stack at decreasing addresses. But most don't do that. Also, there is often no connection between address order and physical order in hardware. Gah4 (talk) 08:34, 29 April 2021 (UTC)
 * I agree absolutely: it is possible to climb the hill up or slide it down the other way.
 * But this is NOT the issue of the article. The terms big- and little-endianness bind the order of "significance" of numeric digits in a multi-digit positional notation to the order of RAM addresses. And as you say "As long as you do it consistently, there is no problem." But the article deals with effects where consistency is lost, either by leaving the originating hardware or having a guy/girl looking into a hexdump. So, sorrily, endianness has to be made explicit. (Whether the hexdump on its side again defines its own new order, is not at all part of a DEFINITION in this article ― although so many, many people seem to have need to comment on that .) ―Nomen4Omen (talk) 09:23, 29 April 2021 (UTC)
 * Like addition, multiplication is done LSB to MSB. And even though division is done MSB first, it needs normalization. Anyway, endianness matters only if you perform the operation byte by byte in memory, and in such a case, this is possible only for addition and subtraction (in one pass) and multiplication (in multiple passes). Division is much more complex, and division algorithms use multiplication and subtraction at least, so that I think that there is still a benefit for little endian on low-end processors concerning division. Processors that can handle the whole integers in registers are not affected by endianness, since once a value is read in a register, endianness is no longer involved. Note that all this still matters for multiple precision (where one has arrays of words); the endianness of the arrays is chosen by the multiple-precision software and may be different from the word endianness, and little-endian is probably a better choice as explained. This is probably why GMP uses little-endian for its multiple-precision data types. — Vincent Lefèvre (talk) 11:52, 1 May 2021 (UTC)
 * For any processor complicated enough to do division, the difference is small enough to ignore. The 6502 is pretty funny, though. For most, the CALL instruction pushes the address of the next instruction on the stack. Seems obvious enough, but not for the 6502. It pushes one less than the address, as that is what is in the register at the time. The RET instruction then needs to add one. Presumably it simplified the logic, but makes it a little less obvious for programmers. If you want to do indirect jump, you push the address on the stack, and RET, but the address has to be one less. The instruction sets for S/360 and VAX were designed to make it easier for assembly programmers, even if it made it harder for processor designers. (Except that VAX is little endian for most things, except floating point.) But you can't get away from text, as you have to describe the processor to users, and even write Wikipedia articles about them. And even without text, as I noted, the instruction formats for RISC-V have the opcode on the right, though it would normally be on the left. VAX documentation puts the opcode on the left. Oh well. Gah4 (talk) 20:37, 1 May 2021 (UTC)
 * But this is NOT the issue of the article. The terms big- and little-endianness bind the order of "significance" of numeric digits in a multi-digit positional notation to the order of RAM addresses. And as you say "As long as you do it consistently, there is no problem." But the article deals with effects where consistency is lost, either by leaving the originating hardware or having a guy/girl looking into a hexdump. So, sorrily, endianness has to be made explicit. (Whether the hexdump on its side again defines its own new order, is not at all part of a DEFINITION in this article ― although so many, many people seem to have need to comment on that .) ―Nomen4Omen (talk) 09:23, 29 April 2021 (UTC)
 * Like addition, multiplication is done LSB to MSB. And even though division is done MSB first, it needs normalization. Anyway, endianness matters only if you perform the operation byte by byte in memory, and in such a case, this is possible only for addition and subtraction (in one pass) and multiplication (in multiple passes). Division is much more complex, and division algorithms use multiplication and subtraction at least, so that I think that there is still a benefit for little endian on low-end processors concerning division. Processors that can handle the whole integers in registers are not affected by endianness, since once a value is read in a register, endianness is no longer involved. Note that all this still matters for multiple precision (where one has arrays of words); the endianness of the arrays is chosen by the multiple-precision software and may be different from the word endianness, and little-endian is probably a better choice as explained. This is probably why GMP uses little-endian for its multiple-precision data types. — Vincent Lefèvre (talk) 11:52, 1 May 2021 (UTC)
 * For any processor complicated enough to do division, the difference is small enough to ignore. The 6502 is pretty funny, though. For most, the CALL instruction pushes the address of the next instruction on the stack. Seems obvious enough, but not for the 6502. It pushes one less than the address, as that is what is in the register at the time. The RET instruction then needs to add one. Presumably it simplified the logic, but makes it a little less obvious for programmers. If you want to do indirect jump, you push the address on the stack, and RET, but the address has to be one less. The instruction sets for S/360 and VAX were designed to make it easier for assembly programmers, even if it made it harder for processor designers. (Except that VAX is little endian for most things, except floating point.) But you can't get away from text, as you have to describe the processor to users, and even write Wikipedia articles about them. And even without text, as I noted, the instruction formats for RISC-V have the opcode on the right, though it would normally be on the left. VAX documentation puts the opcode on the left. Oh well. Gah4 (talk) 20:37, 1 May 2021 (UTC)
 * For any processor complicated enough to do division, the difference is small enough to ignore. The 6502 is pretty funny, though. For most, the CALL instruction pushes the address of the next instruction on the stack. Seems obvious enough, but not for the 6502. It pushes one less than the address, as that is what is in the register at the time. The RET instruction then needs to add one. Presumably it simplified the logic, but makes it a little less obvious for programmers. If you want to do indirect jump, you push the address on the stack, and RET, but the address has to be one less. The instruction sets for S/360 and VAX were designed to make it easier for assembly programmers, even if it made it harder for processor designers. (Except that VAX is little endian for most things, except floating point.) But you can't get away from text, as you have to describe the processor to users, and even write Wikipedia articles about them. And even without text, as I noted, the instruction formats for RISC-V have the opcode on the right, though it would normally be on the left. VAX documentation puts the opcode on the left. Oh well. Gah4 (talk) 20:37, 1 May 2021 (UTC)


 * The thing is that there isn't a "left" or "right" in RAM, only low addresses and high addresses, with bytes at low addresses being normally meant to be read first. We simply choose what a low address means, and given the way text is read it makes sense to store the leftmost character on the first address, since it's the one that will be read first (for left-to-right scripts).  In this sense, you're actually equating the "increasing address" axis with the time axis, not with the horizontal direction (left/right) axis.  Usually, on little-endian systems you should think of the memory map as going right-to-left and not left-to-right.
 * Regarding sorting, it is true that the usual way to sort words is alphabetically, with the first character being the most significant, so if you want to use the same sorting function for binary (natural) numbers, they must be encoded in big endian for that to work. (Still, as you point out, this would only work if you assume fixed width in the numbers, otherwise you get 25 being collated between 1 and 3.)
 * Regarding significance, big endian presents the most important data first, which is probably something preferable in case you're receiving digits one by one and want to start processing as soon as possible, and an actual weakness inherent to little endian (but that is not what was discussed in the paragraph at hand); however, I don't think it is usual to start processing the information before you have read the whole number. (One case in which this is actually helpful is in UTF-8, which splits long character codes into multiple bytes in big endian, which allows sorting UTF-8 strings with bytewise sorting functions; this wouldn't be possible if it used little-endian.)
 * However, if we're strictly equating the address axis with the significance axis, semantically it would make sense that low address = low significance and high address = high significance, wouldn't it?
 * Regarding the "meaning" of each position, little-endian has the advantage that the first digit is always the ones, the second digit is always the tens (or twos, 16s, 256s, etc), and so on, so numbers are always "aligned" regardless of their length.
 * This alignment trait is exploited in some microprocessors so that they can store a byte, a short, or a long integer on a register and then use it directly: if it were big-endian, a byte would be loaded on the leftmost 8 bits of the register and would need to be shifted to the right, whereas on little-endian the representation is the same for any length, so you save one step.
 * In any case, we're deviating from the point, which is whether this conflict is "intrinsic to the nature of little-endian", as the article previously stated, or "intrinsic to the different natures of little-endian vs the way numbers are represented in plain text". If you don't have any complaints or further suggestions, I think I will use that sentence in the article (and erase the rest of the paragraph).  —Cousteau (talk) 12:24, 29 April 2021 (UTC)

Let's expose the points pertaining to the article: So, as a concluding sentence of the paragraph Endianness I find the previous having more reality and relevance than your But the truth is that plain text has nothing at all to do with endianness in its proper sense. And the interpretion of a hexdump cannot be the purpose of the article, everybody recommends to take a debugging course on that. ―Nomen4Omen (talk) 14:07, 29 April 2021 (UTC)
 * 1) (As you say:) There isn't a "left" or "right" in RAM, only low addresses and high addresses. (Although many, many people "visualize" the low addresses left and the high ones to the right. IMHO, this establishes the pitfalls and trapdoors of the article.)
 * 2) (As I say:) Little-endianness correlates the "address axis with the significance axis", whereas big-endianness anti-correlates. (Nobody knows or is interested in what the "semantic sense" of that could be. Although you clearly seem to attribute semantic sense to correlation.) The other endiannesses do not have one of these positions.
 * 3) (Taken really seriously:) Endianness has nothing to do with human reading or writing and thus nothing with "the way numbers are represented in plain text". Strangely enough, as it turns out: even right-to-left scripts are recorded starting at the low addresses. With the consequence: With respect to RAM-addresses (and that is the axis related to endianness), RTL is as big-endian (and not lttle-endian) as LTR is big-endian (and not lttle-endian).
 * "This conflict between the memory arrangements of text and binary data is intrinsic to the nature of the little-endian convention."
 * "... to the different natures of little-endian vs the way numbers are represented in plain text."


 * The reason I don't like the previous phrasing is that it implies that the little endian convention is "inherently backwards" by itself, when it is simply "backwards with respect to text", i.e., this backwards property only appears when you compare it to another system, so the property isn't "intrinsic" to one or the other, but to their difference. (And, other than when writing numbers, I fail to see how regular text is "big endian" by nature, i.e., with the "most significant characters" first.)
 * Overall, the whole "Byte addressing" section seems to be built upon the premise that bytes are to be represented from left to right, which is a completely arbitrary choice of representation for hex dump that is consistent with how text is stored in memory, and the fact that "4A 6F 68 6E looks like 4A6F686E when you type the bytes in hex from left to right", but that is only due to the way we chose to represent numbers in text form (which happens to be in "big endian"). So basically, the argument being given for little endian "being backwards" is only "because it's different to how we represent numbers".  That is not "intrinsic to little endian" any more than it is intrinsic to the way we represent numbers.
 * Honestly I think that, if we don't reach a consensus, the best approach would be to simply delete that line. As I said, it doesn't really contribute any value to the article, it is more confusing than helpful, and seems to reflect an opinion or personal conclusion more than a fact. —Cousteau (talk) 23:09, 29 April 2021 (UTC)