Talk:IEEE 754

Exponent range
I reverted a recent edit regarding exponent range. The complication is the IEEE 754 binary has one significand bit before the binary point, where some other formats have the binary point on the left. If you don't consider that, then the exponent is one higher, but actually it isn't. That is, if you need to compare to other formats, you will be off by one bit, or 0.30 decimal digits. But if you don't need to do that comparison, then it is fine. Note that, unless I forgot, the decimal formats have the decimal point on the right of the significand. Gah4 (talk) 07:33, 9 September 2023 (UTC)
 * Decimal E max is defined in the article as Emax × log10 base. For binary64, Emax = 1023, which gives Emax × log10 2 ≈ 307.95; so the table seems correct.
 * Concerning your last sentence on the decimal formats, the IEEE 754 standard describes 2 conventions: one with the exponent e, where there is one digit before the fractional point, and one with the exponent q, where the significand is regarded as an integer (i.e. the fractional point is on the right of the significand). The exponent e is useful for both binary and decimal, as it allows one to give the rule emin = 1 − emax. However, for the decimal formats (which are not normalized), the exponent q is used to select the representation in a cohort in case of exact result (i.e. no rounding needed), because a value may have several representations.
 * — Vincent Lefèvre (talk) 01:14, 10 September 2023 (UTC)
 * But the "true" Decimal Emax is larger. E.g. in binary64 some numbers larger than 1·10308 making "true" Emax slightly larger than 308. Also in the table in its current edit, the binary256 value is the "true" Decimal Emax (log10 MAXVAL) is stated and not the approximation of Decimal Emax 262143·log10 2 that is listed for the other precisions. I would suggest that (what I refer to as) the true Decimal Emax is used in this table. See my suggested "Interchange format table adjustments" below. Nsmeds (talk) 22:01, 10 September 2023 (UTC)
 * I agree that the definition Emax × log10 base (which is probably WP:OR) is bad. Taking the maximum finite number as you suggest (or the next power of the base, which is very close) would be much better. — Vincent Lefèvre (talk) 23:17, 10 September 2023 (UTC)

Interchange format table adjustments
In an exercise I was conducting to learn more about the more novel GPU FP formats I stumbled upon some inconsistencies in the table describing some properties of IEEE-754 FP formats and submitted an edit. I found that some entries in the column Decimal Emax were not consistent to the values stated in the respective web pages for the individual formats binary128, binary64, binary32 and binary16, but the binary256 entry was consistent. My edits were revoked by @Gah4, hence I initiate this discussion topic (on Gah4's suggestion).

I created a spreadsheet page to redo the calculations as a fun pastime and came to the same result as the binaryXX pages in Wikipedia so I made the edit.

As an example, the binary128 page states the largest regular number that can be represented is 216383×(2−2−112) ≈ 1.1897314953572317650857593266280070162×104932. The log10 of this value is ≈ 4932+log101.18973 which is larger than 4932. The table on the page I edited states a number smaller than 4932.

Since my spreadsheet used the same formulas for each binary format and agreed with the binaryXX pages as well as one of the entries on the IEEE-754 page I felt encouraged that my calculations were right.

Images of the current version of the table, my suggested edited tabele , and a screenshot of my sheet are provided. The spreadsheet itself is provided as a URL link. Love to hear your comments. Nsmeds (talk) 16:17, 10 September 2023 (UTC)
 * As I noted above, it is the 216383 part. So, it is either 216383 or 104931.77. That is then multiplied by the significand, which is between 1 and (almost) 2. You don't take part of the significand and move it to the exponent. Gah4 (talk) 21:41, 10 September 2023 (UTC)
 * One that they explain in detail is decimal32, which has Emax of 96, which allows for values up to 9.999999 times 1096. The largest value is almost 1097 but Emax is 96. The actual explanation is an integer between 0 and 9999999 multiplies by a power of 10 beween -101 and 90, because the bias is 101, and not the more obvious 96. In any case, Emax is what it is defined to be, independent of the actual larges value combining the exponent and significand. Gah4 (talk) 06:30, 11 September 2023 (UTC)
 * @Gah4 I think you are mixing up Emax, which is the largest exponent in the base of the floating point format in question, and Decimal Emax which should be the base 10 logarithm of the largest number representable in the floating point format. The integer part of this is the largest exponent you will see if printing a value. The fractional part tells you how far into this decade the fp format reaches
 * Just because this particular article mentions its own(?) definition of Decimal Emax doesn't make it. Emax for decimal32 is either 96, or 89 depending if you want to view the largest mantissa to be 1.1111111 or 11111111. Decimal Emax will regardless be (in my opinion) 96.999..
 * (base 10 log of the largest number). The current definition used in this article (with the exception of binary256) appears to be "base 10 log of the positive number with the largest exponent and the smallest possible mantissa". Is this a useful measure? And why is it then not used in the pages for binary16,32,64,128,256 ? Nsmeds (talk) 08:12, 11 September 2023 (UTC)
 * So my challenge to the base x log10(Emax} definition used is that for binaryFP the "correct" value should be log10(2^(Emax+1) - 2^(Emax-p)), where the+1 comes from binary fp are in the half open interval [1,2) and p is the number of mantissa bits, not counting the assumed leading 1. This is approx equal to log10(2x2^Emax) = Emax x log10(2) + log10(2). Decimal Emax in the table for binary FP is off by log10(2) except for binary256, in my opinion. Nsmeds (talk) 09:03, 11 September 2023 (UTC)
 * No. This article is about the IEEE 754 standard, and so follows the way it defines things. They define Emax, and we use that definition. Now, there are some cases where they don't define things, or in general where a standard doesn't define something, and in those cases we have a little leeway in how we do it. The way they define it, with bias and such, is a little strange, but that doesn't change the actual Emax. There are some (non-IEEE) floating point systems that use a different position of the radix point. IEEE 754 binary is very close to that used by VAX, but the position of the binary point is different. Now, if the question is for the log10 of the largest or smallest floating point value, then it is different, as you note. But that isn't the question. Gah4 (talk) 19:46, 11 September 2023 (UTC)
 * @Gah4 That is indeed the question. So
 * a) The table is currently not using the same interpretation of Decimal Emax for all entries so at least one entry should need to be edited
 * b) Is there any reference suggesting that "Decimal Emax" should be defined as the value used in this article?
 * c) Is there any practical use of a definition of Decimal Emax in the form used here ( Largest exponent value in the format x LOG10(base) )? The current value is neither representative of the highest value the format can store, nor the largest exponent printed when printing the format's largest value in decimal form.
 * Sure, Emax may be defined in the standard and should not be used different from the standard in that case. Decimal Emax is a property of the binary format (not just the definition of Emax) and due to the nature of this particular format it has the value LOG10(2^(Emax+1)-2^(Emax-p)). For decimal FP the computation of Decimal Emax will be different due to the inherent properties of that format (and it will different from - but not unrelated to - the highest possible encoded exponent. In decimal floating point the value of Emax is ambiguous since it will depend on where you want to define the position of the decimal point to be (after the first digit in the mantissa, after the last digit or at any other point). Nsmeds (talk) 23:35, 11 September 2023 (UTC)
 * There could be a column named: log10 of the largest representable value, but there isn't. There are some who actually worked on the standard, unlike me, who often post here. For the decimal forms, decimal Emax is the actual Emax. For binary, it is log10(2)*Emax, which makes a lot of sense. You would want to call it something else if it wasn't Emax. Gah4 (talk) 05:04, 12 September 2023 (UTC)
 * @Gah4 It would be lovely to hear from anyone of those who worked on the standard on this. I will not repeat myself, but the current column is not self consistent and I believe has no solid basis for its claimed definition of "Decimal Emax". The IEEE decimal32 standard has an Emax of 96 (or 91 depending on your preferred position of the arbitrary decimal point) and I have seen no listing of "Decimal Emax" to compare to but I still argue it would be ≈97(1-5•10-13)). Nsmeds (talk) 07:43, 12 September 2023 (UTC)
 * The emax defined in the standard is designed to be used only with the associated base β (because the factor between the maximum finite number and βemax depends on the base β). So formulas like log10(2)×emax do not make much sense. — Vincent Lefèvre (talk) 10:36, 12 September 2023 (UTC)
 * @Vincent Lefèvre, thanks. I can volunteer to rename the column to LOG10(maxval) and enter the correct values. Nsmeds (talk) 19:09, 12 September 2023 (UTC)
 * Personally I don't mind the way it is, but then maybe that is because I can figure it out fast enough. I did have to go through some years ago, the difference between VAX and IEEE, in the way things are defined. VAX has 0 bits before the binary point.
 * I do believe that the article could better explain the positions of the radix points in the different formats. Along with that, the way the bias is defined. Gah4 (talk) 20:37, 12 September 2023 (UTC)
 * @Gah4 let's work together on making those important concepts clearer. I haven't used the VAX format, but it sounds as if it got its revenge to it's loss in fight for binary format in the decimal format 😄  encoding [0..1) vs [1..B), B being the base. Nsmeds (talk) 06:23, 13 September 2023 (UTC)
 * @Vincent Lefèvre, thanks. I can volunteer to rename the column to LOG10(maxval) and enter the correct values. Nsmeds (talk) 19:09, 12 September 2023 (UTC)
 * Personally I don't mind the way it is, but then maybe that is because I can figure it out fast enough. I did have to go through some years ago, the difference between VAX and IEEE, in the way things are defined. VAX has 0 bits before the binary point.
 * I do believe that the article could better explain the positions of the radix points in the different formats. Along with that, the way the bias is defined. Gah4 (talk) 20:37, 12 September 2023 (UTC)
 * @Gah4 let's work together on making those important concepts clearer. I haven't used the VAX format, but it sounds as if it got its revenge to it's loss in fight for binary format in the decimal format 😄  encoding [0..1) vs [1..B), B being the base. Nsmeds (talk) 06:23, 13 September 2023 (UTC)
 * @Gah4 let's work together on making those important concepts clearer. I haven't used the VAX format, but it sounds as if it got its revenge to it's loss in fight for binary format in the decimal format 😄  encoding [0..1) vs [1..B), B being the base. Nsmeds (talk) 06:23, 13 September 2023 (UTC)

Possible modifications to the "ranges" table
Here is a playground where I intend to suggest a slightly modified version of the table. As it stands right now there are some repetitions that can be avoided and some info that could be added. Have a bit of patience and I will have a suggestion in a few days. (Editing tables is a pain)

I want to add information about sub-normal numbers and compact some information. I will try to not make too many changes here, but instead make small edits in a personal sandbox and then larger updates here when I think a discussion could be useful.

I need to work around the unfortunate wrapping in some places and that some columns are unnecessarily wide.

Here is the current table:

Note that in the table above, the minimum exponents listed are for normal numbers; the special subnormal number representation allows even smaller numbers to be represented (with some loss of precision). For example, the smallest positive number that can be represented in binary64 is 2−1074; contributions to the −1074 figure include the E min value −1022 and all but one of the 53 significand bits (2−1022 − (53 − 1) = 2−1074).

Decimal digits is the precision of the format expressed in terms of an equivalent number of decimal digits. It is computed as digits × log10 base. E.g. binary128 has approximately the same precision as a 34 digit decimal number.

log10 MAX is a measure of the range of the encoding. Its integer part is the largest exponent shown on the output of a value in scientific notation with one leading digit in the significand before the decimal point (e.g. 1.698·1038 is near the largest value in binary32, 9.999999·1096 is the largest value in decimal32) Nsmeds (talk) 19:00, 13 September 2023 (UTC)
 * The binary log10 MAX round, or maybe truncate, to two digits after the decimal point. I think the decimal values should also do that. Gah4 (talk) 00:45, 14 September 2023 (UTC)
 * Yes, it is strange that the "log10 MAX" values for the decimal formats are much more accurate than the ones for the binary formats, but I'm not sure how this could be presented in a good way. — Vincent Lefèvre (talk) 10:51, 14 September 2023 (UTC)
 * Seems that the choices are 96.99 and 97.00. Either one is fine with me. For those who understand floating point enough to ask the question, either one will be fine. For those that don't, no value will help.  As above, though. I think the article still needs to explain better the position of the radix point in the different formats. I had out for another question: "Alpha Architecture Handbook" which has the VAX formats in it. VAX uses 0 bits before the binary point, but 128 and 1024 bias. And the highest exponent value isn't special. Gah4 (talk) 03:46, 15 September 2023 (UTC)
 * It seems that some of the binary format values are rounded up, so rounding up to 97.00, etc., seems fair. Gah4 (talk) 03:54, 15 September 2023 (UTC)
 * When one doesn't specify, one generally rounds to the nearest. — Vincent Lefèvre (talk) 08:51, 16 September 2023 (UTC)
 * Yes. I am also wondering how many digits they should have. Three after the decimal point might be too many. Gah4 (talk) 09:39, 16 September 2023 (UTC)
 * The reason why the decimal formats has a higher accuracy in the table is simply because it is easy to express their values. I thought it better to write 97-2.2·10-15 than round it to 97.00. For the binary values there is no other way when expressing them in decimal format than rounding. But I will insert my suggested edited table now. Nsmeds (talk) 20:18, 20 September 2023 (UTC)
 * When one doesn't specify, one generally rounds to the nearest. — Vincent Lefèvre (talk) 08:51, 16 September 2023 (UTC)
 * Yes. I am also wondering how many digits they should have. Three after the decimal point might be too many. Gah4 (talk) 09:39, 16 September 2023 (UTC)
 * The reason why the decimal formats has a higher accuracy in the table is simply because it is easy to express their values. I thought it better to write 97-2.2·10-15 than round it to 97.00. For the binary values there is no other way when expressing them in decimal format than rounding. But I will insert my suggested edited table now. Nsmeds (talk) 20:18, 20 September 2023 (UTC)
 * The reason why the decimal formats has a higher accuracy in the table is simply because it is easy to express their values. I thought it better to write 97-2.2·10-15 than round it to 97.00. For the binary values there is no other way when expressing them in decimal format than rounding. But I will insert my suggested edited table now. Nsmeds (talk) 20:18, 20 September 2023 (UTC)

Suggestion for a revised table:

Note that in the table above, the min exponent value listed is for normal binary numbers; the special subnormal number format allow for values in smaller magnitude to be represented, but at a loss of precision. The decimal format does not define a "subnormal" form of values as such, but numbers with a leading 0 in the mantissa and an exponent with the minimal value of the format can be seen as an analog to the subnormals of the binary formats.

Decimal digits is the precision of the format expressed in terms of an equivalent number of decimal digits. It is computed as digits × log10 base. Eg binary128 has approximately the same precision as a 34 digit decimal number.

log10 MAXVAL is a measure of the range of the encoding. Its integer part is the largest exponent shown on the output of a value in scientific notation with one leading digit in the significand before the decimal point (eg 1.698·1038 is near the largest value in binary32, 9.999999·1096 is the largest value in decimal32). The value in the table is rounded towards zero.


 * I would remove the column "Bias" for 2 reasons: 1) the bias is useful only when the encoding is described, while the encoding is ignored here; 2) its meaning depends on the radix: for the binary formats, the bias is related to the exponent e, and for the decimal formats, it is related to the exponent q (so, without detailed information, this is confusing). Also note that MOS:ABBR says that one writes "e.g." (with periods, and not italicised). — Vincent Lefèvre (talk) 22:52, 20 September 2023 (UTC)
 * I agree with you, Vincent. I kept it to not make too many changes from the original table, but happy to remove it. Nsmeds (talk) 09:51, 21 September 2023 (UTC)
 * Better would be one that has actual meaning. But the way it is, it suggests to people that they don't understand it, so they should read the article more carefully. (At least that is what I did.) I suppose we should see what the standard says, though. Gah4 (talk) 23:09, 21 September 2023 (UTC)
 * It could be explained, but in a specific section on the encoding of the binary and decimal formats. Having the bias in this table is misleading (in addition to being useless for most readers), because its definition is different for the binary and the decimal formats (the standard gives it in two different tables: a table for the binary formats and a table for the decimal formats). — Vincent Lefèvre (talk) 23:21, 22 September 2023 (UTC)
 * Seems that it is worse than that. For binary, it is fine. There is one (hidden) bit before the binary point, and the bias gives the right value for the exponent. For decimal, the min/max work if there is one digit before the decimal point. But instead, it is defined with the decimal point to the right of the significand, and the different bias. Two different definitions at the same time. People reading the table now, will notice the inconsistent bias, and then read the article to find out why. (That is what I did a few days ago, even though I read it all before.) Since the standard allows for either the densely packed decimal or pure binary significand, it probably makes sense for the bias to be defined that way. It would help a lot, if the article just came out and said that. Until DFP gets more popular, though, there might not be so many interested in reading about it. Gah4 (talk) 21:58, 23 September 2023 (UTC)
 * The biased exponent depends on the unbiased exponent (e or q). For the decimal formats, the representation is not normalized, and for a given operation, the choice of the member of the set of the representations that give the considered value (this set is called a cohort) is done using the exponent q (because this is simpler and more natural). That's why the definition of the bias uses the exponent q for the decimal formats. — Vincent Lefèvre (talk) 23:17, 23 September 2023 (UTC)
 * For IBM S/360 and successors, HFP, prenormalization for add and subtract is done based on the exponents. Unnormalized values can be surprising. The Fortran AINT function works by adding 0 with a biased exponent of 7. At prenormalization, the other value is shifted to match the exponents, shifting the digit before the hexadecimal point into the guard digit. The post normalization shifts back. Digit past the hexadecimal point are lost, just as AINT requires. But not all do that. Multiply and divide prenormalize, shifting out left zeros. Gah4 (talk) 11:01, 24 September 2023 (UTC)
 * Seems that it is worse than that. For binary, it is fine. There is one (hidden) bit before the binary point, and the bias gives the right value for the exponent. For decimal, the min/max work if there is one digit before the decimal point. But instead, it is defined with the decimal point to the right of the significand, and the different bias. Two different definitions at the same time. People reading the table now, will notice the inconsistent bias, and then read the article to find out why. (That is what I did a few days ago, even though I read it all before.) Since the standard allows for either the densely packed decimal or pure binary significand, it probably makes sense for the bias to be defined that way. It would help a lot, if the article just came out and said that. Until DFP gets more popular, though, there might not be so many interested in reading about it. Gah4 (talk) 21:58, 23 September 2023 (UTC)
 * The biased exponent depends on the unbiased exponent (e or q). For the decimal formats, the representation is not normalized, and for a given operation, the choice of the member of the set of the representations that give the considered value (this set is called a cohort) is done using the exponent q (because this is simpler and more natural). That's why the definition of the bias uses the exponent q for the decimal formats. — Vincent Lefèvre (talk) 23:17, 23 September 2023 (UTC)
 * For IBM S/360 and successors, HFP, prenormalization for add and subtract is done based on the exponents. Unnormalized values can be surprising. The Fortran AINT function works by adding 0 with a biased exponent of 7. At prenormalization, the other value is shifted to match the exponents, shifting the digit before the hexadecimal point into the guard digit. The post normalization shifts back. Digit past the hexadecimal point are lost, just as AINT requires. But not all do that. Multiply and divide prenormalize, shifting out left zeros. Gah4 (talk) 11:01, 24 September 2023 (UTC)
 * The biased exponent depends on the unbiased exponent (e or q). For the decimal formats, the representation is not normalized, and for a given operation, the choice of the member of the set of the representations that give the considered value (this set is called a cohort) is done using the exponent q (because this is simpler and more natural). That's why the definition of the bias uses the exponent q for the decimal formats. — Vincent Lefèvre (talk) 23:17, 23 September 2023 (UTC)
 * For IBM S/360 and successors, HFP, prenormalization for add and subtract is done based on the exponents. Unnormalized values can be surprising. The Fortran AINT function works by adding 0 with a biased exponent of 7. At prenormalization, the other value is shifted to match the exponents, shifting the digit before the hexadecimal point into the guard digit. The post normalization shifts back. Digit past the hexadecimal point are lost, just as AINT requires. But not all do that. Multiply and divide prenormalize, shifting out left zeros. Gah4 (talk) 11:01, 24 September 2023 (UTC)
 * For IBM S/360 and successors, HFP, prenormalization for add and subtract is done based on the exponents. Unnormalized values can be surprising. The Fortran AINT function works by adding 0 with a biased exponent of 7. At prenormalization, the other value is shifted to match the exponents, shifting the digit before the hexadecimal point into the guard digit. The post normalization shifts back. Digit past the hexadecimal point are lost, just as AINT requires. But not all do that. Multiply and divide prenormalize, shifting out left zeros. Gah4 (talk) 11:01, 24 September 2023 (UTC)

sortability
There is a recent edit noting that IEEE-754 values are sortable as sign-magnitude. I believe this is true for most sign-magnitude floating point formats, at least for normalized values when they can be unnormalized. (I am not sure about denormals, though.) The PDP-10 floating point format uses two's complement on the whole word for negative values, such that they are comparable using integer compare instructions. Not many processors supply a sign-magnitude compare operation, though. Gah4 (talk) 20:38, 10 November 2023 (UTC)
 * Do you mean that the PDP-10 two's complement also applied on the exponent field, i.e. changing the sign of the FP number would also change the encoding of the exponent? That's important to make the FP numbers comparable using integer compare instructions. — Vincent Lefèvre (talk) 00:33, 11 November 2023 (UTC)
 * Yes the whole word, including exponent. I suspect that hardware uncomplements it before using it. Maybe harder for humans to read, though. I am not sure what you mean by encoding of the exponent, but I believe that there can be carry into the exponent. Gah4 (talk) 03:14, 11 November 2023 (UTC)
 * Yes the whole word, including exponent. I suspect that hardware uncomplements it before using it. Maybe harder for humans to read, though. I am not sure what you mean by encoding of the exponent, but I believe that there can be carry into the exponent. Gah4 (talk) 03:14, 11 November 2023 (UTC)
 * Yes the whole word, including exponent. I suspect that hardware uncomplements it before using it. Maybe harder for humans to read, though. I am not sure what you mean by encoding of the exponent, but I believe that there can be carry into the exponent. Gah4 (talk) 03:14, 11 November 2023 (UTC)

It's really old
It lacks many details for professionals.

It lacks simplicity as well.

They can't decide even audience after that many years.

It is really important. Boh39083 (talk) 05:03, 19 November 2023 (UTC)


 * This is hard to make sense of. Can you elaborate a bit, with more context and more specific details? –jacobolus (t) 03:43, 20 November 2023 (UTC)

decimal exponent
I did a revert to a change on decimal exponent values. I believe it is right, because of the way they are defined, but I start this in case someone wants to discuss it, as I noted in the edit summary. Gah4 (talk) 17:20, 19 January 2024 (UTC)
 * The change was correct. The decimal exponent values got wrong in 1179557460 (but before that, there were already errors in some values for decimal64, which were introduced in the previous change 1179553657 by Nsmeds). I've corrected another value in 1197287634. — Vincent Lefèvre (talk) 22:17, 19 January 2024 (UTC)
 * OK, I am confused. What I see now isn't what I remember from the differences I saw before. There is always the question of the position of the decimal point, and I thought it was just that. In any case, we are discussing them, which is what I wanted. Gah4 (talk) 03:49, 20 January 2024 (UTC)
 * Concerning the position of the decimal point, this can make a difference of 15 or 16 in the exponent for decimal64, but here, this was a factor 2. — Vincent Lefèvre (talk) 13:53, 20 January 2024 (UTC)
 * Yes, I thought it was closer to 15, and now I see that it was a factor of 2. That is why I said I was confused about it. In any case, it is good to discuss here. Gah4 (talk) 23:22, 20 January 2024 (UTC)
 * I am sorry for not keeping in touch on this issue. There have been some other things keeping me occupied lately. Looking at the table as it stands today https://en.wikipedia.org/w/index.php?title=IEEE_754&oldid=1210204290 I am happy with the way it looks as regards the table on some useful properties of the IEEE-754 floating point formats.  I went through my spreadsheet to compare FP formats and found a bug in the computations of estimated log10(MAXVAL) and have a public (LibreOffice) copy uploaded to my Google Drive share
 * I will update the table with adjusted values of the Decimal format log10(Maxval) estimates. I am not sure, but I think that is the issue you discussed in this thread? Nsmeds (talk) 17:13, 25 February 2024 (UTC)
 * This is what I think the best approx is to log10(MAXVAL)
 * $${\alpha = 1+maxexp}, {\beta=maxexp - digits }$$
 * $$log_{10}(10^\alpha-10^\beta)=\alpha+log_{10}(1-10^{\beta-\alpha})=\alpha+{{log_{10}(1-10^{\beta-\alpha})}\over{ln(10)}}=$$
 * $$=\lbrace {\alpha > \beta > 0} \rbrace = {\alpha - {{10^{\beta-\alpha}}\over{ln(10)}}} + \Omega({10^{2\cdot(\beta-\alpha)})} =$$
 * $$={\alpha - log_{10}(e)\cdot 10^{\beta-\alpha}} = {\alpha - 10\cdot log_{10}(e)\cdot 10^{\beta-\alpha-1}} \approx {\alpha - 4.34\cdot 10^{\beta-\alpha-1}}$$
 * $$= {(1+maxexp) - 4.34\cdot 10^{-(1+digits)}}$$ Nsmeds (talk) 18:44, 25 February 2024 (UTC)
 * Yes, if there are p digits, then the correction term is
 * $$\ln(1-10^{-p})/\ln(10) \approx -10^{-p}/\ln(10) \approx -4.34 \cdot 10^{-p-1}$$
 * — Vincent Lefèvre (talk) 03:21, 26 February 2024 (UTC)
 * Thanks for confirming. If you want to (and if you have access to LibreOffice/OpenOffice and dare to enable my macros), have a look at the spreadsheet. Some of the formats there are not officially accepted and/or necessarily correctly described, but I find it illuminating to feed in various research formats and see what comes out. :-) Nsmeds (talk) 08:39, 26 February 2024 (UTC)
 * $$\ln(1-10^{-p})/\ln(10) \approx -10^{-p}/\ln(10) \approx -4.34 \cdot 10^{-p-1}$$
 * — Vincent Lefèvre (talk) 03:21, 26 February 2024 (UTC)
 * Thanks for confirming. If you want to (and if you have access to LibreOffice/OpenOffice and dare to enable my macros), have a look at the spreadsheet. Some of the formats there are not officially accepted and/or necessarily correctly described, but I find it illuminating to feed in various research formats and see what comes out. :-) Nsmeds (talk) 08:39, 26 February 2024 (UTC)

Introduction to History
Though I agree that an introduction in IEEE 754 would be useful, there are several issues with what has been added in 1210457800, so that I'm going to revert this change (mainly because of the first point below):
 * First, this should be an introduction to the history of the standardization of floating-point arithmetic, not a history on FP arithmetic (there is an article Floating-point arithmetic on this larger subject, with its own history). So, here, it should just be explained what was before IEEE 754 and why standardization was needed.
 * In the added text, "non-enumerable" is pointless and misleading: The issue compared to integers is that the considered numbers cannot be represented exactly in general. This is the case even if you restrict to the computable real numbers (which form the subset of $R$ that really matters in computing), which are enumerable (countable), but also just rational numbers.
 * Binary representation is not always used.
 * The last two sentences are not clear, but anyway, they are not related to the standardization (like the whole paragraph).
 * The image is missing, but anyway, it is not related to the standardization either.
 * Note that "mantissa" and "de-normal" are not the correct terms (there are also English and typographic mistakes).

BTW, in section History of Floating-point arithmetic, there is a paragraph on the standardization, which could serve as a basis: "Initially, computers used many different representations for floating-point numbers. The lack of standardization at the mainframe level was an ongoing problem by the early 1970s for those writing and maintaining higher-level source code; these manufacturer floating-point standards differed in the word sizes, the representations, and the rounding behavior and general accuracy of operations. Floating-point compatibility across multiple computing systems was in desperate need of standardization by the early 1980s, leading to the creation of the IEEE 754 standard once the 32-bit (or 64-bit) word had become commonplace. This standard was significantly based on a proposal from Intel, which was designing the i8087 numerical coprocessor; Motorola, which was designing the 68000 around the same time, gave significant input as well."

— Vincent Lefèvre (talk) 01:54, 27 February 2024 (UTC)


 * @Vincent Lefèvre I agree with your comments. Hopefully there is someone with us able to volunteer a more suitable introductory paragraph? Nsmeds (talk) 12:50, 27 February 2024 (UTC)