Talk:Undefined behavior

Untitled
I got rid of the todo-box comment that "UB is not a feature", because clearly it is a feature of many programming languages. (It's not usually a "feature" in the marketingspeak sense, of course.)

I also subst'ed the todo template, so it would be around to comment on. I don't mind if it's removed. (The anonymous editor makes a good point about the #pragma paragraph, though; it really is out of place.) --Quuxplusone 02:47, 6 December 2006 (UTC)


 * Hi :-) I'm not sure whether the term "feature" applies&hellip; native speakers are in a much better position to decide, and perhaps someone can come up with a wording which avoids the term altogether. BTW, I don't think subst'ing the template was a good idea. Why did you do that? (You made me suspect that I wasn't logged in when inserting my comments but I was :-) Not sure why you say I'm "anonymous". I didn't put a signature after the comments, if that's what you mean, as I thought it wasn't important). &mdash; Gennaro Prota &#8226;Talk 12:06, 6 December 2006 (UTC)

Sorry, I saw after commenting that you were logged in; I tend to assume that if a Talk-page comment isn't signed, it's "anonymous". :) I subst'ed the template so that it would be here on the page in case anyone wondered in three months what I was talking about... except that I just noticed(!) that it didn't actually substitute in the text, which is what I was trying to do. So I've reverted my subst'ing.

Regarding "feature": UB is a "feature" of C and C++ in the same way that currying is a "feature" of ML, or funny operators are a "feature" of APL: it's something a C or C++ programmer is going to have to deal with, because it's a feature of the language. If there's a better word, I'd support it; the only other word I can think of right now is "element", and that has even more confusing overloads in a programming context. :) --Quuxplusone 23:37, 6 December 2006 (UTC)

I took out the nasal demons part of the todo-box because it is just "[r]ecognized shorthand on the Usenet group comp.std.c for any unexpected behavior of a C compiler on encountering an undefined construct.", not actual demons coming out of your nose. In other words, unexpected results is a perfectly possible result of undefined behavior. Also, maybe the todo-box should be taken out altogether because the article currently mentions that undefined and implementation-defined behavior are different. — Daniel 00:19, 13 July 2007 (UTC)


 * Done. --Quuxplusone 06:59, 13 July 2007 (UTC)

Emacs first, or NetHack first?
So, the only "verifiable" paper source we have &mdash; Unix Review &mdash; gives the wrong information! --Quuxplusone (talk) 21:47, 2 November 2008 (UTC)
 * http://blog.djmnet.org/2008/08/05/a-pragmatic-decision/ quotes Unix Review from March 1988, which claims that GCC 1.17 tried to execl emacs, then hack, then rogue. This is the wrong order, at least if we assume that the order didn't change between 1.17 and 1.21.
 * http://www.oldlinux.org/Linux.old/gnu/gcc-1/gcc-1.21.tar.bz2 shows that the actual order in GCC 1.21 was hack, rogue, emacs.
 * http://www.oldlinux.org/Linux.old/gnu/gcc-1/gcc-1.40.tar.bz2 shows that by GCC 1.40 the code had been #ifdef'ed out, but the order remained hack, rogue, emacs.
 * http://everything2.com/e2node/%2523pragma gets the order right: nethack, rogue, emacs. (Note that e2 writes "nethack" instead of "hack"; "hack" is the executable GCC actually tried to execute, but by 1988 "NetHack" was the name of the game.)

Unspecified behavior
It be great if this article explained the differences of undefined behavior with unspecified behavior. --Abdull (talk) 09:05, 4 December 2009 (UTC)

Optimizations
Should any examples be given for compilers exploiting undefined behavior for optimization purposes? For example GCC optimizes away a lot of comparisons involving signed integers by ignoring the possibility of overflow. Hiiiiiiiiiiiiiiiiiiiii (talk) 20:18, 18 May 2011 (UTC)


 * Done -- Martinkunev (talk) 14:26, 25 January 2016 (UTC)

#pragma joke
I think the #pragma joke section does not belong here. #pragma is not undefined, it is implementation-defined behavior. And even though we do not have the specific article, I believe unspecified behavior is a much better match. --Mormegil (talk) 21:23, 21 October 2012 (UTC)
 * I agree, and thus removed the section. Sebastian (talk) 14:43, 24 April 2014 (UTC)

Undefined behaviour resulting from pointer arithmetic
To User:Namezero111111 and other editors who are changing the example which uses pointer arithmetic (via array subscripting) to produce undefined behaviour, please refer to ISO/IEC 9899:1999 §6.5.6 ¶8: "When an expression that has integer type is added or subtracted from a pointer, the result has the type of the pointer operand… If both the pointer and result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined." As you can see, the mere evaluation of such an expression results in undefined behaviour; no dereferencing or assignment need have occurred. —Psychonaut (talk) 09:59, 9 September 2013 (UTC)

gets
There's an entire function in C that has undefined behaviour: gets. Nothing guarantees that the string from stdin fits into the buffer passed to the function. --88.113.189.17 (talk) 15:47, 1 November 2013 (UTC)
 * Just because a function can be used to invoke undefined behaviour doesn't mean that the "entire function… has undefined behaviour". gets is pretty useless (not to mention a horrible security risk) in the general case, but its behaviour is perfectly well-defined when the size of the input is known to be less than or equal to the size of the buffer. —Psychonaut (talk) 15:56, 1 November 2013 (UTC)
 * When is the size of the user input known? You can't trust the user to always input the correct amount of data. If the user happens to write a correctly sized string, the function will behave as expected, but the concept of undefined behaviour doesn't exclude expected behaviour. A program which writes to freed memory can work as expected too, with luck. --88.113.189.17 (talk) 22:37, 2 November 2013 (UTC)
 * Again, you are confusing actual and potential invocation of undefined behaviour. Integer division by zero is also undefined behaviour, but that doesn't mean that the "entire division operator in C has undefined behaviour".  —Psychonaut (talk) 16:14, 4 November 2013 (UTC)

Why is the integer overflow thing listed as a benefit?
There is no advantage at all to this example. All it means is that the "naïve" method of detecting overflow cannot work due to being UB, therefore developers have to bend over backwards to somehow detect the overflow before it happens. It's easy to understand why it's UB when one remembers the C standard does not in any way guarantee that a platform uses Two's complement and assuming it does is bad, but I don't see why one would spin that as a good thing. Medinoc (talk) 14:06, 2 September 2017 (UTC)


 * I agree it's a confused passage. the positive point that the person who wrote that is making is that the optimizer can optimize away code saving space and time. The problem is, he chose a sample of code that might be considered to be "here's how you can use C's efficiency (it doesn't try to detect overflow) and undefined behavior on a known local architecture (therefore defined behavior) to detect overflow if you insert the code appropriate for your semantics" which is illustrating the positive side of C's "close to the metal" nature, and illustrating the pitfalls of undefined behavior and how certain optimizations can screw up that mode. It would be a perfectly good example if it was all explained like that. Also as a small quibble, two's complement arithmetic which you mention is identical to unsigned binary arithmetic which means it doesn't actually require processor support, except for the overflow and sign status flags; otoh other representations for numbers and negative numbers (BCD, sign bit, or ones complement) require their own specialized status flags 66.65.118.87 (talk) 14:24, 2 February 2020 (UTC)

Outdated example: i = i++ + 1 is no longer undefined behavior in C++17.
The article states that

The following example will however cause undefined behavior in both C++ and C.

This is not true in C++17, because = has been added as a sequence point, forcing the right side to evaluate before the left. See https://stackoverflow.com/q/47702220 for details and sources. A better example might be i++ + i++. — Preceding unsigned comment added by 2003:D8:DBC9:8700:9D1E:F4B2:5AC6:B463 (talk) 06:57, 9 April 2019 (UTC)
 * ✅ in . Thanks for pointing this out. Please consider being bolder next time. You could have fixed this 2 years ago. BernardoSulzbach (talk) 18:36, 9 July 2021 (UTC)

Undefined behaviour caused by dereferencing a null pointer
The below example is included in the text but if this is C code, this is incorrect as the NULL pointer is implementation defined. This means that it is not necessarily zero.

int arr[4] = {0, 1, 2, 3}; int *p = arr + 5; // undefined behavior for indexing out of bounds p = 0; int a = *p;       // undefined behavior for dereferencing a null pointer

I believe this is, however, valid C++ code.

In order to make clear the distinction I would recommend replacing:

with

for C or:

for C++. — Preceding unsigned comment added by MaxCampman (talk • contribs) 19:17, 10 April 2019 (UTC)


 * "as the NULL pointer is implementation defined" This is a popular misconception: Yes, the integer value of a null pointer is implementation defined (i.e. what you get if you cast it to, but the C standard says that   creates a null pointer (and that the macro   is just the integer 0). See C99 standard, section 6.3.2.3: "An  integer  constant  expression  with  the  value  0,  or  such  an  expression  cast  to  type void* ,is called a null pointer constant. If a null pointer constant is converted to a pointer type, the resulting pointer,called a null pointer,is guaranteed to compare unequal to a pointer to any object or function.". Confusing maybe, but it's the standard :-). Sebastian (talk) 07:30, 11 April 2019 (UTC)

Using uninitialized objects of automatic storage duration
The article asserts that "In C the use of any automatic variable before it has been initialized yields undefined behavior".

However, this does not seem to be the case. See this post by Jens Gustedt, for example, as well as various WG14 discussions. Worse yet, this appears to be rather underspecified, and different to how it is treated by C++.

I'm struggling to think of the best way to update the article to reflect this information, without unnecessarily confusing people.

Deltax64 (talk) 06:14, 11 March 2021 (UTC)