Talk:C (programming language)/Archive 15

This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 10

←

Archive 13

Footnote b says that main has 2 arguments but there are 3

int main(int argc, char *argv[], char **env)

where char **env is a pointer to pointers of variable definitions in the actual environment.

This nice feature among others, were previously published in other book by Kernighan and Ritchie, where several utilities written in FORTRAN like grep, easier file handling functions, and a rational FORTRAN preprocessor called RatFor. That work, which I do not remember at this time. was reused in the C library and in Unix.

I did not read the article in detail, but at a first glance I did not find a reference to this previous work, which I do not remember the title (I have it in the tip of the tonge).

Since I have not programmed in C since long time ago, some things may have changed in new versions of C, for that reason I have note modified the article. — Preceding unsigned comment added by 189.178.35.11 (talk) 05:18, 23 April 2014 (UTC)

That third parameter for main() is a non-standard extension primarily present on Windows and originating in old Borland C compilers, thus it's not interesting for this article; see here for more details. — Dsimic (talk | contribs) 05:30, 23 April 2014 (UTC)

You are wrong: the env parameter is present on UNIX since forever. It is however rarely used because there is also a global variable environ with the same content. Schily (talk) 09:30, 23 April 2014 (UTC)

I stand corrected, it's available on Unices as well. — Dsimic (talk | contribs) 04:00, 24 April 2014 (UTC)

You may be right, because I programmed in C before Linux existed, but I used the K&R book as reference, and the Unix Programming Environment, too. The link that you posted, does not say nothing about the origin of the third parameter. I do not have any of those at hand now. However see this part of the gcc man page:

-Wmain

Warn if the type of main is suspicious. main should be a function with external linkage, returning int, taking either zero arguments, two, or three arguments of appropriate types. This warning is enabled by default in C++ and is enabled by either -Wall or -pedantic.

You may try to compile this program with gcc/linux:

#include<stdio.h>
int main(int argc, char *argv[], char **env)
{  
   while(*env!=0)
   { 
     printf("%s\n", *env);
     env++;
   }
   return 0;
}

Just to be sure, a look at those books may dissipate the doubt. I came to this article because I do not have those books at hand, maybe you have them or other wikipedian.

Anyway this issue is not a fundamental for the article. Unless it is rewritten to include the influence of the book that I mentioned where RatFor and grep were presented, because many of the functions in C were previously written to ease the use of FORTRAN. I still have the name of the book on the tip of the tonge it is something like: programming tools, a search may find it but I leave it to the volunteer who want to modify the article in this sense. — Preceding unsigned comment added by 189.178.35.11 (talk) 06:35, 23 April 2014 (UTC)

Programming Tools in Fortran, one of a series along with Programming Tools in Pascal and Programming Tools in C. The Fortran one was the only one to include RatFor.

I see no relevance of RatFor to this article, nor any reason to add a 3rd parameter that isn't part of standard or commonplace C practice. Andy Dingley (talk) 09:38, 23 April 2014 (UTC)

I fully agree. While a third parameter pointing to the environment block is a fairly common extension, it's certainly not one sanctioned by the standard. It's also not used all that often, even when it is available. A note to the OP: all sorts of extensions are *allowed* by the standard (including this one), that doesn’t make them standard or universal. Rwessel (talk) 16:27, 23 April 2014 (UTC)

@

I am so confused. This page says that the at sign (@) is part of the C character set, but I can not find any information on what it is used for. Anybody who knows? 5.150.218.51 (talk) 10:13, 14 May 2013 (UTC)

An "@" would be legal in quoted strings, for instance. The point of the comment is that C source uses the POSIX character set (the same as US-ASCII) TEDickey (talk) 10:38, 14 May 2013 (UTC)

Then why does the list not include dollar sign and backtick? 5.150.218.51 (talk) 10:58, 14 May 2013 (UTC)

The ANSI committee purposely limited the character set to a small superset of ISO 646; specifically, starting with the "invariant" ISO 646 subset of ASCII and adding the few additional characters C already used (such as brackets). The backtick (`) and US dollar sign ($) are not part of either ISO 646, nor were they in use by existing C (as of 1988), so that's probably why they were not included. Later when the ANSI proposal became an ISO proposal, trigraphs and later digraphs were added in order to support compilers in environments having only the ISO 646 subset available. Some of this is discussed in section 5.2.1 in the Rationale document, but not these two characters specifically. While it's true that the at-sign (@) is also not part of ISO 646, it could be argued that it is a character required for network protocols (e.g. RFC 822 [1]), so perhaps that's why it was included. — Loadmaster (talk) 17:09, 14 May 2013 (UTC)

Section 6.4.3 Universal character names in ISO/IEC 9899:1999 (E) mentions '$'. VAX C for one did allow dollar signs in identifiers (Apollo's C compiler did also, I recall - but I can cite VAX C more readily). TEDickey (talk) 20:32, 14 May 2013 (UTC)

Good answer, thanks. Also, GNU cpp apparently allows dollar signs in identifiers as an option. 5.150.218.51 (talk) 20:12, 15 May 2013 (UTC)

I don't think @ is part of the C character set. That it is "legal within quoted strings" seems not relevant, because there are many other characters that can included within quoted strings, but they are not listed as part of the C character set. And I don't think it matters that @ is a part of network protocols, because that would still relate to @ being in a quoted string, unless I am missing something. Bottom line: what is an example where @ is used as part of C syntax, is not in a quoted string, and compiles? 71.212.102.14 (talk) 03:05, 12 November 2013 (UTC)

I have to agree. Leave it out. Nasnema Chat 06:06, 12 November 2013 (UTC)

Yes, @ is simply not part of the C character set. The standard is quite explicit about what is. Rwessel (talk) 08:03, 12 November 2013 (UTC)

What does the C book say?

One thing is the statement "@ is a valid character" other thing "@ is a keyword"! — Preceding unsigned comment added by 189.178.35.11 (talk) 04:53, 23 April 2014 (UTC)

It's easy to get confused if you don't realize that the C Standard intentionally allows implementations to support features beyond those required for all Standard-conforming implementations, provided that they don't conflict with the specified set. Of course, if a program uses such extensions then it is limited to only those platforms supporting the extensions. The commercial-at and dollar-sign characters are allowed in implementations that choose to support them, but support for them is not required. In the latest (2011) version of the C Standard, "universal character names" (e.g., \u0040 for commercial-at) must be supported somehow by conforming implementations, but some C implementations haven't yet implemented this new feature. — DAGwyn (talk) 16:27, 7 June 2014 (UTC)

unsourced argument regarding encoding

Essentially each sentence in that comment lacks a WP:RS; some part of it is perhaps valid, but is essentially an opinion TEDickey (talk) 18:10, 7 June 2014 (UTC)

Fixed. — DAGwyn (talk) 19:12, 7 June 2014 (UTC)

Added References

I have added few citations to the Related Languages section. Please verify so that we can get the no references tag removed. Regards !! SlimShadyLFC (talk) 05:29, 9 August 2014 (UTC)

Blogs are unfortunately not reliable sources. Rwessel (talk) 09:14, 9 August 2014 (UTC)

OK Thank you for letting me know! SlimShadyLFC (talk) 17:05, 9 August 2014 (UTC)

The "standard-conforming" hello-world example is NOT standard-conforming!

It did not return with zero. — Preceding unsigned comment added by Dannyniu (talk • contribs)

https://en.wikipedia.org/wiki/Special:Search?search=Hello+world&prefix=Talk%3AC+%28programming+language%29%2F&fulltext=Search&fulltext=Search - Richfife (talk) 03:09, 14 November 2014 (UTC)

this is a own idea of mine.my question is that whatever we type we should get output of that problem is that possible? — Preceding unsigned comment added by 117.216.235.230 (talk) 14:59, 2 December 2014 (UTC)

Sorry, but I'm having troubles to understand your question. Any chances to elaborate a bit, please? — Dsimic (talk | contribs) 10:39, 5 December 2014 (UTC)

There was a discussion about this before. The reasoning was that the program is perfectly conforming, except its return value is undefined. That's probably true from a pure conformance standpoint, but it's still a bug. QVVERTYVS (hm?) 17:09, 11 January 2015 (UTC)

In the absence of an explicit return, the return value is not undefined. It is 0. "int main(int argc, char *argv[0]) {}" sets the exit code to 0. Always. - Richfife (talk) 23:56, 11 January 2015 (UTC)

In C89 the value is undefined (although the actual exit from the program does not otherwise invoke undefined behavior), in later versions, it's zero. As discussed earlier, there are a number of improvements one might make to that program, but there's something to be said for maintaining the minimum distance from the classic version. And there is *no* justification for altering the classic version. Rwessel (talk) 03:56, 12 January 2015 (UTC)

A bug? As described in earlier discussions, and as Richfife explained it above, it isn't a bug. — Dsimic (talk | contribs) 23:58, 11 January 2015 (UTC)

Whether or not it is a bug depends on which standard the implementation conforms to. I suspect that most current implementations don't yet fully conform to C11, although probably most of them conform to C99. I had fixed this in my edit of 2015-03-29T00:06:07‎, but somebody reverted it. Omitting the explicit return is a debatable kludge that is required only for the special case of the main function, and therefore is pedagogically a bad idea when trying to explain how things work. It is appropriate for the original "hello, world!" code only because it was actually done that way in the original 'C Programming Language' book. —DAGwyn (talk) 10:37, 29 May 2015 (UTC)

retargetable C compiler

My understanding is that all early compiler implementations, including all early C compiler implementations, were custom-written to generate machine code for only one particular machine (the target machine).

What was the first retargetable compiler for the C language? --DavidCary (talk) 13:51, 8 April 2015 (UTC)

Possibly the Portable C Compiler, although I'm not completely sure. I hope this helps. --I8086 (talk) 16:08, 10 April 2015 (UTC)

Yes, PCC is generally considered the first readily retargetable C compiler, and it was specifically intended to be so. However, some people did adapt Ritchie's PDP-11 C compiler to other target architectures that weren't too different in general design. My current version of the Ritchie compiler generates only PDP-11 code but runs on non-PDP-11 hosts (along with a C port of the assembler). PCC was followed inside AT&T by QCC and RCC which accommodated a wider range of targets. Of course, GCC is the most widely used retargetable C compiler these days. —DAGwyn (talk) 10:48, 29 May 2015 (UTC)

puts should be used instead of printf for the hello world example.

printf is not the appropriate function to use in this scenario. The string "hello, world\n" is not a formatted string so there is no need to use printf. Instead I suggest the puts function for this situation. The two main advantages for use puts instead of printf are 1. The newline is printed after the text removing the need for the newline at the end of the string 2. Using puts introduces less overhead in comparison to printf. I understand that in this instance the end result will be the same and in fact even old versions of gcc (and recent of course) will optimize the printf to become a call to puts when appropriate see http://www.ciselant.de/projects/gcc_printf/gcc_printf.html Just because gcc can optimize this example does not mean that all compilers can do such so it is important that we have a correct example. Also the fact that gcc optimizes (when appropriate) calls to printf making them puts furthers my argument that the examples could be better optimized. Another advantage for using puts instead of printf is that beginners may see the example and make the same mistake that this article does. Remember the goal of an encyclopedia is to teach and teaching suboptimal programming is going against that goal. Sonic12228 (talk) 23:47, 20 July 2014 (UTC)

This comes up constantly. Everybody has a conflicting opinion as to the "correct" example. Best to just leave it alone. - Richfife (talk) 01:24, 21 July 2014 (UTC)

It makes no difference except that printf is familiar and puts comparatively less so. There is no need to optimize how "hello, world" is printed here, and the historical significance of "which appeared in the first edition of K&R" overrides other considerations. Johnuniq (talk) 01:28, 21 July 2014 (UTC)

Yeah, it's best to leave the example unchanged. However, we might consider adding a note somewhere that using puts() would produce the same results without potentially introducing additional overhead etc. — Dsimic (talk | contribs) 02:28, 21 July 2014 (UTC)

That does seem like a good idea Dsimic. How about I add something like this: The reason printf was used in favor of the puts function was due to historical significant and not due to printf being the most suitable function. The puts function will print a newline character after the string has been printed thus removing the need for the newline character at the end of the string. In addition using printf introduces overhead owing to the fact that variable arguments have to be accounted for and the string checked for formatting. Some compilers such as GCC can replace calls to printf with calls to puts when this would produce the same end result.Sonic12228 (talk) 03:03, 21 July 2014 (UTC)

That sounds good to me, though I'd suggest it to be just a bit less wordy. In addition, the reference you've already provided for the optimization performed by GCC should also be included. However, let's see first what Richfife, Johnuniq and other editors think about adding such a note, if you agree. — Dsimic (talk | contribs) 03:31, 21 July 2014 (UTC)

I disagree. There are many ways you could write "Hello, world". We don't need a deep analysis of what the alternatives are. It's here because that's the way it was introduced in K&R (as noted in the article). So as a first issue, there is no justification whatsoever for changing the "original" version. And the second version is rightfully a minimally modified version conforming to modern C standards. Nor is the example "suboptimal" in any meaningful way. Wikipedia is not a programming tutorial, and notions of performance (and to what extent they might actually be real for this example - and if we actually cared about a form with the largest probability of being well optimized, a series of putc() macro invocations would probably be best), are far outside the scope of the article. "Hello, world" is meant to be illustrative of a fairly minimal program in a language, and it meets that requirement, and manages to illustrate a notable bit of history at the same time. That being said, if K&R had used puts(), it would have been slightly better, but they didn't. Rwessel (talk) 07:52, 21 July 2014 (UTC)

K&R used printf because it is the correct function for a beginner—their next example uses the same printf with "%d\t%d\n" to print two integers. Telling a beginner to use puts for a string and printf for something else would generate pointless confusion. This article is not a text book with how-to advice and there is no need for a note—if a note were desirable, all code examples would need a similar note with optimization information. Johnuniq (talk) 09:36, 21 July 2014 (UTC)

As others have pointed out, using <printf> is not a mistake; it was a deliberate choice. The book used it because it is more versatile and is needed for subsequent examples. There is no requirement, especially in an introductory text, for most code to be optimized by the programmer. —DAGwyn (talk) 10:55, 29 May 2015 (UTC)

Remove/De-merge C Intermediate Language redirect

C Intermediate Language was (inappropriately, IMO), merged into this article in January 2014. I've started a discussion to either delete the redirect or revert the old CIL article back to its prior (non-redirect) form. Wikipedia:Redirects_for_discussion/Log/2015_May_29#C_Intermediate_Language Rwessel (talk) 04:59, 31 May 2015 (UTC)

IBM 310

The paper at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.138.35&rep=rep1&type=pdf bafflingly contains *two* references to IBM 310 systems, the second, "Some issues, like addressability on the IBM 360 and 310 series", can't be dismissed as a typo. Unfortunately no such system exist (List_of_IBM_products is hardly a RS, but the list of computers is pretty complete). The S/360 (and it's successor S/370), quite popular at the correct time, *was* an early port of C. This is even implied by some of the other mentions in the same article. Almost all of pre-S/360 IBM systems still in use at that time 700/7000/14xx names, although there were some exceptions. OTOH, all of those naturally used characters smaller than 8 bits, so any C port would have been a pretty clumsy fit. IIRC, K&R1 (and 2) mention S/360 as an early port, but I'm not around my copies of those at the moment. So anyway, I'm still certain that this is an obvious typo, although the reference above clearly confuses the issue. Rwessel (talk) 19:36, 21 June 2015 (UTC)

Hello! As I've noted in my edit summary, my knowledge of IBM's early systems is really weak. However, we need to follow the references, what turns this into a somewhat rough situation as even a Google search on "IBM 310" returns next to nothing usable. Furthermore, I've checked second edition of The C Programming Language book and it mentions System/370, here's a quotation:

C was originally designed for and implemented on the UNIX operating system on the DEC PDP-11, by Dennis Ritchie. The operating system, the C compiler, and essentially all UNIX applications programs (including all of the software used to prepare this book) are written in C. Production compilers also exist for several other machines, including the IBM System/370, the Honeywell 6000, and the Interdata 8/32. C is not tied to any particular hardware or system, however, and it is easy to write programs that will run without change on any machine that supports C.

Hm, this seems to introduce even a bit more confusion as S/370 is the S/360's successor? — Dsimic (talk | contribs) 19:57, 21 June 2015 (UTC)

Actually S/370 may be just as like for the first port as S/360, perhaps even more so. You've definitely got the quote I was remembering, and I'll take your word for it that the quote is "S/370" in K&R. The sentence "Some issues, like addressability on the IBM 360 and 310 series" makes perfect sense if it's a typo for "...360 and 370 series", the two machines have basically the identical issues for addressing in user-mode programs. Perhaps this is a transcription error from sloppy handwriting, or something like that, and that would explain both mentions in the ref. This paper was also originally published in 1978, so it's quite possible this started as an OCR scan of a "The Bell System Technical Journal". C was created in 1972, a couple of years after S/370 was introduced (although virtual memory/machine support was not introduced until 1972), and K&R1 was published in 1978, at which point many thousands of S/370s had shipped, but many thousands of S/360s were still in use. So the port could have been to either, but if K&R say S/370, I'd say that's definitive. The paper above mentions an IBM/370 (VM based) port of Unix in 1976. The K&R quote also aligns with the above paper in mentioning the Honeywell and Interdata machines. Rwessel (talk) 20:22, 21 June 2015 (UTC)

Let me just add that the above quote is from the second edition's included preface to the first edition. With all you've described, and the fact that IBM 310 really seems to be non-existing, I'd say they we should correct "IBM 310" into "IBM S/370" while using The C Programming Language as the reference. — Dsimic (talk | contribs) 20:45, 21 June 2015 (UTC)

DMR's Bell page https://www.bell-labs.com/usr/dmr/www/portpapers.html also links to this paper, and mentions that it is, in fact, an OCR scan of the original:

In 1976-1977 the Unix system was rendered portable, thus starting a continuing industry. The account by Steve Johnson and me, `Portability of C Programs and the UNIX System,' was published in the Bell System Technical Journal; it is now on-line as PDF, Postscript, or HTML formats.

This is rendered via OCR from BSTJ v57 #6 part 2 (Jul-Aug. 1978; pp. 2021-2048). Johnson and I seem to have misplaced the original source.

So I certainly want to call the 310/370 thing an OCR glitch. There's also a link to a 1975 version of the "C Reference Manual" https://www.bell-labs.com/usr/dmr/www/cman.pdf (the "May 1975" is on the homepage https://www.bell-labs.com/usr/dmr/www/ ) which includes the 370 reference. To complicate things, just below that is a reference to a contemporaneous "Programming in C - A Tutorial" by Kernighan ( https://www.bell-labs.com/usr/dmr/www/ctut.pdf ), which mentions a (partial) compiler running on OS/360 (OS/360 did run on S/370s, although the "360" was removed from the name before too long) and the text mentions 360s a few times (so it may have been using OS/360, but running on a S/370). I don't mind going to the S/370 reference, but maybe going with S/360 and the Kernighan paper is better. Rwessel (talk) 22:06, 21 June 2015 (UTC)

Agreed, it's almost certain that "370" has been misinterpreted as "310" by the OCR software, while unfortunately nobody reviewed the digital version in detail before it has been published. IMHO, we should go with S/370, just because we have two references mentioning S/370 and only one with S/360. Might not be fair or absolutely correct, but at least would resemble the principle of triple modular redundancy. :) — Dsimic (talk | contribs) 16:17, 22 June 2015 (UTC)

I've gone ahead and made the change, and tossed in an inline comment about the 310/370 thing to hopefully avoid this issue in the future. Rwessel (talk) 20:38, 22 June 2015 (UTC)

Looks good to me, thank you for bringing it up in the first place! — Dsimic (talk | contribs) 07:20, 23 June 2015 (UTC)

Lead paragraph stuffed with "mentions"

This final paragraph of the lead section seems to be stuffed with "mentions" of other things:

Many later languages have borrowed directly or indirectly from C, including C++, D, Go, Rust, Java, JavaScript, Limbo, LPC, C#, Objective-C, Perl, PHP, Python, Verilog (hardware description language),[4] and Unix's C shell. These languages have drawn many of their control structures and other basic features from C, usually with overall syntactical similarity to C that sometimes includes identical simple control structures.[9][10][11] C is also used as an intermediate language for other languages,[12] and for building standard libraries and runtime systems for higher-level languages, such as CPython.[13]

Is this really a summary of the article's content? The mention of fairly obscure languages like Limbo or Verilog in particular seem like they've been stuffed in there. I don't think it is. Realistically this could be drastically reduced and most of the "mentions" moved into the body of the article. NotYourFathersOldsmobile (talk) 01:38, 6 October 2015 (UTC)

Agreed. I've removed that poorly-sourced and non-lede compliant paragraph. Material can be added back to the body as appropriate AND reliable sources can be provided to support it. The Dissident Aggressor 01:53, 6 October 2015 (UTC)

There's an article about this phenomena... somewhere. Can't find it right now. Basically newcomers see one thing in an article and add another. And then another newcomer adds another. And another and so on. It's tolerated because it's a way for people to ease in to making bigger edits, but eventually you get an article where every sixth character is a comma. Anyway, you can prune it, but it will grow back in a month or so. - Richfife (talk) 01:57, 6 October 2015 (UTC)

Well, the lead section should be expanded so it better sums up the article. Plus, we should resolve the referencing issues elsewhere in the article, such as in the C (programming language) § Relations to other languages section, as the lead section should be only a summary of the article that's this long. — Dsimic (talk | contribs) 02:45, 6 October 2015 (UTC)