Talk:C (programming language)/Archive 12

This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 5

←

Archive 10

I hate to say it...

But I'm starting to wish you could semi-protect an article only from IP addresses in a particular country. Check the history if you don't know what I mean. - Richfife (talk) 20:12, 29 October 2010 (UTC)

Things don't work like that on the internet.109.240.89.13 (talk) 20:52, 31 October 2010 (UTC)

Actually, it often does. Regional IP blocking is quite common. I was not making a serious suggestion, though. Just expressing frustration. - Richfife (talk) 23:33, 31 October 2010 (UTC)

Nah, we just need to stop allowing anonymous IP's to edit the encyclopedia...

Sebastian Garth (talk) 15:23, 4 November 2010 (UTC)

Yes. And we should only allow experts to post. And they should be peer reviewed. Why hasn't that been tried? Oh, it has. - Richfife (talk) 16:53, 4 November 2010 (UTC)

I just meant that most people who should be banned from any given website probably also know their way around such bans.109.240.208.115 (talk) 22:56, 2 November 2010 (UTC)

So you want to ban entire country because several editors are vandalizing pages. I doubt this is a good idea 1exec1 (talk) 20:14, 3 November 2010 (UTC)

As I said, "I was not making a serious suggestion, though. Just expressing frustration." - Richfife (talk) 21:56, 3 November 2010 (UTC)

POVvy style

Some small fixups needed: C is not one of the most popular programming languages of all time, it is one of the most used languages of all time (I use it extensively but many aspects of it are really annoying); it is used for producing "fairly portable" programs, not just "portable"; there are also other signals that indicate fan club edits:

A standards-compliant and portably written C program can be compiled for a very wide variety of computer platforms and operating systems with little or no change to its source code.

is not well balanced, a port usually needs some port recoding because of the unstandardized signed/unsigned and wchar_t policies of different compilers, and no change only for rare trivial code. Rursus dixit. (^mbork³!) 10:29, 18 December 2010 (UTC)

Thing is, the sources cited purport to measure "popularity", not amount of use (whatever that means in this context). To phrase otherwise would be to misconstrue the sources. Portability issue addressed. --Cybercobra (talk) 00:34, 19 December 2010 (UTC)

The distinction between "popular" and "used" is lost on me. If some aspects are "annoying", that seems a different issue. 97.126.54.135 (talk) —Preceding undated comment added 07:56, 29 December 2010 (UTC).

I think it is fair to say that C is well suited for writing portable programs. Yes, some changes are usually needed to the source code. Obviously UNIX does not have drive letters, Windows does. There is the matter of '/' vs '\'. And there are surely other differences that need to be taken into account. Some other languages are also rather portable, but in a different kind of way. Java can run in any JVM (and use JIT), Javascript can run in any browser, perl can run in any perl interpreter. There are surely other examples of portable languages. However, a difference seems to be that the low-level nature of the C language and close relationship to assembly language makes it easier to write a C compiler for any cpu, and then C is portable as a machine-language executable to that architecture. Also, I would say the C preprocessor makes it much easier to write portable programs. 97.126.54.135 (talk) 08:31, 29 December 2010 (UTC)

If you have porting problems with signed/unsigned or wchar_t, you're not using them correctly. I can see that that would annoy you, but it's not inherent in C. — DAGwyn (talk) 12:49, 23 January 2011 (UTC)

I feel like using "popular" would be more biased because popularity isn't exactly measured by a quantitative factor, whereas "used" is provable in statistical data 71.75.97.45 (talk) 01:32, 4 February 2011 (UTC)

Related ... kind of ... not ...

Section Related languages shoots itself in the foot regarding Python:

has a different sort of C heritage. While the syntax and semantics of Python are radically different from C, the most widely used Python implementation, CPython, is an open source C program. This allows C users to extend Python with C, or embed Python into C programs ...

which is nonsense. The text claims that it has a heritage, then no heritage at all (confusing "implemented in" with "inheriting from"), then alleging a diversity of connections that has nothing to do with "heritage" as proof of heritage.

The fact is that Python has no heritage from C at all. ~~I propose removing it.~~ Rursus dixit. (^mbork³!) 15:10, 14 June 2011 (UTC)

No, I propose rewriting it, making a subsection of C-implemented languages. They don't inherit, they peruse. Rursus dixit. (^mbork³!) 15:23, 14 June 2011 (UTC)

As Python has been removed from the related languages section, should it also be pulled from the "influenced" section of the infobox (and the back link from the Python infobox)? Rwessel (talk) 01:15, 15 June 2011 (UTC)

"Standard conforming"? What standard

To what lengths are we going to implement "standard"s? Somewhere in comments about various entries in the Underhanded C Contest (website http://underhanded.xcott.com/) there is the point that the "printf" of "hello, world" should be:

   printf("%s", "hello, world\n")
or 
   printf("%s\n", "hello, world")

Old_Wombat (talk) 11:37, 25 April 2011 (UTC)

Gonna need a more specific citation than an entire website. --Cybercobra (talk) 14:05, 25 April 2011 (UTC)

That claim is made occasionally, it's not correct. It *is* a very bad idea to pass in an unknown format string to printf(). Thus ps="hello, world"; ... printf(ps); is risky if you don't know for sure that ps points to a valid format string. And if the string to be printed contains any escapes("%") that printf will try to interpret, it requires special care to pass as the first parameter. But obviously neither of those apply in this case. Rwessel (talk) 15:57, 25 April 2011 (UTC)

Let's just use "Hello World" as it first appeared and go about our day. I'm sure we can all write hundreds of valid variants ("Let's find a Taylor Series that generates those ASCII codes!", "I can do it in 27 characters!", etc.) and arguing over which one is the best is no more likely to get anywhere than arguing over bracket placement or byte-ordering. - Richfife (talk) 17:35, 25 April 2011 (UTC)

It's funny just how often "Hello World" debates crop up, isn't it? Sort of like debating about which direction to put the shirt on the hanger—should we *really* spend so much time arguing about such trivial things? Jeez. Sebastian Garth (talk) 03:07, 26 April 2011 (UTC)

I'm a toilet-paper-rolls-to-the-front kind of person myself. The string "Hello, world\n" is not an unknown format string; the programmer can see every character in it in the call to printf(). Let's just leave the original Hello, world example the way it is in the original white book, shall we? — Loadmaster (talk) 02:42, 13 August 2011 (UTC)

Strong typing vs. weak typing

In the Wikipedia article on strong typing, C is classified as a strongly typed language, and the examples there seem to illustrate that; so why in the infobox to this article is C tagged as a weakly typed language? —Avstin (talk) 05:12, 7 October 2011 (UTC)

This has been discussed several times, you may want to search the talk page archives. Strong vs. weak typing is not that consistently defined, variously meaning the inability to mix variables of different types or relating to the degree of type safety. Nor is it a binary attribute. C is more type safe than some languages, and less so than others. And frankly Strong typing is a pretty poor article. The example there really shows strong vs. weak typing in only one of the common uses - it describes it in terms of allowing combinations of different types of values to be used in an expression, but the while the examples shown are weakly typed in that sense, they are not unsafe in the sense that you can really muck things up if you convert pointers poorly, or pass incorrect types to printf(). The description in Type safety is important too. Depending on your preferences, it would be fair to describe C as moderately strongly typed, or moderately weakly typed. There's definitely some cleanup needed, but I think the main problem is at Strong typing. Rwessel (talk) 06:16, 7 October 2011 (UTC)

An even more pedantic point

There is NO SUCH THING as a "curly brace" or for that matter "square bracket". There are:

parentheses: ( )

brackets: [ ]

braces: { }

Old_Wombat (talk) 11:43, 25 April 2011 (UTC)

Basis for your claim? --Cybercobra (talk) 14:07, 25 April 2011 (UTC)

Here's "a" basis for such a claim: The C Programming Language (Brian W. Kernighan, Dennis M. Ritchie, Prentice-Hall, 1978). In the description of the first "hello, world" program, (p4) "The braces { } enclose the statements that make up the function; ..." First mention of "bracket" in connection with use in a C program. Not "first ever", but likely the first in book form. Also, in the REAL standard prior to 1998, C A Reference Manual (Samuel P. Harbison, Guy L. Steele, Prentice-Hall, 1984) Section 7.3.4(p.146) begins, "A subscripting expression consists of a primary expression, a left bracket, an arbitrary expression, and a right bracket."

Also, in the standards: Nowhere in the C99 (draft as of 1998) or C++ 1998 ISO 14882 standards does the word "curly" occur in any context, much less as "curly brace". Braces is braces. The term "square bracket" occurs once only in the C99 draft, in the context previously mentioned (below) at 6.5.2.1. I don't have a copy of C98 to look at, but it's curious that the corresponding paragraph in C++1998 reads the same...but that makes a bit of sense because < > are used as "angle brackets" in C++. Nowhere else in either document does "square bracket" occur.

As for the Unicode Database, "left/right bracket", and "left/right brace" are also listed names for { } [ ] respectively. I don't know about modern standards, but the mathematical use in secondary school in the 1960s and college in the 1970s was "parenthesis", "bracket" and "brace"; in order of preference as a grouping symbol. Math usage in those days influenced more computer language than did typography, which is the basis for the Unicode Character Database. We're discussing C, which has "functions", not "procedures", after all. — Husoski (talk) 00:44, 12 July 2011 (UTC)

You're arguing for the common (programmer's) informal names of the characters, and against the international standard names for them. In a source like Wikipedia, it's more appropriate to use the standard nomenclature. The informal names can be mentioned, but they obviously don't carry the same authoritative weight that the official terminology does. — Loadmaster (talk) 19:28, 13 July 2011 (UTC)

No, I am not. In fact, I supported the original assertion with references for the names of these symbols as used in C. The Wikipedia entry on poker describes certain card holdings as "straights" or "straight flushes", even though the same group of cards in other card games is called a "run" or "run in suit".

I am arguing for using C sources for C terminology. I do not consider the Unicode Character Database as a definition of symbol names as used in C. The UCD has no authority regarding the naming of symbols as used in C. For example, the primary UCD name for the \ character is "REVERSE SOLIDUS". (Yes, these are all caps in the UCD.) However, in C it is a backslash. If the UCD were to disagree, it would be the UCD that were in error--just as it would be if a Wikipedia entry said the same thing. The correct approach, IMHO, is to mention other names for the symbols as used in other contexts--if that's required at all--rather than to obscure the C usage in an article about C. "Square bracket" has some justification, just barely, in the C literature. "Curly brace", on the other hand, does not appear to have serious support. UCD notwithstanding. — Husoski (talk) 22:49, 13 July 2011 (UTC)

Bracket points out that all three terms are ambiguous depending on the speaker (for example, British common usage of "bracket" is equivalent to US parenthesis), and lists "square bracket" and "curly brace" as known variants. In fact "bracket", in terms of punctuation, properly refers to the whole class of symbols (( ), [ ], { }, < >...).

Note that the C standard itself uses “bracket” in the general sense in places (6.7.8 “30 Note that the fully bracketed and minimally bracketed forms of initialization are, in general, less likely to cause confusion.” - referring to the use of curly braces in initializations), and qualifies “bracket” in other places (6.5 “71 … subscripting brackets []”, “6.5.2.1 Array subscripting … 2 A postfix expression followed by an expression in square brackets []”). Rwessel (talk) 16:12, 25 April 2011 (UTC)

That's because the term "braced" would convey the wrong connotations. Just because a noun can be verbed, doesn't mean it should be. — DAGwyn (talk) 20:36, 14 October 2011 (UTC)

"Curly" and "square" are valid descriptors modifying the official nouns, and can be used for redundancy, especially given the use of "angle brackets" for <...>. Perhaps parens should be called "rounded parentheses" for the benefit of the Brits? — DAGwyn (talk) 20:40, 14 October 2011 (UTC)

The ISO standard name for the "{" character is left curly bracket; that is also the standard Unicode name for it, as well as opening curly bracket, and left brace. Likewise, the standard name for "[" is left square bracket, as well as opening square bracket. See the Unicode code chart. The Hacker's Dictionary entry for ASCII is also illuminating on this point. — Loadmaster (talk) 16:33, 28 April 2011 (UTC)

I happen to believe that parenthesis, bracket, brace is the One True Naming Convention for these glyphs, but since the British don't even know what a parenthesis is, there's no way that Wikipedia will ever be the place for it. The terms that best enable the full spectrum of international and non-technical users here to immediately grasp what's being talked about are what should be used here. —chaos5023 (talk) 20:50, 13 July 2011 (UTC)

Terminating strings with a NUL

The Criticism section has a new "terminating strings with a NUL" criticism with a bare link to a blog (which I only skimmed very quickly). My feeling is that the criticism is undue given C's history, and poorly expressed (the extra compiler cost is related to some cases where the compiler may be able to optimize the code, so the current wording is a bit off). The key point about the original terminate-with-NUL decision was that C has no built-in limit to the length of a string, something that I did not see discussed in the blog in my quick skim—the blog seems to suggest that adding a single extra byte to make a byte count field of two bytes would have been a better choice (i.e. have a two-byte length instead of a one byte terminator). That is just misguided, and four bytes is far too many for embedded controllers or indeed many applications at the time C was developed. A better criticism would be that C's strings cannot contain arbitrary data. I don't feel strongly enough to remove the new text myself, but wonder if anyone has comments, or a better source. Johnuniq (talk) 00:06, 13 August 2011 (UTC)

I found the source to be extremely biased though valid on a few points- namely security issues. I trimmed the comments somewhat, but I don't think the portions about the hardware, compiler, or performance costs are actual issues, but more of a sensationalism. The problem with the compiler costs argument is two fold:

The article makes no claim about what the vast number of optimizations that need to be done due to NUL-terminators.
The article doesn't quantify development cost or really say anything other than that NUL-terminators increase compiler development costs.

As for performance costs:

The article didn't quantify this other than to say bcopy/memcpy can be implemented more efficiently by copying word size chunks as opposed to byte size. This may well be true, but it is stated without any proof.

As for hardware costs:

Doesn't quantify hardware costs.
The article basically says that older systems using the addr+len model had ISA support for it. Then gives an example of another system having to add string NUL-terminator HW support. How is the latter more costly than the former?

More or less, if you examine what the article says, none of it really holds up. The problem here is that it appears to be mostly an opinion piece. I was troubled with removing it completely because the article is ACM though and appears to be a valid source that would probably be controversial to remove without discussion. Personally, I feel we should clobber it though if other editors agree. Or rewrite it into criticisms of string security w.r.t. buffer overruns. snaphat ► 01:17, 13 August 2011 (UTC)

This issue is generally known as "counted strings versus terminated strings" and has been the subject of numerous debates, analyses, and experiments. Counted strings have length limited by the counter's representation, but in most applications each string tends to not exceed a few rows of characters on a page or screen. Counted strings make some operations (e.g. concatenation and determining length) faster, but others slower. At the time C was invented, zero-termination was widespread practice in DEC assembly-language programs. If you want to invent your own counted-string package, I highly recommend representing the string objects by pointers to their first characters (just after the count) and null-terminating them also, so the pointers can be used directly with the standard C library functions. — DAGwyn (talk) 20:59, 14 October 2011 (UTC)

Ugh. I see this just got pulled, but NUL terminate strings *are* one of the major criticisms of C (as mentioned, they've "been the subject of numerous debates, analyses, and experiments"). Whether one agrees with that position or not, it *is* a major criticism (and this is not the place to debate whether it's a valid criticism or not valid). I think this should be put back, but the criticism section should be moved out of the syntax section. There’s even some justification for leaving the NUL terminate string issue under syntax, as the compiler *does* explicitly support those. Rwessel (talk) 00:15, 15 October 2011 (UTC)

language with extensions to C

"C has greatly influenced many other popular programming languages, most notably C++, which began as an extension to C." Is the C++ the only language with extension to C? or there are some other small languages (not notable) that I don't know? It looks to me that Ch has many extensions to C. The significant feature is Numerical computing (vs matlab/mathemetica), shell programming (vs C shell/Bash), embedded scripting and C++ class. However, ch is not an open source software and the number of users should not so significant comparing with C++. I would like to see wikipedia has a list of such languages. — Preceding unsigned comment added by 64.134.222.4 (talk) 15:28, 12 October 2011 (UTC)

If I understand your question, you're looking for languages that are a superset of C, or started that way. Note that C++ may have started that way, and retains a high degree of compatibility with C, but is not any longer a proper superset. A few languages have definately taken that route, Objective C being perhaps the most prominent besides C++. There have been other serious revampings of C (D, for example), that still retain much of C's flavor. Other's have retained little but a bit of the syntactical appearance (C#, Java). Also many implementations of C actually implement a superset languaged. GCC, for example, has dozens of extensions to standard C, and is, in effect, a superset of C. So the question is somewhat vague.

There have also been subsets. Embedded C and the MIRSA standards, as well as many subset implementations (many compilers for embedded systems, things like Tiny C).

Perhaps [List of C-based programming languages] would be a start. Rwessel (talk) 16:56, 12 October 2011 (UTC)

Java, C#, ObejectiveC, and D are not C compatible. They all upgraded C (not extending C) for the purpose of doing something better while keeping the C-like language syntax or style. If you write code in those languages, the existing C skill set cannot be applied. I mean all C functions you know cannot be used.

gcc, intel C compiler and Microsoft Visual studio have C extensions. but those extensions are minor. They mainly serves as the purpose of C compiler.

I read [List of C-based programming languages] and think it is more appropriate if we call it [List of C-like programming languages]. I thought it is a [list of C-written programming languages].

If we talk something extending C with significant features, we have C++ and Ch only (C compatible, not 100% though).

Also, it is easy to think C is a compiled language vs scripting languages like Perl. Maybe it is a good idea to introduce Cint and Ch about C scripting here? — Preceding unsigned comment added by 64.134.222.4 (talk) 15:30, 13 October 2011 (UTC)

Objective-C is a pretty strict superset of C. I'm still not quite getting what you want to do. Rwessel (talk) 18:24, 13 October 2011 (UTC)

Thanks. I think the page can be made more clear to the reader. 1) move "C has greatly influenced many other popular programming languages, most notably C++, which began as an extension to C." to the section "Related languages", add something like: "C has three superset languages: C++, Objective-C and Ch. They all serve different purpose". (are there any other superset of C languages?) 2) under the section "Related language", "C has directly or indirectly influenced many later languages such as Java, Perl, Python, PHP, JavaScript, LPC, C# and Unix's C Shell." Add C++, Objective-C, D programming language, Limbo, Go to the above list. 3) under the section "Related language", move the detailed introduction of programming languages C#, D language, Limbo, Go and Perl starting with "C# was designed in order to ..." to the page [List of C-based programming languages]. Too much space wasted to introduce the extended languages which are not compatible or superset of C. 4) "When object-oriented languages became popular, C++ and Objective-C were two different extensions of C that provided object-oriented capabilities.". it is well known that the scripting languages such as perl, python, php are popular nowdays, people intend to do more things in scripting. can we add: "When scripting languages became popular, cint and Ch offer C scripting capabilities"? however, both Cint and Ch are not as popular as Perl. — Preceding unsigned comment added by 64.134.222.4 (talk) 05:07, 14 October 2011 (UTC)

I think the lede is OK as it. It's supposed to provide an introduction to the topic, and should avoid going into too much detail. I'd agree with removing any specific influenced language reference from the lede, except for the huge importance of C++, so that's reasonably there.

FWIW, Unified Parallel C, Split-C, Cilk and C* are other examples of (not necessarily perfect) extensions to C. And that's hardly an exclusive list - those all focus on parallel programming.

I agree that the Related Languages section is a bit of a mess. It's disorganized, and while it's approprate to talk a bit about the important related languages, I think a few of those (Go, Ch, Limbo deserve at most minimal mentions). Perl doesn't need to be mentioned twice (it's inclusion in the "weakly" related list in enough). I think more just deletion of some of the detail is in order, no real need to move it to the list page, those sorts of things should really be covered on the language's page. Again exceptions for the really important ones. We also can't mention all the languages that C influenced, since there are so many.

As a first pass, what do you think about User:Rwessel/c-related? Feel free to edit that, BTW. Rwessel (talk) 07:05, 14 October 2011 (UTC)

I made a slight modification. It looks good to me. — Preceding unsigned comment added by 64.134.222.4 (talk) 07:32, 14 October 2011 (UTC)

I'm not sure that pointing C++ and Objective-C out as supersets (and neither is a pure superset, just to confuse things) really adds much over the prior paragraphs. But why mention Ch? I've never actually encountered in the wild, is there enough usage to consider it a worthy of mention? Why not Unified Parallel C too, for example? Rwessel (talk) 07:58, 14 October 2011 (UTC)

I read about Unified Parallel C. Yes, I think it is a good to mention it. That is the language I want to know when I learn C. The development of Split-C stopped since 1996. it is not necessary to list it. However, the number of users for both United Parallel C and Ch should not be so significant comparing with C++ or Objective-C. If the popularity is a concern for being mentioned, they can be removed. It is a good place to know those superset of C languages here though. No language will be 100% superset of C. Maybe it is good to point it out 64.134.222.4 (talk) 14:35, 14 October 2011 (UTC)

I went ahead and made that change (with a minor revision). Rwessel (talk) 18:44, 14 October 2011 (UTC)

It used to be the case that many programming languages were modeled after ALGOL, and now many are modeled after C. I think less should be said about related languages, with the sole exception of C++ which has close ties to C (several people attend both C and C++ ISO Working Group meetings), perhaps just an unannotated list (with Wikilinks). — DAGwyn (talk) 21:08, 14 October 2011 (UTC)

I don't necessarily disagree, but the new version does remove a bunch of the old fluff, and most of the related languages *are* now just in a list. I do think that Objective C is important enough to merit a mention. Rwessel (talk) 00:25, 15 October 2011 (UTC)

Return 0 vs Return 1 Status

Should we add that 'return 1' indicates a failure as in for example, bad user input,etc? It will give some contrast to the 'return 0' line of code and explain the difference between returning 0 and 1(or any non zero integer) more clearly. --SaurabhKB (talk) 07:05, 23 August 2011 (UTC)

Its not a C language convention. 220.225.67.36 (talk) 07:26, 23 August 2011 (UTC)

You could use EXIT_FAILURE, but '1' indicating failure is not the case on all operating systems. I think (not confirmed) that exiting with 0 is always the same as exiting with EXIT_SUCCESS according to the C standard. IMHO, it doesn't belong on the page because it's an OS thing, not a language thing. strcat (talk) 19:17, 23 August 2011 (UTC)

I agree with strcat. The meaning of return value 0 vs. non-zero isn't part of the language. In some cases, it's the other way round. A non-zero value could indicate a user choice from a menu and zero could indicate failure, for instance. - Richfife (talk) 19:42, 23 August 2011 (UTC)

C99 states that the definition of EXIT_SUCCESS is implementation-defined. It's nonzero on OpenVMS TEDickey (talk) 23:38, 23 August 2011 (UTC)

C99 (7.20.4.3) also specifies that an exit() with either EXIT_SUCCESS or zero returns "an implementation-defined form of the status successful termination is returned." This clearly does not preclude there being two different "success" returns to the environment (one for zero and one for EXIT_SUCCESS), but they must both imply success. C89 contains basically the same language. Rwessel (talk) 03:57, 24 August 2011 (UTC)

Yes, there can be multiple "success" return values. IIRC, the low bit of the 32-bit value resturned by Vax VMS programs designated whether the return was "success" (bit=0, or even result values) or "fail" (bit=1, or odd result values). — Loadmaster (talk) 15:44, 17 October 2011 (UTC)