Wikipedia:Reference desk/Archives/Language/2009 October 12

= October 12 =

Term for scholarly or literary "detective work"
More specifically, what I have in mind is the art/science of tracking a fact, claim, argument or idea through various texts back to its source. For instance, let's say I read on some website that tigers like cheddar cheese. The website mentions a book as the source, and in turn the book mentions another book, and so on until-- following the citations-- I get to a firsthand account What would be the term for what I was doing? 69.224.112.30 (talk) 00:46, 12 October 2009 (UTC)


 * Epistemology? Establishing provenance? Finding primary sources? Historiography? (This is specific to history, rather than, for example, science) Perhaps one of those articles has the specific term you're looking for. Mitch Ames (talk) 11:53, 12 October 2009 (UTC)


 * Library research. -- Hoary (talk) 14:35, 12 October 2009 (UTC)
 * In literary research, the activity of identifying the sources of specific stories or other elements found in literary works is often denoted by the German word Quellenforschung ("source research"), but I've never heard the term used for the more general sort of investigation you seem to have in mind. Deor (talk) 15:49, 12 October 2009 (UTC)
 * Quellenforschung is also in the Shorter Oxford English Dictionary, 6th edition. The headword is in italics, meaning "...although used in English, is still regarded as essentially foreign". Mitch Ames (talk) 09:23, 13 October 2009 (UTC)


 * I would call this "checking sources." John M Baker (talk) 17:19, 12 October 2009 (UTC)

What is a word?
Uh... my sister and I have been discussing agglutinative, isolative, and inflectional languages, and it got me thinking... how do you define a word anyways? I already knew that linguists have basically no perfect way to define a "word". Various imperfect description methods are described on the word page, which concludes that "the exact definition of a word is often still very elusive", even after linguists do their best to consider the semantics, phonetics, and pragmatics of the utterance. So you don't always know whether a certain word/phrase is a single word or many. My question is, why don't we say that Mandarin Chinese is strongly agglutinative instead of strongly inflectional? After all, if each word is represented by a single character, why can't you read a sentence and say that it is all one word instead of six little words? And why can't Japanese be considered strongly isolative, with each of the affixes that make up a long word being its own word?

My initial guess is that, although there is no exact definition in a formulaic sense for a word there is some kind of more subtle mental definition, so that even if you can't follow simple rules you can "just tell" whether something is one word or many. But what if I'm wrong there too? Apparently orthography has a lot to do with how we think of words, especially when we live in a world where orthography is so fundamental (compared to people in the Amazon rain forest who never learn to spell yet speak languages that can be just as complicated as Latin). According to the word page, "ice cream" is supposed to be a single compound word linguistically, and is only spelled with a space in the middle because of its derivation from two separate words. Is it possible that the idea of a "word" is just an abstraction that came with the invention of orthography? Maybe French could have been classified as agglutinative if it has a different orthographical history ("Il y est allé" vs. "ilyestallé")?

In short, how, how, how do we know that Mandarin is isolative while Japanese is agglutinative? thx for any ideas Jonathan talk 04:08, 12 October 2009 (UTC)


 * See English compounds for how prosody in spoken English distinguishes a compound word from the phrase consisting of the same components as free words. Shanghainese has also come to have word stress in recent times (the first syllable's tone spreads over the multisyllable word and the subsequent syllables' tones are disregarded) but Mandarin and Cantonese do not, sticking with the syllable (= character in writing) as the unit. Nevertheless it is said to be an "ideographic myth" (search on that phrase) that Chinese is a monosyllabic language.


 * Japanese particles after a word are pronounced as part of the word, i.e. they are effectively case suffixes not independent words. Nevertheless romanized Japanese writes the particles as separate words. Japanese, Chinese and Thai are the only major languages that are written with no spaces in their native scripts.


 * Japanese verb/adjective endings usually are not independent words (though some that are written as part of the word in romaji descend from independent words in Old Japanese) and go in a fixed order. Chinese independent words with the same functions have some flexibility in word order.


 * French has indeed evolved - another suggestion by Johanna Nichols is that spoken, colloquial French is now at least partly head-marking, in contrast to traditional dependent-marking in European languages other than Basque language. --JWB (talk) 06:04, 12 October 2009 (UTC)


 * Orthography definitely plays a role. And Chinese has lots of set phrases, which need to be learned almost like words. But there's more to it than that. In Chinese (well, Classical Chinese, anyway), you can use each meaningful unit on its own. In Japanese you cannot. A basic rule of thumb is: If you make a mistake while speaking, how far back to you have to go to correct yourself? That unit that you have to repeat in its entirety is the word. In Inuit, where "I bought some fish last night" is a single word, you really do have to repeat the whole thing if you mess up just the end. kwami (talk) 06:13, 12 October 2009 (UTC)


 * Orthography indeed. Would a word counter count "alot" as one word, or two words misspelled as one?  Maybe it's in that twilight zone where it's very commonly seen, but not yet accepted (afaik) by any dictionary as a legitimate word.  --  JackofOz (talk) 08:15, 12 October 2009 (UTC)


 * I think you'll find that most linguistic investigations of this kind of thing take orthography as little more than a first stage toward a definition of word. If we make the (dubious) assumption that "a" is a full-blown word (rather than a mere clitic), then "a lot", however (mis) spelt, will be two words, as these could easily be instead "an awful lot". More interesting is "a helluva lot"; clearly it's now so written because of convention (originally of the representation of macho dialogue in second-rate novels, I suppose), but if we were free to overturn convention, might it better be "ahelluv alot"? On reflection, I don't think it should: it seems to me that on the contrary "hell of a" is edging toward the status of indivisible adjective. -- Hoary (talk) 09:55, 12 October 2009 (UTC)


 * Your "basic rule of thumb" looks interesting, but I'm not sure it always works. "I'm going to the theater (doh!) cinema", could be accepted, but how about the following sentence: "This time you won't get away with beat (doh!) it!"... HOOTmag (talk) 09:09, 12 October 2009 (UTC)


 * This rule of thumb looks very interesting indeed. Hootmag's tentatively proposed counterexample strikes me as odd for unrelated reasons that I'm too lazy to spell out fully; I'd say that "Next time you won't get away with murder (doh!) it" is OK though borderline (I'd guess because "it" seems phonologically inadequate to the task), while "Give the sandwich to it (doh!) him" is fine. In my idiolect, at least. -- Hoary (talk) 09:55, 12 October 2009 (UTC)
 * How about: "We started together, but he beat me to him (doh!) it!..."? HOOTmag (talk) 16:25, 12 October 2009 (UTC)


 * There are certainly intermediate cases like a lot, which are in the process of lexicalizing. Also phonogically weak words which are half-way to being clitics. But it's at least enough to show the general typology of the language.


 * As for the counter example, if you get the part of speech wrong, you will normally need more context to clarify which word you're correcting. But if both words are grammatically correct ("beat him--I mean it"; "beat--I mean show--him"), then it usually works. kwami (talk) 16:46, 12 October 2009 (UTC)


 * The problems are the circularities and similarities that sometimes undermine entire meanings, though they are necessary for continuum (like kwami points out).


 * On the question why a language is said to be agglutinative instead of strongly inflectional, the distinguishing factor is the ‘syntheses’ in a morphosyntatic element. That is, few lexemes can be a synthetic element. An inflection on the other hand is the morphological affixsation that is determined by the rules of syntax in a given lexeme.


 * On the concept that the idea 'a word is just an abstraction that cames with the invention of orthography', yes; it seems to be alike, or other way around like ‘An orthography is the abstraction of word’. That is, a word can still consist of a sign and its referent without its orthography; for example, protowords. Nevill Fernando (talk) 21:24, 12 October 2009 (UTC)


 * You probably want to avoid basing your definition on a language's orthography, since a language may lump together or separate glyphs for any number of reasons. I prefer to think in terms of "lexemes" rather than "words" — a lexeme being a part of the language that is discreet and has its own meaning, such that it is worthy of inclusion in a dictionary. To take the Japanese example, particles might be bound to nouns in the strictest grammatical sense, but they still have distinct meanings and will be listed in the dictionary. Paul Davidson (talk) 13:33, 13 October 2009 (UTC)
 * Small point, but I think this discussion has conflated two kinds of particle in Japanese. Kara is an adposition, o is a case-marker, ni can be either. Even o will appear in a dictionary but this says less about its semantic content than it does about lexicographic conventions. -- Hoary (talk) 14:00, 13 October 2009 (UTC)
 * It seems the glyphs are simply the representations of time in many languages; the duration of the time of a grapheme as to its actual phonemic representation. It usually occurs between an onset and a rhyme or in a coda in general but for the vowels in particular. However, I am not sure on this.
 * About particles: They are just different lexemes in compound words and have varying syntactic rules in orthography than the regular compound words usually. Nevill Fernando (talk) 16:30, 13 October 2009 (UTC)

A word is the lexographical or verbal representation of a discrete idea, concept, or reference. For instance, 'black' represents what we know as the color black; 'people' represents the discrete class of entities known as human beings; and Dave is a reference to the very complicated idea of 'Dave Smith from accounting'. Since Dave is such a huge concept, beyond anyone's reckoning, the best we can do is reference everything we know about Dave. It's like a pointer in computer science. Dave is walking around in the real world: we cannot fully internalize everything about him, but we can certainly fill out some brain patterns that account for who he is and why he does the things he does. Vranak (talk) 00:51, 14 October 2009 (UTC)
 * Whew. &para; You're conflating more or less regular words and names, a conflation that is at least controversial; and while, say, "black" and "blackest" or "Dave" and "Dave Barry" are discrete in some ways, they may not be discrete in others. I'm not at all sure that "Dave" is used in a natural language by referencing everything we know about Dave, or that words in a natural language are like pointers in computer science. &para; For that matter, while "do" (say) is generally taken to be a word of English, I'm not at all sure that it represents any discrete idea. -- Hoary (talk) 04:59, 14 October 2009 (UTC)

Off
I'm looking through a library copy of the book After the Off, photos by Bruce Gilden, story by Dermot Healy. Perhaps the dust cover explains the title but the former was junked before the book was ever lent out. The (remarkable) photographs show the spectators at one or more horse races in Ireland. Perhaps in part because of gimmicky typography, I have great trouble making any sense of the story, but as far as I've noticed it has merely a single, unexplained mention of "off", in "before the off". I know squat about horse racing and normally wouldn't much care, but I'm troubled by my inability even to understand the apparently simple title of a book in English. So: What would a/the "off" be in an Irish horse race? -- Hoary (talk) 12:44, 12 October 2009 (UTC)
 * "The off" refers to the start of the race, e.g. some betting companies allow you to bet "after the off", or check out the usage here Tinfoilcat (talk) 12:55, 12 October 2009 (UTC)


 * [ec] The search results I'm getting seem to indicate that "before the off" and "after the off" refer to bets placed before and after the race starts ("and they're off!"). None of the results I looked at gave a specific definition, which is why I haven't provided a link, but here's my Google search.  -- LarryMac  | Talk  12:57, 12 October 2009 (UTC)
 * Even in the United States, of course, racetrack announcers invariably say "And they're off …" to signal the start of races. Deor (talk) 13:02, 12 October 2009 (UTC)
 * Sometimes they say, "They're off and running!" Like what else would they be doing? The fox-trot? But this business of betting after the race starts is curious. (1) Why would they allow it? and (2) How would you even have time to get the bet down? →Baseball Bugs What's up, Doc? carrots 05:27, 14 October 2009 (UTC)

Ah yes, of course. Thank you all! -- Hoary (talk) 13:07, 12 October 2009 (UTC)

"0.57 second" or "0.57 seconds"?
See "0.57 second" or "0.57 seconds"?. _ _ _ A. di M. 23:34, 12 October 2009 (UTC)


 * In English there are few absolute rules. That said, to me as an American, "0.57 second" looks more "correct", but "0.57 seconds" looks and sounds more "natural".  I would say that neither form is clearly incorrect, so you can use either form.  Marco polo (talk) 00:17, 13 October 2009 (UTC)


 * Typically an American would say "1 second" and "n seconds" where n is any number besides exactly 1. Even "0 seconds". Why? I don't know. He's on third. :) →Baseball Bugs What's up, Doc? carrots 01:02, 13 October 2009 (UTC)
 * Style guides universally recommend "0.57 second" (for the same reason that one writes "one-half mile", for instance, rather than "one-half miles"). Unfortunately, our template convert doesn't see things that way; if one enters 0.5 mile into it, it produces "0.5 mi". Deor (talk) 01:10, 13 October 2009 (UTC)
 * "One-half mile" would be the formal way to say it, but you're more likely to hear "half a mile", which does use the singular. →Baseball Bugs What's up, Doc? carrots 01:29, 13 October 2009 (UTC)
 * This seems to be a case where the spoken and written forms vary. Style guides may well recommend we write "0.57 second".  But reading that out loud literally sounds, well, wrong.  In my experience, people are much more likely to say "point five seven seconds".  Possibly because ".... seven second" goes against the grain, even if the "seven" here represents only .07. OTOH, they'd say "point five seven of a second" (not "of seconds").   Would style guides recommend "zero second"?  I'd be surprised.  -- JackofOz, masquerading as 202.142.129.66 (talk) 02:11, 13 October 2009 (UTC)
 * FWIW, as a speaker of Commonwealth/International english, 0.57 seconds both looks and sounds more natural (I don't know of anyonee who would naturally say or write "0.57 second"), though neither looks or sounds as natural as 0.57 of a second (which is grammatically and mathematically accurate - it is part of one second after all). This is also in line with "half a mile", which is likely a natural contraction of "half of a mile". The formal way to say that would be "one half mile", with no hyphen — which again is logical, since two half miles equals one mile, i.e., a "half mile" is an acceptable base unit of measure in itself. Once you include decimal places, it becomes clear that you are not using the "half" as the unit of measurement - 0.5 miles is therefore more appropriate. As such "one half mile" not really a fair analogy to the "0.57 seconds" case. Grutness...wha?  06:00, 13 October 2009 (UTC)
 * It's fine to call it "Commonwealth English", even though the English of Canada, part of the Commonwealth, is much closer to American English than to the British standard that is really the basis of so-called Commonwealth English. However, please don't call it "International", implying that American English is merely a national variety limited to the United States. Not only is a variety of American English used in Canada (if with mostly British spellings), but American English is the preferred form in Latin America as well as much of East and Southeast Asia.  The British standard is a perfectly acceptable standard, but it is arrogant and incorrect to imply that it is the international standard.
 * Style guides for written text do not always reflect spoken English. On Wikipedia we write 13 October 2009 (or October 13, 2009) whereas I actually say "the thirteenth of September two thousand and nine". -- Александр Дмитрий (Alexandr Dmitri) (talk) 13:57, 13 October 2009 (UTC)
 * Doesn't it get confusing if you say "September" for the month we write as "October" on Wikipedia? +Angr 16:55, 13 October 2009 (UTC)

I don't know why this was posted both here and at WT:Manual of Style (dates and numbers), but it's giving me vertigo. See my reply there, and let's decide where we want to have this discussion. —— Shakescene (talk) 06:19, 13 October 2009 (UTC)


 * In English, we don't really have a plural, we have a non-singular: anything other than unity is "plural". "One-X mile" is an exception, possibly because of the word "one". kwami (talk) 06:27, 13 October 2009 (UTC)
 * Dammit, I've just wasted many 0.57 secondses reading this. Clarityfiend (talk) 09:00, 13 October 2009 (UTC)

There was a thread on this subject fairly recently in alt.usage.english (under the subject line "Zero points?"). Summary: usage varies. At least one poster said it made a difference whether the number between 0 and 1 was expressed as a fraction or a decimal. --Anonymous, 10:33 UTC, October 13, 2009.
 * That's pretty much my point above. Grutness...wha?  12:47, 13 October 2009 (UTC)

The inconsistency is actually pretty consistent in [Californian] English: half a second, but 0.57 seconds; half a mile, but 0.8 miles; half an hour, but 0.75 man hours (a measure of labor required). DOR (HK) (talk) 05:32, 14 October 2009 (UTC)


 * There may be some exceptions, but my understanding is that what you've just described is the norm across all varieties of English, all over the world. Which is why I was so surprised to hear Deor say above that "Style guides universally recommend 0.57 second".  His/her analogy of "one-half mile" doesn't ring true with me.  I'd be more comfortable with "one half-mile", but even that looks stilted and unnatural to my ears (and sounds so to my eyes).  "He walked along for (a) half a mile ..." would be the usual way of saying it, just as "This company earned $1 million every 0.75 man hours last year" would be the norm.  --  JackofOz (talk) 12:20, 14 October 2009 (UTC)
 * ... and from a British English point of view, I agree 100% with DOR and JackofOZ. Colloquial British English seems to have one pence but this is frowned upon by most of us ( - it happened because of a need to distinguish between old and new pennies, and is not supported by the legend on the coin ).  What is this mysterious style guide that contradicts universal usage?    D b f i r s   08:29, 15 October 2009 (UTC)

To be fair to Feynman and the style guides, the singular probably fits no worse than the plural in the formal, highly specialized scientific and technical prose they're using. "0.557 second" is a little stiff, stilted and unnatural in such contexts (say an article about the speed of light, ballistics or technical astronomy), but then so is "0.557 seconds". I wouldn't really want the Manual of Style to pronounce either way, except perhaps to say that the singular is uncolloquial, unidiomatic and therefore probably a bit awkward in non-technical articles directed towards the general reader or treating non-technical subjects like history, music or sports. —— Shakescene (talk) 08:46, 15 October 2009 (UTC)