User talk:Popcornfud/Thoughts on definite articles in names

Interpreting search data
User:Popcornduff, I followed the link from the MOS/Music Rfc to find your essay; thanks for writing this.

I wanted to draw your attention to section A highly unscientific aside (great section title choice, ), as I think this section will need to be removed, or undergo serious rework due to avoid invalid conclusions. This is not surprising, because drawing the right conclusions from an examination of search engine results, and of Trends results in particular, is very tricky business.

Your comments at the Rfc also remind me that I've been threatening to get off my duff and write that essay I've been meaning to, about how to properly interpret search results, which has far more pitfalls than most people are aware of.

In the meantime, fragments of my proto-essay are scattered all over the place, as applied to particular cases, such as for example at this Rfc, much of which is not applicable here. (Another piece of it is at the same Rfc, this one specifically about misinterpreting Trends, is here.) Another pitfall in interpreting Google Trends data, is in this Rfc subsection; see the part about "Elvis is alive".

A bit of proto-essay I haven't written yet, concerns the use of stop words in search. In web search (or general information retrieval situations) this is pretty well understood, and generally not problematic. But, it's more problematic in databases of user queries, which is what Trends is. Because users are somewhat aware of the "meaningless of the" in search (a kind of first-level awareness of stop word concepts, without knowing the term itself or the theory), they tend to drop "the" from searches. After all, humans are lazy, and why type "the eiffel tower", when you know damn well that "eiffel tower" is going to get you the exact same results? ("Laziness" also answers the question why many people don't bother capitalizing Eiffel Tower. Confusion about how search works, and the knowledge that some things&mdash;notably passwords&mdash;do have to be capitalized, explains why many people do capitalize it.) But many people are generally aware that "Who" is not going to get them the results they want, if they are looking for a music group.

The links above were to analyses for particular cases, none of which match yours exactly; but trust me, that there are potholes (more like a minefield) all over the place awaiting you in your example as well. If you'd like to collaborate on how to get some meaningful information from search data, I'm willing to do that, though I don't have a ton of time right now. Maybe you can track down some search experts to help in the meantime, but I'll try to respond to questions here as I'm able. Thanks once again, for starting this essay. This now joins the ranks of "proto-essay bits all over the place". Hopefully, I'll start that darn "Interpreting search results" essay, before too long! Mathglot (talk) 06:03, 21 July 2019 (UTC)


 * Mathglot: Thanks for this thoughtful and useful bit of feedback.
 * Yes, I was aware of how dubious all that amateurish data interpretation was when I added it, and to be honest I added it mainly with the intent of seeing how I felt about it after a few days, with a view to possibly removing it. (It might interest you to know that in one draft I even mentioned the same thing you say here about user laziness.)
 * The reason I added the data in the first place came after I showed the essay (pre-data) to a writer friend, who reckoned that people naturally consider "the" part of band names and would be more likely to search for "the Beatles" than "Beatles". That same friend then checked Google and found that it wasn't true, so I thought it would be interesting to add.
 * But I'm persuaded by your concerns so I'll wipe it. Popcornduff (talk) 06:13, 21 July 2019 (UTC)
 * Not saying that it's impossible to find useful information here. Not trying to discourage you, but to let you know it's tricky, and so requires a bit of extra thought. If you have some search-savvy editors you can tap to opine here, I'd like to hear what they have to say as well. Mathglot (talk) 06:21, 21 July 2019 (UTC)

The boys, the chicks, the insects
Ha, looking up, totally forgot I was here once before! (But that explains why it's on my watchlist; one Wiki-mystery solved, 999 to go...)

Anyway, I think you possibly missed one consideration in why it's the Beastie Boys, the Beatles, the Dixie Chicks, but Run DMC, Nirvana, and so on, and that is that generally the noun that governs pluralization is the last in a series of nouns, where the others all act as nominal adjectives, and only the last one governs pluralization, and the use of the definite article. Examples are simpler than that gobbledygook, so think of bands called, "The Boys" (check; [also covers "Beach Boys"]), "The Chicks" (check), "the Beetles" (check), but "the DMCs" (bzzzt!), "the Nirvanas" (bzzzt!), and so on. Things (i.e., nouns) that are countable can take an -s, and I would predict that the group, if it consists of more than one person, would take -s and the definite article; and if it's uncountable (sand, water, rain, Air, Bread, Traffic, knowledge, Stress; more here) then it probably won't take pluralization or definite article. There are always exceptions. I can't think of any good reason why it's Slayer, but the Clash. Mathglot (talk) 07:07, 26 August 2021 (UTC)
 * , thanks for this! It makes sense so I've incorporated this idea into the essay. However, I'm not convinced that it matters whether the band name consists of a series of nouns, only that the nouns are countable. People argue over (the) Pixies, for example.
 * So because, in English, when we're talking about a specific countable noun (like a band we're talking about), there's a tendency to add the definite article just as we would for common nouns. Maybe. Is that it?
 * Tell me what you think - have I got this right? I don't think I'm quite on the money yet, so feel free to edit the essay directly if you want.
 * I can't think of any good reason why it's Slayer, but the Clash
 * Tell me what you think - have I got this right? I don't think I'm quite on the money yet, so feel free to edit the essay directly if you want.
 * I can't think of any good reason why it's Slayer, but the Clash
 * I can't think of any good reason why it's Slayer, but the Clash
 * I would guess that's just like the Eiffel Tower/Tokyo Tower (etc) - at some point this just becomes arbitrarily set by common use, usually by the band. (Pink Floyd, once the Pink Floyd, are an interesting case of a band name migrating away from the definite article...) Popcornfud (talk) 21:29, 26 August 2021 (UTC)
 * I think you're right on the money citing Eiffel/Tokyo towers as analogous to Slayer/Clash. And then there's the St. Louis Cardinals, and the Stanford Cardinal, although everybody pretty much finds the latter awkward, until they get used to it. Personally, I prefer the Santa Cruz Banana Slugs, such scary and ferocious beasts. Best, Mathglot (talk) 23:31, 26 August 2021 (UTC)