User:Chocolateboy/Dashes

From Wikipedia talk:Manual of Style (dates and numbers):

(The file normalized.txt referenced below is a version of the 20040727 cur table dump with talk pages removed (script available on request). Due to stack overflow issues in Perl's recursive regular expression engine, a few longer articles are also excluded from these statistics.)

Globally, spaced hyphens are at least 15 times more common than ndashes and mdashes combined:


 * grep '&amp;ndash;' normalized.txt | perl -pe '$_ = join ($/, /&amp;ndash;/g) . $/' | wc -l


 * > 14663


 * grep '&amp;mdash;' normalized.txt | perl -pe '$_ = join ($/, /&amp;mdash;/g) . $/' | wc -l


 * > 16526


 * grep ' - ' normalized.txt | perl -pe '$_ = join ($/, / - /g) . $/' | wc -l


 * > 494155

Likewise, hyphens are approximately 40 times more popular than dashes for date ranges:


 * grep '\]\] &amp;ndash; \[\[' normalized.txt | perl -pe '$_ = join ($/, /\]\] &amp;ndash; \[\[/g) . $/' | wc -l


 * > 2698


 * grep '\]\]&amp;ndash;\[\[' normalized.txt | perl -pe '$_ = join ($/, /\]\]&amp;ndash;\[\[/g) . $/' | wc -l


 * > 2599


 * grep '\]\]-\[\[' normalized.txt | perl -pe '$_ = join ($/, /\]\]-\[\[/g) . $/' | wc -l


 * > 59366


 *  grep '\]\] - \[\[' normalized.txt | perl -pe '$_ = join ($/, /\]\] - \[\[/g) . $/' | wc -l


 * > 160911

As you can also see from those stats (which exclude some date ranges and include some non-date-ranges: patches welcome!), spaced hyphens are used approximately 3 times more often than unspaced hyphens.

chocolateboy 23:39, 16 Sep 2004 (UTC)