Talk:Ditto mark

Needs expansion
This article needs expansion. Notably, the history/origins of the ditto mark are missing. tedder (talk) 22:30, 29 October 2009 (UTC)

In legal documents
In legal documents, the use of ditto marks or the abbreviation do. or the word ditto is often forbidden by law or regulations. This statement will soon be backed up by inline citations. --DThomsen8 (talk) 23:30, 30 October 2009 (UTC)
 * When is "soon"? 86.143.54.10 (talk) 23:46, 18 February 2010 (UTC)

Unicode
Why is it in CJK section in unicode? it has nothing to do with it, is that a mistake? --Squidonius (talk) 02:08, 27 September 2010 (UTC)

What?
"In Unicode it is encoded [...] markably in block CJK Symbols and Punctuation."

What? It seems like the article is trying to make the point that it's encoded in that block, but it doesn't say anything further about this. It's like saying "The hills have trees on them. That's important to know." in a lecture, then never speaking of either the hills or the trees again. &mdash; TORTOISE WRATH  18:27, 28 December 2012 (UTC)


 * Yeah, the German article states that it's actually meant for CJK typography, not for German texts. It looks utterly strange to my eye and on my machine so that seems plausible to me. But I'm really not used to English typography so if you find it looks just fine, maybe it's ok to use it as the current article does then. This text is linked to as a proof that 〃 is not suitable for Latin typography in general but I couldn't figure out why exactly. --Mudd1 (talk) 12:55, 6 May 2013 (UTC)


 * the file ScriptExtention.txt of the unicode standard itself specifies that this character is intendended to be used with the following scripts : Bopo Hang Hani Hira Kana i.e. the CJK scripts. (search for DITTO MARK in the file). More precisely, this file "indicates which characters are commonly used with more than one script, but with a limited number of scripts". Frédéric Grosshans (talk) 15:04, 6 May 2013 (UTC)
 * So probably Unicode does not have a ditto mark for Western script(s). That would mean we can delete the Unicode reference. Still, it could very well be a typographic mark in Latin script. Apple has. I use it in handwriting. -DePiep (talk) 18:37, 19 May 2013 (UTC)

Character U+3003 has the Unicode note: see also . I think this should solve it. I have changed the character in the article into this one (an in the sidebar template), and tried to explain+source it in text. Note: the block CJK Symbols and Punctuation, despite the CJK in its name, is not unified. -DePiep (talk) 08:30, 20 May 2013 (UTC)
 * If and when this talk is stable on the outcome, I'll propose a move: ditto mark &rarr; ditto. -DePiep (talk) 16:58, 20 May 2013 (UTC)


 * I think your interpretation of Unicode standard is not correct. The see also note in Unicode only points to an alternative (and very common) representation when U+3003 is not available. Please see:
 * http://www.oxforddictionaries.com/definition/english/ditto-marks
 * http://webdesignledger.com/tips/common-typography-mistakes-apostrophes-versus-quotation-marks
 * http://www.cs.sfu.ca/~ggbaker/reference/characters/#char3003
 * I am going to replace the representation of this symbol with the proper Unicode character for it. Diego.nanu (talk) 15:27, 23 July 2014 (UTC)

Ditto mark and CJK

 * block=CJK Symbols and Punctuation

As the reference -already in there- says: 3003         ; Bopo Hang Hani Hira Kana # Po       DITTO MARK i.e., only used in these five scripts. It is a CJK character. In Latin script, we use
 * , block=General Punctuation.

-DePiep (talk) 09:36, 7 August 2014 (UTC)
 * Five years later and the books listed below seem to suggest that this is no longer true, that 3003 is being proposed for Latin texts too. A quick check on the fonts bundled with Windows 10 suggests that only MS Gothic and Arial Unicode have this glyph, it is not in Lucida Unicode. One to check again in five years time, I suggest, meanwhile definitely don't hold your breath!
 * But yes, some of the sources suggest that the inch sign, aka double prime, is the one to use for ditto. But others disagree. Sigh.--John Maynard Friedman (talk) 19:01, 31 December 2019 (UTC)

Shape
u|MwGamera tagged this statement ''The mark is the simple straight quotation mark.  as "failed verification" because, he says, None of the three describe it as a quotation mark. Cambridge doesn't describe shape at all. Parenthetical examples taken at face value are conflicting with each other but there's no reason to assume they were intended to be typographically sound in the first place.''

It would have been more appropriate to use this talk page to discuss how to improve the article rather than just add an fv tag. But to make progress, let's work on the basis that the fv note is the challenge. [For convenience in this discussion, I have made Cambridge a simplre ref instead of efn).

First we must take these highly reliable sources at face value and not make subjective value judgements. If they say X, then it is WP:OR for us to question did they really mean X.

Second, let's look at the sources in turn. All quotations are verbatim, exactly as written except that I have highlighted the critical words or symbols:
 * Oxford says "the mark is a symbol made using ". This is not currently reflected in the article, an omission which I will rectify. Visually, the difference is slight but I accept that it must be given.
 * Collins says "two small marks placed under something to indicate that it is to be repeated". The mark between the parentheses is indeed a simple quotation mark.
 * Cambridge
 * "Advanced Learner's Dictionary & Thesaurus" says "a symbol  that means "the same" and is used in a list to avoid writing again the word written immediately above it". The symbol shown is a simple quote.
 * Cambridge Business English Dictionary says "the symbol that means 'the same' and is written immediately under a word or phrase that you are repeating so you do not have to write it out again". The symbol shown is the CJK one (Unicode ditto mark).

Third, are they conflicting with each other? In forensic detail, yes, but visually no. All the marks are simple ones.

Finally, though not in MwGamera's challenge, have I replaced WP:OR with WP:SYN? That is more arguable: I don't believe I have but I will comment out the section pending further discussion. --John Maynard Friedman (talk) 12:00, 31 December 2019 (UTC)


 * Ok, sorry, I guess I should've taken this to the talk page from the start.
 * Clearly we shouldn't make value judgements about sources, but I believe you interpreted them as making more precise claims than I understand they do.
 * Being lexicographic sources rather than typographic guides, they provide only enough detail to uniquely identify the concept, not to precisely describe the shape of the mark. Accordingly, neither of Cambridge dictionaries say anything at all about its shape only stating its purpose and usage, Collins says “two small marks” from which you can't really infer that it must be a “quotation mark”, and Oxford says “two apostrophes” which is a different thing than a quotation mark, although it should probably look similar to a closing one (”) and it might probably be taken to describe that shape. So as I see it, none of them supports the statement about it being a “straight quotation mark”.
 * It's only when you try to interpret the particular way they were set for the web “in forensic detail” as a part of what they claim, that Collins and Cambridge Learner's seem to support the statement because they graphically realized the mark with a typewriter quote. You've highlighted visual representation given in parenthesis so I understand this is how you interpreted it, but such interpretation necessarily leads to conclusion that sources are contradictory, because Oxford says apostrophes and Cambridge Business used different Unicode character (one that doesn't make a sense to use, but graphically it's similar enough).
 * No, I don't think they are actually conflicting, I think that's reading more into them than they actually state. I think they just don't state anything to support the statement about it being a quotation mark.
 * By the way, I like vague description used in Collins, because it encompasses typographical variants used in other languages like French or German (or Polish for which I can provide citations stating it consists of two commas, which incidentally looks similar to opening quotation mark).
 * As for the right shape for use in English, it would be great to find typesetters' or typographic guide stating what should it look like. I, honestly have no idea. It seems that The Chicago Manual of Style mentions something about the ditto mark, but I don't have a copy or subscription to check what. —MwGamera (talk) 18:25, 31 December 2019 (UTC)

Typography sources
Searching Google Books, I find Are these any better? --John Maynard Friedman (talk) 13:27, 31 December 2019 (UTC)
 * Exploring Typography (Tova Rabinowitz, 2015), page 267, which uses the CJK mark but the font is chosen is one that shortens it to be consistent in size with 'curly quotes'.
 * Quick & Easy English Punctuation: A Modern Punctuation and Style Handbook ... (Richard De A'Morelli, 2017), section 13.35, which also uses the CJK mark without comment but they appear to my eye to be shortened as in Rabinowitz.
 * The Typographic Style-book: a Manual of Rules for Preparers of Copy, Compositors, and Proof-readers (Whitney Byron McDermut, 1900) page 29 which says 'two inverted commas or an mdash'. It may be relevant that this reference is nearly 120 years old.
 * Design Principles for Desktop Publishers (Tom Lichty, 1994) page 80 "The inch mark might also be appropriate for use as a ditto mark, but that's it: Printer's quotes belong everywhere else! "


 * Yes, someone should judge if they make sense but at least they directly deal with how that symbol looks. I wouldn't be surprised if there is more than one style accepted in different circles. I wonder how can you tell it they use CJK one, though; it looks like a double prime to me. —MwGamera (talk) 18:25, 31 December 2019 (UTC)

I have just found a really interesting one: Unicode Explained (Jukka K. Korpela, 2006) p 398 which remarks that the ASCII quote 'is often used' (subtext: incorrectly) for other things like inch mark (should use double prime) or as a ditto mark. It may be noticed that the ditto mark shown is the shortened one consistent with anglophone orthography rather than the long CJK version. --John Maynard Friedman (talk) 13:49, 31 December 2019 (UTC)


 * That is fascinating indeed. It also says that “APL quote” is identical with the quotation mark on the same page, which is a blatant error. I don't think there exist any particularly well established convention for which Unicode character to use for ditto. A lot of people still can't agree how to write apostrophes, and ditto mark is much rarer in modern typesetting. It's not like there's any rule that would say using U+3003 in Latin text is wrong just because of its provenance (look at emoji), but it's impractical as there doesn't seem to be many western fonts including it either (I know of one). —MwGamera (talk) 18:25, 31 December 2019 (UTC)
 * Fair comment, but we only have these WP:RSs to support anything at all. (I accept now that it was WP:SYN of me to read a typographic ruling in the dictionary definitions but the Shape text I replaced was even more wrong, because it was just unsupported opinion). So it is best to just lose that section completely. --John Maynard Friedman (talk) 19:15, 31 December 2019 (UTC)

Double prime
We don't have any citation for 'use double prime' (although it makes sense) and only a side-swipe reference to using the inch symbol (aka double prime). So unless anyone can find such, I can't see how it can go in the article. --John Maynard Friedman (talk) 19:15, 31 December 2019 (UTC)

Failed verification
I questioned 's "failed verification" challenge because I misread it as challenging the CJK ditto mark. I see now that question was about their use (or otherwise) of curly quotes. The sources are definitely using "typewriter quote" (straight double quote, ") marks today. I reverted my reversion of DT's edit but the text still needs to be updated to reflect the new normal. --John Maynard Friedman (talk) 21:12, 3 September 2020 (UTC)
 * FWIW, my paper Concise OED definitely uses curly quotes. --John Maynard Friedman (talk) 21:16, 3 September 2020 (UTC)

Proposed new lead
I would like to propose a new version of the lead. We have a small problem is that there is a ditto mark glyph defined in Unicode – we just don't use it (much!) with Latin scripts. So how about this? (proposed here first as I suspect it may be a little controversial: "The ditto mark is a typographic convention indicating that the word(s) or figure(s) above it are to be repeated. Latin script (as used for Western and Central European languages) does not have a unique or dedicated glyph for this purpose: other symbols are employed. Chinese, Japanese and Korean scripts do have a dedicated glyph,, that may sometimes be used elsewhere.

In anglophone countries, the mark is made using 'a pair of apostrophes'; a 'pair of marks'; the symbol " (quotation mark); or the symbol ” (right double quotation mark). Other languages have similar conventions."

Comments? --John Maynard Friedman (talk) 11:13, 3 November 2020 (UTC)
 * A great improvement. I do have a few reservations:
 * "Western languages" is ambiguous. Dominant languages in the Western world? The map in that article excludes all of South America except for French Guiana, and that's a big chunk of the world that uses the Latin script. Another big chunk is sub-Sahara Africa. "Central European" is somewhat better, but still vague: does Central Europe include Romania?
 * The use of 〃 by the Cambridge Dictionary of Business English is surely an outlier. I suggest dropping the clause "that may sometimes be used elsewhere". If it is to be kept, note belongs right after.
 * Cheers, Peter Brown (talk) 17:21, 3 November 2020 (UTC)
 * Thank you, appreciated.
 * Yes, I've already wriggled on that hook. I could say "the languages of Europe and its colonies" but it might upset a few people 😁. It can't stay as it is because the Finns and their Balkan neighbours will get even more upset. More suggestions welcome: right now, I'm tending towards dropping it and let people read the Latin script article if they really don't understand.
 * If the world were entirely logical, Western Europe would be those countries that use Western European Time (WET, UTC+00, GMT), Central Europe would use CET and Eastern Europe those that use EET. But only UK, Ireland and Portugal use WET: Spain, France, Luxembourg, Belgium, Netherlands are all west of 7.5°E but use CET. (Yes, Romania is Central Europe by any measure). Finland, Estonia, Latvia and Lithuania use EET but use Latin script. Bulgaria, Greece and Cyprus are all in the EU and EET but don't use Latin script. And of course a lot of people are still in Iron Curtain mode: Eastern Europe to them starts at the Brandenburg Gate and Checkpoint Charlie! So this is a dead end.
 * I am inclined to agree about the Cambridge Dictionary of Business English as it is most likely a typo. I have deliberately done a Nelson on the infobox for now but we will have to come back to that discussion (one option is delete it?) because there is a Unicode code point called Ditto Mark and it is the CJK one. Is that information useful to en.wiki or more likely to mislead readers in a hurry?
 * Position of footnote moved as suggested, thank you.
 * --John Maynard Friedman (talk) 18:19, 3 November 2020 (UTC)


 * On one hand I don't like that Unicode-centric approach, on the other I understand your attempt at making it less confusing given how Unicode names it. But the one thing that you already changed, where the first sentence says that the mark is a convention, makes no sense and I can't see how it relates to Unicode at all (your edit summary made no sense to me). It clearly is a kind of a symbol, even though it displays a considerable variation in its shape (but much less than, say, ampersand). The typographic convention is what says what shape should it have. Conventions differ between languages and English seems to have multiple competing ones. Maybe calling it a typographic symbol wasn't strictly correct as the shape of ditto is usually described by equating it to some other typographic symbol or to a sequence of such, but this has absolutely nothing to do with Unicode. Mapping between orthography, typography, and Unicode are all separate matters.
 * And while we're changing the lead, I think it's a good opportunity to drop these “(s)” and maybe change figures to numbers (although it's probably unlikely to be misinterpreted). So I suggest the first sentence be:
 * "The ditto mark is a graphical symbol indicating that one or more words or numbers above it are to be repeated."


 * Further, I'm not sure how to fix the latter part, but your usage of the word glyph seems incorrect as it appears that what you mean is simply a Unicode character rather than an actual unique shape or even an abstract shape. Also, to my non-native ears the phrase that may be used elsewhere sounds as if it were okay to do so rather than a matter-of-fact statement that it is what sometimes happens, but it may just be me.
 * By the way, as a general comment regarding the shape, I think that it's the usage that makes a sign a ditto mark rather than its shape. Apparently someone at Unicode held similar view at some point because there was a character called (see CJK Symbols and Punctuation) indistinct and later unified with the regular character 仝. Rightly so, because this is simply an abbreviation saying “the same”, which, like Western ditto mark (but unlike other kinds of iteration marks like 々, which, by the way, also comes from the cursive form of the same character!) can be used to repeat a previous line or a previous cell in a table. (I wonder if it should be mentioned in this article.) The  is by all means the Western ditto which like digits and some other Western punctuation made it into CJK orthographies. The difference is that they needed a separate full-width character to fit their conventions while in the West it was set using different symbols, and this is simply carried over to Unicode. Ｕｓｉｎｇ　ｔｈｅ　〃　ｃｈａｒａｃｔｅｒ　ｉｓ　ｅｘａｃｔｌｙ　ａｓ　ｗｒｏｎｇ　ａｓ　ｍｅ　ｕｓｉｎｇ　ｆｕｌｌ−ｗｉｄｔｈ　ｃｈａｒａｃｔｅｒｓ　ｔｏ　ｔｙｐｅ　ｔｈｉｓ　ｓｅｎｔｅｎｃｅ，　ｎｏ　ｌｅｓｓ，　ｎｏ　ｍｏｒｅ． –MwGamera (talk) 16:26, 14 November 2020 (UTC)
 * Thank you for this considered comment. Yes, it is difficult! The biggest problem is to find the right words. So first lets copy some definitions to see if they can help us any:
 * "In typography, a glyph is an elemental symbol within an agreed set of symbols, intended to represent a readable character  for the purposes of writing. Glyphs are considered to be unique marks that collectively add up to the spelling of a word or contribute to a specific meaning of what is written, with that meaning dependent on cultural and social usage."
 * In graphemics, the term allograph denotes any glyphs that are considered variants of a letter or other grapheme, like a number or punctuation. An obvious example in English (and many other writing systems) is the distinction between uppercase and lowercase letters.
 * "A character is a semiotic sign or symbol, or a glyph – typically a letter, a numerical digit, an ideogram, a hieroglyph, a punctuation mark or another typographic mark."
 * "In semiotics, a sign is anything that communicates a meaning that is not the sign itself to the interpreter of the sign."
 * "A symbol is a mark, sign, or word that indicates, signifies, or is understood as representing an idea, object, or relationship."


 * I don't agree that it is a "Unicode-centric approach". The Unicode Consortium records glyphs that have existed for sometimes centuries and gives them code points. As far as Wikipedia is concerned, Unicode is a wp:reliable source where it is evident that a lot of thought and global perspective has gone into their choices of names. A genuinely "Unicode-centric approach" would have us report only the CJK character! But this is en.wiki and we have to wp:think of the reader, so I can't see any consensus emerging that we should do that. It may well be, though, that Unicode channels our understanding of the words defined above, because we have come to use them the way the Unicode standards do. Despite that, it does give us a shared vocabulary which does help us communicate.
 * I'm happy that my earlier use of the word "glyph" is correct but it was your question that prompted me to hunt actively for alternative words: I wonder if we might use "sign"?

"The ditto mark is a sign indicating that the words or figures above it are to be repeated."


 * "Ampersand" is single glyph, though it has many allographs. "Ditto" is the reverse: other than the CJK mark, it does not have a unique glyph – a variety of different glyphs (and combinations thereof) are used to represent it. This is why I now think that "sign" is a better word than "convention" and am annoyed that I didn't think of that first. [I was trying to wiggle around having to say that the ditto mark is not really a mark unless it is the CJK mark, which is true but useless].
 * "May" is a word in English that is somewhat ambiguous: it can mean "you have permission" or "it is possible". Would this make it clearer:

"Chinese, Japanese and Korean scripts do have a double width glyph,, that may sometimes be seen elsewhere."


 * which kind of conveys "among the lower orders" and "not in polite company" sense :-)
 * I didn't appreciate that the CJK ditto mark is another double-width character. I think you have just volunteered to revise that section of the body!
 * --John Maynard Friedman (talk) 17:55, 14 November 2020 (UTC)


 * I absolutely don't agree with your conclusions regarding the terminology, but I have no objections to the wording you proposed now. Calling it a sign is IMO just a correct as graphical symbol. “May be seen” also works for me.
 * Unicode definitions are not very strict and slightly different from the usual ones so I always try to remember to call it Unicode characters if I mean those and not some other kind of characters. But Unicode absolutely does not define any glyphs. It's explicitly stated right in the introduction section of The Unicode Standard, and elaborated in subsequent chapter about design principles. It deals exclusively with characters in the computing sense. The distinction between glyphs and characters as described in Unicode is virtually the same as in typography. And it doesn't define glyphs.
 * Unicode also doesn't try to document any kind of reality, be it typographic, orthographic, or paleographic reality. It's just a means to an end which is the interchange of a text data for some definitions of text. And although Unicode doesn't state this anywhere explicitly, the character names, while good hints, aren't reliable for anything else than uniquely identifying the Unicode characters. They are stable identifiers and multiple cases of misleading or outright wrong names are known (sometimes aliases are added so, for example, we don't have to keep using the misspelled name for, but canonical names aren't retracted, deprecated, or marked as wrong in any way).
 * Allographs are glyphs. Specifically one glyph is said to be an allograph of another glyph if they are both used to represent the same grapheme. This is what definition you pasted says. Sadly grapheme doesn't have a single good definition. In many contexts, upper- and lowercase letters wouldn't be considered allographs, but single- and double-storey letter g would be.
 * Ampersand is a single sign, symbol, character in generic sense (unless considered ligature), character in computing sense, and a grapheme which has many allographs, i.e. can be realized with multiple different glyphs. Ditto is the same, except it doesn't have a single dedicated way to encode it.
 * –MwGamera (talk) 21:37, 14 November 2020 (UTC)
 * Yes, I think I've muddled my glyphs with my graphemes. Thanks.
 * All's well that ends well, I'll apply this version. --John Maynard Friedman (talk) 22:41, 14 November 2020 (UTC)

German notation
The equivalent article in de.wiki confirms that German usage is as stated in the article. They cite (but not in a way that can be reused to en.wiki standards), a German orthography textbook. So we still need a citation. Interestingly, the article says that German-speaking Switzerland uses the French convention. --John Maynard Friedman (talk) 01:04, 17 November 2020 (UTC)

In Deseret alphabet
I would like to mention that 𐐜 𐐔𐐯𐑅𐐨𐑉𐐯𐐻 𐐝𐐯𐐿𐐲𐑌𐐼 𐐒𐐳𐐿 The Deseret Second Book (1868) used "𐐼𐐬" (same as "doe"; it didn't have period at the end) for ditto in its table of contents—paralleling the usage of "do." in Latin-based orthography. It's probably not a sufficiently notable piece of trivia to be included in the article, though. – MwGamera (talk) 11:35, 25 September 2022 (UTC)