User talk:Jan Hidders/HTML-free mark-up

That would certainly make the work of the parser (and mine;) a lot easier. But it would also mean to automatically replace HTML markup (which people will use no matter what) with wiki format (upon saving), which will be
 * 1) very tough, especially with tables
 * 2) a reason for people to cry out loud (I am thinking especiallyof The Cunctator;)

Also, some HTML things are nice, font tags, for example. Labelling an image is quite neat if the label is the same color as the object in the image.

Magnus Manske


 * You only once have to translate the complete contents of Wikipedia to the new mark-up. After that you always replace the tag delimiters with the entities & gt ; and & lt ;. (That's what PhpWiki does, for example.) People can then type all the HTML they like, it won't work. I agree about the font color, but you can probably invent some mark-up for that too. Jan Hidders

Jan, why do you think it desirable to banish all HTML markup? Isn't it be better to keep the threshold of contributing as low as possible for new users? AxelBoldt


 * I believe firmly that using HTML actually heightens that threshold. (FWIW, I actually teach XML but still find that it doesn't make sense as a human-readable format.) Remember that the complexities of HTML was exactly the reason that WikiWiki was invented (See "The Wiki Way" by Ward Cunningham, the originator of the concept). The HTML table-syntax, for example, is much more involved and harder to read in ASCII form than PhpWiki/MoinMoin table-syntax. Having two ways to do the same thing (e.g., ' ' and < i >) also doesn't make things simpeler. Also remember that accessible does not just mean that it should be easy for people to write something new, but also that it should be easy to adapt something old. The latter becomes more difficult if a previous writer used some nifty HTML stuff. ... I guess I could go on about this but I have to get back to work now.  Jan Hidders

FWIW, I agree, especially about the table syntax - take a look at my (still incomplete) list of food additives and think about why my first run is generated by a Python script from a space-separated file on my local machine. It'd be nice not to have to carefully filter HTML, too, so that things like clicking here aren't possible.

I do have some notes on your proposal, though:

Carey Evans
 * We'll still need to be able to enter entities like &beta; as "& beta ;". It'd be nice to be able to enter hexadecimal entities like &#x2019; and have them converted to & #8217; on output for older browsers too.
 * Recognising a "_" or "/", etc., that's supposed to be rendered as itself might be tricky. Maybe a double-underscore?
 * I'd like --- to do em-dash, "&#8212;", myself. I wonder how many people use strike-out?
 * I will never remember which is superscript or subscript. How about something more mnemonic like {^superscript^} and {_subscript_}?


 * I agree that the entities et cetera should stay, it's only the tags that I don't like. The problem of escaping special mark-up symbols is usually solved by a special escape symbol like "\". I would advocate that here too. I also agree about the em-dash and, yes, I don't think strike is used very much. I also agree that my symbols for sub and superscript are not very intuitive, but {_sub_} looks a bit much like _sub_.  -- Jan Hidders

- I have to say I like your proposal. Although I'm generally very comfortable editing HTML by hand in Vim, wiki editing with wiki tags seems very appropriate. I like having different level headings indicated by the number of = signs before and after, for instance. However, that particular convention leads to lots of typos: people forget to leave a space between the section heading's text and the equals signs on either side, or they don't balance the number of equals signs on either side so we see a dangling = on the page. Regarding tables, could there be a way to specify/enforce the number of columns in a table at the beginning? I think pages like List of saints would be much easier to edit using the syntax you suggest. Wesley


 * Thanks for agreeing with me. Enforcing the number of columns given the first line of the table is possible but not easy to implement; the parser then has to remember the number of columns. -- Jan Hidders

- One problem with the proposal: How will the new table syntax represent border/borderless cells and rowspan/colspan (necessary for the depiction of the roulette board)? -- Damian Yerrick


 * Good question. I only gave a notation for colspan. It is enevitable that if you are going to forbid the liberal use of HTML some things will no longer work. On the other hand, if you do want to allow HTML (or a safe subset of it) then you should write a small parser for that if you always want to guarantee correct HTML output and make sure that Magnus's table lay out isn't messed up. -- Jan Hidders

Just wondering: why would I care? --The Cunctator