User talk:Diberri/HTML-to-wiki converter

Bug reports
Ideally, please report bugs via. If you're not comfortable with that, please report the bug below. The more details, the better. Please add a new subsection for each report.

Feature requests
Feature requests are very welcome. Please start each request with a new subsection heading.

Paragraph Inclusion
Hi Diberri, great tool. Is there anything you can do about paragraph spaces. If i have a peice of text that goes like this

"this is something i have just written now. It is not that informative but i had to think of something to write and so to the next paragrapth

This is a new paragraph. Notice the line gap above. The tool pulls everyting together. And so it comes out very unreadable"

it comes out of the converter like

"this is something i have just written now. It is not that informative but i had to think of something to write and so to the next paragrapth This is a new paragraph. Notice the line gap above. The tool pulls everyting together. And so it comes out very unreadable"


 * Glad you're finding the tool useful. I'd like to help you, but first I need some more information:
 * What dialect are you using? (I'm assuming it's MediaWiki judging from your question below, but I'm not 100% certain.)
 * What is the exact HTML you are providing to the converter? (Writing it on a Wikipedia talk page can sometimes be difficult: You can wrap tags in  and   to make them appear in your wiki markup.)
 * What is the exact wiki markup you expect to get?
 * What is the exact (and presumably erroneous) wiki markup you're actually getting?
 * The more information, the better. Thanks! --David Iberri (talk) 20:43, 18 March 2009 (UTC)

Fonts
Hi Diberri, another issue. I found, but it is ok in the interface, but not if you call application programmatically. If you have a font greater than size 3, it comes out really horribly in mediawiki Since Mediawiki only deals with headlines font. But html can have many fonts

Thanks Paul
 * You're right that the MediaWiki dialect doesn't handle fonts particularly well. But there shouldn't be a difference in handling between the web interface and the Perl module. What difference(s) are you seeing when you use the web interface vs. using the Perl module directly? --David Iberri (talk) 20:43, 18 March 2009 (UTC)

TikiWiki
Hi, is the TikiWiki conversion part of this tool working correctly? When I try to convert an HTML table to TikiWiki, it doesn't spit out the correct TikiWiki table code. Thanks, 122.57.135.147 (talk) 02:38, 15 June 2009 (UTC)


 * The TikiWiki dialect is currently rather limited. It does not handle tables at all. I'm not sure when I'll have a chance to code this, but it's on my to-do list. --David Iberri (talk) 21:33, 16 June 2009 (UTC)

problems converting German Umlaute
It appears that the converter does not convert German Umlaute ä ö ü ß properly. (I don't know how the converter deals with other special characters)

To test: convert to DokuWiki syntax which will give you many of these

Berlin - Drei Buchstaben: HRE. Das K�rzel steht l�ngst nicht mehr allein f�r den Bankenkonzern Hypo Real Estate. Es steht f�r die gr��te Rettungsaktion eines privaten Unternehmens durch den Staat und

instead of

Berlin - Drei Buchstaben: HRE. Das Kürzel steht längst nicht mehr allein für den Bankenkonzern Hypo Real Estate. Es steht für die größte Rettungsaktion eines privaten Unternehmens durch den Staat und

Username1204 (talk) 09:40, 15 July 2009 (UTC)

Security Paranoia
The html2wiki tool page works quite well for my needs, but one concern has come up. Since the html code is uploaded to your server, there's the possibility that confidential information might "escape" if you log all the stuff that goes by. I strongly doubt you do this - the volume would be large, and there's really nothing to gain - but a statement that you're not recording anything might be helpful in keeping corporate security types from making some folks life miserable. 198.95.226.224 (talk) 16:43, 14 August 2009 (UTC)
 * The queries are done via post requests, so I don't think they get logged in the Apache log files. I certainly don't log any of the html that comes in, or any of the wiki markup that comes out out of the tool. Thanks for bringing it up though. --David Iberri (talk) 03:11, 15 August 2009 (UTC)

Html &lt;tt&gt; to JSPWiki?
I suggest considering to parse the &lt;tt&gt; html tag, and convert it to pairs in JSPWiki. Similarly &lt;pre&gt; and &lt;code&gt; could appear as code blocks, so in case of JSPWiki.

Thanks for the great tool BTW

+1 for this from me. Devilgate (talk) 11:01, 8 July 2011 (UTC)

And meetoo, the online version of JSPWiki screws up non-ascii chars.

Pihentagy (talk) 16:03, 9 April 2010 (UTC)

Download
Related to the security paranoia above, is there any way we can download the tool to use it in-house? --Robinson weijman (talk) 09:17, 5 November 2010 (UTC)

CPAN web host pretty broken
Could we get this stuff onto another provider before it dies? Thanks! SChalice 18:55, 2 December 2013 (UTC)