Wikipedia talk:WikiProject Punctuation/Round1

This is an archived talk page for Project Punctuation. For the current talk page, see Wikipedia talk:WikiProject Punctuation.

Archived talk from Round 1
You could make the link from the listings page to the article that needs editing an edit link, to save users the bother of that extra mouse click. (Or add another link so users can decide whether to visit the article in read or edit mode. Your filenames, but, suggest there are 625k errors to correct, which is disheartening until you realise that is not the case. In my experience, people like to know far through a project they are, and I'm not sure your files make this very easy. Ah. And if you cannot edit the listing files, then people are going to repeat the same work over and over again. Still. Kudos on doing the work. --Tagishsimon (talk)
 * 652k is the total number of articles, not errors. I can make that more clear in the text. Unfortunately, the dumps total 5 megabytes, so I don't think it's appropriate to host them on wp where they can be edited. The idea is that you do a dump and then remove it from the list, so that the work isn't repeated. Are the dumps too long for that? I can easily make them shorter, at the expense of a much longer list on the project page. Brighterorange 00:06, 20 May 2005 (UTC)
 * Thanks for the comments. I made the dump files much smaller, listed fewer of them, and added direct edit links. I'm also showing fewer on the project page, so it doesn't look so intimidating.

I'm curious to know how big a deal this period thingy is. --Infobacker 19:19, May 24, 2005 (UTC) (PS. thanks for helping!) Brighterorange 19:53, 24 May 2005 (UTC)
 * I'm not sure I know what you're asking. Do you mean, "is this a very common problem in Wikipedia articles?" The answer is surely yes&mdash;the mistake is made thousands and thousands of times. Do you mean, "Does this make Wikipedia unusable?" Probably not; of course the text is understandable even without punctuation. But Wikipedia desires to have high quality articles that are written in grammatical English, so, Wikipedia's articles need punctuation to meet that goal. Plus, IMO, missing punctuation at the end of a sentence makes an article look shoddy.

Qing Dynasty - fragment
Could someone check out how I handled Qing Dynasty, i.e. by adding a in the talk page, and let me know if that's the right way to handle it? It's actually out of scope for this project, to my understanding, but I don't mind learning enough about the Qing Dynasty to fix their fragment. However, it'll be quicker for someone else to do it. -- PhilipR 14:03, 27 May 2005 (UTC)
 * Yes, I think that's a nice solution. This project is really only to fix superficial and obvious mistakes, so if you want to skip over problems like this, that's just fine. In the past, I've yanked such incomplete sentences if I couldn't make sense of them (explaining in my edit summary), or just left them incomplete for someone with more expertise to find. Brighterorange 15:18, 27 May 2005 (UTC)


 * OK, thanks.  Here's another question that I don't feel merits another section, from Rugby football: I changed,


 * and a plaque at the school commemorates the 'achievement'


 * to,


 * and a plaque at the school commemorates the "achievement."


 * Obviously it needs a period, but inside the scare quotes right?  and the quotes should be doubled?   (They were for scare quotes earlier in the same paragraph.)   Thanks, PhilipR 17:36, 27 May 2005 (UTC)
 * The style guide says that periods go outside scare quotes. (They only go inside quotes if they are part of the quote.) This is contrary to standard American typographic conventions, but it is the Wikipedia style, so we should use it when we introduce periods. Brighterorange 17:40, 27 May 2005 (UTC)

False Positives
I've noticed a number of false positives with the pattern


 * lots of properly punctuated interesting text. [[Image:AFascinatingImage.jpg|with a great description]]

I have just been moving the image onto a new paragraph which doesn't affect the layout any, but it would be nice if these were detected by the bot. -- pcr 23:25, 3 Jun 2005 (UTC)

I have noticed that also. Is a little distracting. --Phroziac 00:03, 4 Jun 2005 (UTC)
 * Thanks. I have seen many of these too. You needn't bother moving them to another paragraph, since when I run the analysis for the next release of the database, I'll filter these out automatically. Brighterorange 18:00, 13 Jun 2005 (UTC)

I have noticed some box false-hits and also in Image Captions ending in .]] without a trailing period. -Poli 2005 July 4 15:08 (UTC)
 * Images should no longer be a problem. What should I filter out for boxes? I'm not familiar enough with them. Brighterorange 8 July 2005 13:51 (UTC)

Paragraphs ending with "Jr."
Ran across many pages ending with a reference to a town named Decatur and the towns are usually named after Stephen Decatur Jr. and that's the last phrase in the paragraph. The paragraph ends with ...town is named after Stephen Decatur Jr.
 * Given that the project rules say to not change these, I think it would be prudent to filter out Jr., Sr. and D.C. (I see that one a lot). I'll update the program, but we still have to wait for another dump to come out.. Brighterorange 14:21, 18 Jun 2005 (UTC)
 * They are filtered out now. Brighterorange 8 July 2005 13:54 (UTC)
 * Wouldn't it be better to filter out .]] ? -- Phroziac (talk) 16:44, 18 Jun 2005 (UTC)
 * I don't think so, since we catch some errors where the period is inside a named link. or red. -- I think it's usually an error to have .]] Brighterorange 19:43, 18 Jun 2005 (UTC)

Perhaps User:Humanbot can help?
Could User:Humanbot help by bringing you to the right articles, marking them done if you save, etc? r3m0t talk 16:19, Jun 17, 2005 (UTC) Might be cool. -- Phroziac (talk) 16:44, 18 Jun 2005 (UTC)
 * I'll look into it for my next project. I'm afraid if I started adding it to this one, the project would complete before I got it working. ;) Brighterorange 29 June 2005 02:04 (UTC)

CORRECTION ERROR
I'm not user if this is an operator error or an error with the project code, but this edit shows an example where the change led the failure of Image tags to close properly. 

Dragons flight 02:08, Jun 24, 2005 (UTC)
 * These edits are done manually, so it was a mistake of the person who made the change. As soon as the next database comes out, the listings will also avoid including things that end in image tags, which would have kept this from even showing up in the "potential problems". PS. most of the changes we make are good! Brighterorange 12:51, 24 Jun 2005 (UTC)

Suggestion for improvement to policy
Could it be made clearer, please, in the list of examples, that the computer is likely to pick up examples of missing punctuation when the article is in fact including a quotation, such as:

Brequinda quoted the style guide which held that


 * it is not necessary always to fix the punctuation in WikiPedia articles

because that is what he believed as well. He made a comment to the people at the WikiPedia punctuation project in the belief that it would help them adjust the policy guidelines.

Brequinda 29 June 2005 15:53 (UTC)


 * Well, anything that begins with a colon is filtered out, so those should never show up. What is the specific thing that you're asking us to avoid changing? Also, is that really a quote from WP:STYLE? I don't think it says that. Brighterorange 29 June 2005 17:17 (UTC)

"Smart" or "curly" quotes
For other punctuation besides periods, it might we worthwhile to filter for "smart" or "curly" quotes, which I have found and fixed occasionally on some pages. These are where people have directly entered these items ( &lsquo; &rsquo; &ldquo; &rdquo;) on the pages rather than using HTML entities. When used, they should be entered as their appropriate HTML entities:
 * &lsquo; – &amp;lsquo; or &amp;#8216;
 * &rsquo; – &amp;rsquo; or &amp;#8217;
 * &ldquo; – &amp;ldquo; or &amp;#8220;
 * &rdquo; – &amp;rdquo; or &amp;#8221;


 * Shouldn't these not be used, according to WP:STYLE? If not, they should certainly be filtered out. Brighterorange 30 June 2005 20:29 (UTC)


 * Looks like you are right according to the style page. Where I have found them already in use, I have replaced them with the HTML entities rather than replacing them with straight quotes. (I made the above comment but forgot to sign my user name.) DanMS 30 June 2005 21:03 (UTC)
 * Okay, but it seems to me that the style guide explicitly says to change them to straight quotes. (Personally, I like curly quotes, but I like consistency even more...) Brighterorange 8 July 2005 13:54 (UTC)

Reword?
IMO, there's a problem with this (one of the exceptions):


 * Paragraphs that end in a parenthetical remark, with internal punctuation. This is bad style, but we are only attempting to fix incontrovertible mistakes...

This says we're "only attempting" something. I believe what is meant is "...we are working to fix incontrovertible mistakes only.

Yea or nay?
 * I don't really see the difference, but I changed it. Brighterorange 8 July 2005 13:54 (UTC)
 * I for one agree with you wholeheartedly, but I think it's a lost cause. I correct such errors when I find them, but it's a losing battle. DanMS 2 July 2005 21:24 (UTC)

Standards
I have some AlMac observations about possible similar interests between this project and the usability project. AlMac

For example, for the usability project, I suggested that there might be value in adding to the Tool box.

I just edited this page, please run some standard software to identify common typing errors, that I could fix right now. AlMac 4 July 2005 18:56 (UTC)
 * I don't really understand what you're proposing, but I would love it if wikipedia did some basic checks when you edited an article. Detecting missing punctuation is really simple, and if there were thorough WP:STYLE guidelines about when to use punctuation, it would be even easier. Brighterorange 4 July 2005 20:23 (UTC)

Database dumps
Although I was successfully able to import the database dumps from 16 May, the new ones crash myql when I import them:

STOPPING server from pid file /usr0/src/mysql-standard-5.0.7-beta-linux-i686/data/host.pid ERROR 2013 (HY000): Lost connection to MySQL server during query Command exited with non-zero status 1

Does anybody have any clues? I'd like to run this analysis, or something like it, again. Brighterorange 7 July 2005 15:54 (UTC)
 * I tried a few things and now the import seems to be underway (but it takes ages..) if successful, I'll post the solution here and at Database dump import problems. Brighterorange 7 July 2005 21:15 (UTC)
 * I fixed it. My configuration file can be found at Database dump import problems. Brighterorange 8 July 2005 16:26 (UTC)

number of articles
I pulled this discussion from comments in the project page.


 * Note: There are less articles in Wikipedia than 627k. Plus, add up the numbers in between in the status section because not many people seem to update the status section that often. We currently have 570,998 done. If we divide that by 626,037, that certainly gives us somewhere around 91, not 86. (as of July 8, 2005 12:12 (UTC) by WB)


 * Because I re-ran the analysis on the 23 June data set, it now goes up to id 628k. The main problem with using is that these are article ids, whereas the template counts the number of active (non-deleted) articles, which is necessarily smaller. However, I think that the obsession over an accurate progress percentage is misplaced because: (1) the dumps are completed so quickly that the number is almost always out of date, and (2) the articles and errors are not evenly spaced throughout the article id space (they become less frequent as the article ids get higher), and (3) we are almost done with round 1. Brighterorange 8 July 2005 13:42 (UTC)
 * True. I wrote that it's not accurate on Tagishsimon's talk page. Progress is just an estimation, not an accurate update thing... lol -- WB July 9, 2005 03:03 (UTC)

Round 1 Over
Congratulations for all the Project Contributors, the last dump file is being worked on and Round 1 will be over soon. I am really itching to get Round 2 going... ;) But I agree with Brighterorange that waiting one month might be a good idea.Poli (talk &bull; contribs)

That's it. Round 1 is over! Thanks Budgiekiller for the last dump file!-Poli (talk &bull; contribs) July 9 2005 19:30 (UTC)
 * Thanks, everyone, and congratulations! I'll do some project cleanup tomorrow, but I am rather busy today ;) Brighterorange 02:46, 10 July 2005 (UTC)
 * Congrats to all who contributed! -- WB 11:35, July 10, 2005 (UTC)