User:Periglio/Persondata

For my own personal use, I used the Persondata template to create my own database of celebrity birth and death days. I have carried on developing my software in order to validate birth and death data in wikipedias biographical articles. To be honest, this is mainly for my own benefit to maintain my own accurate database. However, as an ex Wiki editor and I feel I should give something back to the community, I am actively updating articles where anomalies are found.

I have not put the software into the public domain, but if anyone shows any interest I could. I have also thought about making the error lists available hoping to get some help in fixing articles. Again, I am waiting for feedback.

Below I am listing the error messages, to give some idea of what I am searching for, and the criteria I am using. To be honest again, this is mainly for my own benefit, but if anyone shows an interest, I am willing to collaborate.

As of 22 February 2014, there are 1,116,575 entries in my database which should account for all articles that contain a Persondata template. This does vary with articles being created, deleted and edited.

Complete (living)
This indicates an article containing a complete birth date, no validation errors and the subject is still alive.

22 February 2014 - 289,389 records

Complete (non-living)
This indicates an article containing complete birth and death dates, plus no validation errors.

22 February 2014 - 131,337 records

Validated
This indicates an article where a birth or death date is incomplete, but otherwise validated. A relevant category will be present to confirm the date is missing and is not just a typing error for example.

22 February 2014 - 21,659 records

Error Messages
W errors are general Wikipedia errors. The explanations assume the Wikipedia class has been used by the Persondata class. P are specific to the Persondata template.

W001-Cannot contact Wikipedia
This errors occurs when there is a loss of Internet, but can also occur if Wikipedia returns an error page such as "server busy". If the error generates multiple times, the software will terminate.

W002-%1 template not found
This error will result in the article being removed from the database

The requested page does not contain the Persondata template.

W003-Template %1 contains unmatched link brackets
There is a broken Wikilink within the template ie an extraneous ]].

W004-Cannot convert year of %1
The date supplied is a single numeric, assumed to be a year, but cannot be converted into a 1st January date. The software checks for an error condition, but there is no feasible way this would fail.

W005-Cannot extract year of %1
The supplied date field appears to be a small piece of text. This is often someone typing NA into the death date of a living person.

W006-Invalid %1 date - no year
The date field appears to be complete (ie it contains 3 fields) and successfully converts into a date. However, the resulting year does not appear in the original date. This happens when the conversion is fooled for whatever reason and uses the wrong year for its conversion. For example, using a 2 digit year.

W007-Invalid %1 date
The date field appears to be complete, but fails to convert. The entry is invalid for various reasons. Normally misspelling of months or extra text. nb The software does not yet handle circa, about, between etc. Watch out for out of range dates such as 31st April, or 29th February during non-leap years.

W008-There are not 3 fields in %1 date
Before date conversion, the software checks there are 3 fields - day, month and year. This error indicates additional text, or a missing field.

W009-Unmatched category brackets
This error indicates a broken category within the article e.g. [[Category:2011 deaths

W010-Unmatched template brackets
This indicates a broken template within the article - Note that this applies to all templates within the article.

W011-Cannot handle %1 date modifier
Temporary kludge to flag acceptable date modifiers such as circa, about etc. These should not need fixing, it exists solely to prevent false errors in the date conversions.

W012-Unbalanced HTML comment
Somewhere in the article there is a rogue HTML comment start or end

W013-Unbalanced template brackets on page
The software was unable to extract the template because it could not find closing brackets. i.e. The template was found but is broken - there is a rogue {{ after {{Persondata.

P001-Persondata template contains a template
As per WP:PERSON, do not use templates as these can interfere with data extraction. Normally these are date templates, but disambiguation and country flag templates also appear. Occasionally the error may be triggered if there are rogue brackets in the text.

22 February 2014 - 1573 articles found

P002-No year of birth and no explanation category
These articles lack any birth information and and not in a category that would explain the lack of information. The normal fix would be to add Category:Year of birth missing (living people).

P003-No name in Persondata
This error message occurs when the NAME parameter is left blank. It can also occur if the NAME parameter appears twice, even if one parameter has an entry

23 February - 3 records (cleared)

P004-Unrecognised Persondata parameter
This is where someone has added their own parameters to Persondata, such as eye colour, spouse, etc. Can also indicated a rogue | character, left behind when delinking.

P005-Death category does not exist
These are entries where a full death date exists, but there is no (year) deaths category. Note that a different error is triggered if a different years category exists. This error means there is no (year) deaths category at all.

P006-Death category does not match DOD
This is where a death category exists eg 2013 deaths but the death date in persondata gives a different year. On the assumption that the article dates are visible for review, it is normal to make persondata and/or category match the dates contained within the actual article.

8 March 2014 2639 records

P007-Birth category does not exist
This is where at least a birth year is know, but there is no category nnnn births. Sometimes this is due to a more generic version being used, such as a decade birth 1950s births but in the main, it is just simply missing.

9 March 2014 6290 records

P008-Birth category does not match DOB
This occurs when there is a complete date of birth, but the nnnn births category indicates another year. Normally we assume that the article contains the correct information as it is visible to everyone.

8 March 2014 - 11205 records

P009-No comma found in name
The format for the name field is surname, forename. This indicates where this convention has not been followed. However, there will be many false positives due to many articles where the forename, surname does not apply. This is an ongoing project to remove the false positives.

8 March 2014 - 118091 records

P010-No short description
This occurs if the template short description field is left blank,

8 March 2014 - 47602 records

P011-Birth date is in the future
This error occurs when the full date of birth is greater than todays date. The error has a reject status.

P012-Accurate Date of Birth - category says no
This error indicates that the software was able to extract an accurate date of birth, but there is a category that indicates that an accurate date is not available.

8 March 1258 records

P017-Death date is in the future
The date of death is greater than todays date. Often caused by vandalism.

22 February 2014 - 1 record (cleared)

P021-Death year is in the future
This error is normally associated with vandalism. A year (not a complete date) has been found in the death date and it greater than the current year. Other errors will normally be generated as a strange figure will cause other validations to fail.

22 February 2014 - 10 records (cleared)

P022-NAME parameter missing
The Persondata template has no NAME parameter, often a sign that the template is broken.

22 February 2014 - 1 record (cleared)

P029-Date of death is before date of birth
This error will invalidate the database record

This error occurs when the subject appears to have died before he was born.

P032-Longevity too great
These are biographies where the subject appears to be over 120 years old. This can be caused by a date of birth before 1800 and no death information. If there is death information, it is likely that either the birth date or death date is incorrect.

22 February 2014 - 8 records (cleared)

P036-Life span too great
This error occurs if the person appears to have lived for over 120 years. There are two main causes, an incorrect birth/death date or the article is missing death information.

22 February 2014 176 Records

P039-Value in death data and still in Living People category
These errors are mainly due to someone typing in "living" or "NA" into the death date parameter. Occasionally can be due to vandalism. The error is also triggered when a death date is correctly added and the Living people category is left behind.

23 February 2014 - 2234 records

P042-Missing template not using (living people) version
There are various "missing" information templates such as "Year of birth missing". if the subject is still alive, these category titles also include the additional text (living people) i.e. "Year of birth missing (living people)" This error flags when the (living people) suffix has been incorrectly omitted.

22 February 2014 - 1864 records