User:Periglio/Persondata Analysis

Introduction
Many years ago, I was watching an interview with John Peel in which he stated he had calculated the date that he would have outlived his father. Ever since then I have had this morbid fascination with knowing who I have outlived. A few years ago, I had a sudden urge to create a website that would display that information. Wikipedia was an obvious source of information, and this is how I became involved with Persondata.

My first discovery was that I had to write all my own software to do the extraction, which developed into validation as I was seeing lots of weird data in my database. Then along came my first encounter with the Persondata deletion brigade! I managed to fight them off at the time, but made the switch to Wikidata as I realised Persondata days were numbered.

Currently, I have my own personal database of over a million notable names found on Wikidata. Each one has been checked for sanity, then compared against the Wikipedia article, highlighting discrepancies.

Overview of Persondata
These are my thoughts on how useful the data contained within Persondata is. This is based on using the data in an actual application ie My birth/death date analysis website, combined with my data validation programming and fixing the data that the validation highlights.

Name
The data in this field should be the same format as the DEFAULTSORT template "surname,forename". In reality, it is a mixture of "forename surname" and "surname, forename". Add in the other variations, like kings and queens, double barrel names and sometimes a middle name and it becomes impossible to extract into something useful. For my own purposes, I stopped using the field. Instead I used the article title and DEFAULTSORT to display names.

Alternative names
There was a defined format i.e. "Whistler, James Abbott (birth name)" but rarely used. Being too random to extract anything useful, I ignored the field. Alternative names should be visible within the article anyway - just being hidden in Persondata makes them invisible to search engines.

Short description
The short description is the most useful Persondata field.