User:Crouch, Swale/Bot tasks/Civil parishes (current)/Simple

This is for civil parishes in England. Example - Rattlesden.

Rattlesden is a civil parish in the Mid Suffolk district, in the county of Suffolk, England. It is located approximately 15 miles north west of Ipswich. In 2011 the parish had a population of 959. The parish touches Buxhall, Shelland, Gedding, Drinkstone, Brettenham, Felsham and Woolpit.

Features
There are 59 listed buildings in Rattlesden.

Process

 * See User:Crouch, Swale/Bot tasks/Civil parishes (current) for more.
 * The bot goes through the different regions of England, namely South East, London, North West, East of England, West Midlands, South West, Yorkshire and the Humber, East Midlands and North East. In the infobox for "South East", "North West", "South West" and "North East" the word "England" is added after, thus "South West England" for example and it is not linked in the infobox (since it links automatically).
 * It doesn't create the entry for the district its self, namely the entries that have "District", "Unitary District", "Unitary County", "Metropolitan District", "Borough", "City" on the "Status" column. However it uses those to show what district the parish is in and its status. For those under "Unitary District" and "Unitary County", "unitary_england" should be used instead of "shire_district" in the infobox. For those under "Metropolitan District", "metropolitan_borough" should be used instead of "shire_district" in the infobox and "metropolitan_county" should be used instead of "shire_county".
 * Entries labeled "Unparished area" should also be skipped since the names given there are named after the district and not the unparished area, namely Rother unparished area is actually Bexhill unparished area.
 * For stating the district and county see List of English districts for example Southend-on-Sea is in Essex and Ashfield is in Nottinghamshire etc. Note that for (county) Durham the county is called "Durham" with the district is called "County Durham" but the county is called just "Durham" thus the county needs to be linked Durham and the district as County Durham . However do we need separate categories for these? Note that City of London (district is called "City and County of the City of London" though), Northumberland, Herefordshire (district is called "County of Herefordshire"), Rutland, Bristol (district is called "City of Bristol" and possibly county is to) and the Isle of Wight. For these the district category is omitted, thus Putley for example would only be but in Category:Civil parishes in Herefordshire and not also Category:County of Herefordshire and the text "in the County of Herefordshire district" is omitted however County of Herefordshire is still included in the "unitary_england" parameter in the infobox. For the other districts that are also counties, Cornwall, County Durham, Shropshire, Wiltshire and the East Riding of Yorkshire (all are unitary districts) they should use "(district)" to disambiguate (as explained for County Durham) and should have the district included in the text, thus we would have "Morville is a civil parish in the Shropshire district, in the county of Shropshire, England." and would be in Category:Civil parishes in Shropshire and Category:Shropshire (district). However maybe for Cornwall (and possibly the other districts that aren't concurrent with the counties) maybe separate categories aren't required and (as with Putley) should only use the CP category thus Morville would only go in Category:Civil parishes in Shropshire but the district would still be specified in text.
 * The bot checks for articles that already exist as explained in the "Examples" section with WP articles to see if there is already an article that has coordinates at roughly the same area, possibly 5 or 10 miles difference could be allowed since if there is more than that difference between the WP article with the same name and the coordinates for the parish on City Population then its likely that either they are actually 2 totally different places in which 2 articles are needed or the existing WP article has bad coordinates which requires manual fixing and then merging of the bot created article.
 * When checking for existing articles the bot doesn't make a distinction between names that only differ in capitalization, spaces (such as Penselwood/Pen Selwood, hyphens (such as Clapton in Gordano/Clapton-in-Gordano), apostrophes, commas and periods (such as Preston St Mary/Preston St. Mary) and any other similar things others can think of. Other cases such as Hamstead Marshall/Hampstead Marshall and Great Cheverell/Cheverell Magna are also listed at User:Crouch, Swale/CP blacklist. Even if some extras are added I can merge them myself and add them to the list so that on the next bot run they are also missed.
 * Some missing parishes are redirects, unless the name is similar (as explained above) the bot still creates articles when the base name is a redirect elsewhere, thus for Ellesmere Urban it would create the article at Ellesmere Urban, Shropshire (since "Ellesmere Urban" is already taken even though its a redirect). In this case I would need to then see if the new article should also redirect to the same place as the base name or if it should be moved to the base name, in the case of Ellesmere Urban there should be a separate article since its not an alternative name for "Ellesmere", in this case I would file a request at WP:RMT to move Ellesmere Urban, Shropshire to Ellesmere Urban. In other cases the base name might have been merged (such as Wolverley and Cookley) but there actually should be a separate article, in this case I would ask for a history merge of the bot created article into the existing one and restore any old content as needed. Note that if the bot still stumbles upon a title that is a redirect but it thinks should exist (say Ellesmere Urban for example) that exists as an article (or redirect) and the title Ellesmere Urban, Shropshire also exists it can but the article at Ellesmere Urban, Shropshire (parish) (which would probably require manual cleanup), see User:Crouch, Swale/Bot tasks/Civil parishes (current) for more. Note that the bot does not overwrite (or alter) redirects, instead it follows the disambiguation process as explained.
 * For the coordinates (and determining the distance from the county town) it uses the centre of the area as shown at City Population. If the bot isn't able to obtain coordinats from City Population then it uses the coordinates from the Ordnance Survey (which should probably be trimmed to something like 6 decimal places). There shouldn't be mixups between different parishes in the same district because City Population adds some info in brackets (as part of the name) such as "Flixton (Lothingland Ward)" (while the OS just calls it "Flixton") this only appears to affect 3 CPs since others exist and have been noted on the blacklist. The 3 that don't exist (the 2 Newtons in Herefordshire and the Bringsty Linton) will be created when the whitelist is run.
 * For determining the links for the "touching parishes" it checks the coordinates on the Ordnance Survey linked data for those parishes and matches then up with the WP articles, if it can't determine the correct WP article it just links to the base name, thus if it couldn't determine which Brettenham to link to it would just link to Brettenham. For it to find the parish that is the subject of the article on Ordnance Survey linked data (in the example "Rattlesden") after finding it on City Population it goes to the district (Mid Suffolk) and then finds "Rattlesden" and checks the coords with City Population that its found the correct one (since there are a few cases of multiple parishes with the same name in 1 district).
 * A rate limit of 6 articles a day (from say 150 days) not including redirects. Example: the bot runs from the beginning of August to mid December, the bot detects its going to create 770 articles (not including redirects) over approximately 130 days, 770 divided by 130=5.923 so we round up to 6 articles a day, when the bot runs on a particular day it creates a total of 6 articles from the database, however this number doesn't include redirects, thus the rate limit of 6 a day is only triggered once it creates the 6th article.
 * Titles that have commas with a name after (such as Marden, West Sussex) or a term in brackets (such as Corfe Castle (village) are treated as being just "Marden" and "Corfe Castle" if there on Wikipedia however some like Horwood, Lovacott and Newton Tracey have it as part of their name which can be seen by the fact that that name shows up on City Population. This doesn't apply the other way round though, thus it treats the entry "Horwood, Lovacott and Newton Tracey" on City Population as being "Horwood, Lovacott and Newton Tracey" however because there is a Wikipedia article at Horwood, Lovacott and Newton Tracey it knows that that's the correct article.
 * A few names on City Population also have an alternative name put in brackets after the name such as "Preston" in the Dover district has "Preston-next-Wingham" in brackets after the blue link, be sure to only use the name that is linked, thus just "Preston" not "Preston (Preston-next-Wingham)".

Examples

 * 1) The bot looks at the "Name" column, it then looks at the "Status" column for that name. With entries for the districts (as noted above) and unparished areas it skips.
 * 2) The bot then checks every entry that has the status of "Parish" and looks at the coordinates given and scans Wikipedia for articles with that name. Thus upon checking "Coombes" on South East it looks on Wikipedia for articles titled "Coombes" and finds our article Coombes, it then checks the coordinates of that article and finding that they match the City Population coords realizes that there is already an article here and takes no further action and thus goes onto the next name which has the status of "Parish".
 * 3) For "Marden" for example it finds that its a parish and thus checks Wikipedia, it searches for articles titled "Marden" on Wikipedia (including articles titled "Marden, Foo" and "Marden (Foo)" and finds Marden, West Sussex that matches the coordinates. It then (as with Coombes) realizes that the article exists and moves on.
 * 4) For "Wallington Demesne" upon checking it and finding that its indeed a parish it searches on Wikipedia, even after trying similar titles it finds that there is no article here it thus creates the article at Wallington Demesne.
 * 5) For "Morville" upon checking and finding its a parish it searches on Wikipedia, it finds that there is a DAB page at Morville but there isn't an article titled "Morville, Foo" or "Morville (Foo)" with the same coordinates so it then finds that there is no article on this Morville. It finds that there are no other OS settlements or places called "Morville" on Wikipedia so following User:Crouch, Swale/Bot tasks/Civil parishes (current) it creates the article at Morville, Shropshire. There is already a DAB entry for it but even if there isn't the bot doesn't need to add it since I will do that manually.
 * 6) For "Normanton on the Wolds" upon checking its a parish it looks on Wikipedia, it finds that although Normanton on the Wolds doesn't exist, Normanton-on-the-Wolds does. Because it finds that Normanton-on-the-Wolds matches the parish coordinates it then knows that it doesn't need to create an article because "Normanton on the Wolds" is so similar to "Normanton-on-the-Wolds" that its likely just an alternative name. It therefore creates Normanton on the Wolds as a redirect to Normanton-on-the-Wolds and moves on.
 * 7) For "Shipton Thorpe" upon checking its a parish it looks on Wikipedia, it finds that Shipton Thorpe redirects to Shiptonthorpe, it knows that it likely to be an alternative name but as with Normanton on the Wolds it moves on because the redirect already exists.
 * 8) For "Grimston" (the one in the Selby district) it finds its a parish and searches on Wikipedia, it finds that there is a DAB page at Grimston and no article on the one in the Selby district, it also finds that there is also (it checks for "other settlement" when searching the name at http://www.geograph.org.uk/search.php in the "near" box in the same county or district) one in the York district (also in North Yorkshire, although it doesn't yet have an article here) so it creates the article at Grimston, Selby per User:Crouch, Swale/Bot tasks/Civil parishes (current). In this case there is no population data so only the area is included.
 * 9) Note that "Shepway" district is now "Folkestone and Hythe" district.