User talk:Kotniski/Geoboxes

Geobox
I've replied to your at my talk page, this is just that I don't know if you're watching it. – Caroig (talk) 18:27, 15 October 2007 (UTC)


 * If I may, one more notice. It would be nice if all data in the Geoboxes used the same format. I see you're running a bot that parses other infoboxes. The Geoboxer does something similar for e.g. Slovak settlements. Transforming the data could be pretty easy if it were always formatted the same, unfortunatelly it's not always so and so any parser most be set up to deal with a lot of possible data formats. For all Geoboxes within the Czech Republic and Slovakia we use the following format of administrative units: Žilina, the article name being before the pipe, a short name (without Region, District etc.) on the right. In your treatment of the Lower Silesia you've been using e.g. Gm. Dzierżoniów . I think it can be just Dzierżoniów , you get the administrative unit name as the text on the left so there's no need to repeat it even in an abbreviated form. Similarly for counties. Any future parsing would be much easier if the text after the pipe were always the proper name without any generic word. I also think that having either County or Powiat only (without the second language version) as the field label would be sufficient given they get linked to an explanatory article on the administrative unit. Just imagine someone wants to convert the data into a database: English name, English name (Polish name), Polish name. It might be better to be only English names (with Polish name when no English one exists) or only Polish names. I can set up the Geoboxer tool to make the appropriate fixes. – Caroig (talk) 22:03, 15 October 2007 (UTC)
 * I see what you mean, but there is a problem - namely that there are often two separate units at the same level and with the same name, except that one has the prefix/suffix and one doesn't. For example, there is Poznań (a town with the status of a county), and - surrounding it - powiat poznański, for which the English name adopted in WP is Poznań County. The same with, for example, Augustów (which has the status of a gmina) and Gmina Augustów (surrounding villages). These are quite common situations. This is why I decided not to drop the words County and Gmina in the displayed form - I think it could be misleading (particularly to someone who doesn't know the full details of the administrative system). As far as the County (powiat) field label is concerned, maybe it would be better to go for just Powiat, then at least we avoid the repeat of the word County. --Kotniski 05:17, 16 October 2007 (UTC)
 * Yeah, actually we have many such units here as well: Žilina. There's the Žilina Region and the Žilina District but we only get Žilina displayed, the type of the administrative unit is always on the left, and the word Žilina links to the appropriate administrative unit article. The same applies for the Czech Republic. I guess it's quite a common situation that the name of a settlement is also the name of an administrative unit. See e.g. here: Valencia, Spain, Valencia is not only the city but also a Province and an Autonomous unit. Nothing I write here is a rule but I believe it would be beneficial for Wikipedia if at least the Geoboxes are used in a unified way, that's what tehy were designed for, after all. – Caroig (talk) 08:13, 16 October 2007 (UTC)
 * I certainly agree that the Geoboxes should be used in a unified way, but maybe it isn't so critical with piped links - what comes before the pipe should be uniform (for parsing purposes), but what comes after should be whatever is most helpful for the (not necessarily expert) reader in the particular case. Perhaps the terminology chosen is unfortunate, but in articles on Poland towns with the status of a gmina or powiat are often referred to as 'urban gminas' and 'urban powiats/counties'; this certainly means a reader could be misled if he sees that a settlement is in Powiat: Poznań (rather than Powiat: Poznań County). We can't expect readers to click or hover on every link they see. And if some automatic parser were looking at the display text, it too might be misled into thinking that Poznań is "another name for" Poznań County or Koło another name for Gmina Koło (which in many contexts they certainly are not).--Kotniski 08:28, 16 October 2007 (UTC)

I'll start from the left as I'm going to write more comments this time. The broken code for parts has been fixed. But … I'm afraid I desigree more now, when I've given some thoughts to this topic.

First, the Geoboxes are designed to fullfill certain purpose and that is to summarize in a standard way the geographical data from an article. I emphasize summarize as the Infoboxes in general don't need to contain such verbose data as the article but rather some database-like structed way of presenting them. As they're layout's unified across all geoboxes for all geographical subjects it helps a lot ordinary reader to orientate in them. They always find the same data in the same place, formatted in the same way. I'm not saying every infobox must waork this way but it's the primary design of the Geoboxes. There exist countless other Infoboxes which allow users to enter the data in any way they like, but I don't think these are highly useful. I've spent many hours and days improving and speeding up the code and so I'm kind of disappointed to see they're used in a way which is even less standard then most of other Infoboxes.

There's quite a lot of information on the template's documentation pages about how the data should be used and entered. There are blank templates which always conatin the most up-to-date field names (they don't get changed often, but it happens, all becuase of even more standardization, e.g every edditional field (i.e. those starting with the underscore) that somehow relates to a date is named now _date, before there were also _as_of or _year, this was differnt even between the same fields within the various Geoboxes in version one). The new version still supports the old name (the backward compatibility should 100%) but we're tryig to bring them up-to-date which can be done by the Geoboxer tool. It can parse the data, restructure the fields, change the renamed ones and reformat some dat in the recomended format and finally prints out the data again in a standardized format. We say in the documentation that empty field shouldn't be removed so that other editors can esily update, add new data without the need to look for the way where and how they should be entered. I'm afraid the code your bot produces is rather messy. But this can be easily fixed by running the page thru the Geobxer tool.

But my major concern is how the data is entered because that's the main reason why the Geoboxe are here. To contain data in a unified way. Not only the data you enter is not unified in the relation to other Geoboxes, it's somehow unorthodox to any geography related article on Wikipedia. Most Infoboxes I've stumbled upon use the format you object to:
 * Rajecké Teplice: Region: Žilina, District: Žilina
 * Valencia, Spain: Autonomous community: Valencia, Privince: Valencia'''
 * Wrocław: Gmina: Wrocław
 * Loja, Ecuador: Province: Loja, Canton: Loja

So what should be confusing about Powiat: Poznań when this is exactly the format most Infoboxes have been using and nobody seems to have any problem with it? This seem to be the preferred fromat on Wikipedia.

Every administrative field in your Geoboxes is treated differently, why the Voivodeship field uses this most common format, the Powiat field mixes the English Polish names and the Gmina fields places there the Gm. prefix. When I first saw this I thought the prefix was something like the St. prefix in English. I belive this must be highly confusingfor most readers. From the same reason we don't use the academic titles with names, which are in the Czech Republic and Slovakia an official part of the name, because they are highly confusing for English speakers, the academic titles are rarely used with names in this contexed ("is the Ing. in front of Czech names something like Mac in Scottish" I was asked many times).

I don't quite understand your remark about what a parser might "think" of a field formatted in the way I suggest? When most Infoboxes contain data exactly in that format? Most Infoboxes have a field which contains the administrative unit's category and a field which contains the administative unit's name, either as a plain text provided the article on that unit beaValencia (autonomous community)rs the same name (Bavaria, Illinois, Andalusia) or a wikilinked text Valencia : where the part before the link is the name of the appropriate Wikipedia article while the text after it is the actual data, data that can be entered into any database; the text before the pipe matters for Wikipedia but the text after it is even more importtant as this is the piece of infromation we are after? It's quite as in speech: "Which Commune does the Bielawa town lie in?" - "Bielawa". "And which County?" - "Dzierżoniów". "And what about the voievodeship?" - "Lower Sielsia". If anything should confuse the parser it would be, I'm afraid, your code which is non-standard, non-standard in the terms how such data is entered in most (all?) Infoboxex/Geoboxes.

Many Polish use the Infobox Settlement template, but even those are have no problem with having Gmina:Wrocław (at Wrocław). Just the Powiat field is entered in other pages in that Polish - English text.

This is the English Wikipedia so what can be easily translated to English, it should be. Even the Voivodeship has its own English equivalent. I don't speka Polish but it seems the Powiat can easily translate to county which is an administative division most English spekers would have no problem with. Similarly Gmina/Commune. The Geoboxes enable to "retype" any label provided that would be useful, which might be the case for Voivodeship, but why to confuse readers by entering the Polish name (Czech and Slovak Wikipedia use both Region and District, not their Czech/Slovak equivalents, similarly most other countries).

I've read the wiki articles on Polish administrative divisions and came to this summary and recommended usage in Geoboxes:

region = Lower Silesia // no need to add wikilink, the geobox code does that for you region_type = Voievodeship region_label = Equivalent to region or province // optional line, the gets displayed on hover anywhere on the line where there's no linked text

district = County // this field could optionaly be City County for e.g. WroclaW district_type = Bolesławiec district_label = Powiat in Polish // optional

municipality = Commune municipality_type = Bolesławiec municipality_label = Gmina in Polish // optinal

Similarly I don't quite understand why you're using a different naming system for the Heading section of the you created such as here: Gmina Bolesławiec, Lower Silesian Voivodeship. So Commune, Gmina or Municpality? Given I suggest using Commune for Gmina the heading should be formatted this way

{{Geobox | Region // the comma at the end is not necessary and can cause problems | name = Bolesławiec | category = Commune | native_category = Gmina … | official_name = Gmina Bolesławiec // should you want the full official name someplace The other_name should be used for variants of the names, names in other languages etc., not the native name, it has a dedicated field which should be used when the English name is different from the native one. See Prague.

Sorry for this lengthy text, but the Geobox projects aims at creating a unified system of adding parseable geodata to geography related articles, it doesn't replace the article itself but it rather summarizes the data as if it were some sort of a database. The main target of the system is uniformity indeed so that any parser can fetch any data about any country/feature without having to adjust the parser for every region/country or whatever. If you disagree with this idea teh geoboxes were created for, you can still use a lot of other infoboxes. After all I think the Infobox Settlement is the recommended one for Polish settlements and is used for many of them. Please, do not hurt this project adding data in a non-standard way (non-standard way not only in the terms of Geoboxes but in the terms of Infoboxes generally). – Caroig (talk) 21:34, 16 October 2007 (UTC)


 * Thanks for your comments. In principle I certainly support your goal of creating a uniform database. I hope that the various inconsistencies that have crept in can quickly be put right, either using Geoboxer or with my bot if necessary. But I would like to clarify a few things (I'll try to deal with the points in the order in which you raise them):

--Kotniski 07:56, 17 October 2007 (UTC)
 * 1) You say empty fields shouldnt be removed - does that mean every possible field has to appear in every instance of a Geobox? Wouldn't that simply increase server/memory load unnecessarily? Or do you only mean certain principal fields (those which are likely to be filled in)?
 * 2) I don't think you fully appreciate the problem with the Powiat: Poznań (et al) issue. If Poznan were in Poznan County (as presumably Valencia is in Valencia province) then that would be fine. But there are actually two powiat/county-level entities involved: 1) Poznan County and 2) Poznan. Poznan is not in Poznan County. Nor would it be in 'Gmina Poznan' (if such existed) - Poznan has the status of both a gmina and a powiat in its own right, and is sometimes referred to as such. The answer to 'What county is Mosina in?' is not Poznan but must be (to use the terminology adopted in wikipedia) 'Poznan County'. You give the example of Gmina: Wroclaw: this illustrates the problem perfectly. Wroclaw here is a gmina; it is not an abbreviation for Gmina Wroclaw. If the geobox displays the same for (the gmina called) Wroclaw as for (a hypothetical) Gmina Wroclaw, then the reader is automatically confused. If it is done differently in other countries' articles then either the situation is slightly different (so that there is no possible confusion, as in the case of Valencia) or otherwise I would propose that those articles be changed to a system similar to mine, rather than the other way round.
 * 3) I thought it was obvious that Gm. stands for Gmina - sorry, it obviously isn't. So we can write the whole word instead. But not drop the word completely, for the reasons given above.
 * 4) Whether to label the district field 'Powiat' or 'County' - it could just as well be County if you think that's better. I chose Powiat to avoid repeating the word (since I insist on displaying Xxx County in full). And some ENglish speakers with some familiarity witrh Poland might know the word powiat but not instantly associate it with county (though I guess it should be obvious in this context).
 * 5) Gmina/Commune has been discussed on various Poland-related pages - commune was felt not to be a particularly helpful translation (in contrast to county and voivodeship). I would have used district, but that doesn't seem to be popular in Polish contexts for some reason. In any case there is no real uniformity issue here, as the names of subdivisions differ widely between countries anyway.
 * 6) Content of the region field: you suggest Lower Silesia, but wouldn't this link to Lower Silesia rather than Lower Silesian Voivodeship? If I write Lower Silesian would that link correctly? (if not I think a piped link is necessary after all).
 * 7) The Gmina Boleslawiec article was created before I incorporated some of your previous suggestions - that's why it (and some others) do not conform to the latest ones. When we reach agreement on how exactly these things should look, we can do a repair.

Well, I'll put all my comments into one section because I§m going to arrange my points differently. First of all, your first question documments well the major problem here. The answer to that can be found in the docummenation to this template, which I'm afraid you hadn't read as of your writing the previous post, because you would find the answer in the very first section of it, in the red box. Similarly, you'd definitely find out if you looked at how this (and actually any other template) is used. And it is definitely not by selecting several fields and printing just those.


 * >[OK, I see now. I had read it but not carefully enough. Sorry. K.]

Secondly, you seem to be fairly new to the Wikipedia project, which is nothing bad at all and I believe you'll find the community generally very helpful and outgoing. But on the other hand, you can't start by editing pages in whatever way you like, without taking into account any guidelines, without looking at how other pages and the topic you want to participate in are dealt with.

We're not discussing how articles on Polish settlements should be named but just how the data is to be presented in the Geobox template. The templates are generally a summary of the page contents in a tabular form. I emphasize these two words as this suggests the data will be formatted in a little bit different way than in the article body. This doesn't concern just Geoboxes, but any Infoboxes and generally any data presented in tabular or database derived form.

Your major objection is that the data format ADMINISTRATIVE UNIT TYPE:ADMINISTRATIVE UNIT NAME without the unit type (i.e. just the PROPER NAME) is confusing. If you were right then thousands or rather tens of thousands of wiki pages were confusing because they use just this format and nothing like the one you're proposing for Polish settlements. This is a work of countless editors over many years of Wikipedia existence, no one else seems to have problem with this standard tabular form of data presentation so you can hardly state this is bad and everything should be formatted in the way you like.


 * >[So maybe the situation I've described with Polish administrative units and their names has not been encountered before; maybe it's unique to Poland. It certainly seems to be a genuine issue. But I think it can be solved - see my suggestion below. K.]

If anything is confusing it's, I'm afraid, your way of formatting because it is non-standard. Especially that Powiat:XXX County.


 * >[That wasn't my original idea - it was just an attempt to improve things after one of your previous comments. Powiat can easily be replaced by County here. K.]

All other geobox/infobox templates have the administative unit type on the left side of the table while the proper name (just the proper name, without the repeated category) on the right with the wiki link set to the actual article, but the displayed text is just the proper name. I've tried to show you on some pages, and there are thousands more, that exactly this way is the Wikipedia's standard.


 * >[I've also tried to explain why the Polish situation seems to be be different from the pages you've shown me. You don't seem to address this. K.]

Yes, sometime you find the unit type repeated but that's just because the user didn't bother to format it in the standard way. Sooner or later, someone puts in the right format. Because users are used to get the administrative unit on the left, in English unless the name is untranslatatable, they will assume the Polish Powiat is something highly unusual and that in XXX County both words are Polish, because there's no other infobox which would format the data this way. To put it in a nutshell, it is, I'm afraid, your format which is highly confusing, because it is not used in any wiki articles.


 * >[OK, Powiat needn't appear, see above. K.]

Even the Polish Wikipedia uses this system, just look at your two examples there: Mosina or Wrocław. Gmina:Mosina and Gmina:Wrocław. —Preceding unsigned comment added by Caroig (talk • contribs) 22:00, 18 October 2007 (UTC)


 * >[Just as you rightly criticise me for not reading your documentation properly, I think I can criticise you for not reading my arguments thoroughly. Gmina:Mosina and Gmina:Wrocław are perfectly in line with what I would suggest. To support your argument you should have chosen something like pl:Białobrzegi (województwo podlaskie) - this has what I don't like - Gmina: Augustów (when for example Augustów has just Gmina:Augustów, and both display the same - except that in the latter case a clarifying note is used, similar to my new suggestion below). But what (arguably) works on Polish WP isn't necessarily the best solution for English WP - Poles can perhaps be expected to know that a village could be in Gmina Augustów but not in Augustów, English WP readers can't. K.] —Preceding unsigned comment added by Kotniski (talk • contribs) 23:33, 18 October 2007 (UTC)

From all existing infoboxes you've decided to use Geobox which more than any other infobox tries to organize and display the data in a highly standard way, it's what we've been working on very hard. Yet it seems you haven't read its documentation and decided to use in your own way. You've decided to join a project but without respecting its rules, I'm afraid. This applies also for you treatment of Regions, especially the way you've filled in the header section. There are separate fields for the name and category, which every Geobox respects, some of them, e.g. Bratislava or Plunketts Creek (Loyalsock Creek) have recently reached the featured article status using just this formatting, i.e. splitting the proper name and generic word from the article title, so its aobviously genraly accepted and approved format.


 * >[Hmm, I don't really like that Plunketts / Creek display with Creek in smaller letters, when it seems to be just as much a part of the name as Plunketts. If that's how people like it, then OK. But please understand why I think this would be unsuitable for Poznań County or Gmina Augustów. By taking the generic part out of the name, you change the name into the name of something else which exists at the same level - potentially confusing both human readers and parsers. K.]
 * There's a rationale and it's to tell readers which part of the location's name is the proper name and which one just a generic word. While it's fairly obvious to anyone what a creek is, there are many English words not all English spekaing readers are familiar with, I usually give Parish as an example. And there's the official_name field to list the name in full. There were many debates on other pages whether an article should be named e.g. the River XXX or XXX River so our solution makes this obsolete. This is again one of our standards, to split the name. The parser will always work with both fields and also with the Geobox type (Settlement, Region etc.). Of course, we can discuss the font size and boldness but there were so far no objections to this.

There's quite a lot of work that was put into creating this template, which has come a long way for over a year of its development.


 * >[Which doesn't mean it is now perfect. Maybe the rules can be changed sometimes to adapt to previously unencountered situations. K.]
 * Definitely, but I don't think this is the case. After all you managed to find (later on this page) a solution which is both in line with our ideas and with your need to add some extra information using existing Geobox fields.

It is fairly complex but has quite an extensive documentation which says how it is to be used. The main target of this template is to create a means of a standard, machine-parseable data input and presentation. No-one's forced to use this template, however, if anyone joins an existing project they should respect the rules and guidelines set up by its creators. They can set-up thair own templates, or use other ones which don't have that clear aim and strict rules.


 * >[Don't you think you sometimes need to be flexible to expand? If this project is to be a success, as I understand it, then it ought to incorporate as many geo articles as possible, and therefore as many editors as possible using the Geobox template. I liked your template because it seemed to be potentially universal, unlike the ones specific to Poland. But if you tell people to go off and use other templates rather than possibly find slightly different ways of using yours, you won't achieve this universality - the data you have may be very uniform, but it will cover only a limited range of subject matter. K.]
 * Actually countless proposals, suggestions and criticism have been received during the development and have been addressed and the template is way different from the original proposal. And something I spend most my wiki time on while I would be happier actually editing the pages. See e.g. the debates at {{tl|Geobox River}}. Today, it's not a start-up project but something that exists over a year so it's quite mature.

As I wrote before the Polish users decided some time ago to use the {{tl|Infobox Settlement}} but even that splits the names of units into proper name and the generic word in it. The Geoboxes don't havem as of yet, support for Polish locator maps, which the other template does. No one's probably going to add some in near feature as no one participating on this works on Polish locations.

This project is supported by various parsers which read the templates' data or as in the case of Geoboxer cleans them up in to comply with the recommended format. There are much more requirements for the ways how the data is to be formatted and everything is described in the documentation. The basic principle of this template is to enter the data in a raw form and separate the actual data from notes and various other stuff. While the formatting you've been using doesn't cause any problems yet, there are some upgrades under way they may cause they will. But even now it's already breaking up various parsers that put the data into other language Wikipedias or back-up databases. I'm afraid you'll either have to conform to the rules here or stop using this template at all if you find the data format unsatisfactory. We're definitely not going to change anything which is the community's standard.

So if you're going to use this template in future, certain rules will have to be respected otherwise they'll be automatictally corrected by the Geoboxer or AWB.

I'm certainly no expert on Polish administrative divisons but there's not much to be changed about my previous examples. First of all, as all Counties and Gminas/Communes bear the names of their administrative units, the appropriate article is in the XXX County or Gmina YYY form, so the field has be formatted as XXX and YYY], i.e. with the short name after the pipe. This simply simply the basic and only supposrted format here (as well as in most other infoboxes).


 * >[I can't believe this format is forced. Are you saying that giving a county as 'XXX County' and failing to make a piped link actually breaks anything? If the county name was any other string there wouldn't have to be a pipe - and if a parser insists on not having a county name ending "County" then it can simply truncate the string itself. K.]


 * The parser can be set to do anything, surely, but what we're working on is to have the same format for every place on earth so the parser or anything didn't have to be set to treat each and every country differntly which is the problem of varioud existing templates each using a differnt format. There are more things we have to enforce such as using unformatted, raw format of numbers for all unit related fields as these are used for calculations and putting commas to a number causes a parser error.

The Lower Silesia/Lower Silesian issue was simply my typo. The examples I chose were just there to document the data format on some real names which I chose haphazardly so they were not meant as rules for those divions per se. As of the Voievodeship, this should certainly go to the region field, a state is a largely independent unit, which is not the case for Voievodeships.


 * >[Already changed (for new articles - old ones will be brought into line). K.]

The district field should be retyped to County not Powiat as it is the standard English translation and besides, most counties have articles in the form of XXX County so it doesn't make any sense to use a different word here.


 * >[Agreed - see above. K.]

And as I wrote before, for those special counties such as Poznan, the district_type should be set to City County for the city and to County for the county outside the town just clearly setting apart their different stauses (as in this list: List of counties in Poland.)


 * >[That wouldn't be the best solution in my opinion, since someone looking at one article wouldn't necessarily have seen others (it's different with a list where you can see everything all at once). My suggestion if you really can't tolerate seeing the word County twice on one line is that we use district_note in cases where there might be ambiguity. So something like this: district_type = County|district = Poznan|district_note = (land county), as opposed to district_type = County|district = Poznan|district_note = (city county). Similarly with gminas (using municipality_note). Is this acceptable? In the majority of cases there is no ambiguity, so the note could be omitted. K.]
 * This is not about me being able or not being able to tolerate something, what we talk about is standardization, using the same format everywhere and not confusing the readers, which is the main purpose of the Geobox. The case with Gmina is the same as the case of Poznan County and Poznan City County which I did address in my post, I realize there are two county level administrative units of the name Poznan. Your suggestion to use the _note fields is perfect, that's exactly what the _note field is for. Another solution would be the one I suggested for Poznan, for the city the country_type could read City County as the List of counties in Poland has it and the Poznan county outside the town can read just County. Similarly there could be Gmina and Urban Gmina or any other type there might exist. But both suggestion are OK and in line with Geobox standards.

If e.g. Poznan is not divided into gminas, this fields won't be set. As of Gmina I'm not that sure, many of them have articles in the Gmina XXX format and this seems to be used consistently, so it might be OK. But if the Polish name is to be used, the same applies for the heading of the Region subtype of the Geobox template and it shouldn't be named as Commune and Municipality at the same time. In the Regions for Gminas the text Sołectwos should probably be just Settlements, i.e. the deafult. A non-Polish gets again confused here and expects the Sołectwo to be a sort of further administative subdivision while it is in fact, in most cases (I read the article and clicked on many of them and haven't found one, there seem to be rather few) just a Settlement, this is the English wiki after all.


 * >[A sołectwo is the next administrative unit down (and a comparatively unimportant one) after gmina. Normally 1 sołectwo=1 village, but there are exceptions. I only list them where I have the information. Lists of villages (regardless of their solectwo status) appear in the articles and in the navigation boxes. K.]


 * I'd prefer the Settlements label and that one possibly linking to sołectwo but that's no a must.

I'm still willing to provide any help if my time allows but please, do read the documentation, do have a look at those examples of other Infoboxes, check how the various administrative fields are used in them and mainly, either respect the rules of this project on which participated quite a lot of users all working towards standardization or if you insist on your data format, which is undoubtedly meant in good faith I though don't fully comprehend it, don't use this template and don't break our work. – Caroig (talk) 20:19, 18 October 2007 (UTC)


 * My comments are interspersed above. Kotniski 23:13, 18 October 2007 (UTC)


 * Thanks for your further comments above. I think we've now reached agreement about everything specific to the Polish articles, so perhaps we can consider discussion on this page closed. However I still have quite a serious concern about the way the top section of the Geobox is organized - I will raise this at Template talk:Geobox. Kotniski 08:23, 19 October 2007 (UTC)


 * Great, just let me now about the version you decide for in the end, I'll set up the Geoboxer tool to make appropriate fixes on existing pages. Should you be interested I can send you the code (php). Basically it parses the page, extracts the Geobox/Inbox data and splits it to key - values pairs and you can than esily do any changes on the them. For Slovak settlements the tool can even extract missing data (or copy it from if the page has no Geobox/Infobox at all) from the corresponding page on the Slovak Wikipedia, including websites, images, flags … – Caroig (talk) 08:58, 19 October 2007 (UTC)