User talk:Velasco622

Help me!
Please help me with...

"simplewiki-20140605-externallinks.sql" data dump.

I downloaded this dump with the help of wp-download (https://pythonhosted.org/wp-download/) and I am now trying to interpret it. I believe the dump is organized by wikipedia. I am wondering the method to the madness of the organization of the links. The file starts with a bunch of comments and then it has each element of the list of links in the following manner:

(2,916,'http://meta.wikipedia.com/','http://com.wikipedia.meta./')

Each element (external url) is in a parenthesis with two numbers. The first seems to be the count of the link within the .sql file, so link above is the second link in the file; My first question is: What is the second number? Then, we have two versions of the link, the first is the link as it would be entered in a browser and the second has the same domain but has the meta. swapped with the com; My second question is: Why do we have two copies of the same link here?

Thank you for your help.

Velasco622 (talk) 14:20, 20 June 2014 (UTC)
 * I doubt anyone passing through will be able to offer much help. The best I can advise is to either read Data dumps, or ask the peeps over at WP:VPT. Thanks, -- Mdann 52   talk to me!  14:29, 20 June 2014 (UTC)

Thanks for your help Mdann  52. From looking to the file and the pages/articles I was able to figure out my first question.

Question 1 answer: The second number is the page id, which can be found on the pages-articles.xml dump and is the source page from which the external link was harvested.

Question 2 answer: I still don't know.

Velasco622 (talk) 15:27, 20 June 2014 (UTC)