Wikipedia:Request a query/Archive 5

Links in common between two pages, with a twist
I realize PetScan can be used to show links that appear on two pages, but I have a need for it to show such a result sorted by the order in which they appear on the first page. Based on my limited understanding of the database, I don't think the order of the links on a page are tracked. But just in case I'm wrong, can someone show me a query to do this? Or suggest an alternative approach? Thanks. Stefen Towers among the rest!  Gab • Gruntwerk 23:49, 22 June 2024 (UTC)
 * No, that's not possible; Quarry doesn't contain information about the content of the page.
 * Could you give some additional information about what you want this for? There are several alternative approaches that would work, but I need to know a little more about what you want it for. BilledMammal (talk) 23:53, 22 June 2024 (UTC)
 * I thought Quarry knows the links on a page, so that's where I was coming from. Anyway, I have a page that shows links sorted by popularity (views, descending) and another page that shows links to articles having old issues. I'd like an intersection of the links between them, in order of popularity (order of appearance on first page). Stefen Towers among the rest!   Gab • Gruntwerk 00:03, 23 June 2024 (UTC)
 * It can see which pages a given page links to (or is linked from), but doesn't have any information on what order it's done in - that would need the page content.Is this a one-time query, or will you need it repeated? —Cryptic 00:08, 23 June 2024 (UTC)
 * I'd like for it to be repeatable as the underlying pages will change. I'm fine with having to run it manually. Stefen Towers among the rest!   Gab • Gruntwerk 00:10, 23 June 2024 (UTC)
 * It does, but it doesn't have information beyond that, such as about the text of the page.
 * I'm not aware of any tools that can help you with that, but I threw together the information you wanted using a quick and dirty script:


 * 1) Deion Sanders
 * 2) Ford Explorer
 * 3) Maurice Lucas
 * 4) Josh Hamilton
 * 5) Fort Knox
 * 6) Diane Sawyer
 * 7) Aroldis Chapman
 * 8) Secretariat (film)
 * 9) UPS Airlines
 * 10) Joe Torre
 * 11) Presbyterian Church (USA)
 * 12) Damaris Phillips
 * 13) Jim Beam
 * 14) Louis Brandeis
 * 15) Jack McCall
 * 16) My Morning Jacket
 * 17) Carlton Fisk
 * 18) David Pajo
 * 19) Adam Dunn
 * 20) Kentucky Colonels
 * 21) Humana
 * 22) Earl Weaver
 * 23) Pope Lick Monster
 * 24) Meriwether Lewis Clark Jr.
 * 25) John Marshall Harlan
 * 26) B. Brian Blair
 * 27) Frank Ramsey (basketball)
 * 28) Interstate 71
 * 29) A. J. Foyt IV
 * 30) Oldham County, Kentucky
 * 31) Susanne Zenor
 * 32) Andy Van Slyke
 * 33) Harvey Fuqua
 * 34) Dan Uggla
 * 35) Homer Bailey
 * 36) Louisville Cardinals
 * 37) Aristides (horse)
 * 38) Playa (band)
 * 39) Greg Page (boxer)
 * 40) Terry Pendleton
 * 41) Kentucky Derby Festival
 * 42) Louisville Metro Police Department
 * 43) 5th Cavalry Regiment
 * 44) Taylor Nichols
 * 45) David Grissom
 * 46) Valley of the Drums
 * 47) Ward Hill Lamon
 * 48) Jefferson C. Davis
 * 49) Robert Nardelli
 * 50) Jim Caldwell (American football)
 * 51) John Cowan
 * 52) Mildred J. Hill
 * 53) Johnny Edwards (musician)
 * 54) Lance Burton
 * 55) IWA Mid-South
 * 56) Mickie Knuckles
 * 57) Run for the Roses (song)
 * 58) Taeler Hendrix
 * 59) Sovereign Grace Churches
 * 60) Fabian Ver
 * 61) Tori Hall
 * 62) Larry Collmus
 * 63) New Grass Revival
 * 64) Rebel (bourbon)
 * 65) 2011 Kentucky Derby
 * 66) Bullitt County, Kentucky
 * 67) Catherine McCord
 * 68) Shelley Duncan
 * 69) Dan Boyle (ice hockey)
 * 70) History of Louisville, Kentucky
 * 71) Rudy Rucker
 * 72) Big Four Bridge
 * 73) Optimist International
 * 74) Larnelle Harris
 * 75) C. J. Mahaney
 * 76) Thunder Over Louisville
 * 77) Belle of Louisville
 * 78) Bertha Palmer
 * 79) Sports in Louisville, Kentucky
 * 80) Gary Matthews Jr.
 * 81) George Devol
 * 82) John Yarmuth
 * 83) Travis Stone
 * 84) June of 44
 * 85) Ted Washington
 * 86) Larry Elmore
 * 87) Parents Involved in Community Schools v. Seattle School District No. 1
 * 88) Interstate 64 in Kentucky
 * 89) Corey Patterson
 * 90) Stith Thompson
 * 91) Roman Catholic Archdiocese of Louisville
 * 92) Louisville Zoo
 * 93) Boyce Watkins
 * 94) James Speed
 * 95) Jefferson County Public Schools (Kentucky)


 * BilledMammal (talk) 00:23, 23 June 2024 (UTC)
 * I see you want something that can run repeatedly. I don't have time right now to put something together for you, but if Cryptic doesn't come up with something I'll do it sometime in the next couple of weeks - if I don't, feel free to remind me on my talk page. BilledMammal (talk) 00:24, 23 June 2024 (UTC)
 * Thanks for the list - that's good for a start. Would you mind giving me a few clues on your approach for the script you did? It might snap me into figuring it out. Also, I have thought of using a spreadsheet or text compare software, but was hoping for an on-wiki or otherwise online approach. Stefen Towers among the rest!   Gab • Gruntwerk 00:40, 23 June 2024 (UTC)
 * A few Regex operations to get just the links in order, and then a basic python script that works down the first list and if the item exists on the second outputs it. Unfortunately, nothing online ATM. BilledMammal (talk) 00:49, 23 June 2024 (UTC)
 * Thanks for the tips! Stefen Towers among the rest!   Gab • Gruntwerk 00:50, 23 June 2024 (UTC)
 * Any pure-sql-against-the-wmf-databases approach would have to start with something similar to either "Make a page in your userspace with redlinks to 1!Tom Cruise, 2!Muhammad Ali, ... 1000!Frank Torre" or "manually edit this stupidly long query that includes all that data" (like how query/81948 includes the namespace names, but with a thousand items instead of 30). —Cryptic 01:04, 23 June 2024 (UTC)
 * Indeed, those don't seem like tenable approaches. But this discussion has helped me figure out a solution, not optimal but workable:
 * Copy popular articles wikitext into a text editor, and break down into a flat list using a macro I constructed with regex and other tricks.
 * Use PetScan to create a flat list of articles with old issues (this didn't have to be in any particular order).
 * Insert both flat lists into their own column in a spreadsheet, then find matches of first column entries in the second column, then apply a filter of matches, and voila.
 * Stefen Towers among the rest!  Gab • Gruntwerk 04:18, 23 June 2024 (UTC)