User:Dank/Regex

List of US forest-inventory conifers
Start with a bulleted list of species in this format: (5 means the USDA symbol is ACNI5; 46 is the page number in the 1991 inventory)
 * Acer nigrum5.black maple.46

Create links to the USDA Plants Database (but remember this place in the article history; you'll need the version without the links, too): This will add the 4-letter codes: \*(\w\w)([a-z]+) (\w\w)([a-z]+)(\.)(.+?)(\n) then, for (\.), substitute (\d\.), then (\d\d\.) And this creates the urls: \*(\w+ \w+)\.(\w+)\.
 * $1$2 $3$4.$1$3$5$6$7
 * $1.

Create a data table in this format: https://en.wikipedia.org/w/index.php?title=User:Dank/Sandbox/8&oldid=1224700887#temp4 (except: the "uses" column is a string of y and @, not y and n). "Uses" mirrors these categories from "Suitability/Use": Christmas Tree, Lumber, Naval Store, Nursery Stock, Post, Pulpwood, Veneer.

Create the real table: (\n)\|\-\n\|(\w+?)\n\|([y@]+?)\n\|(\w+?)\n\|(\w+?)\n\|(\w+?)\n\|(\w+?)\n\|(\w+?)\n\|(.+?)\n\|(.+?)\n\|(\w+?)\n\|(\w+?)\n\|(\w+?)\n\|(.+?)\n\|(\w+? \w+?) (.+?) (\d+) (\d+) ([`a-zA-Z]+) $1|-$1!scope="row" |$15 $1|Uses: $3@1$2 $15]: Characteristics}}$1|No$1$1$16$1$1$4 ft; $5@1$2 $15]: Characteristics}}$1|pH $9–$10$1$11 - $12 in $1$14 F@1$2 $15]: Characteristics}}$1|D: $7 F: $8 L: $6 S: $13 @1$2 $15]: Characteristics}}$1|

@1 {{sfn|National Plant Data Team|2023|loc=[https://plants.usda.gov/home/plantProfile?symbol=

S: intolerant S: intolerant

[in case I forget] intermediate{ medium{

Add the common names.

Rearrange the y-@ string to the proper order for: construction, landscaping, posts, pulpwood, terpenes, veneers, winter holiday decorations. ([y@])([y@])([y@])([y@])([y@])([y@])([y@]) $2$4$5$6$3$7$1

(?<=\|Uses\: ......)(y) , winter holiday decorations (?<=\|Uses\: .....)(y) , veneers (?<=\|Uses\: ....)(y) , terpenes (?<=\|Uses\: ...)(y) , pulpwood (?<=\|Uses\: ..)(y) , posts (?<=\|Uses\: .)(y) , landscaping (?<=\|Uses\: )(y) construction

Remove any leftover @


 * Uses:_,_ -> |Uses:_

` -> |

https://en.wikipedia.org/w/index.php?title=User:Dank/Sandbox/8&oldid=1224733838#temp0 Alphabetize by last name, then add refs for single authors to reference section, from a table in that format: \*(\w+ \w+) (\d+) (\d+) (\w+)\, (.+?)(\n)
 * {{cite book |last1=$4 |first1=$5 |pages=$2–$3 |chapter=$1 | editor-last1=Burns | editor-first1=Russell M. | editor-last2=Honkala | editor-first2=Barbara H. | title=Silvics of North America, Volume 1. Conifers. | publisher=US Forest Service, Department of Agriculture (US Government Printing Office) | location=Washington, DC | year=1991 | isbn=978-0160292606 }}$6

Fill in the second column and add images. If desired, this can be added manually to the last column: {{Multiple image |perrow=2 | total_width = 360px | image_style = border:none; | border = infobox }}
 * image1 =
 * alt1 =landscape
 * image2 =
 * alt2 =landscape
 * image3 =
 * alt3 =bark
 * image4 =
 * alt4 =cone and foliage

List of Canadian forest-inventory conifers
Remove uppercase codes at end of each line [A-Z ]+(\n) $1

Do lines where the common name isn't two words by hand; add "/" at the end

For the remaining lines, remove all but first two and last two words of each line (\w+ \w+ )(.+?)(\w+ \w+)(\n) $1$3$4

Remove each /. Add * at the beginning of each line

Add links and italics: \*(.+?) (\w+) (\w+)(\n)
 * $2 $3, $1$4

Check POWO for synonyms and Commons for sufficient images. Check on maps.

Add:

Key

 * Provinces: AB Alberta, BC British Columbia, MB Manitoba, NB New Brunswick, NL Newfoundland and Labrador, NS Nova Scotia, NT Northwest Territories, NU Nunavut, ON Ontario, PE Prince Edward Island, QC Quebec, SK Saskatchewan, YT Yukon

Species
Create the table \*\[\[(\w+ \w+)\]\]\,\[(.+?)\] (\w+ \w+)( \w+)?( \w+)?(\n)
 * -$6!scope="row" |$1 ($3$4$5)$6|100px|center|BC |alt=Species distribution in Canada$6|$6|$6|$6|$6

At some point, fill in the "family" column, and (if necessary) add explanations to the Key.

Add distribution maps

If necessary, add : \)(\n)\| )$1|$1|

Do images; get heights approximately even by cropping. Do alt text. Add license info to the talk page.

Create list of parameters for. Do regex on chapter pages and authors from Silvics in the form "@456-462 Silas Little and Peter W. Garrett": \|@(\d+)\-(\d+) (.+?) (\w+)(\n) \|@(\d+)\-(\d+) (.+?) (\w+) and (.+?) (\w+)(\n) \|@(\d+)\-(\d+) (.+?) (\w+)\, (.+?) (\w+)\, and (.+?) (\w+)(\n)
 * first1=$3 |last1=$4 |pages=$1–$2$5
 * first1=$3 |last1=$4 |first2=$5 |last2=$6 |pages=$1–$2$7
 * first1=$3 |last1=$4 |first2=$5 |last1=$6 |first3=$7 |last3=$8 |pages=$1–$2$9

Append chapter names: "row" \|\[\[(\w+ \w+)\]\](.+?)(\n)\|(.+?)\n\|(.+?)\n\| "row" |$1$2$3|$4$3|$5 |chapter=$1$3|

Create a blank "References" section. Create a bulleted separate entry in References for each list of parameters, except: when the author(s) is/are the same, combine into one ref.

Convert this bulleted list into properly formatted citations, swapping the "first1" and "last1" on each line: ''(\n) \*\|first1=(.+?)\|last1=(.+?)\|
 * editor-last1=Burns | editor-first1=Russell M. | editor-last2=Honkala | editor-first2=Barbara H. | title=Silvics of North America, Volume 1. Conifers. | url=https://www.fs.usda.gov/research/treesearch/1547 | publisher=United States Government Printing Office (Department of Agriculture, Forest Service) | location=Washington, DC | year=1991 | isbn=978-0160292606 }}$1
 * {{cite book |last1=$2|first1=$1|

Alphabetize the list

Convert each list of parameters into an appropriate {{sfn}} citation: (\n)\|first1=(.+?)\|last1=(.+?) \|first2=(.+?)\|last2=(.+?) \|first3=(.+?)\|last3=(.+?) \|pages=(\d+.\d+) $1|{{sfn|$3|$5|$7|1991|pp=$8}} (\n)\|first1=(.+?)\|last1=(.+?) \|first2=(.+?)\|last2=(.+?) \|pages=(\d+.\d+) $1|{{sfn|$3|$5|1991|pp=$6}} (\n)\|first1=(.+?)\|last1=(.+?) \|pages=(\d+.\d+) $1|{{sfn|$3|1991|pp=$4}}

Remove "chapter=..." from these lines: \|chapter(.+?)(\n) $1

Add end-sections

...

If I'm using the PLANTS database, this will add the species name to each sfn: "row" \|''\[\[(\w+ \w+)\]\](.+?)(\n)\|(.+?)\n\|(.+?)\|2023\|loc= "row" |$1$2$3|$4$3|$5|2023|loc=$1'':

...

convert e.g. "*https:...symbol=PIST, loc=Fact Sheet,|first2=John |last2=Dickerson" ^\*(.+?)\, loc=(.+?)\,(.+?)(\n)
 * $3 |url=$1$4

If necessary, to add Burns citations to the blank column: \}\{\{(.+?)\|1991\|pp=(\d+.\d+)\}\}(\n)\|\n\|(\w+) family }$3|$3|$4 family

[add where needed]

Adding a cite after : (\n)\|\{\{sfn\|National(.+?)cs\}\}\{\{(.+?)\}\}\n\|\{\{(.+?)\}\}\n\-\-\-\-\n $1|$1|$1$1

Plant family tables
Adding hair space before cites in 3rd col: !scope="row"(.+?)(\n)\|(.+?)\n\|(.+?)\{\{ !scope="row"$1$2|$3$2|$4&hair;$1|-

Removing "thumb" etc. from John's raw image lists: (\.jpg)\|(.+?)\]\](\n) $1]]$3

Adding * and colon: [[F
 * [[:F

convert raw list of images to table: (\n)(.+?)\n\*\[\[\:File\:(.+?)\]\]\n\*\[\[\:File\:(.+?)\]\]\n
 * $1|-$1

(\n)(.+?)\n\*\[\[\:File\:(.+?)\]\]\n
 * $1|-$1

removing (...) in last col: \((.+?)\)(\n)\|\- $2|-

This might be needed after copying an image column: \}\}(\n)\n\|\- }}$1|-

Replacing @ for cites (example): @(\d+)\-(\d+)

@(\d+)

double-hyphen to dash (\d+)\-\-(\d+) $1–$2

Moving POWO cites to the end of the cell: (1 genus|\d genera)\,\{\{sfn\|POWO\|loc=(\w+)\}\}(.+?)(\n) $1,$3$4

Remove Chr cites from Orders, or add them to the first column, as needed

TFA
(\W\W)\|\| (\w+)\|\| (\d{4}) $1|| $2 || $3