User talk:AlexNewArtBot/WomensHistory


 * LAtest rsults at User:AlexNewArtBot/WomensHistorySearchResult

Current problem
I suppose picking up enough dates in the different bands will generate a hit. I'll tweak the rules. Rich Farmbrough, 16:05, 13 May 2012 (UTC).


 * Thanks.Maile66 (talk) 17:01, 13 May 2012 (UTC)

May 14
Looks better than yesterday, Rich. But it's still 502,746 bytes, 1285 articles, which at a first-glance scan don't seem to have a lot to do with women. And it seems to like content from certain editors, which might mean nothing other than some editors tend to focus on a given genre or topic. But it's getting better.Maile66 (talk) 17:29, 14 May 2012 (UTC)
 * You can look at User:TedderBot/NewPageSearch/WomensHistory/errors to see how certain phrases were used. For instance, I saw Centaur (1849 ship) on there. Obviously it doesn't belong. If you look at "errors", you'll see this:

Score: 10, pattern: ((nine)teenth|18th|18) century|\b18\d\d\b, inhibitor count: 2
 * So it matched "nineteenth" or some version of that. But the score was defaulted (it shouldn't have gotten 10 points for matching that). I'm comparing it to User:AlexNewArtBot/Oregon, which works for a higher default score and for specific point values. It looks like the inhibitors are wrong (they should have /slashes/). Actually, looking at it again, the issue is that "nineteenth" was matched several times. I'd suggest much lower points for those queries, or join a "date" query with a "content query". Like "woman.*nineteenth|nineteenth.*woman". tedder (talk) 23:03, 14 May 2012 (UTC)
 * I had assumed that they weren't multiply matched. That is a bit of a disaster, as there is already a bias towards longer articles. Despite my Arb Case, which is going pretty badly at the moment, I will attempt to look at this tomorrow. Rich Farmbrough, 01:04, 15 May 2012 (UTC).


 * Thanks. And I understand if you aren't able to.  I added two lines of keywords, and Tedder adjusted them. Maile66 (talk) 01:11, 15 May 2012 (UTC)
 * I didn't know about multiple matches either. I should change that, as they should only score once. tedder (talk) 02:18, 15 May 2012 (UTC)

May 15
Today's count is 58,056 bytes and 146 articles. Better. I think the added search terms helped. I still see males and unrelated, but it's better. That Jim Reese (Texas politician) seems to show up no matter what we do. I added the rest of my search terms today. Maybe Tedder will have a look at my adds.Maile66 (talk) 16:52, 15 May 2012 (UTC)

Suggestions
+2 for 'she', -1 for 'he'? change the rule for 'twentieth' etc. to only register dates early in the C20th? Dsp13 (talk) 20:48, 15 May 2012 (UTC)
 * And, hopefully, Tedder or somebody will read this and know what you meant by that. I'm not a programmer. Maile66 (talk) 21:12, 15 May 2012 (UTC)