Wikipedia:List of Wikipedians by article count

This is a list of Wikipedians by number of article creations.


 * It runs once a month, on the 1st, and takes one day to two weeks to complete, depending on load.
 * To opt-out: add your name to this page and your position in the list will appear as "Anonymous".
 * To display a template that will help keep track of this accomplishment, add to your user page.
 * For an alternative list of the same thing with more data see Wikiscan
 * The WP:MASSCREATION of articles that uses any sort of automation requires community approval. Mass creation of articles generally can result in disruptions. The order of the top 100 editors has thus been scrambled in order to discourage mass creation by competitive users who seek a high rank, such as to become the #1 Wikipedia editor.

For more information the FAQ describes how the list is created, etc..

5001–10000
Continued at /5001–10000/.

FAQ

 * 1. How often does it update?


 * As of October 2021, it will start on the 1st and completes one day to two weeks later.


 * 2. What if it doesn't update on time?


 * Post a question at Wikipedia talk:List of Wikipedians by article count.


 * 3. What if the data is suspect - the numbers don't seem right or my name is missing?


 * Wait for the next run to confirm. The next run might clear it up. Also compare with prior runs.


 * 4. How does it count creations?


 * Only pages that exist at the time of list generation are counted, not deleted pages. Because the list is generated by looking at the user associated with the oldest edit to a particular page, it may fail to account for scenarios such as a user creating a redirect that is later expanded by a different user. Or an article creation that is later converted to a redirect.


 * 5. How does it work technically?


 * Due to the size of Enwiki it does not use SQL queries because of CPU and memory load, rather it is 100% API driven. At the start of each run, it generates the complete list of ~6 million article titles (via API a few hours). For each title, it queries the API to see who made the first revision. This is saved in an index file (comprising two columns, article title and who made it). The purpose of the index is speed so that in future runs it doesn't have to check the API for every article because the article creator doesn't change and only needs to be determined 1 time. The next time it runs, it generates a new list of ~6 million article titles and for each checks to see if the article is in the index from the previous run, and if not (ie. a new article or renamed), retrieves the creator data from the API and adds it to the index (this takes 12-48 hours). Likewise any deleted pages (in the old index but not the new list) are removed from the index. It then counts the index and posts the top 10,000. It operates in under 40MB of memory because it doesn't hold the full list of users in memory, and doesn't load the full index into memory (via merge sort). As such it can scale indefinitely as Wikipedia grows in size and is easy on WikiMedia resources.


 * 6. How can I watchlist the list?


 * Watchlist one of the sub-pages such as WP:List of Wikipedians by article count/1–1000. Watchlisting this page (WP:List of Wikipedians by article count) will not work since regular updates are made to the sub-pages which are transcluded into this one.


 * 7. Where does it run?


 * It runs on a computer at my personal lab. It previously ran on Toolforge, but after 2023, they changed from the Grid to Containers, so I moved all my tools off Toolforge. The bot is designed to crash and restart without loss of data mid-process, so any temporary problems with the home system will not be noticed. In practice, my home system is more stable than Toolforge was.


 * The GitHub page.


 * 8. How long does it take?


 * In most cases it should finish in no more than 48hrs. If building an index for the first time up to 7 days or more. Periodically the index will be purged and rebuilt to account for username renames.


 * 9. Where was it discussed?


 * Request_a_query (September 23, 2019)
 * Bot_requests (September 16, 2019)


 * 10. Why are the Top 100 Randomly Sorted?
 * A concern was raised here (April 2021) that a goal to be #1 could result in unhealthy competition or self-aggrandizement which can lead to questionable editing practices such as machine-assisted stub creations. Would that lead to the same problem to be among the top 100? Possibly, but the software can be adjusted to the Top 500 or whatever might be required. Short of having no list at all there is no perfect solution but this helps to de-emphasize the #1 position. This is a software option that can be disabled.


 * 11. Does it work with other language wikis?
 * Yes! It was designed with that in mind. Contact the talk page to set up. It currently runs on Enwiki, Trwiki, and Slwiki. The top-100 feature discussed in FAQ #10 can be enabled or disabled.