Wikipedia:Miscellany for deletion/Wikipedia:Database reports/Talk pages by size

 __NOINDEX__
 * The following discussion is an archived debate of the proposed deletion of the miscellaneous page below. Please do not modify it. Subsequent comments should be made on the appropriate discussion page (such as the page's talk page or in a deletion review).  No further edits should be made to this page.

The result of the discussion was  keep. — ξ xplicit  00:19, 30 November 2010 (UTC)

Database reports/Talk pages by size
This is basically a tally of the total size of a given talk page including its archives. Unlike a report on single page sizes, this is not greatly productive, as we typically do not concern ourselves with how many archives a page has so long as they're individually within a reasonable length. I can't think of any reason to devote resources to updating this, and it's already being misinterpreted as being some kind of warning sign. Chris Cunningham (user:thumperward: not at work) - talk 12:58, 21 November 2010 (UTC)


 * I generally agree that large talk pages are more of a curiosity than anything else. But I don't think that deleting the page will make any difference. Anyone can upload the data into their user space, instead of wikipedia space, and link to it there. Or they could post the data as an HTML page on the toolserver and link to it. In other words, deleting the page will not make the information go away as long as people are interested in it. &mdash; Carl (CBM · talk) 13:10, 21 November 2010 (UTC)


 * Keep. As the author of the report, I think having statistics on what the most active discussion pages on Wikipedia are can be useful, or at least interesting. I don't know why Wavelength thinks those pages are some kind of a problem (maybe he doesn't understand WP:PERF?), but the fact that someone is misrepresenting data doesn't mean we should hide those data. Svick (talk) 13:18, 21 November 2010 (UTC)


 * Keep because I think it might be useful somehow...in tracking controversial pages maybe (?) Casliber (talk · contribs) 14:10, 21 November 2010 (UTC)
 * Keep - As well as the reasons above:
 * As WP continues to grow pages like this are useful for helping to determine not only the most active talk pages but also provides metrics for future storage requirements. If we include all the historical snapshots across the wikis it goes into the Terabytes.
 * Having pages like this helps us to determine if we will need to take measures to reduce that footprint in the future like eliminating the spaces between = and titles, extra white spaces in talk page banners and other templates, adding some kind of archive feature to reduce the size of the snapshots by article, etc.
 * It also helps to provide visibility in case someone is using it for inapropriate means. If we suddenly see a page like a user page jump to the top of the list that might be a sign that something has gone wrong.
 * Suggestion though - It might be useful to note the last date the page was updated and the size of the last edit if possible. --Kumioko (talk) 14:20, 21 November 2010 (UTC)
 * I don't understand your suggestion. The report includes "data as of 22:47, 20 November 2010 (UTC)". And the page history stores the edit time and page size. --MZMcBride (talk) 20:51, 21 November 2010 (UTC)
 * Delete - Just an overall ill-conceived idea, there's no need to address talk-page length on a generic, site-wide basis. "Saving kilobytes" ?  Really?  No.  Users can archive their own talk pages, or not, as they see fit; there's no need to watch over their shoulders.  As for article talk pages, the regular users and participants in those pages will have much more of a feel for the pace of the discussions, of what to set the archiving time to and when.  Honestly, this is a solution in search of a problem. Tarc (talk) 14:53, 21 November 2010 (UTC)
 * Oh I don't know - there are, among many predictable pages, some surprises there, like Talk:Monty Hall problem and Wikipedia talk:WikiProject Ice Hockey, why not keep a record of these things so we may learn...something I guess. Casliber (talk · contribs) 15:13, 21 November 2010 (UTC)
 * Reply to Talc. First I highly doubt WP is going to start taking advertisements so no worry their. Just brainstorming. When I mention the spaces issue its not so silly as you would make it out to be. Think of it like this. When you fill your hard drive on your computer up what do you do, delete some stuff, empty out the recycle bin, delete the temporary files. In WP we save history, there is no recycle bin so every thing just keeps building. Spaces in WP are like that, extra clutter that over time can build up, take up space and waste resources. Including history the WP database takes several terabytes (about 6 I think last I heard) and at last count somewhere between 20 and 30 GB (maybe more) of it are wasted spaces and that increases exponentially every day. I have no idea what the vandalism edits are but given the rate of vandalism in WP I think we can safely assume another 10 GB (and thats probably way way low). All Im trying to say is if we manage what we have more efficiently then it will require less dollars to maintain, thereby less donations are needed to maintain it. --Kumioko (talk) 15:37, 21 November 2010 (UTC)
 * I think you need to read WP:Don't worry about performance. If spaces, vandal revisions, or anything else really becomes a space problem, the people whose job (paid or volunteer) it is to actually watch for that sort of thing will tell us. Last I heard, the response to your concern from those who know was "terabytes are cheap". Anomie⚔ 17:43, 21 November 2010 (UTC)


 * Keep This page might serve a useful purpose identifying active talk pages (as said above). Wavelength's posts about it to several of those pages is not relevant to this page's existence but instead should be discussed with him directly. Regards  So Why  18:56, 21 November 2010 (UTC)


 * Keep: What SoWhy said. --MZMcBride (talk) 20:43, 21 November 2010 (UTC)


 * Comment: There are (generally speaking) two types of database reports: utility reports and statistical reports. Utility reports are used by community members to improve the encyclopedia (e.g., ). Statistical reports are simply used for informational value or historical informational value (e.g., ). This report is clearly intended to be a statistical report, not a utility report. Anyone treating it as a utility report should be ignored. There's no "call to action" here, nor should there be. --MZMcBride (talk) 21:21, 21 November 2010 (UTC)
 * Keep - Page misuse and terabytes are features external to that page. They do have some relevance, but it is the page content itself that should be the main focus of the discussion. As for the page contents itself, the page simply provides some information about Wikipedia (what MZMcBride refers to as a statistical report), which is what Project namespace discusses, "Many pages simply provide some information about Wikipedia (like this page)". If you know of a policy, guideline, etc. that more particularly covers this situation, I would be happy to review that and reconsider my keep position. -- Uzma Gamal (talk) 00:39, 22 November 2010 (UTC)


 * Exclude Archives, which would the make this a useful page for determining which talk pages require archiving. D O N D E groovily   Talk to me  14:21, 22 November 2010 (UTC)
 * You want . --MZMcBride (talk) 19:34, 22 November 2010 (UTC)
 * Question How do we know this is accurate? What if someone created an archive page and didn't put any archive tags on it? What if people have non-talk content in a talk space? D O N D E groovily   Talk to me  14:24, 22 November 2010 (UTC)
 * It considers all subpages of talk pages, regardless of whether or not they are archives. So that includes the talk pages of individual RFAs, for example. Looking through the list,, I do see one incorrect item. The software considers pages like Talk:9/11 conspiracy theories to be subpages of Talk:9, and so does the report. Reach Out to the Truth 17:17, 22 November 2010 (UTC)
 * Yes, this. Ideally the database would track whether a page is actually a subpage with a page.page_is_subpage column or something. Alas. --MZMcBride (talk) 19:34, 22 November 2010 (UTC)

—Wavelength (talk) 18:00, 23 November 2010 (UTC)
 * Comment (six in one):
 * There can be Database reports/Active talk pages by size.
 * There can be Database reports/Archived talk pages by size, for subpages containing the character string "rchiv" (allowing for "Archive", "archives", "Archival", and "archiving").
 * There can be Database reports/Total talk pages by size, combining active and archived talk pages.
 * Each of those three pages can be have entries ranked according to their respective descriptors, while also having (in a sortable wikitable) columns for the other two kinds of data, for comparison.
 * The current page can be redirected to the appropriate new page.
 * (Database reports/Talk pages by size was provided in response to a request at Wikipedia talk:Database reports [permanent link here ], by an IP with no other contribution.)
 * Get coding. --MZMcBride (talk) 19:39, 23 November 2010 (UTC)


 * keep but please explain where the information came from. Redirecting to regularly created automatic reports is OK too. Graeme Bartlett (talk) 20:52, 23 November 2010 (UTC)
 * ✅ -- Uzma Gamal (talk) 16:29, 25 November 2010 (UTC)


 * Keep, I don't see how this is detrimental to the project. Might even entice people to archive big ones. Headbomb {talk / contribs / physics / books} 23:24, 24 November 2010 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made on the appropriate discussion page (such as the page's talk page or in a deletion review). No further edits should be made to this page.