Talk:Statistics of the COVID-19 pandemic in the United States

Daily new tests graphs - what is going on?
We currently have two graphs that show new tests. One shows total new tests per day (smoothed), and reflects an increase from zero to almost 1.8 million per day over the past eight months. The second graph is per capita (also smoothed), and shows a shallow, wobbly increase over the same time from 2.3/1000 to 4.something/1000. It also compares the US to Luxembourg, Denmark, and the UAE (the utility of which is a different discussion), the lines for which behave normally.

The second chart makes no sense, and doesn't match with the graph from the second reference (OWID), which properly starts at zero and moves in a manner consistent with the bulk testing numbers. The first ref appears to be OWID's data dump. I believe is the file with national data (more than one source mixed in). It has covidtracking.com and healthdata.gov interleaved, as well as a few irregularities in notation.

Can someone explain how the numbers in our graph are being calculated? pauli133 (talk) 18:34, 25 November 2020 (UTC)
 * Since it hasn't been updated in a couple weeks, I've commented the graph out for now. pauli133 (talk) 12:27, 26 November 2020 (UTC)
 * It is a bit more labor intensive for me to pull this data - I was using just the OWID source. I do agree that the "smoothed" data for the US has oscillations in it - while the other nations interestingly don't.  I have been torn about this data set mostly because to compare to the "top 3" per capita countries is a bit apples to oranges, since they are all 2 orders of magnitude smaller populations.  If we want to restore it, I can see if I can get not oscillating data (it did have the appropriate trend for what it's worth). Scotty.tiberius (talk) 00:46, 11 December 2020 (UTC)
 * Ok, I have a script that is a lot cleaner at generating the text for this. Just updated through Dec 10th (the latest data I had pulled).  This plot is a different test number, because it is per thousand, instead of just a raw test number. Scotty.tiberius (talk) 12:06, 16 December 2020 (UTC)
 * Think this can be removed from Talk Page - any objections? Scotty.tiberius (talk) 15:59, 23 December 2020 (UTC)
 * Fine with me, archive away. It wouldn't be bad to document on the talk page how all the various bits of data are collected, processed, and posted, though. pauli133 (talk) 16:23, 23 December 2020 (UTC)

Death Projections
Currently, the article reads: "Then, at the end of May, the CDC correctly projected the death toll would surpass 115,000 by June 20, by which point it had reached ~112,000. The CDC ensemble forecast on July 31 also correctly predicted at least 168,000 total deaths by August 22, by which point it had reached ~165,000. As of October 17, 2020, the CDC projected the cumulative number of deaths to reach 230,000-250,000 by November 14, four weeks from the estimate." I believe I was involved in writing these sentences (or some earlier version of them) on the original "COVID-19 pandemic in the United States" article before this "Statistics" article was split off.

In my recollection and my view, the original point of these sentences was simply to show that the CDC is generally an accurate source of death projections. (Corollary: If the CDC warns that Americans are going to die, Americans should listen.) There was nothing special about the July 31, August 22, or November 14 dates or the death projections for them. We just wanted people to pay attention and care at the time. These were just examples. They happened to appear in some news articles that we happened to see. I believe the CDC typically keeps a running projection of what's going to happen 3 or 4 weeks in advance -- expressed as a range of the likely total death count -- and their prediction usually proves correct. I suggest we write a more general comment explaining this, to entirely replace the existing sentences. Objections?

- Tuckerlieberman (talk) 22:29, 10 December 2020 (UTC)

Concerns about splitting off the statistics page
You are invited to join the discussion at Wikipedia talk:WikiProject COVID-19 § Article splitting. &#123;{u&#124; Sdkb  }&#125;  talk 00:38, 30 November 2020 (UTC)


 * The discussion linked-to above is from 30 November 2020, and has since been archived; it is thus now obsolete. The discussion can be found at Archive #12 of the WikiProject COVID-19's Talk Page, here. Best regards to everyone, Mercy11 (talk) 14:44, 2 March 2022 (UTC)

Captions for the "Number of U.S. positive test individuals by date" charts.
All of the captions work except for the last chart. The caption reads "50,000–180,000 positive test individuals," but the Y-axis shows to 140,000 cases. In addition, the next chart up, the caption says "160,000–260,000 positive test individuals."

I don't know if this is deliberate or not, but it just struck me as odd.

TrilliumLady (talk) 06:20, 7 December 2020 (UTC)TrilliumLady
 * There are two things going on here: first, the charts are shuffled and renamed as the case counts change, and sometimes things are overlooked. Second, the graphs themselves are dynamically generated based on the data, so the Y axis will crop itself to fit.
 * If you see something that doesn't look right, you are encouraged to change it :) pauli133 (talk) 02:19, 9 December 2020 (UTC)


 * I have been trying to somewhat reasonably and equally balance the number of states above 50k across 3-4 charts. This involves migrating states up as they "graduate" and sometimes re-balancing the limits of the plots.  When I rebalanced most recently I realized the auto-y-axis sizing did not fit a 50-100,100-250,250-500,500+ type distribution, so I chose the multiples of 20k major y-axis ticks.  I think this is the best practice, but sometimes don't have time to move states around when entering daily values, sometimes leading to states that are outside of the title range for that plot. Scotty.tiberius (talk) 16:49, 12 December 2020 (UTC)
 * ...which is why I started coming in behind you to do that from time to time. Also to alphabetize and sort. All easy things to do, while you're doing the more serious stuff.
 * As long as we stay at or below sixteen states per graph, everything should fit visually. I've been leaning towards adjusting the graph ranges to keep from splitting up clusters of states - splitting things in the larger breaks between groups makes things easier for the reader. Just different competing needs (splitting things up evenly vs keeping groups together) when making these calls. I don't think it's vital to match the section headings to the Y axis marks, as long as the description is fundamentally accurate and they match the data in the end. pauli133 (talk) 18:06, 12 December 2020 (UTC)
 * Yes, thanks for the help! Just trying to lay out my methodology to try and have some constancy.  Also, if you're alphabetizing, it's easiest if alphabetized by the state postal abbreviation rather than the name;  the data source uses postal code.  Scotty.tiberius (talk) 16:51, 13 December 2020 (UTC)
 * I was wondering if that's what you were doing for alphabetization - it's reasonable. I'll likely go through and add the abbreviations after the names as comments, to make that clear. pauli133 (talk) 17:03, 13 December 2020 (UTC)

Sudden change in chart history
A few days ago, the over 440,000 chart showed that during November, Texas briefly matched California, but never exceeded it. Now the chart shows Texas exceeded California handily for all of November. Maybe something broke (or was fixed)? 67.169.166.36 (talk) 13:51, 15 December 2020 (UTC)


 * I think I'm the only maintainer of these plots right now, and so sometimes they're only being updated 1-3x per week--and in Dec there were some significant changes in horse race positions. Also, with a poorly marked x-axis it's tough to tell when they're updated if you're not digging into the text details.  Scotty.tiberius (talk) 21:30, 30 December 2020 (UTC)
 * Speaking of sudden jumps - is the 50k increase in NJ legit, or just an artifact? I know it's in the source data, but it doesn't seem to correlate well with the NJ official site. pauli133 (talk) 19:54, 7 January 2021 (UTC)
 * I noticed the NJ jump when I typed in the 4th or 5th, but haven't had a chance to reextract the whole data string. I concur it's probably an artifact of them revising 2 weeks of numbers after the holidays, and I'll try and fix it in the next day or 2 if it's still up.  Scotty.tiberius (talk) 23:24, 7 January 2021 (UTC)
 * Fixed, and have reinjested all of the states for other data adjustment issues. Scotty.tiberius (talk) 16:59, 8 January 2021 (UTC)

Time to add DC, HI, ME, VT?
Right now the state graphs include 47 states and Puerto Rico, but not DC, Hawaii, Maine, or Vermont, because those are still under 50k cases each. It seems odd to not include them at this point. would this be an onerous think to ask you to add to your workflow? pauli133 (talk) 16:11, 18 January 2021 (UTC)
 * Not too onerous. I was thinking the same thing and can try and fit it in. Scotty.tiberius (talk) 00:48, 19 January 2021 (UTC)

Switching state graphs from daily to weekly?
This article is getting a bit heavy. We're also running into template limits. I don't know if it will help with the latter, but I suspect that the infections-per-state graphs could be dropped to weekly datapoints without sacrificing much useful information. Any thoughts? in particular. pauli133 (talk) 02:53, 3 March 2021 (UTC)
 * I might even think a bit deeper, how much maintenance and info is really worth including here when there are so many available (and automated) services to collect much of this information? I wonder how much trying to keep a page like this up to date is just a fool's errand, versus some level of reasonably regular update (weekly or monthly, depending) with each section linking to an up-to-date source? Bakkster Man (talk) 14:53, 3 March 2021 (UTC)
 * I think it is time to shift the plots in this article. Yes, weekly is probably the first pass improvement.  Covid Tracking Project is also sun-setting their collating and data cleaning, so we need to shift to some CDC-type summary data, and it's of inherently lower fidelity, but weekly might be a reasonable filter to apply. Scotty.tiberius (talk) 00:11, 15 March 2021 (UTC)
 * Currently state graphs are every 4th day, we can shift to weekly in a little bit. Reduced page size around 89kB.  Can the page template header come off? Scotty.tiberius (talk) 14:16, 19 March 2021 (UTC)

Sources for Progression Charts (and cases per state over time) charts now that CTP has shut down
Corona Tracking Project ceased operations on 7 May. These charts still list CTP as the source for the numbers. Soooo .... whose numbers are bring used now?

And does anybody happen to know why a giant turning gear appears on those charts in Mobile (not the app)? I was downloading them to my phone, but can't do that any more. No big, just curious. TrilliumLady (talk) 04:36, 16 March 2021 (UTC)


 * Right now .... nothing. I'm trying to adjust my data extraction scripts over to JHU, OWID, or NY Times githubs. Scotty.tiberius (talk) 11:37, 19 March 2021 (UTC)


 * Updated with NY Times as the data source. Scotty.tiberius (talk) 14:14, 19 March 2021 (UTC)

Data Update
I just updated some data. I changed the reporting frequncy to monthly. Hopefully that is low enough such that the numbers keep being updated. I didn't review any old data: they seem reasonable but are strange because it seems they have been modeled somehow. You cannot have 386641.32 cases...--McBayne (talk) 16:12, 22 August 2021 (UTC)

Lack of updates in the state graphs
Is there a reason to keep the "Number of U.S. positive test individuals by state over time" since they haven't been updated since the end of July? I don't now if it matters, but the X axis changes from Monthly to quarterly once there's more than 400,000 cases. Who knew we'd still be doing this almost 2 years later!TrilliumLady (talk) 19:12, 26 October 2021 (UTC)TrilliumLady

Dating the State Graphs
I know the graphs in the "Number of U.S. positive test individuals by state over time" section are updated irregularly. Would it be possible to add a date when they're updated?

Thanks, TrilliumLady (talk) 06:40, 11 March 2022 (UTC)TrilliumLady
 * sure. Done. pauli133 (talk) 12:59, 11 March 2022 (UTC)

Wiki Education assignment: English Composition 1102 085
— Assignment last updated by Narangy (talk) 05:23, 11 April 2024 (UTC)