US Deaths

There’s some articles claiming that US deaths from all causes is lower this year than previous years. Is it true?

No, it is not true. Here’s my result:

Year Pop.      Deaths  %
2011 311556874 2512442 0.806%
2012 313830990 2501531 0.797%
2013 315993715 2608019 0.825%
2014 318301008 2582448 0.811%
2015 320635163 2699826 0.842%
2016 322941311 2703215 0.837%
2017 324985539 2788163 0.858%
2018 326687501 2824382 0.865%
2019 328239523 2835038 0.864%
2020 331937300 3290723 0.991%


# Zoe Phin
# 2020/12/15

download() {
    wget -O covid.csv -c ''
    wget -O popul.csv -c ''

usdeaths() {
    echo 'Year Pop.      Deaths  %'
    (sed -n 2p popul.csv | tr ',' '\n' | sed -n 9,17p > .pop
    sed -n 2p popul.csv | tr ',' '\n' | sed -n 39,47p > .death
    for y in {2011..2019}; do echo $y; done | paste -d ' ' - .pop .death
    awk -F, 'NR<9999 && $2~/States/ && substr($1,1,4)==2020 { 
        S+=$3; "date +%j -d "$1 | getline D } END { 
        printf "2020 331002651 %d\n",366/D*S }' covid.csv
    ) | awk '{
        printf "%s %5.3f%\n", $0, $3/$2*100

Run it:

$ source && download && usdeaths

Note 1: As of writing this, the current year has 340 days of data. What I did was multiply the death count by 366/340.

Note 2: US 2020 population estimate number came from here. It will already be obsolete by the time you read this. Expect a little error in %.

Published by Zoe Phin

14 thoughts on “US Deaths

  1. I believe American population is growing older.
    Ie, the baby boom
    And seen a graph of increasing deaths per year projected for future {nothing to do with this year 2020}.
    So I think you have include this factor.


    1. That’s great. I’ve heard this before as well:
      More older people -> more deaths.

      What I don’t understand is how 1 year made such a jump in longevity. Please explain, if you can.

      I’d understand if there was a large migratory wave of those hardy 90+ year old Koreans and Japanese, but I don’t see that.


      1. It’s because they under-reported deaths in 2019, which is how they skew their averages when they calculate “excess deaths” and “expected deaths”. Of course there’s no way to prove this, but it seems like they “held” a certain amount of 2019 deaths and reported them in 2020 instead. This is pure speculation but seems likely, given massive drop in excess deaths YOY from 2017 to 2019 which is a major outlier when you compare it to the previous decade’s statistics. By doing this, they are able to say that 2020 had “excess deaths” above “normal” – where normal is quite abnormal.

        Here’s an article about the aging baby boomer population from the U.S. Census, from 2018:

        “The aging population of the United States is propelling the nation toward a milestone: A historic increase in the number of deaths every year.”

        The CDC estimates a number for total deaths it “expects” in 2020. The CDC then compares the actual reported number of deaths in 2020 to the number of deaths that the CDC “estimated to be expected in 2020”. Deaths above the “expected deaths” number are then “suggested” to have been “potentially” caused by COVID-19.

        So how do they arrive at the “Expected Deaths” number?

        On its main COVID-19 data page, the CDC labels the comparison between the reported number of deaths in 2020 and the “expected number of deaths in 2020” as the “percent of expected deaths.” To estimate the expected deaths in 2020, the CDC is using the AVERAGE total deaths (also known as the mean of total deaths) from 2017, 2018, and 2019.

        On its main COVID-19 data page, the CDC labels the comparison between the reported number of deaths in 2020 and the “expected number of deaths in 2020” as the “percent of expected deaths.” To estimate the expected deaths in 2020, the CDC is using the AVERAGE total deaths (also known as the mean of total deaths) from 2017, 2018, and 2019.

        The CDC’s data for 1999–2018 can be found here:

        Using this data, I calculated the difference in deaths year over year, compared to the previous year.

        2009 – 2010 – 31,272
        2010 – 2011 – 47,024
        2011 – 2012 – 27,814
        2012 – 2103 – 53,714
        2013 – 2014 – 29,425
        2014 – 2015 – 86,212
        2015 – 2016 – 31,618
        2016 – 2017 – 69,261
        2017 – 2018 – 25,702
        2018 – 2019 – 6,591

        Clearly, the last 2 years are outliers and much lower than all the other years. But these are the years they use to calculate 2020’s “expected deaths” number, which then allows them to create an artificially inflated “expected deaths” number. By using the average of only the last 2-3 years, they are using an inappropriate statistical method to calculate expected deaths and excess deaths.

        So, why would 2017-2019’s death numbers be so low when we have an aging population and we should be seeing more deaths every year, based on US Census analysis? I have no idea, only speculation.


      1. Well china virus killed more people.
        Though particularly older people of course.
        The shutdown also killed more people.
        It not safe to cause economic hardship.
        I imagine simply a recession as general matter, causes more deaths.
        But that is a guess.
        The politicians as general matter screw up badly with nursing homes.


        1. As general matter, I would say we were about 1 month too late {due to China cover up} and due to Impeachment.
          Trump was about 1 week late to shutting down air travel {and related to Impeachment} with data we had.
          But 1 month {or more late} due to China mismanagement and bad action.
          I always though New York State and city in particular were 1 week late with their shutdown, but didn’t know virus already in US {long before I imagined it was}. Or politically a week late. But should be doing stuff a month earlier but no data at that time to support it.


  2. Thanks Zoe.
    Great data.
    I had been hearing the other story, ie the flat line and no increase.

    What this might suggest is a 0.15% mortality rate though?
    Ie one in 700 ? Worst case on this data 0.2% ie 1 in 500 per population per year?


  3. Ms. Zoe,
    I ran your code a couple x just now, 08may2021:
    all is ok until 2020:

    2019 328239523 2835038 0.864%
    awk: cmd. line:3: (FILENAME=covid.csv FNR=36289) fatal: division by zero attempted


      1. TY for dbug hint, but ng, yielded same err msg.
        For brute-force fun, to eliminate div by zero, changed “366/D*S” to “366/(D*S+1)”
        now, last row of output is:
        2020 331002651 366 0.000%

        I don’t expect you to support my debug, am just happy to have examples of collecting data off websites into ubuntu.
        Have been wanting to do stuff like this for awhile, never seemed to get around to learning it.


        1. This code was run in december when the full year wasn’t available. Remove all divisions, see what happens. S is the sum.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: