Revision 72242ca8eade9659031ea00394a30e0cc5cc1c37 authored by Boud Roukema on 16 May 2021, 21:11:25 UTC, committed by Boud Roukema on 16 May 2021, 21:11:25 UTC
This commit adds a 'dist-journal' .tar.gz packaging
option in `reproduce/analysis/make/initialize.mk` and
solves a LaTeX/bibTeX bug related to escaping of '%'.
1 parent 2e07534
Raw File
README-popular-science.md
What is noisiness in SARS-CoV-2 daily infection counts?
=======================================================

Copyright (C) CC-BY - 2020 Boud Roukema
This file is available under the Creative Commons Attribution licence
Licence URL: https://creativecommons.org/licenses/by/4.0/

Suppose that a government agency announces that on 10 successive
days during a pandemic, the numbers of daily new infections were:

169, 169, 169, 169, 169, 169, 169, 169, 169, 169   (Example 1)

This would look highly suspicious. People do not choose to get infected
in an orderly way. Medical testing stations cannot decide to publish
exactly the same number of positive (confirmed infection) test results
every day. The task of health agency administrative staff should be
to verify that the data are authentic from the testing stations, and add
them up from around the country. There should be some randomness in the
numbers.

Suppose instead that the numbers of daily new infections were:

145, 150, 155, 160, 165, 170, 175, 180, 185, 190   (Example 2)

This would again look odd. Try plotting these against the numbers
from 1 to 10 for the days, and you'll see a perfect straight
line. Again, there is no random noise.

Now suppose that someone adds in a tiny bit of noise by hand, and the
daily infection counts are:

145, 150, 156, 163, 167, 170, 175, 182, 185, 190  (Example 3)

This already has a tiny bit of randomness added. But is Example 3
what is really expected statistically? Is this enough noise to be
realistic?  Is this the right sort of noise - randomness - to
look similar to the counts from other countries around the world?
Can people get infected by SARS-CoV-2 and have their positive
test results officially counted in a similar way to a military
march, with everyone (almost) perfectly in step?

The reality is that natural data has many different statistical
properties - properties of randomness. The paper "Anti-clustering
in the national SARS-CoV-2 daily infection counts" looks at just
one statistical property of the national SARS-CoV-2 counts.
Example 3 has too little noise compared to that expected from the
"Poisson distribution". Most countries have more noise than for
a Poisson distribution, which is already more than in Example 3.
Generally, the countries with more infections have a *lot* more
noisy data.

Likely explanations of the extra noise are that a lot of this may
be from super-spreader events, which sometimes happen, and
sometimes don't happen on a particular day; also from the
somewhat random number of laboratories that report their tests on
any particular day to the city or regional or other sub-national
coordinator; and a somewhat random number of sub-national
coordinators who report their data to the national coordinator.
It's not surprising that bigger countries, with more infections,
tend to have more noise.

A few countries have count sequences whose noise properties look
more like that of Example 3 rather than like those of typical
countries. This is difficult to explain.

* ArXiv: https://arXiv.org/abs/2007.11779
* Zenodo: https://zenodo.org/record/3951152
* Git : https://codeberg.org/boud/subpoisson
* Software Heritage: https://archive.softwareheritage.org/swh:1:dir:fcc9d6b111e319e51af88502fe6b233dc78d5166
back to top