TODO
* clarify that the best-modelled-as-sub-Poissonian cases are mildly
acceptable as Poissonian, if considered individually (frequentist), separate from
the generic model; but that the fact that most cases cannot be
modelled as Poissonian argues against a purely Poissonian model
(bayesian).
- see if the P_KS_0 column can be added to the 28, 14, 7 day tables
in the paper
* Zenodo: see README-hacking.md for what files make most
sense in Zenodo and also https://zenodo.org/record/3872248/ as an
example
- 0. ./project make dist-pdf
- subpoisson-8842dad-dirty.pdf
- 1. ./project make dist-arxiv
- subpoisson-8842dad-dirty-arXiv.tar.gz
-- ArXiv .tar.gz - for LaTeXable source, including reproduce/ and
anc/ with the .build/data-to-publish/ files
- 2. ./project make git-bundle
- subpoisson-8842dad-git.bundle
- 3-8. the six data files in .build/data-to-publish/
- 9. ./project make dist-software
- software-cb32347-dirty.tar.gz
Does it make sense to store a 0.4 Gb file as a matter of
principle for reproducibility of the paper? It feels ridiculous;
but the point is that a modern powerful software environment
is not a tiny thing. Zenodo sets a maximum of 50 Gb for a full dataset.
-rw-r--r-- 1 boud boud 367M Jul 20 02:59 software-cb32347-dirty.tar.gz
Argument in README.md:
[[TO AUTHORS: UPLOAD THE SOFTWARE TARBALLS WITH YOUR
DATA AND PROJECT SOURCE TO ZENODO OR OTHER SIMILAR SERVICES. THEN
ADD THE DOI/LINK HERE. DON'T FORGET THAT THE SOFTWARE ARE A
CRITICAL PART OF YOUR WORK'S REPRODUCIBILITY.]]
* verify.mk
-- update for final version
* European JE:
- https://www.springer.com/journal/10654
- https://www.springer.com/gp/open-access/springer-open-choice/springer-compact/agreements-polish-authors
- includes NCU
- Poland Read and Publish (Springer Compact) agreement
- so open-access is most likely at zero cost
* PRE-SUBMISSION CHECKLIST
- update to the latest Wikipedia dataset (archive, checksum)
- see the checklist in README-hacking.md
- automated (or manual) check on abstract word count 150--250 words. (221
right now)
* SPEED
- something like a binary search, taking into account non-monotonicity -
only broad scale smoothness - for the best phi_i' would be much faster
than the two-step brute force presently implemented.