Content - e81561318e52970ec475fb9fef615370dfb74256 - fcc9d6b/TODO

TODO
* clarify that the best-modelled-as-sub-Poissonian cases are mildly
acceptable as Poissonian, if considered individually (frequentist), separate from
the generic model; but that the fact that most cases cannot be
modelled as Poissonian argues against a purely Poissonian model
(bayesian).

- see if the P_KS_0 column can be added to the 28, 14, 7 day tables
   in the paper



* Zenodo: see README-hacking.md for what files make most
   sense in Zenodo and also https://zenodo.org/record/3872248/ as an
   example

- 0. ./project make dist-pdf
   -  subpoisson-8842dad-dirty.pdf

- 1. ./project make dist-arxiv
   - subpoisson-8842dad-dirty-arXiv.tar.gz

 --  ArXiv .tar.gz - for LaTeXable source, including reproduce/ and
   anc/ with the .build/data-to-publish/ files

- 2. ./project make git-bundle
   - subpoisson-8842dad-git.bundle

- 3-8. the six data files in .build/data-to-publish/

- 9. ./project make dist-software
   - software-cb32347-dirty.tar.gz

   Does it make sense to store a 0.4 Gb file as a matter of
   principle for reproducibility of the paper? It feels ridiculous;
   but the point is that a modern powerful software environment
   is not a tiny thing. Zenodo sets a maximum of 50 Gb for a full dataset.
   -rw-r--r-- 1 boud boud 367M Jul 20 02:59 software-cb32347-dirty.tar.gz

      Argument in README.md:
      [[TO AUTHORS: UPLOAD THE SOFTWARE TARBALLS WITH YOUR
        DATA AND PROJECT SOURCE TO ZENODO OR OTHER SIMILAR SERVICES. THEN
        ADD THE DOI/LINK HERE. DON'T FORGET THAT THE SOFTWARE ARE A
        CRITICAL PART OF YOUR WORK'S REPRODUCIBILITY.]]


* verify.mk
-- update for final version


* European JE:
- https://www.springer.com/journal/10654
- https://www.springer.com/gp/open-access/springer-open-choice/springer-compact/agreements-polish-authors
 - includes NCU
 - Poland Read and Publish (Springer Compact) agreement
 - so open-access is most likely at zero cost


* PRE-SUBMISSION CHECKLIST

- update to the latest Wikipedia dataset (archive, checksum)
- see the checklist in README-hacking.md
- automated (or manual) check on abstract word count 150--250 words. (221
   right now)



* SPEED

- something like a binary search, taking into account non-monotonicity -
   only broad scale smoothness - for the best phi_i' would be much faster
   than the two-step brute force presently implemented.