2ba8d31 | Boud Roukema | 19 August 2020, 02:43:16 UTC | TODO: refer to Balashov+2020 2007.14841 Add note to TODO file to refer to Balashov+2020 2007.14841 which does a Newcomb--Benford law on the initial (exponential) phase of the pandemic. | 19 August 2020, 02:43:16 UTC |
02548e9 | Boud Roukema | 18 August 2020, 23:43:33 UTC | Add Keegan Tan reference for Wikipedia | 18 August 2020, 23:43:33 UTC |
3e9d308 | Boud Roukema | 18 August 2020, 22:26:05 UTC | Merge branch 'jhu' into subpoisson | 18 August 2020, 22:26:05 UTC |
6e35ac0 | Boud Roukema | 18 August 2020, 22:24:31 UTC | Update zenodo ID for v2 The main difference between v1 and v2 is that JHU CSSE data is analysed in an appendix; the results have some minor differences with those for the Wikipedia COVID-19 Case Count Task Force data. | 18 August 2020, 22:24:31 UTC |
0d3f4f6 | Boud Roukema | 18 August 2020, 18:51:51 UTC | JHU appendix - minor wording improvement Minor improvement in the comment in the absence of India from the 7-day most un-noisy table in the case of the JHU data. | 18 August 2020, 18:51:51 UTC |
364a563 | Boud Roukema | 17 August 2020, 17:43:45 UTC | Swap calculation order of WP and JHU To simplify the question of naming output files, it is easier to calculate the JHU data first, and then recreate the equivalent files for the WP data, so that main results are for the WP data, with the JHU data only included in the appendix of the paper. This commit does this by swapping the order of analysing the two datasets in `reproduce/analysis/make/poisson.mk`. | 17 August 2020, 17:43:45 UTC |
9d9e9c3 | Boud Roukema | 16 August 2020, 23:11:51 UTC | Disable dev override - JHU branch | 16 August 2020, 23:11:51 UTC |
b2910af | Boud Roukema | 16 August 2020, 22:35:48 UTC | Updates for JHU, MathBiol This commit does several updates, including an Appendix and brief comments in the introduction and at the end of the discussion section to point to the appendix. This still has to be run with the devmode disabled to check the full resolution results, but they appear to be largely compatible with the WPC19CCTF results. Section numbers are reintroduced. Some of the verification md5sums are updated. | 16 August 2020, 22:36:54 UTC |
56ecbb0 | Boud Roukema | 16 August 2020, 00:40:21 UTC | Try summing JHU data over sub-national divisions The JHU CSSE GIS data was not used in the main analysis because many of its entries are for sub-national divisions. Since the primary question of interest is the validity of data provided by national authorities, the JHU dataset is inhomogeneous. Nevertheless, by summing over the data for province/states in countries for which these sub-national data are available, a dataset that is as close as possible to the national counts can be reconstructed. This branch/commit adds the JHU data, replaces the CCTF19 main dataset for analysis by the JHU data, and disables the `verify` config parameter. The pdf produced with this commit will be mostly quite misleading, since it will incorrectly claim that data is from CCTF19 when in reality it is JHU data; this pdf should be interpreted keeping this in mind. Probably the most useful role for the JHU analysis would be to add four tables as an appendix, equivalent to the four tables with the CCTF19 analyses, since the results only have small numerical differences for most of the low phi_i countries, and the lists of low phi_i countries are more or less the same as for the CCTF19 data. | 16 August 2020, 00:40:21 UTC |
f9b5117 | Boud Roukema | 29 July 2020, 01:06:47 UTC | Clarify that Example 3 is not Poisson The previous version of 'README-popular-science.md' sounded like 'Example 3' was the usual expectation of noise, when we know nothing else about a random counting process - the Poisson point process. But that's not the case. This commit is intended to clarify the difference, at the cost of introducing words that may frighten some readers ("Poisson distribution"). | 29 July 2020, 01:06:47 UTC |
67a5e31 | Boud Roukema | 28 July 2020, 18:21:24 UTC | Insert ArXiv metadata ID: 2007.11779 Insert ArXiv ID in `reproduce/analysis/config/metadata.conf` . | 28 July 2020, 18:21:24 UTC |
d73c958 | Boud Roukema | 26 July 2020, 15:54:10 UTC | Fix up markdown for README headers Git repositories typically show README*.md files in markdown format [1]. This commit is intended to tidy the header sections of 'README-popular-science.md' and 'README.md'. [1] https://www.markdownguide.org | 26 July 2020, 15:54:10 UTC |
7ad5dec | Boud Roukema | 25 July 2020, 12:20:09 UTC | Add popular-level explanation - README-popular-science.md In this commit, the file 'README-popular-science.md' is added to give people a highly simplified explanation of the main idea of the paper. | 25 July 2020, 12:20:09 UTC |
8cca47d | Boud Roukema | 24 July 2020, 15:58:46 UTC | TODO: rm things done; v2 notes In this commit, some things on the TODO list that have alread been done are removed, and a list of items to do for v2 is started. | 24 July 2020, 15:58:46 UTC |
84e4345 | Boud Roukema | 23 July 2020, 09:48:53 UTC | Update poisson.tex md5sum and reenable verify-outputs A 3-significant-digit version of a phi_i value was added recently to the tex outputs into the build tex macro file 'poisson.tex'; this is needed for the Russia phi value of 10.35 or 10.4, which looks odd as 10 \times (10^{log10 uncertainty}). The updated m5sum is inserted into the default version of 'reproduce/analysis/make/verify.mk' (not yet into the developers' version), and verify-outputs is re-enabled in 'reproduce/analysis/config/verify-outputs.conf'. | 23 July 2020, 09:48:53 UTC |
71ae2a0 | Boud Roukema | 23 July 2020, 02:24:31 UTC | Add missing word to abstract | 23 July 2020, 02:24:31 UTC |
252cf1c | Boud Roukema | 23 July 2020, 00:53:04 UTC | Three significant figures for RU phi_i | 23 July 2020, 00:53:04 UTC |
1c0ae47 | Boud Roukema | 23 July 2020, 00:42:31 UTC | This should be the submitted version Mostly minor language edits of the text. The abstract length according to pdf cut/paste + wc is 250 words. | 23 July 2020, 00:42:31 UTC |
c719f30 | Boud Roukema | 22 July 2020, 21:28:03 UTC | IN: +few more details; README.md overall software description | 22 July 2020, 21:28:03 UTC |
2fa9d5c | Boud Roukema | 22 July 2020, 17:36:56 UTC | Printable width error bars; PL values into paper | 22 July 2020, 17:36:56 UTC |
ee1e371 | Boud Roukema | 22 July 2020, 15:52:38 UTC | Revert accidental commit of default 50 cpus | 22 July 2020, 15:52:38 UTC |
75ecd83 | Boud Roukema | 22 July 2020, 15:51:39 UTC | Fix checksum for WHO_vs_WP.dat for 2020-07-15 | 22 July 2020, 15:51:39 UTC |
5e061a2 | Boud Roukema | 22 July 2020, 15:40:47 UTC | Try to fix numthreads syntax | 22 July 2020, 15:40:47 UTC |
26588b2 | Boud Roukema | 22 July 2020, 15:36:27 UTC | Checksum update for WHO 2020-07-15; not yet checked | 22 July 2020, 15:36:27 UTC |
3712c41 | Boud Roukema | 22 July 2020, 15:31:59 UTC | Start checksums for WHO 2020-07-15; numthreads As of today 2020-07-22, more recent Wayback snapshots of the WHO data than 2020-07-15 are not available - probably there's some sort of long-term storage issue that involves a delay. This commit restores the 2020-07-15 archived URL and a checksum; other checksums will have to be fixed. The numthreads rule in 'poisson.mk' is adjusted (hopefully fixed) here. A user-level (subpoisson.conf) value should override the default value. | 22 July 2020, 15:31:59 UTC |
b08f737 | Boud Roukema | 22 July 2020, 09:26:41 UTC | Merge branch 'subpoisson' of codeberg:boud/subpoisson into subpoisson | 22 July 2020, 09:26:41 UTC |
30c4e96 | Boud Roukema | 22 July 2020, 09:24:52 UTC | Update WHO Wayback archive to 20200722024239 The Wayback machine archive now seems to be stable for the 20200722024239 snapshot of the WHO dataset. This commit updates to this snapshot and sha512sum. | 22 July 2020, 09:24:52 UTC |
565454b | Boud Roukema | 22 July 2020, 09:06:16 UTC | Remove obsolete python import Subpoisson.py tries to import 'replace_pairs.py', which was used as a partial fix for the WHO data jumps/drops. This is obsolete since the higher quality data - the WP C19CCTF data - is now used. | 22 July 2020, 09:06:16 UTC |
0ab3169 | Boud Roukema | 22 July 2020, 03:14:23 UTC | Downdate two checksums for WHO 20200715 Since Wayback (and the WHO website too, I think) doesn't want to give the 20200721 version of the data file, here we give the checksums for the 20200715 version of their file. | 22 July 2020, 03:14:23 UTC |
76b49d2 | Boud Roukema | 22 July 2020, 02:49:21 UTC | TODO: git-less archive bug; Wayback volatility TODO list: Some of the scripts in the present set of *.mk and bash scripts use git; these will file in a pure snapshot that has no git information. So the 'git archive' snapshot risks leading to more user errors than the git bundle and is better not distributed by default. The Wayback machine is sometimes a bit volatile in terms of snapshots - probably something to do with managing huge amounts of files and backing them up and mirroring them properly. Right not the 20200721 snapshot of the WHO data set redirects to a 20200715 snapshot. Probably waiting a few hours may be enough, but meanwhile, here's the 20200715 snapshot - which may be safer to use since it seems to be stored properly and available. The verify checksums would have to be updated for this 20200715 snapshot. | 22 July 2020, 02:49:21 UTC |
91f12d2 | Boud Roukema | 21 July 2020, 21:50:16 UTC | Preparing for distribution: arXiv version; snapshot This commit adds a command for creating a git snapshot: the user should be able to reproduce the full project from the state of the current git commit. Suffixes are added to some of the .tar.gz output files to reduce ambiguity in what packet is intended for what purpose. The TODO list for submission guidelines is updated. | 21 July 2020, 21:50:16 UTC |
6bb7f36 | Boud Roukema | 21 July 2020, 21:36:44 UTC | Trivial fix to printf lines in paper.mk | 21 July 2020, 21:36:44 UTC |
3ae050d | Boud Roukema | 21 July 2020, 21:11:54 UTC | Update md5sum; EJE/Springer Fig/Table trivia The high-resolution calculation verification checksum is updated in this commit to take into account the cutoff N_plot_N_lim_lowest_phi passing through the pipeline. Trivia to get "Fig. number" and "Table number" closer to Springer/EJE style are implemented. | 21 July 2020, 21:11:54 UTC |
b2af741 | Boud Roukema | 21 July 2020, 19:12:22 UTC | Text: subsubsections in discussion; N_i cutoff for cases curves The discussion section is reorganised in this commit to discuss the Finland case, which has low \phi_i and \psi_i but also low N_i, and to clarify the different subgroups of the general low \psi_i group. The selection of counts curves to display has a criterion added to exclude low-total-count countries, since these typically have closer to Poissonian counts. The summary line in the abstract and corresponding parameters in the pipeline have the "strongly sub-Poissonian" preferred models switched to a cutoff of \phi_i^{28} = 0.5, since 1.0 is not "strongly sub-". | 21 July 2020, 19:12:22 UTC |
7cadd4c | Boud Roukema | 21 July 2020, 16:36:11 UTC | Update default calculation checksums This commit updates the checksums for the default (high accuracy) calculations for the 21 July 2020 downloaded version of the source data files. | 21 July 2020, 16:36:11 UTC |
919d8bd | Boud Roukema | 21 July 2020, 15:18:22 UTC | Update WHO+WPC19CCTF data for 21 July - step 1 This commit does the first step in updating the input data files to those of 21 July (today). The verification checksums are done for the `enable_dev_override` mode in this commit. The second step will be to update the checksums from the default (higher accuracy) calculations. | 21 July 2020, 15:18:22 UTC |
b1823f5 | Boud Roukema | 20 July 2020, 23:29:36 UTC | Tables += poisson probability; discussion + conclusion In this commit: Tables: poisson probablities are added; total counts per subsequence are removed - this is because the latter are not referred to in the text while the former are needed for frequentist interpretations. The discussion and conclusion sections have been generally reorganised, substructured, and had some media references added. The WP C19CCTF data download date is added to the system, so `verify.mk` was updated. A bash bug with testing `enable_dev_override` is fixed. | 20 July 2020, 23:29:36 UTC |
0eb85e8 | Boud Roukema | 20 July 2020, 13:51:09 UTC | Update poisson.tex checksum without extra comma The previous commit fixed an unnecessary comma in the second list of countries for the abstract. This modifies the poisson.tex include file, and so the checksum for `verify.mk` needs to be updated. This is done in this commit. | 20 July 2020, 13:51:09 UTC |
9842b61 | Boud Roukema | 20 July 2020, 02:46:09 UTC | Figure legends; list in abstract This commit should stop the legends from overshadowing points too much in the counts curves; and should fix the extra comma that appeared in the list of countries with phi_i^{28} between 1.0 and 3.0. | 20 July 2020, 02:46:09 UTC |
abe1dba | Boud Roukema | 20 July 2020, 01:17:31 UTC | Adapt README.md; update TODO The README.md is adapted for this project in this commit: only a few quick changes were needed. The TODO is updated. The project seems essentially ready. Only polishing is needed. | 20 July 2020, 01:17:31 UTC |
cb32347 | Boud Roukema | 20 July 2020, 00:50:03 UTC | Merge branch 'subpoisson' of codeberg:boud/subpoisson into subpoisson | 20 July 2020, 00:50:03 UTC |
f07c7f2 | Boud Roukema | 20 July 2020, 00:45:00 UTC | EJE/ArXiv/Zenodo/SWF submittable formats This commit adjusts the rules, especially in `reproduce/analysis/make/initialize.mk`, for making the various versions of the package available to Zenodo, to SWF, to ArXiv, and to the journal. TODO has been updated to summarise the new rule names. This is more hardwired than done modularly. Possibly the most useful line to contribute to upstream maneage is: ```` tar -cv -f - $$(git ls-files reproduce) | (cd $$dir ; tar -xv -f -) ```` so that only git-managed files - and not untracked working files - are stored in the .tar.gz or equivalent package. Texlive has `latexpand` added, so `./project configure --existing-conf` is needed with this commit. | 20 July 2020, 00:45:00 UTC |
8694bd8 | Boud Roukema | 19 July 2020, 21:54:59 UTC | Update output file md5sums This commit updates the output file m5sums for the verify step, for the high resolution calculation. | 19 July 2020, 21:54:59 UTC |
28e174f | Boud Roukema | 19 July 2020, 20:54:19 UTC | Disable enable_dev_override | 19 July 2020, 20:54:19 UTC |
8842dad | Boud Roukema | 19 July 2020, 20:49:06 UTC | LaTeX tidying; some content tidying Mostly minor LaTeX tidying; \usepackage[hyphenbreaks]{breakurl} was needed for long URLs. | 19 July 2020, 20:49:06 UTC |
5fe6a85 | Boud Roukema | 19 July 2020, 16:41:07 UTC | Reproducibility/zenodo/codeberg links; fig font This commit implements and tidies up links to reproducibilty and archiving of the source package. The counts curves fonts are made bigger. | 19 July 2020, 16:41:07 UTC |
66bb0f0 | Boud Roukema | 19 July 2020, 15:06:16 UTC | Remove duplicate RU and TR from C19CCTF data; reproducibility Since RU and TR can be considered to be both in Europe and Asia, they were included twice in the list of countries taken from the C19CCTF template listing countries by geographical region. This commit removes the superfluous copies of RU and TR data. Reproducibility of the plain text data files requires countries to be output in a fixed order. The python implementation of 'set' seems to consider order arbitrary (as in ordinary mathematics), and the Pool.apply_async parallel processes do not consider order important. This commit does sorting to generate outputs in a fixed order independent of CPU calculation order. | 19 July 2020, 15:06:16 UTC |
324e5ab | Boud Roukema | 19 July 2020, 13:11:07 UTC | Verify checksums for standard calculation In this commit, the verify checksums are calculated for either the `enable_dev_override` mode or the default production (publishable) mode, checking automatically which is currently activated. A warning is given if the user is in `enable_dev_override` mode. | 19 July 2020, 13:11:07 UTC |
2e48505 | Boud Roukema | 19 July 2020, 09:14:28 UTC | Start verifying the outputs This commit starts the `verify` system for output data files and output .tex files. The md5sums in this commit are for the `enable_dev_override` fast, inaccurate option and are not meant to stay in place. | 19 July 2020, 09:14:28 UTC |
f14189f | Boud Roukema | 19 July 2020, 01:01:52 UTC | Copyright declarations; rm irrelevant files This commit adds a few copyright declarations and removes some gnuastro/ files that are irrelevant to this project. | 19 July 2020, 01:01:52 UTC |
5842e77 | Boud Roukema | 19 July 2020, 00:46:57 UTC | Tidy up the six publishable data files Six data files for publishing on Zenodo (and in the ArXiv source) are now placed in `.build/data-to-publish` in a moderately friendly and data lineage traceable way. The aim is that readers should be able to easily replot and reuse the data whichever way they wish. | 19 July 2020, 00:46:57 UTC |
aaff072 | Boud Roukema | 18 July 2020, 03:13:16 UTC | Python enable_dev_override fix The variable `enable_dev_override` cannot be left undefined. Python conventions differ from `make` conventions. This commit should fix a bug caused by `enable_dev_override` being left undefined. | 18 July 2020, 03:13:16 UTC |
2fb8c69 | Boud Roukema | 18 July 2020, 03:06:21 UTC | Fixed rng seed values with parallel processes This commit sets the random number generator seed centrally from `subpoisson.conf` through to `subpoisson.py`, passing through to `theil_sen_robust_stderr`. Minor corrections are done in `paper.tex`. The logic of `poisson.mk` is fixed so that $(outdir) is an order-only parameter, since updating a directory should not cause targets needing that directory to be reperformed. Developer overrides are now done with a single parameter in `subpoisson.conf`, enable_dev_override. If n_cpus is left undefined, then the available number of cpus is now max(1,min(ntot-2, 20)), where ntot is the total number available. | 18 July 2020, 03:06:21 UTC |
b243835 | Boud Roukema | 17 July 2020, 21:03:07 UTC | High resolution phi_i calculation This commit increases the phi_i calculation resolution, making the results more accurate but increasing the calculation time. If you have many threads/cores, then increasing the value of `n_cpus` will speed up the calculation significantly. During the remaining few days of continued development, these values may be switched on and off several times. Check carefully if you are concerned about seeing publishable quality values in your pdf. | 17 July 2020, 21:03:07 UTC |
36b72f0 | Boud Roukema | 17 July 2020, 20:39:31 UTC | Add missing script count_jumps_mod.py This commit adds `reproduce/analysis/python/count_jumps_mod.py`, which is needed to compare the numbers of jumps present in the WHO and C19CCTF data. | 17 July 2020, 20:39:31 UTC |
90cd9d1 | Boud Roukema | 17 July 2020, 20:26:17 UTC | Compare WHO to Wikipedia Case Count Task Force data With this commit, the WHO and Wikipedia Case Count Task Force (C19CCTF) data in the medical cases chart templates are compared to see how many disruptive day-to-day 'jumps' (jumps or drops) occur. This way, the choice of using the C19CCTF data is justified quantitatively rather than forcing the reader to wonder if the claim about data quality is true or not. The text is updated, and a figure added. A plain text file with the jump counts is added. | 17 July 2020, 20:26:17 UTC |
87c03fe | Boud Roukema | 16 July 2020, 18:47:39 UTC | Minor text improvements to be done: TODO += Marius pointed out that explaining a little more - even just one sentence - what the KS actually *is* would help the reader unfamiliar with XX C statistical tests. | 16 July 2020, 18:47:39 UTC |
21d7314 | Boud Roukema | 16 July 2020, 03:00:58 UTC | Fix constitutional refs typo | 16 July 2020, 03:00:58 UTC |
2513d9a | Boud Roukema | 16 July 2020, 02:59:24 UTC | Fix cross-ref label; constitutional refs Minor fix. | 16 July 2020, 02:59:24 UTC |
03c4b09 | Boud Roukema | 16 July 2020, 02:44:12 UTC | Do all countries; do all 6 counts curves This commit does two minor changes for bigger calculation levels. | 16 July 2020, 02:44:12 UTC |
1f0bf9d | Boud Roukema | 16 July 2020, 02:38:37 UTC | First complete version This version should be the first complete version that could, in principle, be submitted, apart from any minor bugs that may have been missed. The `maneage` verification steps in `verify.mk` have not (yet?) been implemented. The first priority is to make sure that the plain text input and output data files are ready for the ArXiv package. A secondary priority will be to try to follow the `maneage` verification mechanism. See TODO for other items that could be useful to do. | 16 July 2020, 02:39:05 UTC |
097f8cc | Boud Roukema | 16 July 2020, 00:23:42 UTC | Lots of text changes Changes in this commit include: * EJE format changes; * general title, abstract, introduction, method improvements; * references matching the introduction updates | 16 July 2020, 00:23:42 UTC |
671eeb8 | Boud Roukema | 15 July 2020, 17:51:06 UTC | EJE style - sffamily, namerefs EJE uses the sans serif style for the title and section headers rather than numbers. This commit adjusts the paper style to closer match the EJE style. | 15 July 2020, 17:51:06 UTC |
0188c2d | Boud Roukema | 15 July 2020, 03:01:18 UTC | Improve comments in subpoisson.conf This commit aims to make the comments in `reproduce/analysis/config/subpoisson.conf` clearer. | 15 July 2020, 03:01:18 UTC |
c0f8b84 | Boud Roukema | 15 July 2020, 02:50:52 UTC | Update abstract to better match content The automatic results in the abstract piped from the calculations and text were somewhat misleading. The abstract should make more sense with this commit. Abstract word estimate about 240 (emacs) or 210 (wc on pdf + mouse cut/paste). This commit requires at least 6 countries enabled in the `for country in country_list:` line in `subpoisson.py` and should compile OK. | 15 July 2020, 02:50:52 UTC |
4bebc6a | Boud Roukema | 15 July 2020, 01:05:13 UTC | Include counts curves in paper This commit includes some of the least noisy daily counts curves, and some median-phi_i counts curves, in the paper. The aim is to make the result visually clear to the reader. | 15 July 2020, 01:05:13 UTC |
16bbeef | Boud Roukema | 14 July 2020, 23:12:15 UTC | Poisson and phi_i consistent curves Plot n_i(j) curves for the N_plot_lowest_phi countries in each of the four cases, comparing to the 68% Poisson band, and, if phi_i > 1, to the phi_i band. Do the same thing for the N_plot_median_phi countries at the median of the phi distribution for comparison, in the same way; these should all have phi_i > 1 and show the phi_i model. | 14 July 2020, 23:12:15 UTC |
d027060 | Boud Roukema | 14 July 2020, 17:14:56 UTC | N-day subsequence tables: choose correct column This commit corrects the column selection in `reproduce/analysis/make/poisson.mk` for the N-day subsequence tables. | 14 July 2020, 17:14:56 UTC |
7a645da | Boud Roukema | 14 July 2020, 14:00:12 UTC | Subsequence means and totals to tables The low-phi_i countries appear to favour 10k in either total or mean counts. This commit automates this and adds these values to the subsequence tables, and omits psi from them, since psi is of less interest. | 14 July 2020, 14:00:12 UTC |
4142391 | Boud Roukema | 14 July 2020, 02:28:58 UTC | Text/figs/tables tidying; simpler defaults The presentation of the article is starting to converge. This commit does several small steps in this tidying in terms of the presentation of the results that are the most relevant. The speedup default parameters are set mostly left unset, apart from the `delta_log10_*` parameters which are kept at low resolution for speed. With these two parameters set at 0.5 and 6 cpus a full calculation took about 10 minutes to run on an ordinary few-years-old desktop computer. | 14 July 2020, 02:28:58 UTC |
cf83f3b | Boud Roukema | 14 July 2020, 01:41:33 UTC | Parallel processing: minor fixes This commit removes a testing parameter and adds a j_OK estimate. | 14 July 2020, 01:41:33 UTC |
9b1fb3d | Boud Roukema | 14 July 2020, 01:26:00 UTC | Parallel processing per country seems to work This commit implements python asychronous parallel processing across countries. It appears to work correctly. The parameter `n_cpus` is added to `reproduce/analysis/config/subpoisson.conf`. The speedup is significant. | 14 July 2020, 01:26:00 UTC |
eefbc9d | Boud Roukema | 13 July 2020, 21:40:51 UTC | Solve make rules bug Until this commit, `reproduce/analysis/make/poisson.mk` had the target + prerequisite `$(outdir)/done-check-poisson: $(outdir)`. The motivation was to make sure that the directory is created if it doesn't yet exist. But the problem is that whenever a file is added to or removed from the directory, the directory last-modified timestamp is updated. Since `./project make clean-poisson` removes old output files from `$(outdir)`, this updated the timestamp of that directory. Thus, the `make` rule said that the rules for the target `$(outdir)/done-check-poisson` had to be updated, because the prerequisite (the directory) was newer than the target (the zero-byte file `done-check-poisson`). This is solved by converting the prerequisite to an `order-only` prerequisite: `$(outdir)/done-check-poisson: | $(outdir)`. This bug appears to be solved. | 13 July 2020, 21:40:51 UTC |
d76a9e1 | Boud Roukema | 13 July 2020, 12:37:52 UTC | LaTeX source minor tidying Two minor fixes: fig references and \sloppy in data availability. | 13 July 2020, 12:37:52 UTC |
78c31e3 | Boud Roukema | 13 July 2020, 03:20:17 UTC | Very rough discussion + conclusion | 13 July 2020, 03:20:17 UTC |
0324b06 | Boud Roukema | 13 July 2020, 02:12:02 UTC | Fix minor LaTeX bug Fix in newcommand in `reproduce/analysis/make/download.mk`. | 13 July 2020, 02:12:02 UTC |
29f9941 | Boud Roukema | 13 July 2020, 01:17:08 UTC | Centralise fixed vs clock rng seed With this commit, the boolean parameter `fixed_rng_seed` in `reproduce/analysis/config/subpoisson.conf` decides whether the calculations for the paper should be run with a fixed pseudo-random number generator. For reproducibility, the default is `True`. For checking that the results are not especially sensitive to the seed, `fixed_rng_seed` should be set to `False`. | 13 July 2020, 01:17:08 UTC |
37ba3d8 | Boud Roukema | 12 July 2020, 22:57:00 UTC | s/WHO/Wikipedia medical cases chart/g This commit updates the text from WHO to Wikipedia medical cases chart data, leaving the WHO bugs as an unfortunate problem. | 12 July 2020, 22:57:00 UTC |
05fbba8 | Boud Roukema | 12 July 2020, 22:07:36 UTC | Fix directory for input data - not output dir The input data directory is not the output data directory. This commit should fix that in `reproduce/analysis/make/poisson.mk`. | 12 July 2020, 22:07:36 UTC |
48d5f9c | Boud Roukema | 12 July 2020, 22:01:05 UTC | Try to fix make rule for medical cases file This commit hopefully fixes a missing target in `reproduce/analysis/make/download.mk` - for the Wikipedia medical cases file. The `TODO` file is also updated. | 12 July 2020, 22:01:05 UTC |
bcb11b5 | Boud Roukema | 12 July 2020, 21:43:56 UTC | Switch to Wikipedia medical cases charts This commit seems to work correctly for the Wikipedia medical cases charts, which are better curated data than the WHO official data. Minor title improvement. | 12 July 2020, 21:43:56 UTC |
779035a | Boud Roukema | 12 July 2020, 18:32:41 UTC | Add Wikipedia data to repository In previous commits, obvious errors in the WHO national daily SARS-CoV-2 data were partially corrected using the `replace_pairs` algorithm. This is not an ideal data curation method, since it requires guesswork and adds some noise to the data. With this commit, a script to manually download the Wikipedia `medical cases chart` data, along with the data as of today, are added. The `make` rules in this commit have not yet been tested. | 12 July 2020, 18:32:41 UTC |
0ba49f9 | Boud Roukema | 12 July 2020, 04:13:11 UTC | Plots of least noisy counts curves Counts curves for the least noisy countries are partly implemented in this commit. | 12 July 2020, 04:13:11 UTC |
c271289 | Boud Roukema | 12 July 2020, 00:53:49 UTC | Subsequence figures, tables, start dates The three types of subsequences (28, 14, 7 days) appear to work correctly. Figures of \psi_i and tables including start dates are given. | 12 July 2020, 00:53:49 UTC |
e067f5d | Boud Roukema | 11 July 2020, 23:16:56 UTC | Merge branch 'subpoisson' of codeberg:boud/subpoisson into subpoisson | 11 July 2020, 23:16:56 UTC |
eb2b074 | Boud Roukema | 11 July 2020, 23:15:03 UTC | Subsequence starting figs+tables This development-step commit starts to add figures and tables for the subsequences. | 11 July 2020, 23:15:03 UTC |
43ebb7c | Boud Roukema | 11 July 2020, 22:03:42 UTC | Fix EJE format patch file error The patch file patched against a patched file instead of the original file. This commit fixes that and should work. | 11 July 2020, 22:03:42 UTC |
5833232 | Boud Roukema | 11 July 2020, 20:26:42 UTC | Subseq start; covid-19 refs; EJE style In this commit, the routine for searching for optimal subsequences is added. Some COVID-19 references are added. Some improvements to bring the citation style closer to EJE format are done, though it doesn't quite match exactly. | 11 July 2020, 20:26:42 UTC |
e089981 | Boud Roukema | 11 July 2020, 16:22:54 UTC | Add psi figure; fix error bars This commit adds the psi_N figure to the paper and corrects the error bar, whose values didn't follow the correct matplotlib convention. | 11 July 2020, 16:22:54 UTC |
2471624 | Boud Roukema | 11 July 2020, 15:31:17 UTC | Table improvements; phi refinement In this commit, the first table is improved in style. The phi uncertainty column is sacrificed in favour of the raw Poisson probability, because some countries have values that are not rejections. A one-stage refinement in the accuracy of estimating phi is added. More powerful algorithms could be used for further speedup. | 11 July 2020, 15:31:17 UTC |
bc5df79 | Boud Roukema | 11 July 2020, 04:14:10 UTC | Make rules, plot tick label annoyance In this commit, some more `clean-*` rules are added in `reproduce/analysis/make/paper.mk` and documented in `project`, for the `./project --help` command. The difficulty in convincing `matplotlib` to label logarithmic axes nicely like `plotutils` does is briefly commented in `reproduce/analysis/python/subpoisson.py`. | 11 July 2020, 04:14:10 UTC |
fc49776 | Boud Roukema | 11 July 2020, 01:57:48 UTC | Add table of lowest phi, psi countries In this commit, the Theil-Sen fit to phi_i(N_i) is plotted and explained in the text. TODO: The psi_i(N_i) plot is started but not ready, and not yet for inclusion in `paper.tex`. A table with the key country criteria for the lowest values of phi and psi is included. The default values in this commit (and other recent ones) in `reproduce/analysis/config/subpoisson.conf` are for fast development and are inaccurate. Stronger values, for slower but more accurate calculations, are needed for proper results. TODO: The subsequence results have not yet been started. A table of the lowest phi and psi countries | 11 July 2020, 01:57:48 UTC |
fbcbde8 | Boud Roukema | 10 July 2020, 22:49:12 UTC | Start on results section; first two figures. With this commit, the results section with two figures has been started. A plain text file `.build/data-to-publish/phi_N.dat` is now created automatically. This will have to be included in ArXiv and maybe the publisher version of the data - with the appropriate copyright declarations. Plotutils is removed from `reproduce/software/config/TARGETS.conf`. Copyright declarations are added to the .py scripts. | 10 July 2020, 22:49:12 UTC |
b4ca5b5 | Boud Roukema | 10 July 2020, 19:04:47 UTC | Add plotutils to maneage This commit adds the Debian 2.6-11 patched version of plotutils 2.6 . Debian distributes the original and patched versions of software separately - it's up to the user to apply the patches if s/he wishes to compile from source. To satisfy the maneage system, until plotutils is added to the maneage zenodo archive, this commit uses a codeberg git repository which provides a downloadable .tar.gz file. A third parameter `tarball_download_name="$(strip $(3))";` is added to the `make` level `import-source` macro in `reproduce/software/make/build-rules.mk` in order to allow for non-standard URLs. | 10 July 2020, 19:04:47 UTC |
5218ffb | Boud Roukema | 10 July 2020, 15:12:56 UTC | Analysis section of method An initial rough draft of the analysis section is added in this commit, including the subsequences, and an appropriate Bonferonni-Sidak caveat. The names and indices of the variables in the science text have been made more specific. Consistency of these may still need to be checked. | 10 July 2020, 15:12:56 UTC |
d4ead90 | Boud Roukema | 10 July 2020, 02:44:17 UTC | Theil-Sen module; min_days A Theil-Sen python module is added for robust linear fitting, adapted from my octave routine for this. The min_days starting sequence is now written in the paper.tex text. | 10 July 2020, 02:44:17 UTC |
1ca644f | Boud Roukema | 10 July 2020, 00:38:22 UTC | Rewrite introduction, method 2.1; simpler thresholds; arXiv refs This commit rewrites the introduction to better match the abstract. The data treatment part of the method (2.1) is updated. The threshold is set to a single threshold for both starting and ending, but the minimum days requirement is also set at the beginning, to avoid initial fluctuations artificially cutting off a sequence. Refs for burstiness are added. A patch to the EJE/Springer .bst file is applied for ArXiv IDs. Irrelevant maneage template 'delete-me' files are deleted. | 10 July 2020, 00:38:22 UTC |
32a6ea7 | Boud Roukema | 05 July 2020, 19:03:26 UTC | Add missing EJE patch file This commit adds the patch file `reproduce/analysis/patches/20200703_EJE_abstract.patch` which was missing. | 05 July 2020, 19:03:26 UTC |
8e13095 | Boud Roukema | 04 July 2020, 02:37:39 UTC | Abstract: complete update of method + early results This commit changes many files. It should generate a reasonable looking pdf for two reasonable choices of the start (and stop) thresholds, and give the basic results in the abstract. Runtime is a few minutes. Minor *.mk bug: if the pdf is not fully made, then `./project make` sometimes causes the `poisson` rule to be run too. | 04 July 2020, 02:37:39 UTC |
24de267 | Boud Roukema | 03 July 2020, 16:48:56 UTC | EJE style This commit does many changes to generate the first version which is more or less in Springer/EJE style, using the Springer LaTeX files. The abstract is only one column wide, in contrast to official EJE publications. This is a minor problem only - though it would look nice prior to submission to match the official style. | 03 July 2020, 16:48:56 UTC |