https://gitlab.com/makhlaghi/maneage-paper.git

sort by:
Revision Author Date Message Commit Date
2628c8c Merge branch 'boud_sections_II_III' into 'master' Boud sections ii iii See merge request makhlaghi/maneage-paper!16 23 May 2020, 17:42:23 UTC
63b912a Added TeXLive's ulem package to also be built David reported this problem, it happened right after importing IEEEtran, but for some reason, it didn't happen for me. 23 May 2020, 17:34:38 UTC
198ed0e Corrected name of listings package when installing it with texlive When entering the name of the "listings" package, I had forgot to add the final 's', so it wasn't being installed on a clean system! I didn't have a problem until now, because it remained from previous builds. 23 May 2020, 17:27:40 UTC
112f74b Section III edits - 5901 words This commit makes several small changes to Section III, some of which are quite significant in terms of meaning. It was difficult to improve the clarity without extending the word length. Now we're at 5901 words. 23 May 2020, 17:26:50 UTC
f8cb55e Section II edits + definition of solutions This commit implements quite a few minor changes in section II. The aim of most is to clarify the meaning and remove ambiguity. A few changes are that the reader will normally assume that successive sentences in a paragraph are closely related in terms of logical flow. It is superfluous - and considered excessive - to put too many "Therefore"'s and "Hence"'s in (at least) modern astronomy style. These are supposed to be used when there is a strong chain of reasoning. One change is done in the Introduction, because if we're going to use "solution(s)" throughout to mean "reproducible workflow solution(s)", then we have to clearly define this as jargon for this particular paper. It's probably preferable to RWS - reproducible workflow solution - or RWI - reproducible workflow implementation. But we can't just keep saying "solution" because that has many different meanings in a scientific context. Pdf word count = 5880 23 May 2020, 16:41:32 UTC
fdafd0a Cherry-pick 7bf5fcd to make merging easier This series of commits aims to edit sections II+III, but first implements the changes from 7bf5fcd, apart from one that conflicts in the abstract: this commit has ``Maneage'' without `(managing+lineage)` in the abstract. 23 May 2020, 16:16:51 UTC
39161aa Biography style reverted to CiSE PDF mode (different from webpage) After a look at the PDFs of the linked papers of the previous commit and a few 2020 papers, we noticed that the biography format of the webpage and PDFs are different! So it is now back in its old way (which is how biographies are presented in the PDF). A few other minor edits were made in the text. 23 May 2020, 15:56:32 UTC
b41a646 Affiliations CiSE style It appears from looking at https://ieeexplore.ieee.org/document/5725236/authors#authors https://ieeexplore.ieee.org/document/7878935/authors#authors that the affiliations section needs to start with a one-phrase definition of the author's main affiliation. In 5725236, the typesetters/proofreaders swapped van der Walt and Colbert, so don't be confused by that. It shows that nobody proofread properly. With this commit, each author's institute (single hierarchical level) is written as the first paragraph of the author's affiliation section. Since 5725236 allows a very-well-known acronym, I'm guessing that IAC can be defined for Mohammad and then re-used for the others. I've added a brief CV for me. If necessary, we could compress my main research together as "observational cosmology", but let's see how we go in the word count. I have not (yet) worked through the main text. There is also one minor language fix - `Because is complete` was incomplete. Pdf word count: 5873 23 May 2020, 13:55:03 UTC
70597b6 Edits, to make the text more readable After one day not looking at the first draft of this new version (commit 7b008dfbb9b2), I went through the text and done some general edits to make its presentation and logic smoother. 23 May 2020, 03:02:05 UTC
2eed85b Typo and style corrections in the text, Roberto's bio added Before this commit: several typos were present along the text. With this commit several typos have been corrected (types listed below) and my bio has been added. a) double words b) general typos c) comas after adverbs at the beginning of a sentence d) contractions are removed, e.g., don't vs do not e) three sentences in parenthesis have been removed since I think they were out of context or unnecessary f) etc 23 May 2020, 00:48:54 UTC
f4e977e Corrected name of produced demonstration table In order to correspond to the updated datalineage plot, the name of the plotted columns was changed to 'columns.txt', but I had forgot to update it in the LaTeX source and since the old file still remained I hadn't noticed. This was found by Boud and corrected. 22 May 2020, 23:38:14 UTC
7b008df Re-write of the paper to fit in ~6000 words and IEEE format Following the fact that the DSJ editor decided that this paper doesn't fit into their scope, we decided to submit it to IEEE's Computing in Science and Engineering (CiSE). So with this commit the text was re-written to fit into their style and word-count limitations. 22 May 2020, 01:18:42 UTC
2bfa3a0 First implementation of style in IEEEtran style The paper is no longer using LuaLaTeX, but raw LaTeX (that saves a DVI), it is so much faster! Initially I had used LuaLaTeX to use special fonts to resemble the CODATA Data Science Journal, but all that overhead is no longer necessary. Therefore I also removed the MANY extra LaTeX packages we were importing. The paper builds and is able to construct one of its images (the git-branching figure) with only 7 packages beyond the minimal TeX/LaTeX installation. Also in terms of processing it is so much faster. The text is just temporary now, and mainly just a place holder. With the next commit, I'll fill it with proper text. 02 May 2020, 03:42:58 UTC
7fee886 Added a .gitattributes file to avoid merging some files As explained in the new `README-hacking.md', this files greatly helps in avoiding un-necessary conflicts. 01 May 2020, 21:37:32 UTC
df878cc Imported recent changes in Maneage, minor conflicts fixed A few small conflicts showed up here and there. They are fixed with this merge. 01 May 2020, 21:36:45 UTC
8266607 Fixed OpenSSL deprecation bug on some OSs, causing problems in libgit2 Until this commit, the configure step would fail with an error when compiling libgit2 on a test system. The origin of this bug, on the OS that was tested, appears to be that in OpenSSL Version 1.1.1a, openssl/ec.h fails to include openssl/openconf.h. The bug is described in more detail at https://savannah.nongnu.org/bugs/index.php?58263 With this commit, this is fixed by manually inserting a necessary components. In particular, `sed` is used to insert a preprocessor instruction into `openssl/openconf.h`, defining `DEPRECATED_1_2_0(f)`, for an arbitrary section of code `f`, to include that code rather than exclude it or warn about it. This commit is valid provided that openssl remains at a version earlier than 1.2.0. Starting at version 1.2.0, deprecation warnings should be run normally. We have thus moved the version of OpenSSL in `versions.conf' to the section for programs that need to be manually checked for version updates with a note to remind the user when reaching that version. Other packages that use OpenSSL may benefit from this commit, not just libgit2. 01 May 2020, 20:03:32 UTC
a6f5fcd Abstract: three minor language edits The difference between `that` and `which` is not strictly required, but it helps clarify the difference in meaning, which is important in science and software :). This is best shown by an example: * Maneage provides reproducibility, which is a good thing. The sentence would make sense if we drop `, which is a good thing.` The last part of the sentence is a comment rather than a necessary part of the sentence. * Maneage provides a quality of reproducibility that is missing from other implementations. The sentence would not quite make sense if we drop `that is ...`, since we would not know what sort of quality is provided. The fact that the quality is missing is key to the intended meaning of the sentence. 01 May 2020, 13:01:00 UTC
8f0ce4a Merged David's suggestions, further edited to be more clear It is also slightly shorter with this commit, without loosing anything substantial. 01 May 2020, 11:52:22 UTC
4381686 Minor edits in abstract No need to invent a new word (archive-able) when an existing one (archivable) does the job. One issue that we have not included and which perhaps we could discuss in the paper (space permitting), is that this tool could bypass the use of blockchains in this context. 01 May 2020, 11:37:45 UTC
bff9cb5 Minor edits in abstract, link between analysis and narrative added As discussed by Boud in the previous commit, this is an important feature that was lost in the new abstract. So I added it as a criteria. 01 May 2020, 11:24:08 UTC
1c20614 Several minor edits to the title + abstract Most are minor English tidying, e.g. * spelling: achieving * archivable - https://en.wiktionary.org/wiki/archivable * `i.e.` does not look good in an abstract; * `when` didn't sound quite right; Comment: we no longer state one of the most interesting aspects of Maneage - producing the draft paper that is submittable for peer review in a way that makes it natural for the authors to achieve automatic consistency between the calculations/analysis and the values in the paper. But this is hard to describe in a compact way without disrupting the overall argument of the abstract, so it's a bit of a pity, but people will learn about it anyway from the body of the article (or from trying out the package!) `Peer-review verification` does not directly state producing a pdf. Related to this absence of talking about reproducing the *paper*, not just the calculations, I suggest dropping `, with snapshot \projectversion` from the abstract initially sent to the journal (they can't stop us updating it afterwards), because without the context of explaining that the paper itself is produced from the package, it's not clear what the snapshot means - a snapshot of the abstract? In the `real` paper, it makes sense, because the reader will have access to the rest of the paper. 01 May 2020, 03:25:55 UTC
2e525d9 Edited abstract for more clarity, still in the 250 word limit Boud's suggestions in the previous commit were great and really helped in improving the tone of the abstract (and thus the whole paper shortly!), better putting it in the big picture. I had forgot to give the exact word limit (which was 250), so Boud had set it to a very conservative value of 190, I added around 22 words to better highlight the points we want to make, while still being below the limit. 01 May 2020, 02:39:38 UTC
8f14213 Abstract re-organized to be more research-oriented To make this a research article, we either have to present it as a theoretical advance, or as an empirical advance. An empirical research result would be something like doing a survey of users and getting statistics of their success/failure in using the system, and of whether their experience is consistent with the claimed properties and principles of Maneage (e.g. success/failure in creating paper.pdf as expected? was the user's system POSIX? did the user do the install with non-root privileges? was this a with-network or without-network ./project configure ?) This is doable, but would require a bit of extra work that we are not necessarily motivated to do or have the time to do right now. I think it's possible to present Maneage as a theoretical advance, but it has to be worded properly. Maneage is a tool, but it's a tool that satisfies what we can reasonably present as a unique theoretical proposal. Here's my proposed rewrite. I've aimed at minimum word length. I've also included (commented out) keywords for a structured research abstract - these are just for us, as a guideline to improve the abstract. I think "criteria" is safer than "standards". Whether a principle is good or bad tends to lead to debate. Whether a criterion is satisfied or not is a more objective question, independent of whether you agree with the criterion or not. In the rewrite below, we propose a theoretical standard and show that the new standard can be satisfied. Maneage is *used as a tool* to prove that the standard is not too difficult to achieve. Maneage is no longer the subject of the paper. (That won't change the main body of the paper too much, apart from compression, but the way it's presented will have to change, under this proposal.) The title would need to match this. E.g. TITLE.1: Evidence that a higher standard of reproducibility criteria is attainable TITLE.2: Evidence that a rigorous standard of reproducibility criteria is attainable TITLE.3: Towards a more rigorous standard of reproducibility criteria I would probably go for TITLE.3. 01 May 2020, 02:25:08 UTC
842fd2f Abstract re-written to better highlight the uniqueness of Maneage This abstract is a first step in order to put more focus on the research aspects of Maneage. 01 May 2020, 01:10:03 UTC
ce485fc Removed Definition and Summary sections and low-level figures Given the very strict limits of journals, we needed to remove these sections and images. The removed images are: the `figure-file-architecture', `figure-src-topmake' and `figure-src-inputconf'. In total, with `wc' we now have 9019 words. This will be futher reduced when we remove all the technical parts of the Maneage section, in short, we will only describe the generalities, not any specific details. 01 May 2020, 00:56:35 UTC
638ec52 Added interesting references by David David suggested some interesting references in particular about the problems with Juypyter notebooks that are now added to the long version of the paper. We'll later decide if/how they can be used. 30 April 2020, 23:39:39 UTC
b953465 Reactivated --host-cc config option to use host C compiler Until now, if GCC couldn't be built for any reason, Maneage would crash and the user had no way forward. Since GCC is complicated, it may happen and is frustrating to wait until the bug is fixed. Also, while debugging Maneage, when we know GCC has no problem, because it takes so long, it discourages testing. With this commit, we have re-activated the `--host-cc' option. It was already defined in the options of `./project', but its affect was nullified by hard-coding it to zero in the configure script on GNU/Linux systems. So with this commit that has been removed and the user can use their own C compiler on a GNU/Linux operating system also. Furthermore, to inform the user about this option and its usefulness, when GCC fails to build, a clear warning message is printed, instructing the user to post the problem as a bug and telling them how to continue building the project with the `--host-cc' option. 29 April 2020, 02:53:02 UTC
c778a69 Better explanation at the end of the configuration Until now, at the end of the configuration step, we would tell the user this: "To change the configuration later, please re-run './project configure', DO NOT manually edit the relevant files". However, as Boud suggested in Bug #58243, this is against our principle to encourage users to modify Maneage. With this commit, that explanation has been expanded by a few sentences to tell the users what to change and warn them in case they decide to change the build-directory. 28 April 2020, 02:23:34 UTC
4a53bd5 Astropy will no longer be installed by default Until now Gnuastro and Astropy where installed by default in any clean build of Maneage. Gnuastro is used to do the demonstration analysis that is reported in the paper and Astropy was just there to help in testing the building of the MANY tools it depends on! It (and its dependencies) also had several papers that helped show software citation. However, as Boud suggested in task #15619, the burden of installing them for a new user may be too much and any future changes will cause merge conflicts. It may also give the impression that Maneage is only/mainly written for astronomers. So with this commit, I am removing Astropy as a default target. But we can only remove Gnuastro after we include an alternative analysis in the demonstration `delete-me' files. Following Boud's suggestion in that task, `TARGETS.conf' was also added to the files to be ignored in any future merge (in the checklist of `README-hacking.mk'). The solution was already described there, but mainly focused on the deleted `delete-me' files. So with this commit, I brought out this item as a more prominent item in the list. Maybe we can later add the analysis done in the Maneage paper (not yet published). In terms of testing the software builds, we already have task #15272 (Single target to build all high-level software, for testing) that aims to have a single configure option to install ALL high-level software and we can ask people to try if they like and report errors. 28 April 2020, 01:43:22 UTC
2fb0b2a Configration bug fixed: other problematic software names from tarball Similar to the previous commit (e43e3291483699), following a change made yesterday in the identification of software names from their tarballs, a few other problematic names are corrected with this commit: `apr-util', HDF5, TeX Live's installation tarball and `rpcsvc-proto'. Even though we have visually checked the list of software, other unidentified similar cases may remain and will be fixed when found in practice. 27 April 2020, 23:39:35 UTC
e43e329 Configration bug fixed: identify pkg-config from its tarball name Until Commit 3409a54 (from yesterday), pkg-config was found correctly in `reproduce/software/make/basic.mk` by searching for `pkg`. However, commit a21ea20 made an improvement in the regular expression for relating package names and download filenames, and the string `pkg-config` with the new regex no longer simplifies to `pkg`. The result of this was that the basic.mk could not find `pkg-config` in the list of packages, since it was still listed as `pkg`. This blocked downloading for a system without pkg-config preloaded. With this commit (of just a few bytes), the bug is fixed. 27 April 2020, 23:14:52 UTC
d474d4c Aborting with informative error when GNU gettext not found Until now, we wouldn't explicity check for GNU gettext. If it was present on the system, we would just add a link to it in Maneage's installation directory. However, in bug #58248, Boud noticed that Git (a basic software) actually needs it to complete its installation. Unfortunately we haven't had the tiem to include a build of Gettext in Maneage. Because it is mostly available on many systems, it hasn't been reported too commonly, it also has many dependencies which make it a little time consuming to install. So with this commit, we actually check for GNU gettext right after checking the compiler and if its not available an informative error message is written to inform the user of the problem, along with suggestions on fixing it (how to install GNU gettext from their package manager). 27 April 2020, 01:55:01 UTC
acb13b9 Thanked Fabrizio, Tamara and Nadia for their support They supported my visit and talk on Maneage at the Barcelona Super Computing center. They have also offerred to read the paper and are providing comments. Also, I noticed that in the author list, we had forgot to put an `,' after Boud's name. That is also corrected here. 26 April 2020, 23:35:16 UTC
a21ea20 Configuration: improved version separation from tarball name Until now, the sed script for determining URL download rules in the three software building Makefiles (`basic.mk', `high-level.mk' and `python.mk') considered package names such as `fftw-3...` and `fftw2-2.1...` to be identical. As the example above shows, this would make it hard to include some software that may hav conflicting non-number names. With this commit, the SED script that is used to separate the version from the tarball name only matches numbers that are after a dash (`-'). Therefore considers `fftw-3...` and `fftw-2...` to be identical, but `fftw-3-...` and `fftw2-2.1...` to be different. As a result of this change, the `elif' check for some of the other programs like `m4', or `help2man' was also corrected in all three Makefiles. While doing this check on all the software, we noticed that `zlib-version' is being repeated two times in `version.conf' so it was removed. It caused no complications, because both were the same number, but could lead to bugs later. 26 April 2020, 23:22:20 UTC
3409a54 README-hacking.md: described why automatic preparation only occurs once Recently (since Commit 7d0c5ef77), the preparation is not run automatically every time. It is only run automatically the first time and needs to be manually called with the `--prepare-redo' option. But this wasn't explained in `README-hacking.md' (currently the main documentation of Maneage). With this commit, a description about invoking the preparation process after the first attempt of the running project has been added to `README-hacking.md'. 26 April 2020, 17:23:52 UTC
d058b0c Corrected Gnuastro configuration directory in initialize.mk Recently (in Commit 8eb0892e) the Gnuastro configuration files moved under "reproduce/analysis/config/gnuastro" directory (before that they were in `reproduce/software/config/gnuastro)'. But this hadn't been reflected in it the variable that defines this directory in `initialize.mk'. With this commit, the address of the Gnuastro configuration files directory is corrected, allowing Gnuastro programs to operate properly when it is used. 26 April 2020, 17:13:25 UTC
cb74bd9 verify-outputs.conf: typo correction in comment to avoid confusion Until now, the comment in the file said that setting the `verify-outputs` variable to `yes` disables the verification. Looking at `reproduce/analysis/make/verify.mk` shows that the opposite is true. With this commit, the word `disable` is replaced with `enable` so that the user is not confused by the conflict between the source code in the other file and this comment. 26 April 2020, 02:58:29 UTC
4d3db9a Configure.sh: build directory checked for ability to modify permissions Until now we only checked for the existance and write-ability of the build directory. But we recently discovered that if the specified build-directory is in a non-POSIX compatible partition (for example NTFS), permissions can't be modified and this can cause crashs in some programs (in particular, while building Perl, see [1]). The thing that makes this problem hard to identify is that on such partitions, `chmod' will still return 0 (so it was hard to find). With this commit, a check has been added after the user specifies the build-directory. If the proposed build directory is not able to handle permissions as expected, the configure script will not continue and will let the user know and will ask them for another directory. Also, the two printed characters at the start of error messages were changed to `**' (instead of `--'). When everything is good, we'll use `--' to tell the user that their given directory will be used as the build directory. And since there are multiple checks now, the final message to specify a new build directory is now moved to the end and not repeated in every check. [1] https://savannah.nongnu.org/support/?110220 26 April 2020, 01:46:32 UTC
8430c9a Demonstration cloning URL set to https://git.maneage.org/project.git Until now, we were using GitLab as the main Git repository of Maneage. But today I finally setup our own Git repository under `git.maneage.org' and enabled a CGit web interface for a simple and fast viewing of the commits and changes. Since this URL is under our own control, we can always ensure that it will point to somewhere meaningful, on any server so in the long-run its much better than publishing the paper an explicit reliance of `gitlab.com'. 25 April 2020, 03:58:45 UTC
d73a262 IMPORTANT: Primary Maneage repositories are now under maneage.org Until now, the primary Maneage URLs were under GitLab, but since we now have a dedicated URL and Git repository, its better to transfer to this as soon as possible. Therefore with this commit, throughout Maneage, any place that Maneage was referenced through GitLab has been corrected. Please correct your project's remote to point to the new repository at `git.maneage.org/project.git', and please make sure it follows the `maneage' branch. There is no more `master' branch on Maneage. 25 April 2020, 03:50:08 UTC
b336c13 Typo 24 April 2020, 16:59:45 UTC
8bdf5f8 Minor edits on Boud's great corrections Reading over Boud's edits, I noticed a few other parts that I could summarize more and corrected one or two other parts to fit the original purpose of the sentence better. 23 April 2020, 22:05:09 UTC
6e2ea98 Conclusion Reduction by about 5 words. Although it's true that the low-level tools - make, bash, gcc - are still being actively developed, only expert users will tend to notice the differences, and in this context, it's probably more useful to point out that these are actively *maintained*. (Comment: I felt that the first sentence in the Conclusion is missing one of the obvious criteria for handling big data - citizen control so that big data could hopefully become less Orwellian than it is right now, with GAFAM having the main big data databases that are used by AI researchers and will tend to affect people's lives more than traditional "scientific" databases. But there's no point adding this here, since the criteria that tend to satisfy the scientific requirements ("principles") and citizens' rights tend to overlap to a fair degree...) 23 April 2020, 15:36:12 UTC
1de931d Discussion/caveats section. Reduction of about 50 words. There were a couple of expressions that look a bit like some sort of software/research analysis jargon, such as `Research Objects`, `Software Heritage`, `Machine actionable`. Unless these are defined, capitalising them makes the reader assume that there is some well-known formal meaning and that s/he has to search for that him/herself. As lower case expressions, the reader can guess some reasonable meanings of these. The word "embargo" was introduced for proposal 2) to handle the third caveat. 23 April 2020, 15:12:56 UTC
3c5ae2c Further edits to summarize the parts corrected by Boud [Compared to first submission to DSJ last week with 11436 words in raw PDF, we have decreased the paper by ~1000 words to 10493 :-)] As with the previous commits, the moment Boud changed the structure of sentences, I was able to find the redundancies and remove them! This is a fascinating feature of collaboration I had never felt before: it is so hard to find redundancies in my own raw text, but even a minor correction by someone else suddeny breaks my mental memories/barrier on the sentence, allowing me to be more critical to it! Anyway, besides such corrections, I fixed a few other things: 1) In the DSJ's recently published papers, ther is no `~' between "Figure" and its number. 2) I noticed that in `tex/src/figure-src-inputconf.tex' I was actually using manually input strings for the filename, checksum and size! This was contrary to the whole philosophy of Maneage(!), I must have rushed and forgot! So LaTeX variables are now defined and used. 23 April 2020, 02:54:26 UTC
e27634f 4.6 Project analysis - publication About 20 words less. The ArXiv URL is added - this adds no extra length in words, and some readers will not be familiar with ArXiv (although the COVID-19 pandemic has attracted attention to BiorXiv). 23 April 2020, 00:55:51 UTC
52e9ae4 4.5 Project analysis - multi-user Increase by 5 words. We don't need to give a big warning here, but "Permissions management" is meant to be a brief way of saying that whether or not different users can really read/write/execute in subdirectories will firstly depend on whether the user who cloned Maneage has handled these permissions correctly and whether s/he is able to allow others to edit in his/her subdirectories. Comment: Users would have to check who else is logged in at the time, who else is running jobs, and so on. On a supercomputer this might make sense, to avoid unnecessary recompiles. Anyway, this edit summary is not the place to discuss this... 23 April 2020, 00:36:43 UTC
e1adce4 4.4 Project analysis - git branches Reduction by 15 words. "Branch" is fine as a verb, and "off" is fine as a preposition; there's no need for a second preposition. "We branched off the main forest path onto a smaller path". 23 April 2020, 00:25:43 UTC
59715db 4.3.6 Project analysis - configure files Length reduction by about 15 words. A semantically significant change is from `leading to more robust scientific results` to `evolves in the case of exploratory research papers, and better self-consistency in hypothesis testing papers`. I said this in a previous commit, but it can't hurt repeating: In the covidian epoch (though not only), it is especially important to distinguish bayesian type exploratory research (typical in astronomy or searching for a good COVID-19 treatment or vaccine) from hypothesis testing (clinical testing in double-blind random access trials with clinical trials methods published on a public registry prior to the trials taking place). In the latter case, you want your results to be analysed consistently with the plan published before the trials even begin, and ideally you want them to be published (or at least posted on the trial registry website) even if your results are insignificant, to avoid a publication bias in favour of significant results. Test homeopathy against placebos in 1000 independent experiments, analyse them all the same way, and 2-3 experiments will be significant at the 3 sigma level... 22 April 2020, 23:50:56 UTC
708ec3b 4.3.5 Project analysis - downloads Reduction by about 7 words. I added "internet security" as an extra reason for having all the downloads in a single file. Modularity and minimal complexity in themselves generally contribute to internet security, but in this case, it's obvious that having all the communication with the outside world managed through a single file makes internet security management much simpler. I replaced the "fake URL" by the real one, because at least in the present format, the URL fits in nicely. So both `paper.tex` and `tex/src/figure-src-inputconf.tex` are modified in this commit. 22 April 2020, 22:46:20 UTC
7bdbef3 4.3.4 Project analysis - the analysis itself Reduction by about 20 words - minor rewording. 22 April 2020, 22:29:26 UTC
2f16793 Acknowledged the help of Idafen in Maneage Idafen has helped in testing Maneage a lot during the last year and has provided very useful feedback and suggestions. 22 April 2020, 20:23:01 UTC
f990bba Applied futher comments by Konrad Regarding Docker Konrad pointed out that "Linux has an excellent track record for stability. It's more likely that the Docker itself becomes incompatible with older containers. Docker isn't developed for reproducibility after all". So I tried to modify that paragraph to include this important point too. In the process, I also shrank it a little more (without loosing anything substantial), so it doesn't add to the paper's length. 22 April 2020, 18:49:31 UTC
76a2148 Minor edits to summarize section on project.tex and verify.tex After going through Boud's corrections, I thought it can be further summarized without loosing any major point. 22 April 2020, 18:18:10 UTC
c9d6492 4.3.3 Project analysis - verification Reduction by 4 words. Minor rewording; removal of "Note that" and "simply" (the opposite of "complicatedly"). If a checksum is simple for a given user, then s/he already knows that; if s/he doesn't yet know what a checksum is, then stating that it's simple doesn't help very much. :) 22 April 2020, 17:09:49 UTC
085141f 4.3.2 Project analysis - values within text Reduction of about 15 words. The phrase "which does not need it" is removed. On its own, this is a claim, not an explanation. If the reader is wondering why `paper.tex` is not a produced file, then stating that the file is not needed will not help very much. Looking at the diagram will show that `paper.tex` is the overall article template; and the diagram strongly suggests that values from initialize.tex, ..., are passed into verify.tex, and from there into project.tex, which goes into paper.tex. The phrase "files, possibly in another subMakefile" should really be something like "files, possibly created by another subMakefile". But this would add more words, and given that the user has full control to modify and adapt the overall scheme (including making a mess of it), we can safely drop the info that the scheme can be made more complicated. :) 22 April 2020, 16:58:43 UTC
282be64 4.3.1 Project analysis - paper.pdf Only 3 words are reduced in this commit, but I think the improvements are worth it. "Note that" and "It is worth mentioning" are phrases still quite often used by academics (even in astronomy) that can be politely described as "pontification" or informally as "empty blabla"; these add no meaning except "I am teaching you something and I expect you to pay attention to what I am saying". :) There are also less polite descriptions. 22 April 2020, 16:14:10 UTC
8d88566 4.3 Project analysis intro Minor rewording of 4.3 Project analysis - introduction. Reduction of about 40 words. 4.2 `parallel` quote: s/http:/https:/ 22 April 2020, 16:08:31 UTC
f0622d8 Implemented Konrad's suggestions, minor edits here and there Today Konrad made the following suggestions after reading through the paper (created from Commit 1ac5c12). Thanks a lot Konrad ;-). I tried to address them all in this commit. Afterwards, while looking over the corrected parts, some minor edits came up to me to remove redundant parts and add extra points where it helps. In particular to be able to print the International Phonetic Alphabet (IPA), I had to include the LaTeX `TIPA' package, but it was interesting to see that it was already available in the project as a dependency of another package we loaded. 22 April 2020, 03:58:12 UTC
7d26642 README-hacking.md: removed any mention of tags Tags are not a fixed piece of history (they can easily be moved and not imported in a different repository), so they are only confusing in the context of Maneage (where people should branch-off the main project). the raw commit hashes are a much more robust way to store a precise moment in history. Before this commit, I removed all Tags from the main Git repositories of Maneage and thus removed any mention of Tags with `README-hacking.md'. Ofcourse, if a project decides to use tags is upto them, but we won't implement it in the main branch. 21 April 2020, 18:10:53 UTC
1bf94d0 README-hacking.md: minor clarifications in checklist Roberto Baena recently tried building a new project with Maneage and provided the following suggestions to make it more clear for a new user: 1) In the part where we talk about creating a Git repository, we should highlight that it must be empty. This is because some (for example Gitlab) propose to include a `README' file. But if the project is not empty, Git will not allow pushing to it. 2) The `(can be done later)' comment was removed from the "Delete dummy parts") to avoid confusion about applying some of them, but not others: if only some are done, it may cause problems in the build. 21 April 2020, 17:18:20 UTC
ad84e26 Configuration: current directory printed properly in stdout Until now, the message that we printed just before starting to build software didn't actually print the current directory, but only `pwd'. With this commit, this is fixed (it uses the `currentdir' variable that is already found before). 20 April 2020, 19:33:39 UTC
1f8fca2 Configuration: current directory printed properly in stdout Until now, the message that we printed just before starting to build software didn't actually print the current directory, but only `pwd'. With this commit, this is fixed (it uses the `currentdir' variable that is already found before). 20 April 2020, 19:29:19 UTC
ac7b82b README-hacking.md: Removed TeXLive year problem and numberd checklist We recently fixed the problem of TeXLive that hard-codes the year of its build in its installation directory. But the note on this problem was still kept in `README-hacking.md'. That part is now removed. Also, to help in following the checklist, it is now an ordered list. 20 April 2020, 18:30:33 UTC
69e7422 Added link to citation from GNU Parallel, slightly summarized it Boud previously pointed out that that he couldn't find a reference to the citation, so I added it as a link over "its FAQ" (since its described in its `doc/citation-notice-faq.txt' file). I also removed the first part of the quote which was not really necessary, the heart of the quote is the latter part that still remains. 20 April 2020, 17:51:26 UTC
0321773 Minor edits on Boud's corrections to merge I tried to make it slightly shorter, but I felt that it is important to keep the quote from GNU Parallel and in particular the financial aid it asks for. It will help readers feel the gravity of the sitution for this software author. The precise citation of the quote was given in the long version. 20 April 2020, 05:03:56 UTC
00500f6 Minor copyedits - 4.2.2 software citation This reduces the length by about 70 words. The biggest change is to remove what looks like a citation from `parallel'. I couldn't find the citation in GNU parallel 20161222-1 (Debian/stretch), nor with search engines. I don't think that the quote is really so useful (even assuming it's a valid quote from somewhere): citation practices are a mix between ethics, preparation to convince referees, citing those who are already cited frequently, and the practicality of searching for and verifying references against the information for which they are used. Showing that Maneage makes citation not only easy, but more or less automatic, bypasses some of the compromises between practicality and ethics. 20 April 2020, 05:03:14 UTC
c710370 Minor copyedits - 4.2.1 source verification Minor rewording; a reduction of about 12 words. 20 April 2020, 05:03:14 UTC
6fed651 Minor copyedits - 4.2 intro configuration Minor edits - reduces about 17 words. 20 April 2020, 05:03:14 UTC
30cfbab Minor copyedits - 4.1 Maneage orchestration This commit reduces about 25 words from the 4.1 Maneage orchestration, aka `make`, section. 20 April 2020, 05:03:14 UTC
e7bf184 Minor edits to 4 Maneage intro This drops the word count in the introductory part of the Maneage section by about 15 words. 20 April 2020, 05:03:14 UTC
ae4142f Clarfication on free software complementing reproducibility Thanks to Boud's corrections, I see that the sentence can be confusing and not convey the point I wanted to make properly, so I am clarifying it here. The main point is that this principle complements the definition of reproducibility, not the other principls. 20 April 2020, 04:14:18 UTC
1d72bf8 minor language edits These tiny language edits add 1 word in length. 20 April 2020, 04:14:18 UTC
c6372f4 Boud moved to third author, Lyon affiliation for Mohammad, minor edits Boud has contributed a lot to Maneage over the last few years and with the last few commits he also contributed significantly to this paper, so I am moving him to third author. Thanks to Boud, I also remembered that even though I done the most important parts of Maneage in Lyon, I hadn't added it as an affiliation for myself, so I added it. Maneage became a separate project in Lyon. Finally, I tried to decrease the length of the acknowledgments by adding some abbreviations that were shared between various parts. 20 April 2020, 04:02:24 UTC
564b91a boud authorship/affil/acknowl Unfortunately, adding in my name/affiliations/acknowledgments adds about 90 words to the text. We don't really know if these are counted by the editor in the 8000-word limit. I changed `funded' to `funded/supported'. I only get funding from one out of the three sources I acknowledge, but it's important to acknowledge all three. 20 April 2020, 02:19:34 UTC
ce1a5ee Minor edits in the text While looking over the PDF, a few small edits were made to be more clear. 20 April 2020, 02:14:21 UTC
be8481f Maneage instead of Template in README-hacking.md and copyright notices Until now, throughout Maneage we were using the old name of "Reproducible Paper Template". But we have finally decided to use Maneage, so to avoid confusion, the name has been corrected in `README-hacking.md' and also in the copyright notices. Note also that in `README-hacking.md', the main Maneage branch is now called `maneage', and the main Git remote has been changed to `https://gitlab.com/maneage/project' (this is a new GitLab Group that I have setup for all Maneage-related projects). In this repository there is only one `maneage' branch to avoid complications with the `master' branch of the projects using Maneage later. 20 April 2020, 00:07:49 UTC
3a56aac Imported the recent parallel works on the principles section The conflict was only on the list of existing tools and that was easily corrected. 19 April 2020, 19:54:10 UTC
bf6e876 Further summarized the principles section Following Boud's great corrections, I was able to futher summarize this section, decreasing roughly 150 more words from this section. 19 April 2020, 19:48:20 UTC
6e667d9 List of existing tools made cleaner in LaTeX source Until now the list of existing tools was written in one line which made it hard to read and follow, especially since we added links. It is now expanded into a one-line per item which makes to no difference in the final PDF. 19 April 2020, 16:13:45 UTC
22f380a Principles - P7 FOSS Reduction by 15 words. 19 April 2020, 15:52:41 UTC
e8eef37 Principles - P6 Scalability Reduction by 7 words. For a regular GNU/Linux of other unix-like system user, the bit about ISO C compilers even existing for Microsoft systems more or less says "despite there being no point ever trying to do science on a Microsoft system, you *could* hypothetically compile and run any ISO C program on it". Interesting, but not directly of interest to this user, who is unlikely to actually want to do it. A Microsoft user who thinks that s/he can do science on a Microsoft system will typically think "Microsoft is good, so of course I can run anything I want on it". So the message here could more likely be seen as provocative rather than useful, since this user is unaware of the fundamental problems of Microsoft as an authoritarian, manipulative, centralised organisation providing bad software. So either way, the parenthesis about Microsoft can be safely removed given the space constraints. 19 April 2020, 15:40:37 UTC
1d281bf Principles - P5 History and temporal provenance Reduction by 5 words. The term "exploratory research" is intended in the specific sense listed at en.Wikipedia: https://en.wikipedia.org/wiki/Exploratory_research to distinguish it from hypothesis testing. The final phases of clinical (medical) research, for example, to test whether a candidate SARS-CoV-2 vaccine is (i) effective and (ii) safe in homo sapiens, cannot accept the exploratory methods that are acceptable in astronomy, or in other exploratory research (which is acceptable in the early stages of medical research). Clinical trial registration is aimed at *preventing* scientists from modifying their methods in a given project: https://en.wikipedia.org/wiki/Clinical_trial_registration 19 April 2020, 15:30:14 UTC
e8f5b6a Principles - P4 verifiable inputs and outputs One superfluous word was removed. 19 April 2020, 15:22:59 UTC
4e9e145 Principles - P3 minimal complexity Minor wording changes - reduction by 10 words. 19 April 2020, 15:20:46 UTC
13d0a68 principles - P2 modularity Minor wording improvements; reduction by 10 words. 19 April 2020, 15:14:18 UTC
a133918 principles: all nouns For consistency, the principles should either all be nouns, or all be adjectives. Most are nouns, so this commit switches the adjectives to nouns. 19 April 2020, 15:03:53 UTC
6e97fdd Principles - P1 - Complete Compression by about 40 words. Updating python2 to python3 is often nothing more than modifying print statements, so removing this doesn't weaken the text by much. Re-creation helps avoid thinking of watching movies, going to the beach, reading a novel, when seeing the word "recreation": https://en.wiktionary.org/wiki/recreation#Usage_notes The matplotlib sentence was not so clear: now it's a bit shorter and hopefully clearer. 19 April 2020, 14:55:51 UTC
49cdb17 Principles intro Word-length reduction (8 words) of the first part of 3 Principles. Change in meaning: we can argue that *results* are not part of science, but science needs aims as well as methods; hypotheses are needed too, but these overlap between the aims and methods. So I put "primarily". 19 April 2020, 14:17:04 UTC
e682332 Clickable URLs for the 19 earlier reproducibility solutions In this commit, the URLs for the 19 "earlier solutions" at the beginning of "3 Principles" are recovered from tex/src/paper-long.tex and put behind the package names as clickable words. To reduce the chance that these are interpreted as references, "Project1 (yyy1), Project2 (yyy1)" is changed to "yyy1: Project1, Project2". We cannot add full references because of the 8000-word space constraint. With a minor word improvement, this commit overall reduces the word count very slightly, by 9, according to pdftotext paper.pdf |wc paper.txt before and after the commit. 19 April 2020, 13:58:51 UTC
d4fb323 Added arbitrarily complex to description of scalability Scalability is not just on the size of the project, but also its complexity, so I added an `and/or complex' to the description of the scalability principle. 18 April 2020, 17:49:26 UTC
c7969da Added Scalability as a principle, minor edits/clippings Someone reading the principles section until now would think that IPOL is an almosts perfect solution, and for its usecase it certainly is. However, this is only because of the nature of its work: it only focuses on algorithms, not usage/analysis which cannot be done in raw ISO C. So with this commit, I added a new principle on Scalability and discussed this limitation of IPOL there. To avoid simply lengthening the text, to add this new principle, I had to remove/summarize some parts that seemed redundant. In the process, I also removed some of the existing tools (at the start of the principles section) that had several others in the same time frame, I have already mentioned (through the "and many more") that this list is not complete. Also, the list of people to thank in the acknowledgments is now put in a one-line per name to be more easily maintainable: Boud and Mohammad-reza were added, and given that I have sent the paper to several other people for feedback, I expect the list to get longer. 18 April 2020, 17:14:09 UTC
063b74c Minor language edits in paper.tex A few more minor language edits. For parseable vs parseable, see https://en.wiktionary.org/wiki/parsable which recommends `parsable` for formal usage. 18 April 2020, 16:09:17 UTC
9ac77d0 Minor language edits in paper.tex These are mostly minor language edits. There is one significant fix: the word `typically' in `a non-free software project typically' cannot be distributed by the project. There is a whole range of licences between strictly free software definition, strictly OSI open-source definition, and fully closed source. For example, software with a no-commercial usage licence (similar to CC-BY-NC) can be publicly redistributed on any server, as long as there is no requirement of payment or no requirement of payment that is "commercial" (according to lawyers' interpretation of when a payment is commercial). 18 April 2020, 16:07:26 UTC
83fea23 Two papers cited, for research software and data management plans These are important aspects that are highly relevant to Maneage: its philosophy (the former) and usability (the latter). To add them, I tried to summarize some other parts of the paper. 18 April 2020, 02:46:16 UTC
e9b55c9 Imported recent updates in Maneage, no conflicts There weren't any conflicts in this merge. 18 April 2020, 00:13:45 UTC
a323fe1 Corrected several instances of n't to not Three such cases and they are fixed. 18 April 2020, 00:04:19 UTC
c003a2d Edits in the text to make it shorter and fix a few mistakes A few minor issues were found and fixed in the text. I also tried to shorten it a little further. 17 April 2020, 23:51:20 UTC
3a49e2c Properly adding libiconv to the libraries that libstdc++ links with Of the GCC dynamically linked libraries we need to manually add RPATH to all and for `libstdc++' we also need to tell it to link with `libiconv'. Until now, the conditional to check for libstdc++ was not working and thus libiconv wasn't been added to it. With this commit the conditional has been corrected and is now working. Also, to help in reading the logs, an echo statement was added after every call to PatchELF. 17 April 2020, 23:34:20 UTC
d91813a Replaced name of directory under akhlaghi.org as backup server Until now, when a the raw tarball of some software wasn't usable, I would put it under my own webpage, or `akhlaghi.org/reproduce-software'. That same address was also used as a backup server. However, now the project has a proper name: Maneage. So I changed the directory on my own server to `akhlaghi.org/maneage-software'. With this commit, this new address has replaced the old one. But to avoid crashes in projects that haven't yet merged with the main Maneage branch, the old `reproduce-software' still works (its actually a symbolic link to the new directory now). 17 April 2020, 22:55:26 UTC
back to top