https://gitlab.com/fkohrt/bachelorarbeit-code
Tip revision: 33bf772e64d81c6e230742ec1fd55cf66a428f8e authored by Florian Kohrt on 04 May 2022, 12:04:29 UTC
Provide more detailed instructions for running the experiment
Provide more detailed instructions for running the experiment
Tip revision: 33bf772
README.md
<!--
SPDX-FileCopyrightText: 2021 Florian Kohrt
SPDX-License-Identifier: CC0-1.0
-->
# Bachelorarbeit Code
This repository contains my bachelor's thesis. It is developed online at <https://gitlab.com/fkohrt/bachelorarbeit-code> and archived in the Software Heritage universal source code archive.
## Usage
### Cloud
Click on the following button to launch an interactive RStudio instance:
[](https://mybinder.org/v2/gl/fkohrt%2Fbachelorarbeit-code/main?urlpath=rstudio)
Now the document can be recreated by typing the following in the R console:
```r
xfun::Rscript_call(
rmarkdown::render,
list(input = file.path("analysis", "paper", "paper.Rmd"))
)
```
Alternatively, navigate to the folder `analysis/paper/`, open the file `paper.Rmd` and click RStudio's _Knit_ button.
The same goes for the supplementary material, which has to be compiled after the main `paper.Rmd`:
```r
xfun::Rscript_call(
rmarkdown::render,
list(input = file.path("analysis", "paper", "supplementary.Rmd"))
)
```
### Local
This repository defines the environment it needs in the directory `binder/`, which contains configuration files that are understood by [repo2docker](https://repo2docker.readthedocs.io/). But you can also reproduce the thesis without Docker/containerization, although it may be harder. The configuration file `environment.yml` defines a [conda](https://anaconda.org/anaconda/conda) environment, and most dependencies can be found there. Everything that starts with `r-` is related to R and mostly can be installed via `install.packages(c("<package-name>"))` (beware that some packages may have a different capitalization). Some packages are prepared inside the `postBuild` file, so you will want to check that as well.
Of course, you need to install **R** before installing R packages. R and most other system dependencies should be installable via `conda` or through your system's native package manager (R 3.x should do the job, but R 4.x is recommended in case you want to change model parameters as it's faster). In either case, you will also need to install Python 3.
If you use a Debian derivative, the following might work for you:
```sh
sudo apt-get install r-base librsvg2-dev pandoc libudunits2-dev libssl-dev libgdal-dev libmagick++-dev
```
For Fedoras, try this:
```sh
sudo dnf install R-devel librsvg2-devel libcurl-devel pandoc udunits2-devel ImageMagick-c++-devel openssl-devel gdal-devel proj-devel sqlite-devel geos-devel
```
Other operating systems besides GNU/Linux should work as well, but I can't help with that.
Use the pip package manager to install [panflute](https://github.com/sergiocorreia/panflute) via `pip install panflute`.
Now you should be able to type `R` in your terminal and open the R console. Install the R packages as explained above and run the R code from `postBuild`. Leave the R console again with `quit()`.
Now run the following on your terminal:
```sh
git clone https://gitlab.com/fkohrt/bachelorarbeit-code.git
cd bachelorarbeit-code
Rscript -e 'rmarkdown::render(file.path("analysis", "paper", "paper.Rmd"))'
```
If you get the error: `Could not find executable python`, you need to create a link to your `python3` executable: `sudo ln /usr/bin/python3 /usr/bin/python`
### Full re-run
A full re-run has an expected runtime of 12 days. It is recommended to use a terminal multiplexer such as `tmux` in combination with the following command to simultaneously view the progress on the console an write it to a file:
```sh
Rscript -e 'rmarkdown::render(file.path("analysis", "paper", "paper.Rmd"))' |& tee log.txt
```
## Repository structure
Basic familiarity with [Git](https://git-scm.com/) is assumed, otherwise see e.g. the chapter on [Version Control](https://the-turing-way.netlify.app/reproducible-research/vcs.html) from the _The Turing Way's Guide for Reproducible Research_.
This repository hosts two related components: The `main` branch contains the thesis and the `package` branch contains the **labEvolution** R package that is used for analyses. The `main` branch holds a fixed version of the package inside the `miniCRAN` folder, the branches therefore do not depend on each other.
The organization of files roughly follows Marwick et al. ([2018](https://oadoi.org/10.1080/00031305.2017.1375986)) and what the **[rrtools](https://github.com/benmarwick/rrtools)** R package would produce. Entry point for all computations are the documents they are are used in; to understand the execution flow it is therefore recommended to start with the two documents `paper.Rmd` and `supplementary.Rmd`.
## License
Most content is distributed under [CC0 1.0](https://creativecommons.org/publicdomain/zero/1.0/). As this repository conforms to the [REUSE Specification – Version 3.0](https://reuse.software/spec/), a SPDX document of all files detailing the licenses in use can be generated with the [reuse](https://reuse.readthedocs.io/) tool:
```sh
reuse spdx
```