Skip to main content
  • Home
  • Development
  • Documentation
  • Donate
  • Operational login
  • Browse the archive

swh logo
SoftwareHeritage
Software
Heritage
Archive
Features
  • Search

  • Downloads

  • Save code now

  • Add forge now

  • Help

  • 7f03a19
  • /
  • inst
  • /
  • doc
  • /
  • indicesEstimationComparison.Rmd
Raw File Download

To reference or cite the objects present in the Software Heritage archive, permalinks based on SoftWare Hash IDentifiers (SWHIDs) must be used.
Select below a type of object currently browsed in order to display its associated SWHID and permalink.

  • content
  • directory
content badge Iframe embedding
swh:1:cnt:be63e3cf46e233659b5797f2bf849009aafb1024
directory badge Iframe embedding
swh:1:dir:0602c377bfda199c198ed559ff1f70b0ebcb7ceb

This interface enables to generate software citations, provided that the root directory of browsed objects contains a citation.cff or codemeta.json file.
Select below a type of object currently browsed in order to generate citations for them.

  • content
  • directory
Generate software citation in BibTex format (requires biblatex-software package)
Generating citation ...
Generate software citation in BibTex format (requires biblatex-software package)
Generating citation ...
indicesEstimationComparison.Rmd
---
title: "In-Depth 1: Comparison of Point-Estimates"
output: 
  github_document:
    toc: true
    fig_width: 10.08
    fig_height: 6
  rmarkdown::html_vignette:
    toc: true
    fig_width: 10.08
    fig_height: 6
tags: [r, bayesian, posterior, test]
vignette: >
  \usepackage[utf8]{inputenc}
  %\VignetteIndexEntry{In-Depth 1: Comparison of Point-Estimates}
  %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  chunk_output_type: console
---

This vignette can be referred to by citing the package:

- Makowski, D. \& Lüdecke, D. (2019). *Understand and Describe Bayesian Models and Posterior Distributions using BayestestR*. Available from https://github.com/easystats/bayestestR. DOI: [10.5281/zenodo.2556486](https://zenodo.org/record/2556486).



# Effect Point-Estimates in the Bayesian Framework


## Introduction 

One of the main difference between the Bayesian and the frequentist frameworks is that the former returns a probability distribution of each effect (*i.e.*, parameter of interest of a model, such as a regression slope) instead of a single value. However, there is still a need and demand, for reporting or use in further analysis, for a single value (**point-estimate**) that best characterise the underlying posterior distribution.

There are three main indices used in the literature for effect estimation: the **mean**, the **median** or the **MAP** (Maximum A Posteriori) estimate (roughly corresponding to the mode (the "peak") of the distribution). Unfortunately, there is no consensus about which one to use, as no systematic comparison has ever been done.

In the present work, we will compare these three point-estimates of effect between themselves, as well as with the widely known **beta**, extracted from a comparable frequentist model. With this comparison, we expect to draw bridges and relationships between the two frameworks, helping and easing the transition for the public.

## Methods

### Simulate Regression Data with Noise

We simulated 36000 Bayesian and frequentist linear regression models with systematic variation of "true" effect (1 or 0), sample size (20, 40, 60), levels of noise (0.1, 2.5, 5), and priors (correct - same as true effect or incorrect).

```{r message=FALSE, warning=FALSE}
library(ggplot2)
library(dplyr)
library(tidyr)

df <- read.csv("https://raw.github.com/easystats/circus/master/data/bayes_indices.csv")
```

For the sake of time and computational space, we downloaded the data from github. However, you can find the code to generate it [here](https://easystats.github.io/circus/articles/bayesian_indices.html).




## Results

### Relationship with the Theoretical True Effect

We subtracted the theoretical "true" effect (0 or 1) from the indices, to observe the influence of noise, sample size and priors. 


#### Sensitivity to Noise

```{r, message=FALSE, warning=FALSE}
df %>%
  select(noise, beta, prior_correct, effect, median, mean, map) %>%
  gather(estimate, value, -noise, -effect, -prior_correct) %>%
  mutate(noise = as.factor(noise),
         prior_correct = as.factor(prior_correct),
         value = value-effect) %>%
  ggplot(aes(x = noise, y = value, fill = estimate, color=prior_correct)) +
  geom_boxplot() +
  geom_hline(yintercept = 0) +
  theme_classic() +
  scale_fill_manual(values = c("beta" = "#607D8B", "map" = "#795548", "mean" = "#FF9800", "median" = "#FFEB3B"),
                    name = "Index") +
  scale_color_manual(values = c(`0`="#f44336", `1`="#8BC34A"),
                     name = "Correct Prior") +
  xlab("Point-estimate of the true value 0\n") +
  ylab("\nNoise") +
  coord_cartesian(ylim=c(-1, 1))
```


#### Sensitivity to Sample Size

```{r, message=FALSE, warning=FALSE}
df %>%
  select(sample_size, beta, effect, prior_correct, median, mean, map) %>%
  gather(estimate, value, -sample_size, -effect, -prior_correct) %>%
  mutate(sample_size = as.factor(sample_size),
         prior_correct = as.factor(prior_correct),
         value = value-effect) %>%
  ggplot(aes(x = sample_size, y = value, fill = estimate, color=prior_correct)) +
  geom_boxplot() +
  geom_hline(yintercept = 0) +
  theme_classic() +
  scale_fill_manual(values = c("beta" = "#607D8B", "map" = "#795548", "mean" = "#FF9800", "median" = "#FFEB3B"),
                    name = "Index") +
  scale_color_manual(values = c(`0`="#f44336", `1`="#8BC34A"),
                     name = "Correct Prior") +
  ylab("Point-estimate of the true value 0\n") +
  xlab("\nSample Size") +
  coord_cartesian(ylim=c(-1, 1))
```



#### Statistical Modelling

We fitted a (frequentist) multiple linear regression to statistically test the effects and interactions of noise, sample size and priors on the effect estimates.

```{r, message=FALSE, warning=FALSE}
df %>%
  select(sample_size, beta, effect, prior_correct, median, mean, map, noise) %>%
  gather(estimate, value, -sample_size, -effect, -prior_correct, -noise) %>%
  mutate(noise= scale(noise),
         sample_size = scale(sample_size),
         prior_correct = as.factor(prior_correct)) %>%
  glm(effect ~ estimate/value * noise * sample_size * prior_correct, data=., family="binomial") %>%
  broom::tidy() %>%
  select(term, estimate, p=p.value) %>%
  filter(stringr::str_detect(term, 'value')) %>%
  mutate(term = stringr::str_remove(term, ":value"),
         term = stringr::str_remove(term, "estimate"),
         p = ifelse(p < .001, "< .001***", ifelse(p < .01, "< .01**", ifelse(p < .05, "< .05*", "> .05")))) %>%
  filter(stringr::str_detect(term, ':')) %>%
  knitr::kable(digits=2)
```





### Relationship with the Frequentist Beta

In the next section, we will compare the three Bayesian indices with the frequentist beta.


```{r, message=FALSE, warning=FALSE}
df %>%
  select(sample_size, beta, effect, prior_correct, median, mean, map, noise) %>%
  gather(estimate, value, -beta, -sample_size, -effect, -prior_correct, -noise) %>%
  mutate(effect = as.factor(effect),
         sample_size = as.factor(sample_size),
         estimate = factor(estimate, levels=c("mean", "median", "map"))) %>%
  ggplot(aes(x = beta, y = value, color = effect, shape=sample_size)) +
  geom_point(alpha=0.05) +
  facet_wrap(~estimate, scales = "free") +
  theme_classic() +
  theme(strip.background = element_blank()) +
  scale_color_manual(values = c(`0` = "#f44336", `1` = "#8BC34A"), name="Effect") +
  guides(colour = guide_legend(override.aes = list(alpha = 1)),
         shape = guide_legend(override.aes = list(alpha = 1), title="Sample Size"))
```

## Discussion

Conclusions can be found in the [guidelines section](https://easystats.github.io/bayestestR/articles/guidelines.html).

back to top

Software Heritage — Copyright (C) 2015–2025, The Software Heritage developers. License: GNU AGPLv3+.
The source code of Software Heritage itself is available on our development forge.
The source code files archived by Software Heritage are available under their own copyright and licenses.
Terms of use: Archive access, API— Content policy— Contact— JavaScript license information— Web API