Content - 3afaded74b6f8d1678c8662334472067ff32c787 - 40805fe/vignettes/credible_interval.Rmd

visit type:
https://github.com/cran/bayestestR

08 March 2024, 11:09:09 UTC
Tip revision: 23ea3229abe72a5f23dcf3a4cfcd3478d744b536 authored by Dominique Makowski on 20 June 2019, 11:50:03 UTC
version 0.2.2
Tip revision: 23ea322
credible_interval.Rmd
---
title: "Credible Intervals (CI)"
output: 
  github_document:
    toc: true
    fig_width: 10.08
    fig_height: 6
  rmarkdown::html_vignette:
    toc: true
    fig_width: 10.08
    fig_height: 6
tags: [r, bayesian, posterior, test, ci, credible interval]
vignette: >
  \usepackage[utf8]{inputenc}
  %\VignetteIndexEntry{Credible Intervals (CI)}
  %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  chunk_output_type: console
bibliography: bibliography.bib
---

This vignette can be referred to by citing the package:

- Makowski, D., Ben-Shachar M. S. \& Lüdecke, D. (2019). *Understand and Describe Bayesian Models and Posterior Distributions using bayestestR*. Available from https://github.com/easystats/bayestestR. DOI: [10.5281/zenodo.2556486](https://zenodo.org/record/2556486).

---

```{r message=FALSE, warning=FALSE, include=FALSE}
library(knitr)
options(knitr.kable.NA = '')
knitr::opts_chunk$set(comment=">")
options(digits=2)

set.seed(333)
```


# What is a *Credible* Interval?

Credible intervals are an important concept in Bayesian statistics. Its core purpose is to describe and summarise **the uncertainty** related to your parameters. In this regards, it could appear as quite similar to the frequentist **Confidence Intervals**. However, while their goal is similar, **their statistical definition annd meaning is very different**. Indeed, while the latter is obtained through a complex algorithm full of rarely-tested assumptions and approximations, the credible intervals are fairly straightforward to compute. 

As the Bayesian inference returns a distribution of possible effect values (the posterior), the credible interval is just the range containing a particular percentage of probable values. For instance, the 95\% credible interval simply is the central portion of the posterior distribution that contains 95\% of the values. 

Note that this drastically improve the interpretability of the Bayesian interval compared to the frequentist one. Indeed, the Bayesian framework allows to say *"given the observed data, the effect has 95% probability of falling within this range"*, while the frequentist less straightforward alternative (the 95\% ***Confidence* Interval**) would be "*there is a 95\% probability that when computing a confidence interval from data of this sort, the effect falls within this range*".


# Why is the default 89\%?

Naturally, when it came about choosing the CI level to report by default, **people started using 95\%**, the arbitrary convention used in the **frequentist** world. However, some authors suggested that 95\% might not be the most apppropriate for Bayesian posterior distributions, potentially lacking stability if not enough posterior samples are drawn [@kruschke2014doing]. 

The proposition was to use 90\% instead of 95\%. However, recently, McElreath (2014, 2018) suggested that if we were to use arbitrary tresholds in the first place, why not use 89\% as this value has the additional argument of being a prime number.

Thus, by default, the CIs are computed with 89\% intervals (`ci = 0.89`), deemed to be more stable than, for instance, 95\% intervals [@kruschke2014doing]. An effective sample size (ESS; see [here](https://easystats.github.io/bayestestR/reference/diagnostic_posterior.html)) of at least 10.000 is recommended if 95\% intervals should be computed (Kruschke, 2014, p. 183ff). Moreover, 89 is the highest **prime number** that does not exceed the already unstable 95\% threshold. What does it have to do with anything? *Nothing*, but it reminds us of the total arbitrarity of any of these conventions [@mcelreath2014rethinking, @mcelreath2018statistical].

# Different types of CIs

The reader might notice that `bayestestR` provides **two methods** to compute credible intervals, the **Highest Density Interval (HDI)** and the **Equal-tailed Interval (ETI)**, obtained via the quantile method. What is the difference? Let's see:

```{r warning=FALSE, message=FALSE}
library(bayestestR)
library(dplyr)
library(ggplot2)

# Generate a normal distribution
posterior <- distribution_normal(1000)

# Compute HDI and Quantile CI
ci_hdi <- hdi(posterior)
ci_eti <- ci(posterior)

# Plot the distribution and add the limits of the two CIs
posterior %>% 
  estimate_density(extend=TRUE) %>% 
  ggplot(aes(x=x, y=y)) +
  geom_area(fill="orange") +
  theme_classic() +
  # HDI in blue
  geom_vline(xintercept=ci_hdi$CI_low, color="royalblue", size=3) +
  geom_vline(xintercept=ci_hdi$CI_high, color="royalblue", size=3) +
  # Quantile in red
  geom_vline(xintercept=ci_eti$CI_low, color="red", size=1) +
  geom_vline(xintercept=ci_eti$CI_high, color="red", size=1)
```

> **These are exactly the same...**

But is it also the case for other types of distributions?

```{r warning=FALSE, message=FALSE}
library(bayestestR)
library(dplyr)
library(ggplot2)

# Generate a beta distribution
posterior <- distribution_beta(1000, 6, 2)

# Compute HDI and Quantile CI
ci_hdi <- hdi(posterior)
ci_eti <- ci(posterior)

# Plot the distribution and add the limits of the two CIs
posterior %>% 
  estimate_density(extend=TRUE) %>% 
  ggplot(aes(x=x, y=y)) +
  geom_area(fill="orange") +
  theme_classic() +
  # HDI in blue
  geom_vline(xintercept=ci_hdi$CI_low, color="royalblue", size=3) +
  geom_vline(xintercept=ci_hdi$CI_high, color="royalblue", size=3) +
  # Quantile in red
  geom_vline(xintercept=ci_eti$CI_low, color="red", size=1) +
  geom_vline(xintercept=ci_eti$CI_high, color="red", size=1)
```

> **The difference is strong with this one.**

Contrary to the **HDI**, for which all points within the interval have a higher probability density than points outside the interval, the **ETI** is **equal-tailed**. This means that a 90\% interval has 5\% of the distribution on either side of its limits. It indicates the 5th percentile and the 95h percentile. In symmetric distributions, the two methods of computing credible intervals, the ETI and the HDI, return similar results.

This is not the case for skewed distributions. Indeed, it is possible that parameter values in the ETI have lower credibility (are less probable) than parameter values outside the ETI. This property seems undesirable as a summary of the credible values in a distribution.

On the other hand, the ETI range does change when transformations are applied to the distribution (for instance, for a log odds scale to probabilities): the lower and higher bounds of the transformed distribution will correspond to the transformed lower and higher bounds of the original distribution. On the contrary, applying transformations to the distribution will change the resulting HDI.

# References