Raw File
Tip revision: 23ea3229abe72a5f23dcf3a4cfcd3478d744b536 authored by Dominique Makowski on 20 June 2019, 11:50:03 UTC
version 0.2.2
Tip revision: 23ea322

# bayestestR <img src='man/figures/logo.png' align="right" height="139" />


***Become a Bayesian master you will***

`bayestestR` is a lightweight package providing utilities to describe
posterior distributions and Bayesian models.

## Installation

Run the following:

``` r

``` r

## Documentation


Click on the buttons above to access the package
[**documentation**]( and the
[**easystats blog**](, and
check-out these vignettes:

#### Tutorials

  - [Get Started with Bayesian
  - [Example 1: Initiation to Bayesian
  - [Example 2: Confirmation of Bayesian
  - [Example 3: Become a Bayesian

#### Articles

  - [Credible Intervals
  - [Probability of Direction
  - [Region of Practical Equivalence
  - [Bayes Factors
  - [Comparison of
  - [Comparison of Indices of Effect
  - [Reporting

# Features

is the master function with which you can compute all of the indices
cited below *at once*.

``` r

## Point-estimates

### MAP Estimate

find the **Highest Maximum A Posteriori (MAP)** estimate of a posterior,
*i.e.,* the most probable value.

``` r
map_estimate(rnorm(1000, 1, 1))

![](man/figures/unnamed-chunk-6-1.png)<!-- -->

## Uncertainty

### Highest Density Interval (HDI) - The *Credible* Interval (CI)

computes the **Highest Density Interval (HDI)** of a posterior
distribution, *i.e.*, the interval which contains all points within the
interval have a higher probability density than points outside the
interval. The HDI can be used in the context of Bayesian posterior
characterisation as **Credible Interval (CI)**.

Unlike equal-tailed intervals (see
[ci]( that
typically exclude 2.5% from each tail of the distribution, the HDI is
*not* equal-tailed and therefore always includes the mode(s) of
posterior distributions.

By default, `hdi()` returns the 89% intervals (`ci = 0.89`), deemed to
be more stable than, for instance, 95% intervals (Kruschke, 2014). An
effective sample size of at least 10.000 is recommended if 95% intervals
should be computed (Kruschke 2014, p. 183ff). Moreover, 89 is the
highest prime number that does not exceed the already unstable 95%
threshold (McElreath, 2015).

``` r
hdi(rnorm(1000), ci = .89)

![](man/figures/unnamed-chunk-8-1.png)<!-- -->

## Null-Hypothesis Significance Testing (NHST)

### ROPE

computes the proportion (in percentage) of the HDI (default to the 89%
HDI) of a posterior distribution that lies within a region of practical

Statistically, the probability of a posterior distribution of being
different from 0 does not make much sense (the probability of it being
different from a single point being infinite). Therefore, the idea
underlining ROPE is to let the user define an area around the null value
enclosing values that are *equivalent to the null* value for practical
purposes (Kruschke 2010, 2011, 2014).

Kruschke (2018) suggests that such null value could be set, by default,
to the -0.1 to 0.1 range of a standardized parameter (negligible effect
size according to Cohen, 1988). This could be generalized: For instance,
for linear models, the ROPE could be set as `0 +/- .1 * sd(y)`. This
ROPE range can be automatically computed for models using the

Kruschke (2010, 2011, 2014) suggests using the proportion of the 95% (or
90%, considered more stable) HDI that falls within the ROPE as an index
for “null-hypothesis” testing (as understood under the Bayesian
framework, see

``` r
rope(rnorm(1000, 1, 1), range = c(-0.1, 0.1))

![](man/figures/unnamed-chunk-10-1.png)<!-- -->

### Equivalence test

a **Test for Practical Equivalence** based on the *“HDI+ROPE decision
rule”* (Kruschke, 2018) to check whether parameter values should be
accepted or rejected against an explicitly formulated “null hypothesis”
(*i.e.*, a

``` r
equivalence_test(rnorm(1000, 1, 1), range = c(-0.1, 0.1))

### Probability of Direction (*p*d)

computes the **Probability of Direction** (***p*d**, also known as the
Maximum Probability of Effect - *MPE*). It varies between 50% and 100%
and can be interpreted as the probability (expressed in percentage) that
a parameter (described by its posterior distribution) is strictly
positive or negative (whichever is the most probable). It is
mathematically defined as the proportion of the posterior distribution
that is of the median’s sign. Although differently expressed, this index
is fairly similar (*i.e.*, is strongly correlated) to the frequentist

**Relationship with the p-value**: In most cases, it seems that the *pd*
corresponds to the frequentist one-sided *p*-value through the formula
`p-value = (1-pd/100)` and to the two-sided *p*-value (the most commonly
reported) through the formula `p-value = 2*(1-pd/100)`. Thus, a `pd` of
`95%`, `97.5%` `99.5%` and `99.95%` corresponds approximately to a
two-sided *p*-value of respectively `.1`, `.05`, `.01` and `.001`. See
the [*reporting

``` r
p_direction(rnorm(1000, mean = 1, sd = 1))

![](man/figures/unnamed-chunk-13-1.png)<!-- -->

### Bayes Factor

computes the ratio between the density of a single value (typically the
null) in two distributions, typically the posterior vs. the prior
distributions. This method is used to examine if the hypothesis value is
less or more likely given the observed data.

``` r
prior <- rnorm(1000, mean = 0, sd = 1)
posterior <- rnorm(1000, mean = 1, sd = 0.7)

bayesfactor_savagedickey(posterior, prior, direction = "two-sided", hypothesis = 0)

![](man/figures/unnamed-chunk-15-1.png)<!-- -->

### MAP-based *p*-value

computes a Bayesian equivalent of the p-value, related to the odds that
a parameter (described by its posterior distribution) has against the
null hypothesis (*h0*) using Mills’ (2014, 2017) *Objective Bayesian
Hypothesis Testing* framework. It is mathematically based on the density
at the Maximum A Priori (MAP) and corresponds to the density value at 0
divided by the density of the MAP estimate.

``` r
p_map(rnorm(1000, 1, 1))

![](man/figures/unnamed-chunk-17-1.png)<!-- -->

## Utilities

### Find ROPE’s appropriate range

This function attempts at automatically finding suitable “default”
values for the Region Of Practical Equivalence (ROPE). Kruschke (2018)
suggests that such null value could be set, by default, to a range from
`-0.1` to `0.1` of a standardized parameter (negligible effect size
according to Cohen, 1988), which can be generalised for linear models to
`-0.1 * sd(y), 0.1 * sd(y)`. For logistic models, the parameters
expressed in log odds ratio can be converted to standardized difference
through the formula `sqrt(3)/pi`, resulting in a range of `-0.05` to

``` r

### Density Estimation

This function is a wrapper over different methods of density estimation.
By default, it uses the base R `density` with by default uses a
different smoothing bandwidth (`"SJ"`) from the legacy default
implemented the base R `density` function (`"nrd0"`). However, Deng &
Wickham suggest that `method = "KernSmooth"` is the fastest and the most

### Perfect Distributions

Generate a sample of size n with near-perfect distributions.

``` r
distribution(n = 10)

### Probability of a Value

Compute the density of a given point of a distribution.

``` r
density_at(rnorm(1000, 1, 1), 1)

## Credits

You can cite the package as following:

  - Makowski, D., Ben-Shachar M. S., & Lüdecke, D. (2019). *Understand
    and Describe Bayesian Models and Posterior Distributions using
    bayestestR*. Available from
back to top