Content - a0697f684628ed808624cf15a2f6febbe48b79fb - 97e9a56/README.md

swh:1:snp:48ea8a6d6519f1b1cb458bbc692eef758fbe018a

Tip revision: 0a57dc7a2c4ab8e3ea4e4e452e64a2a0c0bf1691 authored by Takaya Saito on 02 February 2021, 09:50:03 UTC
version 0.12.1
Tip revision: 0a57dc7
README.md

# Precrec <img src="man/figures/logo.png" align="right" alt="" width="100" />

[![Travis](https://travis-ci.org/takayasaito/precrec.svg?branch=master)](https://travis-ci.org/takayasaito/precrec/)
[![AppVeyor Build
Status](https://ci.appveyor.com/api/projects/status/github/takayasaito/precrec?branch=master&svg=true)](https://ci.appveyor.com/project/takayasaito/precrec/)
[![codecov.io](https://codecov.io/github/takayasaito/precrec/coverage.svg?branch=master)](https://codecov.io/github/takayasaito/precrec?branch=master)
[![CodeFactor](https://www.codefactor.io/repository/github/takayasaito/precrec/badge)](https://www.codefactor.io/repository/github/takayasaito/precrec/)
[![CRAN\_Status\_Badge](https://www.r-pkg.org/badges/version-ago/precrec)](https://cran.r-project.org/package=precrec)
[![CRAN\_Logs\_Badge](https://cranlogs.r-pkg.org/badges/grand-total/precrec)](https://cran.r-project.org/package=precrec)

The aim of the `precrec` package is to provide an integrated platform
that enables robust performance evaluations of binary classifiers.
Specifically, `precrec` offers accurate calculations of ROC (Receiver
Operator Characteristics) and precision-recall curves. All the main
calculations of `precrec` are implemented with
C++/[Rcpp](https://cran.r-project.org/package=Rcpp).

## Documentation

  - [Package website](https://takayasaito.github.io/precrec/) – GitHub
    pages that contain all precrec documentation.

  - [Introduction to
    precrec](https://takayasaito.github.io/precrec/articles/introduction.html)
    – a package vignette that contains the descriptions of the functions
    with several useful examples. View the vignette with
    `vignette("introduction", package = "precrec")` in R. The HTML
    version is also available on the [GitHub
    Pages](https://takayasaito.github.io/precrec/articles/introduction.html).

  - [Help pages](https://takayasaito.github.io/precrec/reference/) – all
    the functions including the S3 generics except for `print` have
    their own help pages with plenty of examples. View the main help
    page with `help(package = "precrec")` in R. The HTML version is also
    available on the [GitHub
    Pages](https://takayasaito.github.io/precrec/reference/).

## Six key features of precrec

### 1\. Accurate curve calculations

`precrec` provides accurate precision-recall curves.

  - Non-linear interpolation
  - Elongation to the y-axis to estimate the first point when necessary
  - Use of score-wise threshold values instead of fixed bins

`precrec` also calculates AUC scores with high accuracy.

### 2\. Super fast

`precrec` calculates curves in a matter of seconds even for a fairly
large dataset. It is much faster than most other tools that calculate
ROC and precision-recall curves.

### 3\. Various evaluation metrics

In addition to precision-recall and ROC curves, `precrec` offers basic
evaluation measures.

  - Error rate
  - Accuracy
  - Specificity
  - Sensitivity, true positive rate (TPR), recall
  - Precision, positive predictive value (PPV)
  - Matthews correlation coefficient
  - F-score

### 4\. Confidence interval band

`precrec` calculates confidence intervals when multiple test sets are
given. It automatically shows confidence bands about the averaged curve
in the corresponding plot.

### 5\. Calculation of partial AUCs and visualization of partial curves

`precrec` calculates partial AUCs for specified x and y ranges. It can
also draw partial ROC and precision-recall curves for the specified
ranges.

### 6\. Supporting functions

`precrec` provides several useful functions that lack in most other
evaluation tools.

  - Handling multiple models and multiple test sets
  - Handling tied scores and missing scores
  - Pre- and post-process functions of simple data preparation and curve
    analysis

## Installation

  - Install the release version of `precrec` from CRAN with
    `install.packages("precrec")`.

  - Alternatively, you can install a development version of `precrec`
    from [our GitHub
    repository](https://github.com/takayasaito/precrec/). To install it:
    
    1.  Make sure you have a working development environment.
        
          - **Windows**: Install Rtools (available on the CRAN website).
          - **Mac**: Install Xcode from the Mac App Store.
          - **Linux**: Install a compiler and various development
            libraries (details vary across different flavors of Linux).
    
    2.  Install `devtools` from CRAN with
        `install.packages("devtools")`.
    
    3.  Install `precrec` from the GitHub repository with
        `devtools::install_github("takayasaito/precrec")`.

## Functions

The `precrec` package provides the following six functions.

| Function             | Description                                                |
| :------------------- | :--------------------------------------------------------- |
| evalmod              | Main function to calculate evaluation measures             |
| mmdata               | Reformat input data for performance evaluation calculation |
| join\_scores         | Join scores of multiple models into a list                 |
| join\_labels         | Join observed labels of multiple test datasets into a list |
| create\_sim\_samples | Create random samples for simulations                      |
| format\_nfold        | Create n-fold cross validation dataset from data frame     |

Moreover, the `precrec` package provides nine S3 generics for the S3
object created by the `evalmod` function. **N.B.** The R language
specifies S3 objects and S3 generic functions as part of the most basic
object-oriented system in R.

| S3 generic    | Package  | Description                                                    |
| :------------ | :------- | :------------------------------------------------------------- |
| print         | base     | Print the calculation results and the summary of the test data |
| as.data.frame | base     | Convert a precrec object to a data frame                       |
| plot          | graphics | Plot performance evaluation measures                           |
| autoplot      | ggplot2  | Plot performance evaluation measures with ggplot2              |
| fortify       | ggplot2  | Prepare a data frame for ggplot2                               |
| auc           | precrec  | Make a data frame with AUC scores                              |
| part          | precrec  | Calculate partial curves and partial AUC scores                |
| pauc          | precrec  | Make a data frame with pAUC scores                             |
| auc\_ci       | precrec  | Calculate confidence intervals of AUC scores                   |

## Examples

Following two examples show the basic usage of `precrec` functions.

### ROC and Precision-Recall calculations

The `evalmod` function calculates ROC and Precision-Recall curves and
returns an S3 object.

``` r
library(precrec)

# Load a test dataset
data(P10N10)

# Calculate ROC and Precision-Recall curves
sscurves <- evalmod(scores = P10N10$scores, labels = P10N10$labels)
```

### Visualization of the curves

The `autoplot` function outputs ROC and Precision-Recall curves by using
the `ggplot2` package.

``` r
# The ggplot2 package is required 
library(ggplot2)

# Show ROC and Precision-Recall plots
autoplot(sscurves)
```

![](https://rawgit.com/takayasaito/precrec/master/README_files/figure-markdown_github/unnamed-chunk-2-1.png)

## Citation

*Precrec: fast and accurate precision-recall and ROC curve calculations
in R*

Takaya Saito; Marc Rehmsmeier

Bioinformatics 2017; 33 (1): 145-147.

doi:
[10.1093/bioinformatics/btw570](https://doi.org/10.1093/bioinformatics/btw570)

## External links

  - [Classifier evaluation with imbalanced
    datasets](https://classeval.wordpress.com/) - our web site that
    contains several pages with useful tips for performance evaluation
    on binary classifiers.

  - [The Precision-Recall Plot Is More Informative than the ROC Plot
    When Evaluating Binary Classifiers on Imbalanced
    Datasets](https://doi.org/10.1371/journal.pone.0118432) - our paper
    that summarized potential pitfalls of ROC plots with imbalanced
    datasets and advantages of using precision-recall plots instead.

  - [Advanced R](https://adv-r.hadley.nz) and [R
    packages](https://r-pkgs.org) - web sites of two Hadley Wickham’s
    books that we used as references to decide the basic structure and
    the coding style of `precrec`.