Revision 94a7e298b1a50d93e8a9ccb813a070f7b30f3da1 authored by Christian Thiele on 21 March 2018, 08:27:24 UTC, committed by cran-robot on 21 March 2018, 08:27:24 UTC
0 parent
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/optimize_metric.R
\title{Optimize a metric function in binary classification after bootstrapping}
maximize_boot_metric(data, x, class, metric_func = youden, pos_class = NULL,
neg_class = NULL, direction, summary_func = mean, boot_cut = 50,
inf_rm = TRUE, tol_metric, use_midpoints, ...)
minimize_boot_metric(data, x, class, metric_func = youden, pos_class = NULL,
neg_class = NULL, direction, summary_func = mean, boot_cut = 50,
inf_rm = TRUE, tol_metric, use_midpoints, ...)
\item{data}{A data frame or tibble in which the columns that are given in x
and class can be found.}
\item{x}{(character) The variable name to be used for classification,
e.g. predictions or test values.}
\item{class}{(character) The variable name indicating class membership.}
\item{metric_func}{(function) A function that computes a single number
metric to be maximized. See description.}
\item{pos_class}{The value of class that indicates the positive class.}
\item{neg_class}{The value of class that indicates the negative class.}
\item{direction}{(character) Use ">=" or "<=" to select whether an x value
>= or <= the cutoff predicts the positive class.}
\item{summary_func}{(function) After obtaining the bootstrapped optimal
cutpoints this function, e.g. mean or median, is applied to arrive at a single cutpoint.}
\item{boot_cut}{(numeric) Number of bootstrap repetitions over which the mean
optimal cutpoint is calculated.}
\item{inf_rm}{(logical) whether to remove infinite cutpoints before
calculating the summary.}
\item{tol_metric}{All cutpoints will be passed to \code{summary_func}
that lead to a metric
value in the interval [m_max - tol_metric, m_max + tol_metric] where
m_max is the maximum achievable metric value. This can be used to return
multiple decent cutpoints and to avoid floating-point problems.}
\item{use_midpoints}{(logical) If TRUE (default FALSE) the returned optimal
cutpoint will be the mean of the optimal cutpoint and the next highest
observation (for direction = ">") or the next lowest observation
(for direction = "<") which avoids biasing the optimal cutpoint.}
\item{...}{To capture further arguments that are always passed to the method
function by cutpointr. The cutpointr function passes data, x, class,
metric_func, direction, pos_class and neg_class to the method function.}
A tibble with the column \code{optimal_cutpoint}
Given a function for computing a metric in \code{metric_func}, these functions
bootstrap the data \code{boot_cut} times and
maximize or minimize the metric by selecting an optimal cutpoint. The returned
optimal cutpoint is the result of applying \code{summary_func}, e.g. the mean,
to all optimal cutpoints that were determined in the bootstrap samples.
The \code{metric} function should accept the following inputs:
\item \code{tp}: vector of number of true positives
\item \code{fp}: vector of number of false positives
\item \code{tn}: vector of number of true negatives
\item \code{fn}: vector of number of false negatives
The above inputs are arrived at by using all unique values in \code{x}, Inf, and
-Inf as possible cutpoints for classifying the variable in class.
The reported metric represents the usual in-sample performance of the
determined cutpoint.
cutpointr(suicide, dsi, suicide, method = maximize_boot_metric,
metric = accuracy, boot_cut = 30)
cutpointr(suicide, dsi, suicide, method = minimize_boot_metric,
metric = abs_d_sens_spec, boot_cut = 30)
Other method functions: \code{\link{maximize_gam_metric}},
\code{\link{oc_manual}}, \code{\link{oc_mean}},
\code{\link{oc_median}}, \code{\link{oc_youden_kernel}},
Computing file changes ...