https://github.com/hadley/dplyr
Raw File
Tip revision: 39ee11bfe78c4a301070b00d3b92127217786ba7 authored by Hadley Wickham on 23 January 2020, 16:18:08 UTC
Preserve drop attr
Tip revision: 39ee11b
summarise_all.Rd
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/colwise-mutate.R
\name{summarise_all}
\alias{summarise_all}
\alias{summarise_if}
\alias{summarise_at}
\alias{summarize_all}
\alias{summarize_if}
\alias{summarize_at}
\title{Summarise multiple columns}
\usage{
summarise_all(.tbl, .funs, ...)

summarise_if(.tbl, .predicate, .funs, ...)

summarise_at(.tbl, .vars, .funs, ..., .cols = NULL)

summarize_all(.tbl, .funs, ...)

summarize_if(.tbl, .predicate, .funs, ...)

summarize_at(.tbl, .vars, .funs, ..., .cols = NULL)
}
\arguments{
\item{.tbl}{A \code{tbl} object.}

\item{.funs}{A function \code{fun}, a quosure style lambda \code{~ fun(.)} or a list of either form.}

\item{...}{Additional arguments for the function calls in
\code{.funs}. These are evaluated only once, with \link[rlang:tidy-dots]{tidy dots} support.}

\item{.predicate}{A predicate function to be applied to the columns
or a logical vector. The variables for which \code{.predicate} is or
returns \code{TRUE} are selected. This argument is passed to
\code{\link[rlang:as_function]{rlang::as_function()}} and thus supports quosure-style lambda
functions and strings representing function names.}

\item{.vars}{A list of columns generated by \code{\link[=vars]{vars()}},
a character vector of column names, a numeric vector of column
positions, or \code{NULL}.}

\item{.cols}{This argument has been renamed to \code{.vars} to fit
dplyr's terminology and is deprecated.}
}
\value{
A data frame. By default, the newly created columns have the shortest
names needed to uniquely identify the output. To force inclusion of a name,
even when not needed, name the input (see examples for details).
}
\description{
\Sexpr[results=rd, stage=render]{lifecycle::badge("retired")}

Scoped verbs (\verb{_if}, \verb{_at}, \verb{_all}) have been superseded by the use of
\code{\link[=across]{across()}} in an existing verb. See \code{vignette("colwise")} for details.

The \link{scoped} variants of \code{\link[=summarise]{summarise()}} make it easy to apply the same
transformation to multiple variables.
There are three variants.
\itemize{
\item \code{summarise_all()} affects every variable
\item \code{summarise_at()} affects variables selected with a character vector or
vars()
\item \code{summarise_if()} affects variables selected with a predicate function
}
}
\section{Grouping variables}{


If applied on a grouped tibble, these operations are \emph{not} applied
to the grouping variables. The behaviour depends on whether the
selection is \strong{implicit} (\code{all} and \code{if} selections) or
\strong{explicit} (\code{at} selections).
\itemize{
\item Grouping variables covered by explicit selections in
\code{summarise_at()} are always an error. Add \code{-group_cols()} to the
\code{\link[=vars]{vars()}} selection to avoid this:\preformatted{data \%>\%
  summarise_at(vars(-group_cols(), ...), myoperation)
}

Or remove \code{group_vars()} from the character vector of column names:\preformatted{nms <- setdiff(nms, group_vars(data))
data \%>\% summarise_at(nms, myoperation)
}
\item Grouping variables covered by implicit selections are silently
ignored by \code{summarise_all()} and \code{summarise_if()}.
}
}

\section{Naming}{


The names of the new columns are derived from the names of the
input variables and the names of the functions.
\itemize{
\item if there is only one unnamed function (i.e. if \code{.funs} is an unnamed list
of length one),
the names of the input variables are used to name the new columns;
\item for \verb{_at} functions, if there is only one unnamed variable (i.e.,
if \code{.vars} is of the form \code{vars(a_single_column)}) and \code{.funs} has length
greater than one,
the names of the functions are used to name the new columns;
\item otherwise, the new names are created by
concatenating the names of the input variables and the names of the
functions, separated with an underscore \code{"_"}.
}

The \code{.funs} argument can be a named or unnamed list.
If a function is unnamed and the name cannot be derived automatically,
a name of the form "fn#" is used.
Similarly, \code{\link[=vars]{vars()}} accepts named and unnamed arguments.
If a variable in \code{.vars} is named, a new column by that name will be created.

Name collisions in the new columns are disambiguated using a unique suffix.
}

\section{Life cycle}{


The functions are maturing, because the naming scheme and the
disambiguation algorithm are subject to change in dplyr 0.9.0.
}

\examples{
# The _at() variants directly support strings:
starwars \%>\%
  summarise_at(c("height", "mass"), mean, na.rm = TRUE)
# ->
starwars \%>\% summarise(across(c("height", "mass"), ~ mean(.x, na.rm = TRUE)))

# You can also supply selection helpers to _at() functions but you have
# to quote them with vars():
starwars \%>\%
  summarise_at(vars(height:mass), mean, na.rm = TRUE)
# ->
starwars \%>\%
  summarise(across(height:mass, ~ mean(.x, na.rm = TRUE)))

# The _if() variants apply a predicate function (a function that
# returns TRUE or FALSE) to determine the relevant subset of
# columns. Here we apply mean() to the numeric columns:
starwars \%>\%
  summarise_if(is.numeric, mean, na.rm = TRUE)
starwars \%>\%
  summarise(across(is.numeric, ~ mean(.x, na.rm = TRUE)))

by_species <- iris \%>\%
  group_by(Species)

# If you want to apply multiple transformations, pass a list of
# functions. When there are multiple functions, they create new
# variables instead of modifying the variables in place:
by_species \%>\%
  summarise_all(list(min, max))
# ->
by_species \%>\%
  summarise(across(everything(), list(min = min, max = max)))
}
\seealso{
\link[=scoped]{The other scoped verbs}, \code{\link[=vars]{vars()}}
}
back to top