https://github.com/hadley/dplyr
Raw File
Tip revision: 98b8a0f5de25e238ac97514da24ec228610c8701 authored by Lionel Henry on 19 January 2021, 09:23:23 UTC
Merge pull request #5686 from lionel-/fix-warning-overhead
Tip revision: 98b8a0f
count.Rd
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/count-tally.R
\name{count}
\alias{count}
\alias{tally}
\alias{add_count}
\alias{add_tally}
\title{Count observations by group}
\usage{
count(x, ..., wt = NULL, sort = FALSE, name = NULL)

tally(x, wt = NULL, sort = FALSE, name = NULL)

add_count(x, ..., wt = NULL, sort = FALSE, name = NULL, .drop = deprecated())

add_tally(x, wt = NULL, sort = FALSE, name = NULL)
}
\arguments{
\item{x}{A data frame, data frame extension (e.g. a tibble), or a
lazy data frame (e.g. from dbplyr or dtplyr).}

\item{...}{<\code{\link[=dplyr_data_masking]{data-masking}}> Variables to group by.}

\item{wt}{<\code{\link[=dplyr_data_masking]{data-masking}}> Frequency weights.
Can be \code{NULL} or a variable:
\itemize{
\item If \code{NULL} (the default), counts the number of rows in each group.
\item If a variable, computes \code{sum(wt)} for each group.
}}

\item{sort}{If \code{TRUE}, will show the largest groups at the top.}

\item{name}{The name of the new column in the output.

If omitted, it will default to \code{n}. If there's already a column called \code{n},
it will error, and require you to specify the name.}

\item{.drop}{For \code{count()}: if \code{FALSE} will include counts for empty groups
(i.e. for levels of factors that don't exist in the data). Deprecated in
\code{add_count()} since it didn't actually affect the output.}
}
\value{
An object of the same type as \code{.data}. \code{count()} and \code{add_count()}
group transiently, so the output has the same groups as the input.
}
\description{
\code{count()} lets you quickly count the unique values of one or more variables:
\code{df \%>\% count(a, b)} is roughly equivalent to
\code{df \%>\% group_by(a, b) \%>\% summarise(n = n())}.
\code{count()} is paired with \code{tally()}, a lower-level helper that is equivalent
to \code{df \%>\% summarise(n = n())}. Supply \code{wt} to perform weighted counts,
switching the summary from \code{n = n()} to \code{n = sum(wt)}.

\code{add_count()} and \code{add_tally()} are equivalents to \code{count()} and \code{tally()}
but use \code{mutate()} instead of \code{summarise()} so that they add a new column
with group-wise counts.
}
\examples{
# count() is a convenient way to get a sense of the distribution of
# values in a dataset
starwars \%>\% count(species)
starwars \%>\% count(species, sort = TRUE)
starwars \%>\% count(sex, gender, sort = TRUE)
starwars \%>\% count(birth_decade = round(birth_year, -1))

# use the `wt` argument to perform a weighted count. This is useful
# when the data has already been aggregated once
df <- tribble(
  ~name,    ~gender,   ~runs,
  "Max",    "male",       10,
  "Sandra", "female",      1,
  "Susan",  "female",      4
)
# counts rows:
df \%>\% count(gender)
# counts runs:
df \%>\% count(gender, wt = runs)

# tally() is a lower-level function that assumes you've done the grouping
starwars \%>\% tally()
starwars \%>\% group_by(species) \%>\% tally()

# both count() and tally() have add_ variants that work like
# mutate() instead of summarise
df \%>\% add_count(gender, wt = runs)
df \%>\% add_tally(wt = runs)
}
back to top