% Generated by roxygen2: do not edit by hand % Please edit documentation in R/group-by.r \name{group_by} \alias{group_by} \alias{ungroup} \title{Group by one or more variables} \usage{ group_by(.data, ..., add = FALSE, .drop = group_by_drop_default(.data)) ungroup(x, ...) } \arguments{ \item{.data}{a tbl} \item{...}{Variables to group by. All tbls accept variable names. Some tbls will accept functions of variables. Duplicated groups will be silently dropped.} \item{add}{When \code{add = FALSE}, the default, \code{group_by()} will override existing groups. To add to the existing groups, use \code{add = TRUE}.} \item{.drop}{When \code{.drop = TRUE}, empty groups are dropped. See \code{\link[=group_by_drop_default]{group_by_drop_default()}} for what the default value is for this argument.} \item{x}{A \code{\link[=tbl]{tbl()}}} } \value{ A \link[=grouped_df]{grouped data frame}, unless the combination of \code{...} and \code{add} yields a non empty set of grouping columns, a regular (ungrouped) data frame otherwise. } \description{ Most data operations are done on groups defined by variables. \code{group_by()} takes an existing tbl and converts it into a grouped tbl where operations are performed "by group". \code{ungroup()} removes grouping. } \section{Tbl types}{ \code{group_by()} is an S3 generic with methods for the three built-in tbls. See the help for the corresponding classes and their manip methods for more details: \itemize{ \item data.frame: \link{grouped_df} \item data.table: \link[dtplyr:grouped_dt]{dtplyr::grouped_dt} \item SQLite: \code{\link[=src_sqlite]{src_sqlite()}} \item PostgreSQL: \code{\link[=src_postgres]{src_postgres()}} \item MySQL: \code{\link[=src_mysql]{src_mysql()}} } } \section{Scoped grouping}{ The three \link{scoped} variants (\code{\link[=group_by_all]{group_by_all()}}, \code{\link[=group_by_if]{group_by_if()}} and \code{\link[=group_by_at]{group_by_at()}}) make it easy to group a dataset by a selection of variables. } \examples{ by_cyl <- mtcars \%>\% group_by(cyl) # grouping doesn't change how the data looks (apart from listing # how it's grouped): by_cyl # It changes how it acts with the other dplyr verbs: by_cyl \%>\% summarise( disp = mean(disp), hp = mean(hp) ) by_cyl \%>\% filter(disp == max(disp)) # Each call to summarise() removes a layer of grouping by_vs_am <- mtcars \%>\% group_by(vs, am) by_vs <- by_vs_am \%>\% summarise(n = n()) by_vs by_vs \%>\% summarise(n = sum(n)) # To removing grouping, use ungroup by_vs \%>\% ungroup() \%>\% summarise(n = sum(n)) # You can group by expressions: this is just short-hand for # a mutate/rename followed by a simple group_by mtcars \%>\% group_by(vsam = vs + am) # By default, group_by overrides existing grouping by_cyl \%>\% group_by(vs, am) \%>\% group_vars() # Use add = TRUE to instead append by_cyl \%>\% group_by(vs, am, add = TRUE) \%>\% group_vars() # when factors are involved, groups can be empty tbl <- tibble( x = 1:10, y = factor(rep(c("a", "c"), each = 5), levels = c("a", "b", "c")) ) tbl \%>\% group_by(y) \%>\% group_rows() } \seealso{ Other grouping functions: \code{\link{group_by_all}}, \code{\link{group_indices}}, \code{\link{group_keys}}, \code{\link{group_map}}, \code{\link{group_nest}}, \code{\link{group_rows}}, \code{\link{group_size}}, \code{\link{group_trim}}, \code{\link{groups}} } \concept{grouping functions}