Raw File
filter-joins.Rd
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/join.r
\name{filter-joins}
\alias{filter-joins}
\alias{semi_join}
\alias{semi_join.data.frame}
\alias{anti_join}
\alias{anti_join.data.frame}
\title{Filtering joins}
\usage{
semi_join(x, y, by = NULL, copy = FALSE, ...)

\method{semi_join}{data.frame}(x, y, by = NULL, copy = FALSE, ..., na_matches = c("na", "never"))

anti_join(x, y, by = NULL, copy = FALSE, ...)

\method{anti_join}{data.frame}(x, y, by = NULL, copy = FALSE, ..., na_matches = c("na", "never"))
}
\arguments{
\item{x, y}{A pair of data frames, data frame extensions (e.g. a tibble), or
lazy data frames (e.g. from dbplyr or dtplyr). See \emph{Methods}, below, for
more details.}

\item{by}{A character vector of variables to join by.

If \code{NULL}, the default, \verb{*_join()} will perform a natural join, using all
variables in common across \code{x} and \code{y}. A message lists the variables so that you
can check they're correct; suppress the message by supplying \code{by} explicitly.

To join by different variables on \code{x} and \code{y}, use a named vector.
For example, \code{by = c("a" = "b")} will match \code{x$a} to \code{y$b}.

To join by multiple variables, use a vector with length > 1.
For example, \code{by = c("a", "b")} will match \code{x$a} to \code{y$a} and \code{x$b} to
\code{y$b}. Use a named vector to match different variables in \code{x} and \code{y}.
For example, \code{by = c("a" = "b", "c" = "d")} will match \code{x$a} to \code{y$b} and
\code{x$c} to \code{y$d}.

To perform a cross-join, generating all combinations of \code{x} and \code{y},
use \code{by = character()}.}

\item{copy}{If \code{x} and \code{y} are not from the same data source,
and \code{copy} is \code{TRUE}, then \code{y} will be copied into the
same src as \code{x}.  This allows you to join tables across srcs, but
it is a potentially expensive operation so you must opt into it.}

\item{...}{Other parameters passed onto methods.}

\item{na_matches}{Should \code{NA} and \code{NaN} values match one another?

The default, \code{"na"}, treats two \code{NA} or \code{NaN} values as equal, like
\code{\%in\%}, \code{\link[=match]{match()}}, \code{\link[=merge]{merge()}}.

Use \code{"never"} to always treat two \code{NA} or \code{NaN} values as different, like
joins for database sources, similarly to \code{merge(incomparables = FALSE)}.}
}
\value{
An object of the same type as \code{x}. The output has the following properties:
\itemize{
\item Rows are a subset of the input, but appear in the same order.
\item Columns are not modified.
\item Data frame attributes are preserved.
\item Groups are taken from \code{x}. The number of groups may be reduced.
}
}
\description{
Filtering joins filter rows from \code{x} based on the presence or absence
of matches in \code{y}:
\itemize{
\item \code{semi_join()} return all rows from \code{x} with a match in \code{y}.
\item \code{anti_join()} return all rows from \code{x} with\strong{out} a match in \code{y}.
}
}
\section{Methods}{

These function are \strong{generic}s, which means that packages can provide
implementations (methods) for other classes. See the documentation of
individual methods for extra arguments and differences in behaviour.

Methods available in currently loaded packages:
\itemize{
\item \code{semi_join()}: \Sexpr[stage=render,results=rd]{dplyr:::methods_rd("semi_join")}.
\item \code{anti_join()}: \Sexpr[stage=render,results=rd]{dplyr:::methods_rd("anti_join")}.
}
}

\examples{
# "Filtering" joins keep cases from the LHS
band_members \%>\% semi_join(band_instruments)
band_members \%>\% anti_join(band_instruments)

# To suppress the message about joining variables, supply `by`
band_members \%>\% semi_join(band_instruments, by = "name")
# This is good practice in production code
}
\seealso{
Other joins: 
\code{\link{mutate-joins}},
\code{\link{nest_join}()}
}
\concept{joins}
back to top