https://github.com/cran/validate
Raw File
Tip revision: 0eeca3fbc221772e8affd4030bedc6a11c051093 authored by Mark van der Loo on 30 March 2021, 14:50:02 UTC
version 1.0.2
Tip revision: 0eeca3f
syntax.Rd
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/syntax.R
\name{syntax}
\alias{syntax}
\title{Syntax to define validation or indicator rules}
\description{
A concise overview of the \code{validate} syntax.
}
\section{Basic syntax}{


The basic rule is that an R-statement that evaluates to a \code{logical} is a
validating statement. This is established by static code inspection when
\code{validator} reads a (set of) user-defined validation rule(s).
}

\section{Comparisons}{


All basic comparisons, including \code{>, >=, ==, !=, <=, <}, \code{\%in\%}
are validating statements. When executing a validating statement, the
\code{\%in\%} operator is replaced with \code{\link[validate:vin]{\%vin\%}}.
}

\section{Logical operations}{


Unary logical operators `\code{!}', \code{all()} and \code{any} define
validating statements. Binary logical operations including \code{&, &&, |,
||}, are validating when \code{P} and \code{Q} in e.g. \code{P & Q} are
validating. (note that the short-circuits \code{&&} and \code{&} onnly return
the first logical value, in cases where for \code{P && Q}, \code{P} and/or
\code{Q} are vectors. Binary logical implication \eqn{P\Rightarrow Q} (P
implies Q) is implemented as \code{if ( P ) Q}. The latter is interpreted as
\code{!(P) | Q}.
}

\section{Type checking}{


Any function starting with \code{is.} (e.g. \code{is.numeric}) is a
validating expression.
}

\section{Text search}{


\code{grepl} is a validating expression.
}

\section{Functional dependencies}{


Armstrong's functional dependencies, of the form \eqn{A + B \to C + D} are
represented using the \code{~}, e.g. \code{A + B ~ C + D}. For example
\code{postcode ~ city} means, that when two records have the same value for
\code{postcode}, they must have the same value for \code{city}.
}

\section{Reference the dataset as a whole}{


Metadata such as numer of rows, columns, column names and so on can be 
tested by referencing the whole data set with the '\code{.}'. For example,
the rule \code{nrow(.) == 15} checks whether there are 15 rows in the
dataset at hand.
}

\section{Uniqueness, completeness}{


These can be tested in principle with the 'dot' syntax. However, there are
some convenience functions: \code{\link{is_complete}}, \code{\link{all_complete}}
\code{\link{is_unique}}, \code{\link{all_unique}}.
}

\section{Local, transient assignment}{

The operator `\code{:=}' can be used to set up local variables (during, for
example, validation) to save time (the rhs of an assignment is computed only
once) or to make your validation code more maintainable.  Assignments work more
or less like common R assignments: they are only valid for statements coming
after the assignment and they may be overwritten. The result of computing the
rhs is not part of a \code{\link{confront}}ation with data.
}

\section{Groups}{

Often the same constraints/rules are valid for groups of variables. 
\code{validate} allows for compact notation. Variable groups can be used
in-statement or by defining them with the \code{:=} operator.

\code{validator( var_group(a,b) > 0 )}

is equivalent to

\code{validator(G := var_group(a,b), G > 0)}

is equivalent to

\code{validator(a>0,b>0)}.

Using two groups results in the cartesian product of checks. So the statement

\code{validator( f=var_group(c,d), g=var_group(a,b), g > f)}

is equivalent to

\code{validator(a > c, b > c, a > d, b > d)}
}

\section{File parsing}{

Please see the vignette on how to read rules from and write rules to file:

\code{vignette("rule_files",package="validate")}
}

back to top