Revision 775c7e1be4c58aaf8adccdd2b92d07aa9cdc265f authored by Matthias Templ on 14 January 2020, 05:10:03 UTC, committed by cran-robot on 14 January 2020, 05:10:03 UTC
1 parent d761b2f
Raw File
smoothSplines.Rd
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/splineDensity.R
\name{smoothSplines}
\alias{smoothSplines}
\title{Estimate density from histogram}
\usage{
smoothSplines(
  k,
  l,
  alpha,
  data,
  xcp,
  knots,
  weights = matrix(1, dim(data)[1], dim(data)[2]),
  num_points = 100,
  prior = "default",
  cores = 1,
  fast = 0
)
}
\arguments{
\item{k}{smoothing splines degree}

\item{l}{order of derivative in the penalization term}

\item{alpha}{weight for penalization}

\item{data}{an object of class "matrix" containing data to be smoothed, row by row}

\item{xcp}{vector of control points}

\item{knots}{either vector of knots for the splines or a integer for the number of equispaced knots}

\item{weights}{matrix of weights. If not given, all data points will be weighted the same.}

\item{num_points}{number of points of the grid where to evaluate the density estimated}

\item{prior}{prior used for zero-replacements. This must be one of "perks", "jeffreys", "bayes_laplace", "sq" or "default"}

\item{cores}{number of cores for parallel execution, if the option was enabled before installing the package}

\item{fast}{1 if maximal performance is required (print statements suppressed), 0 otherwise}
}
\value{
An object of class \code{smoothSpl}, containing among the other the following variables:
\item{\code{bspline}}{each row is the vector of B-spline coefficients}
\item{\code{Y}}{the values of the smoothed curve, for the grid given}
\item{\code{Y_clr}}{the values of the smoothed curve, in the clr setting, for the grid given}
}
\description{
Given raw (discretized) distributional observations, \code{smoothSplines} computes the density
function that 'best' fits data, in a trade-off between smooth and least squares approximation, using B-spline basis functions.
}
\details{
The original discretized densities are not directly smoothed, but instead the centred logratio transformation is
first applied, to deal with the unit integral constraint related to density functions. \cr
Then the constrained variational problem is set. This minimization problem for the optimal
density is a compromise between staying close to the given data, at the corresponding \code{xcp},
and obtaining a smooth function.
The non-smoothness measure takes into account the \code{l}th derivative, while the fidelity term is weigthed by \code{alpha}. \cr
The solution is a natural spline. The vector of its coefficients is obtained by the minimum norm solution of a linear system.
The resulting splines can be either back-transformed to the original Bayes space of density
functions (in order to provide their smoothed counterparts for vizualization and interpretation
purposes), or retained for further statistical analysis in the clr space.
}
\examples{
SepalLengthCm <- iris$Sepal.Length
Species <- iris$Species

iris1 <- SepalLengthCm[iris$Species==levels(iris$Species)[1]]
h1 <- hist(iris1, nclass = 12, plot = FALSE)

midx1 <- h1$mids
midy1 <- matrix(h1$density, nrow=1, ncol = length(h1$density), byrow=TRUE)
knots <- 7
sol1 <- smoothSplines(k=3,l=2,alpha=1000,midy1,midx1,knots)
plot(sol1)

h1 <- hist(iris1, freq = FALSE, nclass = 12, xlab = "Sepal Length     [cm]", main = "Iris setosa")
# black line: kernel method; red line: smoothSplines result
lines(density(iris1), col = "black", lwd = 1.5)
xx1 <- seq(sol1$Xcp[1],tail(sol1$Xcp,n=1),length.out = sol1$NumPoints)
lines(xx1,sol1$Y[1,], col = 'red', lwd = 2)
}
\references{
J. Machalova, K. Hron & G.S. Monti (2016):
Preprocessing of centred logratio transformed density functions
using smoothing splines. Journal of Applied Statistics, 43:8, 1419-1435.
}
back to top