Raw File
\name{lavaan.survey}
\alias{lavaan.survey}
\title{
	Complex survey analysis of structural equation models (SEM)
}
\description{
	Takes a lavaan fit object and a complex survey design object as input
	and returns a structural equation modeling analysis based on the fit 
	object, where the complex sampling design is taken into account. 
	
	The structural equation model parameter estimates are "aggregated" (Skinner, Holt & Smith 1989), i.e. they consistently estimate parameters aggregated over any 
  clusters and strata and no explicit modeling of the effects of clusters and strata
  is involved. Standard errors are design-based. 
	See Satorra and Muthen (1995) and references below for details on the procedure.
  
 Both the pseudo-maximum likelihood (PML) procedure popular in the SEM world 
 (e.g. Asparouhov 2005; Stapleton 2006) and 
 weighted least squares procedures similar to aggregate regression modeling with
 complex sampling (e.g. Fuller 2009, chapter 6) are implemented.
 
 It is possible to give a list of multiply imputed datasets to svydesign as data.
 \code{lavaan.survey} will then apply the standard Rubin (1987) formula to obtain
 point and variance estimates under multiple imputation. Some care is required with
 this procedure when survey weights are also involved, however (see Notes).
}
\usage{
lavaan.survey(lavaan.fit, survey.design, 
	     estimator=c("MLM", "MLMV", "MLMVS", "WLS", "DWLS", "ML"),
	     estimator.gamma=c("default","Yuan-Bentler"))
	 

}
%- maybe also 'usage' for other objects documented here.
\arguments{
  \item{lavaan.fit}{
	A \code{\linkS4class{lavaan}} object resulting from a lavaan call. 
	
	Since this is the estimator that will be used in the complex sample
	estimates, for comparability it can be convenient to use the same estimator in the call
	generating the \pkg{lavaan} fit object as in the \code{lavaan.survey} call. By default
	this is "MLM".
}
  \item{survey.design}{
	An  \code{\link{svydesign}} object resulting from a call to 
	\code{svydesign} in the \pkg{survey} package. This allows for incorporation of
	clustering, stratification, unequal probability weights, finite
	population correction, and multiple imputation. 
	See the survey documentation for more information.
}
   \item{estimator}{
   		The estimator used determines how parameter estimates are obtained, 
   		how standard errors are calculated, and how the test statistic and 
   		all measures derived from it are adjusted. See \code{\link{lavaan}}.
   		
 		The default estimator is MLM. It is recommended to use one
		of the ML estimators.
   }
   \item{estimator.gamma}{
	Whether to use the usual estimator of Gamma as given by \code{svyvar} (the variance-covariance
	matrix of the observed variances and covariances), or apply some kind
	of smoothing or adjustment. Currently the only other option is the
	Yuan-Bentler (1998) adjustment based on model residuals.
   }
}
\details{
	The user specifies a complex sampling design with the \pkg{survey} package's
	\code{\link{svydesign}} function, and a structural equation model with
	\code{\link{lavaan}}.
	
	 \code{lavaan.survey} follows these steps:
	\enumerate{
		\item The covariance matrix of the observed variables
		(or matrices in the case of multiple
		group analysis) is estimated using the \code{svyvar} command from the
		\pkg{survey} package. 
	       	\item The asymptotic covariance matrix of the variances and
		covariances is obtained from the \code{svyvar} output  (the "Gamma"
		matrix)
		\item The last step depends on the estimation method chosen:
		
		\enumerate{
		\item[MLM, MLMV, MLMVS] The \pkg{lavaan} model is re-fit using Maximum Likelihood
		with the covariance matrix as data. After normal-theory ML
		estimation, the standard errors (\code{vcov} matrix), likelihood ratio
		("chi-square") statistic, and all derived fit indices and
		statistics are adjusted for the complex sampling design using
		the Gamma matrix. I.e. the Satorra-Bentler (SB) corrections are
		obtained ("MLM" estimation in \pkg{lavaan} terminology). This procedure
    is equivalent to "pseudo"-maximum likelihood (PML).
		\item[WLS, DWLS] The \pkg{lavaan} model is re-fit using Weighted Least Squares
		with the covariance matrix as data, and the Moore-Penrose inverse
		of the Gamma matrix as estimation weights. If DWLS is chosen
		only the diagonal of the weight matrix is used.
		}
	}


}
\value{
	An object of class \code{\linkS4class{lavaan}}, where the estimates, 
	standard errors, \code{vcov} matrix, chi-square statistic, and fit measures 
	based on the chi-square take into account the complex survey 
	design. Several methods are available for \code{\linkS4class{lavaan}} 
	objects, including a \code{summary} method.}
\references{
Asparouhov T (2005). Sampling Weights in Latent Variable Modeling. Structural 
equation modeling, 12(3), 411-434.

Fuller WA (2009). Sampling Statistics. John Wiley & Sons, New York.

Kim J, Brick J, Fuller WA, Kalton G (2006). On the Bias of the Multiple-Imputation Variance Estimator in Survey Sampling. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(3), 509-521.

Oberski, D. and Saris, W. (2012). A model-based procedure to evaluate
    the relative effects of different TSE components on structural equation
    model parameter estimates. Presentation given at the International
    Total Survey Error Workshop in Santpoort, the Netherlands. 
    \url{http://daob.org/}

Satorra, A., & Bentler, P. M. (1994). Corrections to test statistics
	and standard errors in covariance structure analysis. 

Satorra, A., and Muthen, B. (1995). Complex sample data in structural
   equation modeling. Sociological methodology, 25, 267-316.

Skinner C, Holt D, Smith T (1989). Analysis of Complex Surveys. John Wiley & Sons,     
New York.

Stapleton L (2006). An Assessment of Practical Solutions for Structural Equation Modeling with Complex Sample Data. Structural Equation Modeling, 13(1), 28-58.

Stapleton L (2008). Variance Estimation Using Replication Methods in Structural 
Equation Modeling with Complex Sample Data. Structural Equation Modeling, 15(2), 183-210.

Yuan K, Bentler P (1998). Normal Theory Based Test Statistics in Structural Equation
Modelling. British Journal of Mathematical and Statistical Psychology, 51(2), 289-309.
    
}
\author{
	Daniel Oberski - \url{http://daob.org} - \email{daniel.oberski@gmail.com}
}
\note{    
    1) Some care should be taken when applying multiple imputation with survey
    weights. The weights should be incorporated in the imputation, and even 
    then the variance produced by the usual Rubin (1987) estimator may not
    be consistent (Kott 1995; Kim et al. 2006).

    If multiple imputation is used to deal with unit nonresponse,
    calibration and/or propensity score weighting with jackknifing may be a 
    more appropriate method. See the \pkg{survey} package.

 2) Note that when using PML or WLS, the Gamma matrix need not be positive definite.
  Preliminary investigations suggest that it often is not. This may happen due to 
  reduction of effective sample size from clustering, for instance. 
  In itself this need not be a problem, depending on the restrictiveness of the model.
  In such cases \code{lavaan.survey} checks explicitly whether the covariance matrix
  of the parameter estimates is still positive definite and produces a warning otherwise. 

    3) Currently only structural equation models for continuous variables are 
    implemented.
}

%% ~Make other sections like Warning with \section{Warning }{....} ~

\seealso{
	\code{\link{svydesign}}
	\code{\link{svyvar}}
	\code{\link{lavaan}}
}
\examples{
###### A single group example #######

# European Social Survey Denmark data (SRS)
data(ess.dk)

# A saturated model with reciprocal effects from Saris & Gallhofer
dk.model <- "
  socialTrust ~ 1 + systemTrust + fearCrime
  systemTrust ~ 1 + socialTrust + efficacy
  socialTrust ~~ systemTrust
"
lavaan.fit <- lavaan(dk.model, data=ess.dk, auto.var=TRUE, estimator="MLM")
summary(lavaan.fit)

# Create a survey design object with interviewer clustering
survey.design <- svydesign(ids=~intnum, prob=~1, data=ess.dk)

survey.fit <- lavaan.survey(lavaan.fit=lavaan.fit, survey.design=survey.design)
summary(survey.fit)



###### A multiple group example #######

data(HolzingerSwineford1939)

# The Holzinger and Swineford (1939) example - some model with complex restrictions
HS.model <- ' visual  =~ x1 + x2 + c(lam31, lam31)*x3
              textual =~ x4 + x5 + c(lam62, lam62)*x6
              speed   =~ x7 + x8 + c(lam93, lam93)*x9 
             speed ~ textual 
             textual ~ visual'

# Fit multiple group per school
fit <- lavaan(HS.model, data=HolzingerSwineford1939,
              auto.var=TRUE, auto.fix.first=TRUE, group="school",
              auto.cov.lv.x=TRUE, estimator="MLM")
summary(fit, fit.measures=TRUE)

# Create fictional clusters in the HS data
set.seed(20121025)
HolzingerSwineford1939$clus <- sample(1:100, size=nrow(HolzingerSwineford1939), replace=TRUE)
survey.design <- svydesign(ids=~clus, prob=~1, data=HolzingerSwineford1939)

summary(fit.survey <- lavaan.survey(fit, survey.design))


}
% Add one or more standard keywords, see file 'KEYWORDS' in the
% R documentation directory.
\keyword{survey}
\keyword{models}
\keyword{regression}
\keyword{robust}
\keyword{multivariate}
back to top