https://github.com/cran/crs
Tip revision: b94b3504e86f89b8498f947181f7fbb0881e284a authored by Jeffrey S. Racine on 04 June 2012, 00:00:00 UTC
version 0.15-17
version 0.15-17
Tip revision: b94b350
CHANGELOG
Changes from Version 0.15-16 to 0.15-17 [04-Jun-2012]
* Updated NOMAD from 3.5.0 (released Jan 2011) to 3.5.1 (released Mar
2012)
Changes from Version 0.15-15 to 0.15-16 [30-Apr-2012]
* Fixed bug when deriv= is used in crs() (derivatives for categorical
variables were not correct in some cases - derivatives computed by
plot and plot-data were correct however)
* Fixed issue when plot called with "plot-data" or "data" and
par(mfrow()) is not reset
* Startup message points to the faq, faq is now a vignette
* Glitch fixed in npglpreg() where kernel types were not being passed
to the NOMAD solver...
* Added ckerorder option to npglpreg()
* More information output by summary for npglpreg() objects (kernel
type, order etc.)
* tol for determining ill-conditioned bases now consistent with
Octave/Matlab etc. and uses the tolerance
max(dim(x))*max(sqrt(abs(e)))*.Machine$double.eps
Changes from Version 0.15-14 to 0.15-15 [16-Apr-2012]
* NOTE: ** major change in default setting ** Default basis set to
"auto" from "additive" for more robust results when using default
settings with more than one continuous predictor (at the cost of
additional computation for all supported bases, more memory
required). Warnings changed to reflect this, additional warnings
provided, but mainly when cv="none" (i.e. user is doing things
manually and failing to provide defaults)
* Added switch for discarding singular bases during cross-validation
(singular.ok=FALSE is default, so default is to discard singular
bases)
* More stringent checking for ill-conditioned bases during
cross-validation - previous checks were insufficient allowing poorly
conditioned bases to creep into the final model (this may result in
increased runtime for large datasets and multivariate tensor models)
* Potential runtime improvement for large bases (test for
ill-conditioned bases was still relying in parts on rcond(t(B)%*%B),
now using crossprod(B) which is more computationally efficient than
rcond(t(B)%*%B) and has a substantially smaller memory footprint)
* No longer report basis type for one continuous predictor as all
bases are identical in this case
Changes from Version 0.15-13 to 0.15-14 [22-Mar-2012]
* Added support for weights to crs()
* Corrected issue with reporting of cv function value by summary()
with quantile regression splines
* Test for coexistence of pruning and quantile regression splines and
stop with a message that these cannot coexist
* Check for valid derivative integer
Changes from Version 0.15-12 to 0.15-13 [05-Mar-2012]
* Added new function crsivderiv()
* More options added to crsiv()
* Modified vignette (spline_primer)
* Corrected error in demo for IV regression when exhaustive search was
selected (nmulti needed to be set though it is not used)
* Changes to code to improve compliance with R `Writing portable
packages' guidelines and correct partial argument matches
Changes from Version 0.15-11 to 0.15-12 [7-Dec-2011]
* Added option to limit the minimum degrees of freedom when conducting
cross validation (see cv.df.min which defaults to 1) which can save
memory and computation time for large sample sizes by avoiding
computation of potentially very large dimensioned spline bases
Changes from Version 0.15-10 to 0.15-11 [3-Dec-2011]
* Fixed glitch in bwtype="auto" when adaptive_nn is selected
Changes from Version 0.15-9 to 0.15-10 [3-Dec-2011]
* Fixed regression in code when using cv.func=cv.aic
* Added option bwtype="auto" for npglpreg (automatically determine the
bandwidth type via cross-validation)
Changes from Version 0.15-8 to 0.15-9 [25-Nov-2011]
* Fixed glitch when computing derivatives with crs and multiple
predictors are of degree zero
* Added support for parallel npglpreg (calls npRmpi rather than np -
see demo)
* Improved handling of complex bases via dim.bs (test for negative
degrees of freedom prior to attempting to construct the basis
function - dramatically reduces memory overhead and can cut down on
unnecessary computation)
Changes from Version 0.15-7 to 0.15-8 [16-Nov-2011]
* Added support for is.fullrank testing for generalized local
polynomial kernel regression for cross-validation (default remains
ridging a la Seifert & Gasser), replaced rcond(t(X)%*%X) with
is.fullrank(X) (smaller memory footprint for large datasets)
* Fixed glitch with derivative computation when one or more degrees is
0 when kernel=TRUE and additionally when one or more factors is
excluded when using kernel=FALSE
* Fixed issue with reported cross-validation score only corresponding
to leave-one-out cross-validation by passing back cv function from
solver rather than computing post estimation
* Added ability to estimate quantile regression splines
* More rigorous testing for rank deficient fit via rcond() in cv
function
* Fixed issue where degree.min was set > 1 but initial degree was 1
(corresponding to the linear model which is the default for the
initial degree otherwise)
* Added the option to treat the continuous bandwidths as discrete with
lambda.discrete.num+1 values in the range [0,1] which can be more
computationally efficient when a `quick and dirty' solution is
sufficient rather than conducting mixed integer search treating the
lambda as real-valued
* Corrected incorrect warning about using basis="auto" when there was
only one continuous predictor
Changes from Version 0.15-6 to 0.15-7 [24-Oct-2011]
* Added logical model.return to crs (default model.return=FALSE) which
previously returned a list of models corresponding to each unique
combination of the categorical predictors when kernel=TRUE (the
memory footprint could be potentially very large so this allows the
user to generate this list if so needed)
Changes from Version 0.15-5 to 0.15-6 [17-Oct-2011]
* Compiler error thrown on some systems due to changes in
Eval_Point.hpp corrected
Changes from Version 0.15-4 to 0.15-5 [16-Oct-2011]
* Thanks to Professor Brian Ripley, additional Solaris C/C++ compiler
warnings/issues have been resolved
* Some internal changes for soon to be deprecated functionality
(sd(<matrix>) expected to be deprecated shortly)
Changes from Version 0.15-3 to 0.15-4 [15-Oct-2011]
* Extended the gsl.bs functionality to permit out-of-sample prediction
of the spline basis and its derivatives
* Added option knots="auto" to automatically determine via
cross-validation whether to use quantile or uniform knots
* Minor changes to help page examples and descriptions and to the crs
vignette
Changes from Version 0.15-2 to 0.15-3 [05-Sep-2011]
* Fixed glitch when all degrees are zero when computing the
cross-validation function (also fixes glitch when all degrees are
zero when plotting the partial surfaces)
* Added new function crssigtest (to be considered in beta status until
further notice)
* Added F test for no effect (joint test of significance) in crs
summary
* Both degree and segments now set to one for first multistart in crs
(previously only degree was, but intent was always to begin from a
linear model (with interactions where appropriate) so this glitch is
corrected)
* Test for pathological case in npglpreg when initializing bandwidths
where IQR is zero but sd > 0 (for setting robust sd) which occurs
when there exist many repeated values for a continuous predictor
* Added `typical usage' preformatted illustrations for docs
Changes from Version 0.15-1 to 0.15-2 [30-Jul-2011]
* Renamed COPYING file to COPYRIGHTS
Changes from Version 0.15-0 to 0.15-1 [29-Jul-2011]
* Automated detection of ordered/unordered factors implemented
* Initial degree values set to 1 when conducting NOMAD search (only
for initial, when nmulti > 1 random valid values are generated)
* Multiple tests for well-conditioned B-spline bases, dynamic
modification of search boundaries when ill-conditioned bases are
detected, and detection of non-positive degrees of freedom and full
column rank of the spline basis (otherwise the penalty
sqrt(.Machine$double.xmax) is returned during search) - this can
lead to a significant reduction in the memory footprint
* Added support for generalized B-spline kernel bases (varying order
generalized polynomial)
* Corrected issue with plot when variables were cast as factors in the
model formula
* Fixed glitch with return object and i/o when cv="bandwidth" and
degree=c(0,0,...,0)
* Added tests for pathological cases (e.g. optimize degree and knots
but set max degree to min degree or max segments to min hence no
search possible).
* Added argument cv.threshold that uses exhaustive search for simple
cases where the number of objective function evaluations is less
than cv.threshold (currently set to 1000 but user can
set). Naturally exhaustive search is always preferred but often
unfeasible, so when it is feasible use it.
* Added additional demos for constrained estimation (Du, Parmeter, and
Racine (2011)), inference, and a sine-based function.
* Substantial reductions in run-time realized.
- Product kernel computation modified for improved run-time of
kernel-based cross-validation and estimation.
- Moved from lsfit to lm.fit and from lm to lm.wfit/lm.fit in
cv.kernel.spline and cv.factor.spline (compute objective function
values). Two effects - R devel indicates lm.fit/lm.wfit are more
robust (confirmed for large number of predictors) and much faster
cv.kernel.spline function emerges (run-time cut 20-30%).
- The combined effects of these changes are noticeable. For instance,
run-time for wage1 with 7 predictor cross-validation goes from 510
seconds in 0.15-0 to 304 seconds due to use of lm.fit/lm.wfit
described below to 148 seconds due to the modified kernel function.
Changes from Version 0.14-9 to 0.15-0 [23-Jun-2011]
* Thanks to Professor Brian Ripley, compile on Solaris system issues
are resolved, and check/examples are reduced in run time to
alleviate the excessive check times by the R development team. Many
thanks to them for their patience and guidance.
* Minor changes to radial_rgl demo
Changes from Version 0.14-8 to 0.14-9 [20-Jun-2011]
* Cleaned up issues for creating binary for windows
* Setting seed in crs -> frscvNOMAD/krscvNOMAD fed to snomadr.cpp via
snomadr.R for starting points when nmulti > 0
* frscvNOMAD and krscvNOMAD will check the number of times the
objective function, compare with MAX_BB_EVAL and give the warning if
the maximum is reached (nmulti * MAX_BB_EVAL - note this will only
detect the case where every restart hit MAX_BB_EVAL)
* Increased default MAX_BB_EVAL from 500 to 10000 (makes a difference
for difficult problems) and modified default EPSILON in NOMAD along
with other parameters (MIN_MESH_SIZE, MIN_POLL_SIZE) to reflect
actual machine precision (using R's .Machine$double.eps where NOMAD
fixed EPSILON at 1e-13)
* Zhenghua added help functionality for retrieving help via snomadr
* Now default number of restarts in crs is 5 (zero is not reliable
and I want sensible defaults in this package - higher is better but
for many problems this ought to suffice)
* Corrected glitches in interactive demos where options were not being
passed, updated docs to reflect demos
Changes from Version 0.14-7 to 0.14-8 [10-Jun-2011]
* crsiv now returns a crs model object that supports residuals,
fitted, predict and other generic functions. Note that this approach
is based on first computing the model via regularization and then
feeding a transformed response to a crs model object. You can test
how close the two approaches are to one another by comparing
model$phihat with fitted(model) via
all.equal(as.numeric(fitted(model)),as.numeric(model$phihat))
Changes from Version 0.14-6 to 0.14-7 [09-Jun-2011]
* Checking for non-auto basis corrected, modify crsiv to use auto
basis.
* When degree==0 segments is not used, so set segments==1 in this case
(could be NA).
Changes from Version 0.14-3 to 0.14-6 [08-Jun-2011]
* Initial release of the crs package on CRAN.
* Added internal support for NOMAD via snomadr (Zhenghua's simple
NOMAD interface). Ideally in the future Sebastien Le Digabel will be
releasing an R package for NOMAD and we will be able to dump our
internal code and rely on his R package. In the meantime this allows
us to proceed with an R package.
* Numerous internal changes incorporated, most noteworthy additional
flexibility in search provided by degree.min, degree.max,
segments.min, and segments.max. Replaces basis.maxdim.
* Default now search via NOMAD, kernel=TRUE.
* crsiv implemented.
* Engel95 data added.
* Demos added.
Changes from Version 0.14-2 to 0.14-3 [01-May-2011]
* Zhenghua migrated the necessary files from the gsl so that the crs
package is now standalone with the exception of NOMAD.
Changes from Version 0.14-0 to 0.14-2 [30-Apr-2011]
* Zhenghua migrated the bare-bones code required for the mgcv functions
uniquecombs and tensor.prod.model.matrix, and I migrated the
associated manpages, modified NAMESPACE and now the package does not
require loading of mgcv which was sub-optimal.
* dropped cv.norm and replaced with cv.func=c("cv.ls","cv.gcv","cv.aic").
Changes from Version 0.13-8 to 0.14-0 [14-Apr-2011]
* tensor products now working properly (need gsl.bs intercept and no
intercept in lm()) so dumped "additive-tensor" option and had to
rework code to support additive (lm() has intercept, gsl.bs no
intercept here) etc. In the process realized derivatives were all
effed up and now for the first time they work (with segments being
any value at all etc.).
Changes from Version 0.13-7 to 0.13-8 [30-Jan-2011]
* cv now default, increased basis.maxdim to 10
* segments and degree hit different maxima (basis.maxdim+1,
basis.maxdim) but `degree equal to basis.maxdim' warning was only
being issued when equal to basis.maxdim.
* restarts was not being passed to krscv
Changes from Version 0.13-6 to 0.13-7 [9-Dec-2010]
* added option to choose both degree and knots independently for each
predictor and (more importantly) cross-validate both
Changes from Version 0.13-5 to 0.13-6 [8-Dec-2010]
* added option to use either uniform knots (equally spaced segments)
or quantile knots (equal number of observations in each segment)
Changes from Version 0.13-4 to 0.13-5 [17-Apr-2010]
* changed package and function names to `crs' as Lijian strictly
prefers `regression spline' to `smoothing spline' (The major
difference is to place knots on sample points (smoothing spline,
Wahba) or on equidistant deterministic points (B spline, Stone))
Changes from Version 0.13-3 to 0.13-4 [07-Apr-2010]
* fixed annoying "Working... Working" in plot (message produced by
predict.*.spline sufficient here)
* added option for L1 CV (incomplete as properly this would require L1
estimation as well)
* neglected to set cv and cv.pruned when degree is zero (fixed)
* adjusted R-squared corrected
Changes from Version 0.13-2 to 0.13-3 [28-Mar-2010]
* had to revamp derivative code to support pruning by taking out the
reordering prior to derivative estimation - new approach is more
natural with less issues. Examples all pass with flying colors
* had to modify the add1.lm function (modified version is add1.lm.cv)
to flesh out stepCV
Changes from Version 0.13-1 to 0.13-2 [28-Mar-2010]
* modified stepAIC code from the MASS package to admit the
cross-validation objective function, renamed to stepCV to not
clobber users using stepAIC - this is a three line addition of the
extractCV() function followed by wholesale replacement of AIC with
CV - appears to work perfectly and now pruning returns models that
indeed improve in terms of CV (was mixing objective functions
pruning with stepAIC at the final stage (BIC) and hence testing for
model that improves etc.)
Changes from Version 0.13-0 to 0.13-1 [25-Mar-2010]
* migrated pruning inside predict.factor.spline, derivatives not
crashing and are working for one continuous regressor - one major
benefit is that the object is now a proper cssreg object hence
predict etc. are fully supported (the old function prune.cssreg
could not do this)
Changes from Version 0.12-6 to 0.13-0 [22-Mar-2010]
* added prune.cssreg that allows for a final pruning stage a la
Friedman's MARS, or alternatively a BIC model selection criterion to
complement cross-validation - I prefer first running
cross-validation as it operates at the _variable_ level while
stepAIC operates at the individual basis level
* added passing back of hatvalues from predict.factor.spline and
predict.kernel.spline to facilitate default plot (added Cook's
Distance)
* fixed corner case via data.frame(as.matrix(P))
* added passing of se back in fitted.values[,4] so now confidence
intervals for differences in levels are implemented (though no cov()
term). Also, went through code and cleaned up all lingering
issues. A true milestone hence the new cssver
Changes from Version 0.12-5 to 0.12-6 [22-Mar-2010]
* lwr/upr error bounds (95% cis) now being returned and used in plot
when ci=TRUE - lingering issue with differences in levels
Changes from Version 0.12-4 to 0.12-5 [21-Mar-2010]
* no longer using ncoeffs as a passable parameter, rather nbreak (>=2)
* all appears to be working smoothly. Now passing back model/model
list for cssreg. Next standard errors from predict and for
derivatives.
Changes from Version 0.12-3 to 0.12-4 [20-Mar-2010]
* kernel derivatives implemented and appear to be working
properly. drop=FALSE populated so that as.matrix() no longer needed
and handles corner cases of nrow=1 in matrices etc. This is the
first real fully functioning version with multivariate derivatives
etc. appearing to work. Passes R CMD check and numerous toy
examples. One big improvement is adding factor.to.numeric to
splitFrame() to admit both numeric and character string factors.
Changes from Version 0.12-2 to 0.12-3 [20-Mar-2010]
* streamlining code, removing now redundant functions (prod.spline can
be handled by prod.kernel.spline so removed former and renamed the
latter to prod.spline throughout)
Changes from Version 0.12-1 to 0.12-2 [16-Mar-2010]
* fitted and predict are faster than the matrix multiplication I am
using (not to mention simpler) - moving to these for simplicity and
to facilitate standard errors
Changes from Version 0.12-0 to 0.12-1 [16-Mar-2010]
* multivariate gradients working with tensor enabled/disabled - now
cleaning up to match np in terms of functionality
Changes from Version 0.11-9 to 0.12-0 [13-Mar-2010]
* gradients successfully implemented - remains to get factor/ordered
in plot and allowing for plot to handle factor in formula (not a
high priority)
Changes from Version 0.11-8 to 0.11-9 [13-Mar-2010]
* adding S3 plot class
* support for mean and (continuous) derivatives with drawback that
data must be cast as a factor to work
Changes from Version 0.11-6 to 0.11-8 [13-Mar-2010]
* multivariate bugs squashed for factor estimator... now to test for
kernel... if working, we move on to a plot method for the fit and
derivatives
Changes from Version 0.11-5 to 0.11-6 [11-Mar-2010]
* multivariate derivatives implemented and appear to be working, at
least when tensor is disabled so I can verify
* next to add to predict in order to get evaluation (attribute? could
add gradients function to do this? a bit odd but not sure what else
would be easier)
Changes from Version 0.11-4 to 0.11-5 [11-Mar-2010]
* about to begin adding derivatives
Changes from Version 0.10-8 to 0.11-4 [11-Mar-2010]
* total reworking of package for S3 methods - now one function cssreg
that does it all, supports predict properly etc. Big effort but now
much happier with code (streamlined and simpler to maintain)
* gsl spline support for derivatives, have dumped spline package as it
is missing its S-Plus counterpart's derivative capabilities
Changes from Version 0.10-8 to 0.10-9 [4-Mar-2010]
* cleaned up and lingering bugs gone related to use of predict.bs (now
compares exactly with pre package code)
Changes from Version 0.10-7 to 0.10-8 [4-Mar-2010]
* added factor smoothing splines using indicator basis functions
* this (fssreg) supplants ssreg, so will have a change from x=matrix
to x=data frame shortly
* corrected glitch in B-splines and reverting to using predict.bs as
previously
* removing poly() splines and switch
Changes from Version 0.10-6 to 0.10-7 [2-Mar-2010]
* removed straight search pending MIQP solver in R
* added better i/o to help gauge time to completion
* fixed bug in dimension warning
* added more trapping of issues for small cells having less data than
the spline degree etc. prior to beginning search
Changes from Version 0.10-5 to 0.10-6 [1-Mar-2010]
* multivariate x now implemented and working in ssregcv and cssregcv
for exhaustive search - issues remain with straight search
(parscale, ndeps etc.)
Changes from Version 0.10-4 to 0.10-5 [27-Feb-2010]
* new functions predict.spline, model.spline, and cv.spline
automatically handle K=0 and z/no z cases - code much more modular
hence handling multivariate x will in principle be simplified
Changes from Version 0.10-3 to 0.10-4 [27-Feb-2010]
* implementing tensor product splines in function to facilitate
multivariate x and make code base easier to handle - handle base case
of zero in this function so use intercept=TRUE
* revisiting search with discrete K
Changes from Version 0.10-2 to 0.10-3 [26-Feb-2010]
* added S3 predict method for ssreg and cssreg
* added ssreg and ssregcv for continuous-only x case to facilitate
comparison of improvements
* added wage1 and cps71 data and examples
Changes from Version 0.00-0 to 0.10-2 [24-Feb-2010]
* S3 methods for summary, print
* created functions cssreg and cssregcv
* initial version of the css package