https://github.com/cran/RecordLinkage
Tip revision: b32452149857b15849e9c82fb81e812df6e921fc authored by Murat Sariyar on 08 November 2022, 13:10:15 UTC
version 0.4-12.4
version 0.4-12.4
Tip revision: b324521
NEWS
Changes in version 0.4-12
-adapted factor handling to the new R standard
-switched from sweave to knitr for the vignettes
-fixed some SQL statement issues in RSQLite
-fixed the Pointer Mismatch (const char *, const unsigned char *) in sqlite_extensions.c
-removed the phonetic code function due to its encoding-dependent mapping tables
-made the package independent from the now orphaned ffbase package
-Replace class(x)==y statementy by is(x,y) statements
Changes in version 0.4-11
-added missing S3 method exports
-fixed broken/non-canonical URLs
-fixed some logical expressions to deal with non-scalar values
(contributed by Kurt Hornik)
Changes in version 0.4-10
-added SQLite header files to be compatible with upcoming RSQLite package,
which does not export them anymore
Changes in version 0.4-9
-added import declarations to pass R 3.3.0 package checks
Changes in version 0.4-8
-fixed import declaration for ffbase imports
-translated source code omments
-removed non-ASCII characters
-added S3method-declarations in NAMESPACE
Changes in version 0.4-7
-added unit tests (inst/unitTests) to .Rbuildignore
-accelerated example for epiWeights and epiClassify
Changes in version 0.4-6
-updated maintainer, author and contact information
-fixed notes "no visible binding" in R >= 2.15.1
-moved packages in "Depends" to "Imports" where appropriate and added
import statement in NAMESPACE
Changes in version 0.4-5
-fixed truncated lines in PDF manual
Changes in version 0.4-4
-fixed memory access problem in C code
Changes in version 0.4-3
-added check if bringToTop() is availabe in getParetoThreshold
-avoid warning in countpattern
Changes in version 0.4-2
-changed the way keys are set for data tables for compatibility with data.table 1.7.8
Changes in version 0.4-1
-Removed printf statements from compiled code to prevent NOTE in R CMD check
Changes in version 0.4-0
-The classes for linkage of big data sets have been redesigned. The mechanism of creating comparison patterns on the fly from a database is replaced by saving all comparison patterns in a ffdf object (disk-based data frame, see package ff). This approach limits the amount of comparison patterns with respect to available disk space (the database approach was designed to handle even more patterns, at least in theory), but is more efficient regarding execution speed. In addition, some methods that were only usable with the S3 class RecLinkData, can now be used with RLBigData objects, most notably getMinimalTrain(). For the structure of the redesigned classes, see the documention entries for classes "RLBigData", "RLBigDataDedup" and "RLBigDataLinkage".
Changes in version 0.3-5
-bug fixes:
-getExpectedSize failed in rare cases
Changes in version 0.3-4
-added summary method for RLResult objects
-bug fixes:
-summary.RecLinkResult failed for object without known
matching status
Changes in version 0.3-3
-added new classification method "bumping", see ?trainSupv
-bug fixes:
- string comparators crashed for character(0) input
(reported by Dominik Reusser)
- getPairs could fail when the result is emtpy
-various performance improvements
-renamed vignette "Fellegi-Sunter Deduplication" to
"Weight-based deduplication"
-package is now in a namespace
Changes in version 0.3-2
- performance improvements in various functions
- bug fixes:
- epiClassify failed for datasets with only one column
(reported by Richard Herron)
- getErrorMeasures returned wrong values in rare cases
Changes in version 0.3-1
- fixed bug in emClassify that threw an error during classification
with on-the-fly calculation of weights
- added checks in unit tests
- fixed bug in optimalThreshold which caused wrong results in some
cases when ny was supplied
- added status messages and progress bars to various classification functions
- fixed numerical problem that produced wrong thresholds in some cases
for emClassify (with my / ny supplied)
Changes in version 0.3-0
- A framework of S4 methods and classes tailored for processing large
data sets has been added. See the vignette 'BigData' for details.
- The output format of getPairs (existing method for RecLinkData and
RecLinkResult objects) has changed: Column "Weight" is now last instead
of first. Also, for single.rows=FALSE, record pairs are speparated by
a blank line.
Changes in version 0.2-6
- added subset operator [ for RecLinkData and RecLinkPairs objects
- improved running time of epiWeights significantly
Changes in version 0.2-5
- Fixed a bug in getMinimalTrain which caused the function to fail
Changes in version 0.2-4
- Fixed a bug in getMinimalTrain that caused pairs with unknown
matching status to become non-matches in the training set.
- Some examples were modified to reduce execution time.
- Unit tests have been added in inst/unitTests.
- Checking of arguments improved in some functions.
- Small documentation changes.
Changes in version 0.2-3
- getParetoThreshold now prompts a message if no data lies in the
interactively selected interval and asks the user to choose again
- Corrected maintainer e-mail address
- Several bug fixes