Package: RecordLinkage 0.4-12.4
RecordLinkage: Record Linkage Functions for Linking and Deduplicating Data Sets
Provides functions for linking and deduplicating data sets. Methods based on a stochastic approach are implemented as well as classification algorithms from the machine learning domain. For details, see our paper "The RecordLinkage Package: Detecting Errors in Data" Sariyar M / Borg A (2010) <doi:10.32614/RJ-2010-017>.
Authors:
RecordLinkage_0.4-12.4.tar.gz
RecordLinkage_0.4-12.4.zip(r-4.5)RecordLinkage_0.4-12.4.zip(r-4.4)RecordLinkage_0.4-12.4.zip(r-4.3)
RecordLinkage_0.4-12.4.tgz(r-4.4-x86_64)RecordLinkage_0.4-12.4.tgz(r-4.4-arm64)RecordLinkage_0.4-12.4.tgz(r-4.3-x86_64)RecordLinkage_0.4-12.4.tgz(r-4.3-arm64)
RecordLinkage_0.4-12.4.tar.gz(r-4.5-noble)RecordLinkage_0.4-12.4.tar.gz(r-4.4-noble)
RecordLinkage_0.4-12.4.tgz(r-4.4-emscripten)RecordLinkage_0.4-12.4.tgz(r-4.3-emscripten)
RecordLinkage.pdf |RecordLinkage.html✨
RecordLinkage/json (API)
NEWS
# Install 'RecordLinkage' in R: |
install.packages('RecordLinkage', repos = c('https://sym33.r-universe.dev', 'https://cloud.r-project.org')) |
- RLdata10000 - Test data for Record Linkage
- RLdata500 - Test data for Record Linkage
- identity.RLdata10000 - Test data for Record Linkage
- identity.RLdata500 - Test data for Record Linkage
This package does not link to any Github/Gitlab/R-forge repository. No issue tracker or development information is available.
Last updated 2 years agofrom:b324521498. Checks:OK: 7 NOTE: 2. Indexed: yes.
Target | Result | Date |
---|---|---|
Doc / Vignettes | OK | Nov 17 2024 |
R-4.5-win-x86_64 | NOTE | Nov 17 2024 |
R-4.5-linux-x86_64 | NOTE | Nov 17 2024 |
R-4.4-win-x86_64 | OK | Nov 17 2024 |
R-4.4-mac-x86_64 | OK | Nov 17 2024 |
R-4.4-mac-aarch64 | OK | Nov 17 2024 |
R-4.3-win-x86_64 | OK | Nov 17 2024 |
R-4.3-mac-x86_64 | OK | Nov 17 2024 |
R-4.3-mac-aarch64 | OK | Nov 17 2024 |
Exports:[.RecLinkData[.RecLinkResult[.RLBigData[.RLResult%append%beginblockfldfunclassifySupvclassifyUnsupclearclonecompare.dedupcompare.linkagecountpatterndeleteNULLseditMatchemClassifyemWeightsepiClassifyepiWeightserrorMeasuresfsClassifyfsWeightsgenSamplesgetColumnNamesgetErrorMeasuresgetExpectedSizegetFalsegetFalseNeggetFalsePosgetFrequenciesgetMatchCountgetMinimalTraingetNACountgetNonMatchCountgetPairsgetPairsBackendgetParetoThresholdgetPatternCountsgetSQLStatementgetTablegetThresholdsgpdEsthasWeightsinit_sqlite_extensionsisFALSEjarowinklerlevenshteinDistlevenshteinSimloadRLObjectmakeBlockingPairsmrlmygllmnextPairsoptimalThresholdplotMRLprint.summaryRLBigDataDedupprint.summaryRLBigDataLinkageprint.summaryRLResultresampleRLBigDataDedupRLBigDataLinkagesaveRLObjectsoundexsplitDatasummary.RecLinkDatasummary.RecLinkResultsummary.RLBigDataDedupsummary.RLBigDataLinkagesummary.RLResulttexSummarytrainSupvunorderedPairs
Dependencies:adabitbit64blobcachemclassclicodetoolscpp11data.tableDBIdiagramdigeste1071evdfastmapfffuturefuture.applyglobalsglueipredKernSmoothlatticelavalifecyclelistenvMASSMatrixmemoisennetnumDerivparallellypkgconfigplogrprodlimprogressrproxyRcpprlangrpartRSQLiteshapeSQUAREMsurvivalvctrsxtable
Classes for record linkage of big data sets
Rendered fromBigData.rnw
usingknitr::knitr
on Nov 17 2024.Last update: 2020-04-09
Started: 2012-01-11
Record Linkage with Extreme Value Theory
Rendered fromEVT.rnw
usingknitr::knitr
on Nov 17 2024.Last update: 2020-04-09
Started: 2012-01-11
Supervised Classification
Rendered fromSupervised.rnw
usingknitr::knitr
on Nov 17 2024.Last update: 2020-04-09
Started: 2012-01-11
Weight-based deduplication
Rendered fromWeightBased.rnw
usingknitr::knitr
on Nov 17 2024.Last update: 2022-11-08
Started: 2012-01-11
Readme and manuals
Help Manual
Help page | Topics |
---|---|
Concatenate comparison patterns or classification results | %append% %append%,RecLinkData,RecLinkData-method %append%,RecLinkResult,RecLinkResult-method %append%-methods |
Supervised Classification | classifySupv classifySupv,RecLinkClassif,RecLinkData-method classifySupv,RecLinkClassif,RLBigData-method classifySupv-methods |
Unsupervised Classification | classifyUnsup |
Serialization of record linkage object. | clone clone,RLBigData-method clone,RLResult-method clone-methods loadRLObject saveRLObject saveRLObject,RLBigData-method saveRLObject,RLResult-method saveRLObject-methods |
Compare Records | compare.dedup compare.linkage |
Remove NULL Values | deleteNULLs |
Edit Matching Status | editMatch editMatch,RecLinkData-method editMatch,RLBigData-method editMatch-methods |
Weight-based Classification of Data Pairs | emClassify emClassify,RecLinkData,ANY,ANY-method emClassify,RecLinkData,missing,missing-method emClassify,RLBigData,ANY,ANY-method emClassify,RLBigData,missing,missing-method emClassify,RLBigData-method |
Calculate weights | emWeights emWeights,RecLinkData-method emWeights,RLBigData-method emWeights-methods |
Classify record pairs with EpiLink weights | epiClassify epiClassify,RecLinkData-method epiClassify,RLBigData-method epiClassify-methods |
Calculate EpiLink weights | epiWeights epiWeights,RecLinkData-method epiWeights,RLBigData-method epiWeights-methods |
Class '"ff_vector"' | ff_vector-class |
Class '"ffdf"' | ffdf-class |
Generate Training Set | genSamples |
Calculate Error Measures | errorMeasures getErrorMeasures getErrorMeasures,RecLinkResult-method getErrorMeasures,RLResult-method getErrorMeasures-methods |
Estimate number of record pairs. | getExpectedSize getExpectedSize,data.frame-method getExpectedSize,RLBigDataDedup-method getExpectedSize,RLBigDataLinkage-method getExpectedSize-methods |
Get attribute frequencies | getFrequencies getFrequencies,RLBigData-method getFrequencies-methods |
Create a minimal training set | getMinimalTrain getMinimalTrain,RecLinkData-method getMinimalTrain,RLBigData-method getMinimalTrain-methods |
Extract Record Pairs | getFalse getFalseNeg getFalsePos getPairs getPairs,RecLinkData-method getPairs,RecLinkResult-method getPairs,RLBigData-method getPairs,RLResult-method getPairs-methods |
Estimate Threshold from Pareto Distribution | getParetoThreshold getParetoThreshold,RecLinkData-method getParetoThreshold,RLBigData-method getParetoThreshold-methods |
Build contingency table | getTable getTable,RecLinkResult-method getTable,RLResult-method getTable-methods |
Estimate Threshold from Pareto Distribution | gpdEst |
Check for FALSE | isFALSE |
Generalized Log-Linear Fitting | mygllm |
Optimal Threshold for Record Linkage | optimalThreshold optimalThreshold,RecLinkData-method optimalThreshold,RLBigData-method optimalThreshold-methods |
Phonetic Code | phonetics soundex |
Class "RecLinkClassif" | RecLinkClassif RecLinkClassif-class |
Class "RecLinkData" | RecLinkData-class |
Record Linkage Data Object | RecLinkData RecLinkData.object |
Class "RecLinkResult" | RecLinkResult-class |
Record Linkage Result Object | RecLinkResult |
Safe Sampling | resample |
Class "RLBigData" | RLBigData-class |
Constructors for big data objects. | RLBigDataDedup RLBigDataLinkage |
Class "RLBigDataDedup" | RLBigDataDedup-class |
Class "RLBigDataLinkage" | RLBigDataLinkage-class |
Test data for Record Linkage | identity.RLdata10000 identity.RLdata500 RLdata10000 RLdata500 |
Class "RLResult" | RLResult-class |
Show a RLBigData object | show show,RLBigData-method |
Split Data | splitData |
Stochastic record linkage. | fsClassify fsClassify,RecLinkData-method fsClassify,RLBigData-method fsClassify-methods fsWeights fsWeights,RecLinkData-method fsWeights,RLBigData-method fsWeights-methods |
String Metrics | jaro jarowinkler levenshtein levenshteinDist levenshteinSim strcmp winkler |
Subset operator for record linkage objects | [.RecLinkData [.RecLinkResult [.RLBigData [.RLResult |
Print Summary of Record Linkage Data | summary.RecLinkData summary.RecLinkResult |
summary methods for '"RLBigData"' objects. | print.summaryRLBigDataDedup print.summaryRLBigDataLinkage summary.RLBigData summary.RLBigDataDedup summary.RLBigDataLinkage |
Summary method for '"RLResult"' objects. | print.summaryRLResult summary,RLResult-method summary.RLResult |
Train a Classifier | trainSupv |
Create Unordered Pairs | unorderedPairs |