Error output when using RLBigData class in R's RecordLinkage packageHow to unload a package without restarting RProblem with R package RecordLinkageNeed Help on Calculate Percentage of Failure of All Columns in Very Large MySQL TableError in model.frame.default(Terms, newdata, na.action = na.pass, xlev = object$xlevels) : factor has new levelsR session is being terminated while executing comparison function using RecordLinkage ,R packagePython RecordLinkage - Supervised Machine Learning ErrorIterate through columns and row values (list) in R dplyrHow to calculate a column in a data table or frame based on table information in RCustomized comparison Python RecordLinkageRecordLinkage package in R - add weight to individual linking variables
Must I use my personal social media account for work?
My mom's return ticket is 3 days after I-94 expires
When editor does not respond to the request for withdrawal
Why are ambiguous grammars bad?
What's the difference between DHCP and NAT? Are they mutually exclusive?
Fixed-Do Solfege in A Major scale with accidentals
Is it true that "only photographers care about noise"?
Are the guests in Westworld forbidden to tell the hosts that they are robots?
Does a single fopen introduce TOCTOU vulnerability?
Why is the concept of the Null hypothesis associated with the student's t distribution?
Is fission/fusion to iron the most efficient way to convert mass to energy?
If the pressure inside and outside a balloon balance, then why does air leave when it pops?
Can I attach a DC blower to intake manifold of my 150CC Yamaha FZS FI engine?
Which game is this?
David slept with Bathsheba because she was pure?? What does that mean?
Is Jesus the last Prophet?
Is all-caps blackletter no longer taboo?
Why did the Death Eaters wait to reopen the Chamber of Secrets?
Was the Lonely Mountain, where Smaug lived, a volcano?
Savage Road Signs
A team managed by my peer is close to melting down
how to fix error not showing in magento 2
Forgot passport for Alaska cruise (Anchorage to Vancouver)
Idiom for 'person who gets violent when drunk"
Error output when using RLBigData class in R's RecordLinkage package
How to unload a package without restarting RProblem with R package RecordLinkageNeed Help on Calculate Percentage of Failure of All Columns in Very Large MySQL TableError in model.frame.default(Terms, newdata, na.action = na.pass, xlev = object$xlevels) : factor has new levelsR session is being terminated while executing comparison function using RecordLinkage ,R packagePython RecordLinkage - Supervised Machine Learning ErrorIterate through columns and row values (list) in R dplyrHow to calculate a column in a data table or frame based on table information in RCustomized comparison Python RecordLinkageRecordLinkage package in R - add weight to individual linking variables
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
When using the R package RecordLinkage, some of the outputs following epiClassify() or emClassify() functions (which can operate on RLBigDataLinkageclass objects) will output errors. These errors are not seen when I use the functions designed for smaller data comparisons, such as compare.linkage(). The documentation describing this can be found in the package vignettes here.
My overall objective is to 'fuzzy match' data between two tables; once I know which rows are similar between the tables based on other variables, I can then use the index to grab unique IDs from the column in one table which are missing from the other table.
Data: This is some dummy data to reproduce the errors...
library(tibble)
table1 <- tibble(col1 = c("JIMMY", "SARA", "AYIL", "JIM", "JOHN"),
col2 = c("OHEARN", "HANDLE", "HASE", "JHORN", ""),
col3 = c("jdt322", "jdb122", "", "ddd532", "ddd444"))
table2 <- tibble(col1 = c("JIMMY", "SARAH", "SARA", "AYIL", "JIM", "JOHN", "timm"),
col2 = c("OHEARN", "HAND","H", "HASE", "JORN", "", ""),
col3 = c("jdt322", "jda122", "jdb112", "", "ddd532", "ddd444", "ddd322"))
When I use the above data in the functions for smaller comparisons, it works fine, without errors:
mypairs <- RecordLinkage::compare.linkage(table1, table2, strcmp = T)
mypairs_weights <- emWeights(mypairs)
result <- emClassify(mypairs_weights, threshold.lower = 1)
getTable(result)
summary(result)
getPairs(result, min.weight = 1)
However, after I use the RLBigData class (as in the code below), I get errors when:
I try to access the object
resultafteremClassify(): Error in nrow(object@pairs) :
no slot of name "pairs" for this object of class "RLResult"Trying to grab the summary with
summary(result): Error in dbGetQuery(object@con, "select count() from data1") :
no slot of name "con" for this object of class "RLBigDataLinkage"*Trying to grab the comparison table with
getTable(result): Error in table.ff(object@data@pairs$is_match, object@prediction, useNA = "ifany") :
Only vmodes integer currently allowed - are you sure ... contains only factors or integers?Warnings are also thrown when first running the
RLBigDataLinkage()function: Warning messages:
1: In result_fetch(res@ptr, n = n) :
Don't need to call dbFetch() for statements, only for queries
The following code should reproduce these errors
mypairs <- RLBigDataLinkage(table1, table2,
strcmp = T,
strcmpfun = "jarowinkler")
mypairs_weights <- emWeights(mypairs)
result <- emClassify(mypairs_weights, 0.6)
result
getTable(result)
summary(result)
getPairs(result, min.weight = 0.5)
I am not certain as to why the code spits these errors in the second case. I wanted to troubleshoot the code on a small dataset before moving onto larger datasets. I would greatly appreciate it if anyone could shed light on these errors and warnings output by this package.
r bigdata record-linkage
add a comment |
When using the R package RecordLinkage, some of the outputs following epiClassify() or emClassify() functions (which can operate on RLBigDataLinkageclass objects) will output errors. These errors are not seen when I use the functions designed for smaller data comparisons, such as compare.linkage(). The documentation describing this can be found in the package vignettes here.
My overall objective is to 'fuzzy match' data between two tables; once I know which rows are similar between the tables based on other variables, I can then use the index to grab unique IDs from the column in one table which are missing from the other table.
Data: This is some dummy data to reproduce the errors...
library(tibble)
table1 <- tibble(col1 = c("JIMMY", "SARA", "AYIL", "JIM", "JOHN"),
col2 = c("OHEARN", "HANDLE", "HASE", "JHORN", ""),
col3 = c("jdt322", "jdb122", "", "ddd532", "ddd444"))
table2 <- tibble(col1 = c("JIMMY", "SARAH", "SARA", "AYIL", "JIM", "JOHN", "timm"),
col2 = c("OHEARN", "HAND","H", "HASE", "JORN", "", ""),
col3 = c("jdt322", "jda122", "jdb112", "", "ddd532", "ddd444", "ddd322"))
When I use the above data in the functions for smaller comparisons, it works fine, without errors:
mypairs <- RecordLinkage::compare.linkage(table1, table2, strcmp = T)
mypairs_weights <- emWeights(mypairs)
result <- emClassify(mypairs_weights, threshold.lower = 1)
getTable(result)
summary(result)
getPairs(result, min.weight = 1)
However, after I use the RLBigData class (as in the code below), I get errors when:
I try to access the object
resultafteremClassify(): Error in nrow(object@pairs) :
no slot of name "pairs" for this object of class "RLResult"Trying to grab the summary with
summary(result): Error in dbGetQuery(object@con, "select count() from data1") :
no slot of name "con" for this object of class "RLBigDataLinkage"*Trying to grab the comparison table with
getTable(result): Error in table.ff(object@data@pairs$is_match, object@prediction, useNA = "ifany") :
Only vmodes integer currently allowed - are you sure ... contains only factors or integers?Warnings are also thrown when first running the
RLBigDataLinkage()function: Warning messages:
1: In result_fetch(res@ptr, n = n) :
Don't need to call dbFetch() for statements, only for queries
The following code should reproduce these errors
mypairs <- RLBigDataLinkage(table1, table2,
strcmp = T,
strcmpfun = "jarowinkler")
mypairs_weights <- emWeights(mypairs)
result <- emClassify(mypairs_weights, 0.6)
result
getTable(result)
summary(result)
getPairs(result, min.weight = 0.5)
I am not certain as to why the code spits these errors in the second case. I wanted to troubleshoot the code on a small dataset before moving onto larger datasets. I would greatly appreciate it if anyone could shed light on these errors and warnings output by this package.
r bigdata record-linkage
add a comment |
When using the R package RecordLinkage, some of the outputs following epiClassify() or emClassify() functions (which can operate on RLBigDataLinkageclass objects) will output errors. These errors are not seen when I use the functions designed for smaller data comparisons, such as compare.linkage(). The documentation describing this can be found in the package vignettes here.
My overall objective is to 'fuzzy match' data between two tables; once I know which rows are similar between the tables based on other variables, I can then use the index to grab unique IDs from the column in one table which are missing from the other table.
Data: This is some dummy data to reproduce the errors...
library(tibble)
table1 <- tibble(col1 = c("JIMMY", "SARA", "AYIL", "JIM", "JOHN"),
col2 = c("OHEARN", "HANDLE", "HASE", "JHORN", ""),
col3 = c("jdt322", "jdb122", "", "ddd532", "ddd444"))
table2 <- tibble(col1 = c("JIMMY", "SARAH", "SARA", "AYIL", "JIM", "JOHN", "timm"),
col2 = c("OHEARN", "HAND","H", "HASE", "JORN", "", ""),
col3 = c("jdt322", "jda122", "jdb112", "", "ddd532", "ddd444", "ddd322"))
When I use the above data in the functions for smaller comparisons, it works fine, without errors:
mypairs <- RecordLinkage::compare.linkage(table1, table2, strcmp = T)
mypairs_weights <- emWeights(mypairs)
result <- emClassify(mypairs_weights, threshold.lower = 1)
getTable(result)
summary(result)
getPairs(result, min.weight = 1)
However, after I use the RLBigData class (as in the code below), I get errors when:
I try to access the object
resultafteremClassify(): Error in nrow(object@pairs) :
no slot of name "pairs" for this object of class "RLResult"Trying to grab the summary with
summary(result): Error in dbGetQuery(object@con, "select count() from data1") :
no slot of name "con" for this object of class "RLBigDataLinkage"*Trying to grab the comparison table with
getTable(result): Error in table.ff(object@data@pairs$is_match, object@prediction, useNA = "ifany") :
Only vmodes integer currently allowed - are you sure ... contains only factors or integers?Warnings are also thrown when first running the
RLBigDataLinkage()function: Warning messages:
1: In result_fetch(res@ptr, n = n) :
Don't need to call dbFetch() for statements, only for queries
The following code should reproduce these errors
mypairs <- RLBigDataLinkage(table1, table2,
strcmp = T,
strcmpfun = "jarowinkler")
mypairs_weights <- emWeights(mypairs)
result <- emClassify(mypairs_weights, 0.6)
result
getTable(result)
summary(result)
getPairs(result, min.weight = 0.5)
I am not certain as to why the code spits these errors in the second case. I wanted to troubleshoot the code on a small dataset before moving onto larger datasets. I would greatly appreciate it if anyone could shed light on these errors and warnings output by this package.
r bigdata record-linkage
When using the R package RecordLinkage, some of the outputs following epiClassify() or emClassify() functions (which can operate on RLBigDataLinkageclass objects) will output errors. These errors are not seen when I use the functions designed for smaller data comparisons, such as compare.linkage(). The documentation describing this can be found in the package vignettes here.
My overall objective is to 'fuzzy match' data between two tables; once I know which rows are similar between the tables based on other variables, I can then use the index to grab unique IDs from the column in one table which are missing from the other table.
Data: This is some dummy data to reproduce the errors...
library(tibble)
table1 <- tibble(col1 = c("JIMMY", "SARA", "AYIL", "JIM", "JOHN"),
col2 = c("OHEARN", "HANDLE", "HASE", "JHORN", ""),
col3 = c("jdt322", "jdb122", "", "ddd532", "ddd444"))
table2 <- tibble(col1 = c("JIMMY", "SARAH", "SARA", "AYIL", "JIM", "JOHN", "timm"),
col2 = c("OHEARN", "HAND","H", "HASE", "JORN", "", ""),
col3 = c("jdt322", "jda122", "jdb112", "", "ddd532", "ddd444", "ddd322"))
When I use the above data in the functions for smaller comparisons, it works fine, without errors:
mypairs <- RecordLinkage::compare.linkage(table1, table2, strcmp = T)
mypairs_weights <- emWeights(mypairs)
result <- emClassify(mypairs_weights, threshold.lower = 1)
getTable(result)
summary(result)
getPairs(result, min.weight = 1)
However, after I use the RLBigData class (as in the code below), I get errors when:
I try to access the object
resultafteremClassify(): Error in nrow(object@pairs) :
no slot of name "pairs" for this object of class "RLResult"Trying to grab the summary with
summary(result): Error in dbGetQuery(object@con, "select count() from data1") :
no slot of name "con" for this object of class "RLBigDataLinkage"*Trying to grab the comparison table with
getTable(result): Error in table.ff(object@data@pairs$is_match, object@prediction, useNA = "ifany") :
Only vmodes integer currently allowed - are you sure ... contains only factors or integers?Warnings are also thrown when first running the
RLBigDataLinkage()function: Warning messages:
1: In result_fetch(res@ptr, n = n) :
Don't need to call dbFetch() for statements, only for queries
The following code should reproduce these errors
mypairs <- RLBigDataLinkage(table1, table2,
strcmp = T,
strcmpfun = "jarowinkler")
mypairs_weights <- emWeights(mypairs)
result <- emClassify(mypairs_weights, 0.6)
result
getTable(result)
summary(result)
getPairs(result, min.weight = 0.5)
I am not certain as to why the code spits these errors in the second case. I wanted to troubleshoot the code on a small dataset before moving onto larger datasets. I would greatly appreciate it if anyone could shed light on these errors and warnings output by this package.
r bigdata record-linkage
r bigdata record-linkage
edited Mar 25 at 14:17
Klink
asked Mar 24 at 23:34
KlinkKlink
318
318
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55329605%2ferror-output-when-using-rlbigdata-class-in-rs-recordlinkage-package%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55329605%2ferror-output-when-using-rlbigdata-class-in-rs-recordlinkage-package%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown