how can I convert values to a meaning separated with a colon (double point)Simultaneously merge multiple data.frames in a listCharacters counting and subletting specific patternsdata.table vs dplyr: can one do something well the other can't or does poorly?how to make a bar plot for a list of dataframes?Matplot not plotting datasetOrder of elements in CharacterVector and NumericVector in rcppFilter two tables with crosstalkR- How to use map() into map()R - Pmap() instead of Map()How to select highest value of col in r
Proof of bound on optimal TSP tour length in rectangular region
What are the most important factors in determining how fast technology progresses?
This fell out of my toilet when I unscrewed the supply line. What is it?
Making a animation of multiple 3D objects rotating
Can someone identify this old round connector?
Would Great Old Ones care about the Blood War?
SHA3-255, one bit less
Can/should you swim in zero G?
Narrow streets behind houses
In what sense is SL(2,q) "very far from abelian"?
Determine the Winner of a Game of Australian Football
Characters in a conversation
Race condition interview question: Min and Max range of an integer
What are the limits on an impeached and not convicted president?
Why didn't Trudy wear a breathing mask in Avatar?
An example of a "simple poset" which does not belong to a convex polytope
Go (to / in) your own way
I pay for a service, but I miss the broadcast
Is negative resistance possible?
Power Adapter for Traveling to Scotland (I live in the US)
Can 35 mm film which went through a washing machine still be developed?
difference between $HOME and ~
Is having your hand in your pocket during a presentation bad?
How to prove (A v B), (A → C), (B → D) therefore (C v D)
how can I convert values to a meaning separated with a colon (double point)
Simultaneously merge multiple data.frames in a listCharacters counting and subletting specific patternsdata.table vs dplyr: can one do something well the other can't or does poorly?how to make a bar plot for a list of dataframes?Matplot not plotting datasetOrder of elements in CharacterVector and NumericVector in rcppFilter two tables with crosstalkR- How to use map() into map()R - Pmap() instead of Map()How to select highest value of col in r
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;
I have a data like this
df<- structure(list(df = structure(c(10L, 8L, 2L, 8L, 7L, 7L, 10L,
8L, 3L, 10L, 10L, 9L, 9L, 1L, 1L, 3L, 1L, 5L, 5L, 4L, 10L, 8L,
1L, 1L, 2L, 6L), .Label = c("-1:-1:2", "-1:2:-1", "-1:2:2", "1:01:01",
"1:1(2):1", "1(1)|1(2):1(1)|1(2):1(1)|1(2)", "1(1)|1(2):2:2",
"2:-1:-1", "2:-1:2", "2:02:02"), class = "factor")), class = "data.frame", row.names = c(NA,
-26L))
I want to expend it to words as I define. I want to have as many columns as the number of double points , for example here we have three : so we will add 3 columns after the df. Then we fill it up with words
2 = Homo
-1 = No
1= Het
1(1)= Het1
1(2)= Het2
So an expected output looks like below.
2:02:02 Homo Homo Homo
2:-1:-1 Homo No No
-1:2:-1 No Homo No
2:-1:-1 Homo No No
1(1)|1(2):2:2 Het1 Het2 Homo Homo
1(1)|1(2):2:2 Het1 Het2 Homo Homo
2:02:02 Homo Homo Homo
2:-1:-1 Homo No No
-1:2:2 No Homo Homo
2:02:02 Homo Homo Homo
2:02:02 Homo Homo Homo
2:-1:2 Homo No Homo
2:-1:2 Homo No Homo
-1:-1:2 No No Homo
-1:-1:2 No No Homo
-1:2:2 No Homo Homo
-1:-1:2 No No Homo
1:1(2):1 Het Het2 Het
1:1(2):1 Het Het3 Het
1:01:01 Het Het Het
2:02:02 Homo Homo Homo
2:-1:-1 Homo No No
-1:-1:2 No No Homo
-1:-1:2 No No Homo
-1:2:-1 No Homo No
1(1)|1(2):1(1)|1(2):1(1)|1(2) Het1 Het2 Het1 Het2 Het1 Het2
r
add a comment
|
I have a data like this
df<- structure(list(df = structure(c(10L, 8L, 2L, 8L, 7L, 7L, 10L,
8L, 3L, 10L, 10L, 9L, 9L, 1L, 1L, 3L, 1L, 5L, 5L, 4L, 10L, 8L,
1L, 1L, 2L, 6L), .Label = c("-1:-1:2", "-1:2:-1", "-1:2:2", "1:01:01",
"1:1(2):1", "1(1)|1(2):1(1)|1(2):1(1)|1(2)", "1(1)|1(2):2:2",
"2:-1:-1", "2:-1:2", "2:02:02"), class = "factor")), class = "data.frame", row.names = c(NA,
-26L))
I want to expend it to words as I define. I want to have as many columns as the number of double points , for example here we have three : so we will add 3 columns after the df. Then we fill it up with words
2 = Homo
-1 = No
1= Het
1(1)= Het1
1(2)= Het2
So an expected output looks like below.
2:02:02 Homo Homo Homo
2:-1:-1 Homo No No
-1:2:-1 No Homo No
2:-1:-1 Homo No No
1(1)|1(2):2:2 Het1 Het2 Homo Homo
1(1)|1(2):2:2 Het1 Het2 Homo Homo
2:02:02 Homo Homo Homo
2:-1:-1 Homo No No
-1:2:2 No Homo Homo
2:02:02 Homo Homo Homo
2:02:02 Homo Homo Homo
2:-1:2 Homo No Homo
2:-1:2 Homo No Homo
-1:-1:2 No No Homo
-1:-1:2 No No Homo
-1:2:2 No Homo Homo
-1:-1:2 No No Homo
1:1(2):1 Het Het2 Het
1:1(2):1 Het Het3 Het
1:01:01 Het Het Het
2:02:02 Homo Homo Homo
2:-1:-1 Homo No No
-1:-1:2 No No Homo
-1:-1:2 No No Homo
-1:2:-1 No Homo No
1(1)|1(2):1(1)|1(2):1(1)|1(2) Het1 Het2 Het1 Het2 Het1 Het2
r
By "double points" do you mean a colon? Is this a regional term? Never heard it in the US
– camille
Mar 28 at 21:03
@camille In Portugal it is "dois pontos", meaning "two points".
– Rui Barradas
Mar 28 at 21:05
@Camille I meant :
– Learner
Mar 28 at 21:06
Will02
and2
matches to the same string?
– akrun
Mar 28 at 21:09
@akrun Yes 02 and 2 are the same
– Learner
Mar 28 at 21:09
add a comment
|
I have a data like this
df<- structure(list(df = structure(c(10L, 8L, 2L, 8L, 7L, 7L, 10L,
8L, 3L, 10L, 10L, 9L, 9L, 1L, 1L, 3L, 1L, 5L, 5L, 4L, 10L, 8L,
1L, 1L, 2L, 6L), .Label = c("-1:-1:2", "-1:2:-1", "-1:2:2", "1:01:01",
"1:1(2):1", "1(1)|1(2):1(1)|1(2):1(1)|1(2)", "1(1)|1(2):2:2",
"2:-1:-1", "2:-1:2", "2:02:02"), class = "factor")), class = "data.frame", row.names = c(NA,
-26L))
I want to expend it to words as I define. I want to have as many columns as the number of double points , for example here we have three : so we will add 3 columns after the df. Then we fill it up with words
2 = Homo
-1 = No
1= Het
1(1)= Het1
1(2)= Het2
So an expected output looks like below.
2:02:02 Homo Homo Homo
2:-1:-1 Homo No No
-1:2:-1 No Homo No
2:-1:-1 Homo No No
1(1)|1(2):2:2 Het1 Het2 Homo Homo
1(1)|1(2):2:2 Het1 Het2 Homo Homo
2:02:02 Homo Homo Homo
2:-1:-1 Homo No No
-1:2:2 No Homo Homo
2:02:02 Homo Homo Homo
2:02:02 Homo Homo Homo
2:-1:2 Homo No Homo
2:-1:2 Homo No Homo
-1:-1:2 No No Homo
-1:-1:2 No No Homo
-1:2:2 No Homo Homo
-1:-1:2 No No Homo
1:1(2):1 Het Het2 Het
1:1(2):1 Het Het3 Het
1:01:01 Het Het Het
2:02:02 Homo Homo Homo
2:-1:-1 Homo No No
-1:-1:2 No No Homo
-1:-1:2 No No Homo
-1:2:-1 No Homo No
1(1)|1(2):1(1)|1(2):1(1)|1(2) Het1 Het2 Het1 Het2 Het1 Het2
r
I have a data like this
df<- structure(list(df = structure(c(10L, 8L, 2L, 8L, 7L, 7L, 10L,
8L, 3L, 10L, 10L, 9L, 9L, 1L, 1L, 3L, 1L, 5L, 5L, 4L, 10L, 8L,
1L, 1L, 2L, 6L), .Label = c("-1:-1:2", "-1:2:-1", "-1:2:2", "1:01:01",
"1:1(2):1", "1(1)|1(2):1(1)|1(2):1(1)|1(2)", "1(1)|1(2):2:2",
"2:-1:-1", "2:-1:2", "2:02:02"), class = "factor")), class = "data.frame", row.names = c(NA,
-26L))
I want to expend it to words as I define. I want to have as many columns as the number of double points , for example here we have three : so we will add 3 columns after the df. Then we fill it up with words
2 = Homo
-1 = No
1= Het
1(1)= Het1
1(2)= Het2
So an expected output looks like below.
2:02:02 Homo Homo Homo
2:-1:-1 Homo No No
-1:2:-1 No Homo No
2:-1:-1 Homo No No
1(1)|1(2):2:2 Het1 Het2 Homo Homo
1(1)|1(2):2:2 Het1 Het2 Homo Homo
2:02:02 Homo Homo Homo
2:-1:-1 Homo No No
-1:2:2 No Homo Homo
2:02:02 Homo Homo Homo
2:02:02 Homo Homo Homo
2:-1:2 Homo No Homo
2:-1:2 Homo No Homo
-1:-1:2 No No Homo
-1:-1:2 No No Homo
-1:2:2 No Homo Homo
-1:-1:2 No No Homo
1:1(2):1 Het Het2 Het
1:1(2):1 Het Het3 Het
1:01:01 Het Het Het
2:02:02 Homo Homo Homo
2:-1:-1 Homo No No
-1:-1:2 No No Homo
-1:-1:2 No No Homo
-1:2:-1 No Homo No
1(1)|1(2):1(1)|1(2):1(1)|1(2) Het1 Het2 Het1 Het2 Het1 Het2
r
r
edited Mar 28 at 21:08
Learner
asked Mar 28 at 20:57
LearnerLearner
4211 silver badge11 bronze badges
4211 silver badge11 bronze badges
By "double points" do you mean a colon? Is this a regional term? Never heard it in the US
– camille
Mar 28 at 21:03
@camille In Portugal it is "dois pontos", meaning "two points".
– Rui Barradas
Mar 28 at 21:05
@Camille I meant :
– Learner
Mar 28 at 21:06
Will02
and2
matches to the same string?
– akrun
Mar 28 at 21:09
@akrun Yes 02 and 2 are the same
– Learner
Mar 28 at 21:09
add a comment
|
By "double points" do you mean a colon? Is this a regional term? Never heard it in the US
– camille
Mar 28 at 21:03
@camille In Portugal it is "dois pontos", meaning "two points".
– Rui Barradas
Mar 28 at 21:05
@Camille I meant :
– Learner
Mar 28 at 21:06
Will02
and2
matches to the same string?
– akrun
Mar 28 at 21:09
@akrun Yes 02 and 2 are the same
– Learner
Mar 28 at 21:09
By "double points" do you mean a colon? Is this a regional term? Never heard it in the US
– camille
Mar 28 at 21:03
By "double points" do you mean a colon? Is this a regional term? Never heard it in the US
– camille
Mar 28 at 21:03
@camille In Portugal it is "dois pontos", meaning "two points".
– Rui Barradas
Mar 28 at 21:05
@camille In Portugal it is "dois pontos", meaning "two points".
– Rui Barradas
Mar 28 at 21:05
@Camille I meant :
– Learner
Mar 28 at 21:06
@Camille I meant :
– Learner
Mar 28 at 21:06
Will
02
and 2
matches to the same string?– akrun
Mar 28 at 21:09
Will
02
and 2
matches to the same string?– akrun
Mar 28 at 21:09
@akrun Yes 02 and 2 are the same
– Learner
Mar 28 at 21:09
@akrun Yes 02 and 2 are the same
– Learner
Mar 28 at 21:09
add a comment
|
2 Answers
2
active
oldest
votes
Not sure if the result is exactly what you need, but maybe this could help.
I think also maybe it's not the most efficient and beautiful solution, but it can be a starting point.
However, I called dats
your data:
head(dats)
df
1 2:02:02
2 2:-1:-1
3 -1:2:-1
4 2:-1:-1
5 1(1)|1(2):2:2
6 1(1)|1(2):2:2
And I created a mapping data.frame
:
mapping
id value
1 2 Homo
2 -1 No
3 1 Het
4 1(1) Het1
5 1(2) Het2
First, I splitted with stringr::str_split_fixed()
the double points:
library(stringr)
double_point <- as.data.frame.matrix(str_split_fixed(dats$df, ":", 3))
Now we have to separate for each column, the values by |
:
listed <- list() # empty list
for (i in (1:ncol(double_point)))
listed[[i]] <- (double_point[,i])
listed[[i]] <- str_split_fixed(listed[[i]], "\
# put as data frame
df_ <- do.call(cbind, listed)
# this is going to help in the future
df_1 <- df_
# result till now:
head(df_1)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] "2" "" "02" "" "02" ""
[2,] "2" "" "-1" "" "-1" ""
[3,] "-1" "" "2" "" "-1" ""
[4,] "2" "" "-1" "" "-1" ""
[5,] "1(1)" "1(2)" "2" "" "2" ""
[6,] "1(1)" "1(2)" "2" "" "2" ""
Now we have to replace the values with the mapping, and bind them with the original data splitted (in this case):
listed <- list()
for (i in (1:ncol(df_)))
df_[,i] <- gsub("0","",df_[,i])
listed[[i]] <- mapping[match(df_[,i], mapping$id), 2, drop=F]
df_final <- cbind(df_1,do.call(cbind, listed))
head(df_final)
1 2 3 4 5 6 value value value value value value
1 2 02 02 Homo <NA> Homo <NA> Homo <NA>
1.1 2 -1 -1 Homo <NA> No <NA> No <NA>
2 -1 2 -1 No <NA> Homo <NA> No <NA>
1.2 2 -1 -1 Homo <NA> No <NA> No <NA>
4 1(1) 1(2) 2 2 Het1 Het2 Homo <NA> Homo <NA>
4.1 1(1) 1(2) 2 2 Het1 Het2 Homo <NA> Homo <NA>
Hope it helps!
EDIT
Here the mapping dput()
and str()
:
dput(mapping)
structure(list(id = structure(c(5L, 1L, 2L, 3L, 4L), .Label = c("-1",
"1", "1(1)", "1(2)", "2"), class = "factor"), value = structure(c(4L,
5L, 1L, 2L, 3L), .Label = c("Het", "Het1", "Het2", "Homo", "No"
), class = "factor")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5"))
str(mapping)
'data.frame': 5 obs. of 2 variables:
$ id : Factor w/ 5 levels "-1","1","1(1)",..: 5 1 2 3 4
$ value: Factor w/ 5 levels "Het","Het1","Het2",..: 4 5 1 2 3
Thanks, your code does not print for 1(1) or 1(2) . can you please tell me the str of mapping ?
– Learner
Mar 29 at 14:38
Hi,posted the edit. It seems that in the last rows of the last output, it prints for the cases you mention.
– s_t
Mar 29 at 15:10
1
I accepted and liked your answer
– Learner
Mar 29 at 20:02
add a comment
|
You can explicitly define all the possible values in num2words
data frame and then run the following
df<- structure(list(df = structure(c(10L, 8L, 2L, 8L, 7L, 7L, 10L,
8L, 3L, 10L, 10L, 9L, 9L, 1L, 1L, 3L, 1L, 5L, 5L, 4L, 10L, 8L,
1L, 1L, 2L, 6L), .Label = c("-1:-1:2", "-1:2:-1", "-1:2:2", "1:01:01",
"1:1(2):1", "1(1)|1(2):1(1)|1(2):1(1)|1(2)", "1(1)|1(2):2:2",
"2:-1:-1", "2:-1:2", "2:02:02"), class = "factor")), class = "data.frame", row.names = c(NA,
-26L))
num2words <- read.table(text = "
num word
2 Homo
02 Homo
-1 No
1 Het
01 Het
1(1) Het1
1(2) Het2
1(1)|1(2) Het1-Het2
1(2)|1(1) Het2-Het1
", header = T, stringsAsFactors = F)
lst=lapply(1:nrow(df), function(x)
split.nums <- unlist(strsplit(as.character(df[x,]), ":"))
num2words$word[match(split.nums, num2words$num)]
)
new.df=cbind(df, do.call(rbind, lst))
> new.df
df 1 2 3
1 2:02:02 Homo Homo Homo
2 2:-1:-1 Homo No No
3 -1:2:-1 No Homo No
4 2:-1:-1 Homo No No
5 1(1)|1(2):2:2 Het1-Het2 Homo Homo
6 1(1)|1(2):2:2 Het1-Het2 Homo Homo
7 2:02:02 Homo Homo Homo
8 2:-1:-1 Homo No No
9 -1:2:2 No Homo Homo
10 2:02:02 Homo Homo Homo
11 2:02:02 Homo Homo Homo
12 2:-1:2 Homo No Homo
13 2:-1:2 Homo No Homo
14 -1:-1:2 No No Homo
15 -1:-1:2 No No Homo
16 -1:2:2 No Homo Homo
17 -1:-1:2 No No Homo
18 1:1(2):1 Het Het2 Het
19 1:1(2):1 Het Het2 Het
20 1:01:01 Het Het Het
21 2:02:02 Homo Homo Homo
22 2:-1:-1 Homo No No
23 -1:-1:2 No No Homo
24 -1:-1:2 No No Homo
25 -1:2:-1 No Homo No
26 1(1)|1(2):1(1)|1(2):1(1)|1(2) Het1-Het2 Het1-Het2 Het1-Het2
I liked your answer thank you
– Learner
Mar 29 at 20:02
Thanks. I think it is succinct and easy to read.
– SinghTheCoder
Mar 29 at 23:45
add a comment
|
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55406752%2fhow-can-i-convert-values-to-a-meaning-separated-with-a-colon-double-point%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Not sure if the result is exactly what you need, but maybe this could help.
I think also maybe it's not the most efficient and beautiful solution, but it can be a starting point.
However, I called dats
your data:
head(dats)
df
1 2:02:02
2 2:-1:-1
3 -1:2:-1
4 2:-1:-1
5 1(1)|1(2):2:2
6 1(1)|1(2):2:2
And I created a mapping data.frame
:
mapping
id value
1 2 Homo
2 -1 No
3 1 Het
4 1(1) Het1
5 1(2) Het2
First, I splitted with stringr::str_split_fixed()
the double points:
library(stringr)
double_point <- as.data.frame.matrix(str_split_fixed(dats$df, ":", 3))
Now we have to separate for each column, the values by |
:
listed <- list() # empty list
for (i in (1:ncol(double_point)))
listed[[i]] <- (double_point[,i])
listed[[i]] <- str_split_fixed(listed[[i]], "\
# put as data frame
df_ <- do.call(cbind, listed)
# this is going to help in the future
df_1 <- df_
# result till now:
head(df_1)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] "2" "" "02" "" "02" ""
[2,] "2" "" "-1" "" "-1" ""
[3,] "-1" "" "2" "" "-1" ""
[4,] "2" "" "-1" "" "-1" ""
[5,] "1(1)" "1(2)" "2" "" "2" ""
[6,] "1(1)" "1(2)" "2" "" "2" ""
Now we have to replace the values with the mapping, and bind them with the original data splitted (in this case):
listed <- list()
for (i in (1:ncol(df_)))
df_[,i] <- gsub("0","",df_[,i])
listed[[i]] <- mapping[match(df_[,i], mapping$id), 2, drop=F]
df_final <- cbind(df_1,do.call(cbind, listed))
head(df_final)
1 2 3 4 5 6 value value value value value value
1 2 02 02 Homo <NA> Homo <NA> Homo <NA>
1.1 2 -1 -1 Homo <NA> No <NA> No <NA>
2 -1 2 -1 No <NA> Homo <NA> No <NA>
1.2 2 -1 -1 Homo <NA> No <NA> No <NA>
4 1(1) 1(2) 2 2 Het1 Het2 Homo <NA> Homo <NA>
4.1 1(1) 1(2) 2 2 Het1 Het2 Homo <NA> Homo <NA>
Hope it helps!
EDIT
Here the mapping dput()
and str()
:
dput(mapping)
structure(list(id = structure(c(5L, 1L, 2L, 3L, 4L), .Label = c("-1",
"1", "1(1)", "1(2)", "2"), class = "factor"), value = structure(c(4L,
5L, 1L, 2L, 3L), .Label = c("Het", "Het1", "Het2", "Homo", "No"
), class = "factor")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5"))
str(mapping)
'data.frame': 5 obs. of 2 variables:
$ id : Factor w/ 5 levels "-1","1","1(1)",..: 5 1 2 3 4
$ value: Factor w/ 5 levels "Het","Het1","Het2",..: 4 5 1 2 3
Thanks, your code does not print for 1(1) or 1(2) . can you please tell me the str of mapping ?
– Learner
Mar 29 at 14:38
Hi,posted the edit. It seems that in the last rows of the last output, it prints for the cases you mention.
– s_t
Mar 29 at 15:10
1
I accepted and liked your answer
– Learner
Mar 29 at 20:02
add a comment
|
Not sure if the result is exactly what you need, but maybe this could help.
I think also maybe it's not the most efficient and beautiful solution, but it can be a starting point.
However, I called dats
your data:
head(dats)
df
1 2:02:02
2 2:-1:-1
3 -1:2:-1
4 2:-1:-1
5 1(1)|1(2):2:2
6 1(1)|1(2):2:2
And I created a mapping data.frame
:
mapping
id value
1 2 Homo
2 -1 No
3 1 Het
4 1(1) Het1
5 1(2) Het2
First, I splitted with stringr::str_split_fixed()
the double points:
library(stringr)
double_point <- as.data.frame.matrix(str_split_fixed(dats$df, ":", 3))
Now we have to separate for each column, the values by |
:
listed <- list() # empty list
for (i in (1:ncol(double_point)))
listed[[i]] <- (double_point[,i])
listed[[i]] <- str_split_fixed(listed[[i]], "\
# put as data frame
df_ <- do.call(cbind, listed)
# this is going to help in the future
df_1 <- df_
# result till now:
head(df_1)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] "2" "" "02" "" "02" ""
[2,] "2" "" "-1" "" "-1" ""
[3,] "-1" "" "2" "" "-1" ""
[4,] "2" "" "-1" "" "-1" ""
[5,] "1(1)" "1(2)" "2" "" "2" ""
[6,] "1(1)" "1(2)" "2" "" "2" ""
Now we have to replace the values with the mapping, and bind them with the original data splitted (in this case):
listed <- list()
for (i in (1:ncol(df_)))
df_[,i] <- gsub("0","",df_[,i])
listed[[i]] <- mapping[match(df_[,i], mapping$id), 2, drop=F]
df_final <- cbind(df_1,do.call(cbind, listed))
head(df_final)
1 2 3 4 5 6 value value value value value value
1 2 02 02 Homo <NA> Homo <NA> Homo <NA>
1.1 2 -1 -1 Homo <NA> No <NA> No <NA>
2 -1 2 -1 No <NA> Homo <NA> No <NA>
1.2 2 -1 -1 Homo <NA> No <NA> No <NA>
4 1(1) 1(2) 2 2 Het1 Het2 Homo <NA> Homo <NA>
4.1 1(1) 1(2) 2 2 Het1 Het2 Homo <NA> Homo <NA>
Hope it helps!
EDIT
Here the mapping dput()
and str()
:
dput(mapping)
structure(list(id = structure(c(5L, 1L, 2L, 3L, 4L), .Label = c("-1",
"1", "1(1)", "1(2)", "2"), class = "factor"), value = structure(c(4L,
5L, 1L, 2L, 3L), .Label = c("Het", "Het1", "Het2", "Homo", "No"
), class = "factor")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5"))
str(mapping)
'data.frame': 5 obs. of 2 variables:
$ id : Factor w/ 5 levels "-1","1","1(1)",..: 5 1 2 3 4
$ value: Factor w/ 5 levels "Het","Het1","Het2",..: 4 5 1 2 3
Thanks, your code does not print for 1(1) or 1(2) . can you please tell me the str of mapping ?
– Learner
Mar 29 at 14:38
Hi,posted the edit. It seems that in the last rows of the last output, it prints for the cases you mention.
– s_t
Mar 29 at 15:10
1
I accepted and liked your answer
– Learner
Mar 29 at 20:02
add a comment
|
Not sure if the result is exactly what you need, but maybe this could help.
I think also maybe it's not the most efficient and beautiful solution, but it can be a starting point.
However, I called dats
your data:
head(dats)
df
1 2:02:02
2 2:-1:-1
3 -1:2:-1
4 2:-1:-1
5 1(1)|1(2):2:2
6 1(1)|1(2):2:2
And I created a mapping data.frame
:
mapping
id value
1 2 Homo
2 -1 No
3 1 Het
4 1(1) Het1
5 1(2) Het2
First, I splitted with stringr::str_split_fixed()
the double points:
library(stringr)
double_point <- as.data.frame.matrix(str_split_fixed(dats$df, ":", 3))
Now we have to separate for each column, the values by |
:
listed <- list() # empty list
for (i in (1:ncol(double_point)))
listed[[i]] <- (double_point[,i])
listed[[i]] <- str_split_fixed(listed[[i]], "\
# put as data frame
df_ <- do.call(cbind, listed)
# this is going to help in the future
df_1 <- df_
# result till now:
head(df_1)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] "2" "" "02" "" "02" ""
[2,] "2" "" "-1" "" "-1" ""
[3,] "-1" "" "2" "" "-1" ""
[4,] "2" "" "-1" "" "-1" ""
[5,] "1(1)" "1(2)" "2" "" "2" ""
[6,] "1(1)" "1(2)" "2" "" "2" ""
Now we have to replace the values with the mapping, and bind them with the original data splitted (in this case):
listed <- list()
for (i in (1:ncol(df_)))
df_[,i] <- gsub("0","",df_[,i])
listed[[i]] <- mapping[match(df_[,i], mapping$id), 2, drop=F]
df_final <- cbind(df_1,do.call(cbind, listed))
head(df_final)
1 2 3 4 5 6 value value value value value value
1 2 02 02 Homo <NA> Homo <NA> Homo <NA>
1.1 2 -1 -1 Homo <NA> No <NA> No <NA>
2 -1 2 -1 No <NA> Homo <NA> No <NA>
1.2 2 -1 -1 Homo <NA> No <NA> No <NA>
4 1(1) 1(2) 2 2 Het1 Het2 Homo <NA> Homo <NA>
4.1 1(1) 1(2) 2 2 Het1 Het2 Homo <NA> Homo <NA>
Hope it helps!
EDIT
Here the mapping dput()
and str()
:
dput(mapping)
structure(list(id = structure(c(5L, 1L, 2L, 3L, 4L), .Label = c("-1",
"1", "1(1)", "1(2)", "2"), class = "factor"), value = structure(c(4L,
5L, 1L, 2L, 3L), .Label = c("Het", "Het1", "Het2", "Homo", "No"
), class = "factor")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5"))
str(mapping)
'data.frame': 5 obs. of 2 variables:
$ id : Factor w/ 5 levels "-1","1","1(1)",..: 5 1 2 3 4
$ value: Factor w/ 5 levels "Het","Het1","Het2",..: 4 5 1 2 3
Not sure if the result is exactly what you need, but maybe this could help.
I think also maybe it's not the most efficient and beautiful solution, but it can be a starting point.
However, I called dats
your data:
head(dats)
df
1 2:02:02
2 2:-1:-1
3 -1:2:-1
4 2:-1:-1
5 1(1)|1(2):2:2
6 1(1)|1(2):2:2
And I created a mapping data.frame
:
mapping
id value
1 2 Homo
2 -1 No
3 1 Het
4 1(1) Het1
5 1(2) Het2
First, I splitted with stringr::str_split_fixed()
the double points:
library(stringr)
double_point <- as.data.frame.matrix(str_split_fixed(dats$df, ":", 3))
Now we have to separate for each column, the values by |
:
listed <- list() # empty list
for (i in (1:ncol(double_point)))
listed[[i]] <- (double_point[,i])
listed[[i]] <- str_split_fixed(listed[[i]], "\
# put as data frame
df_ <- do.call(cbind, listed)
# this is going to help in the future
df_1 <- df_
# result till now:
head(df_1)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] "2" "" "02" "" "02" ""
[2,] "2" "" "-1" "" "-1" ""
[3,] "-1" "" "2" "" "-1" ""
[4,] "2" "" "-1" "" "-1" ""
[5,] "1(1)" "1(2)" "2" "" "2" ""
[6,] "1(1)" "1(2)" "2" "" "2" ""
Now we have to replace the values with the mapping, and bind them with the original data splitted (in this case):
listed <- list()
for (i in (1:ncol(df_)))
df_[,i] <- gsub("0","",df_[,i])
listed[[i]] <- mapping[match(df_[,i], mapping$id), 2, drop=F]
df_final <- cbind(df_1,do.call(cbind, listed))
head(df_final)
1 2 3 4 5 6 value value value value value value
1 2 02 02 Homo <NA> Homo <NA> Homo <NA>
1.1 2 -1 -1 Homo <NA> No <NA> No <NA>
2 -1 2 -1 No <NA> Homo <NA> No <NA>
1.2 2 -1 -1 Homo <NA> No <NA> No <NA>
4 1(1) 1(2) 2 2 Het1 Het2 Homo <NA> Homo <NA>
4.1 1(1) 1(2) 2 2 Het1 Het2 Homo <NA> Homo <NA>
Hope it helps!
EDIT
Here the mapping dput()
and str()
:
dput(mapping)
structure(list(id = structure(c(5L, 1L, 2L, 3L, 4L), .Label = c("-1",
"1", "1(1)", "1(2)", "2"), class = "factor"), value = structure(c(4L,
5L, 1L, 2L, 3L), .Label = c("Het", "Het1", "Het2", "Homo", "No"
), class = "factor")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5"))
str(mapping)
'data.frame': 5 obs. of 2 variables:
$ id : Factor w/ 5 levels "-1","1","1(1)",..: 5 1 2 3 4
$ value: Factor w/ 5 levels "Het","Het1","Het2",..: 4 5 1 2 3
edited Mar 29 at 15:09
answered Mar 29 at 10:32
s_ts_t
4,7742 gold badges12 silver badges35 bronze badges
4,7742 gold badges12 silver badges35 bronze badges
Thanks, your code does not print for 1(1) or 1(2) . can you please tell me the str of mapping ?
– Learner
Mar 29 at 14:38
Hi,posted the edit. It seems that in the last rows of the last output, it prints for the cases you mention.
– s_t
Mar 29 at 15:10
1
I accepted and liked your answer
– Learner
Mar 29 at 20:02
add a comment
|
Thanks, your code does not print for 1(1) or 1(2) . can you please tell me the str of mapping ?
– Learner
Mar 29 at 14:38
Hi,posted the edit. It seems that in the last rows of the last output, it prints for the cases you mention.
– s_t
Mar 29 at 15:10
1
I accepted and liked your answer
– Learner
Mar 29 at 20:02
Thanks, your code does not print for 1(1) or 1(2) . can you please tell me the str of mapping ?
– Learner
Mar 29 at 14:38
Thanks, your code does not print for 1(1) or 1(2) . can you please tell me the str of mapping ?
– Learner
Mar 29 at 14:38
Hi,posted the edit. It seems that in the last rows of the last output, it prints for the cases you mention.
– s_t
Mar 29 at 15:10
Hi,posted the edit. It seems that in the last rows of the last output, it prints for the cases you mention.
– s_t
Mar 29 at 15:10
1
1
I accepted and liked your answer
– Learner
Mar 29 at 20:02
I accepted and liked your answer
– Learner
Mar 29 at 20:02
add a comment
|
You can explicitly define all the possible values in num2words
data frame and then run the following
df<- structure(list(df = structure(c(10L, 8L, 2L, 8L, 7L, 7L, 10L,
8L, 3L, 10L, 10L, 9L, 9L, 1L, 1L, 3L, 1L, 5L, 5L, 4L, 10L, 8L,
1L, 1L, 2L, 6L), .Label = c("-1:-1:2", "-1:2:-1", "-1:2:2", "1:01:01",
"1:1(2):1", "1(1)|1(2):1(1)|1(2):1(1)|1(2)", "1(1)|1(2):2:2",
"2:-1:-1", "2:-1:2", "2:02:02"), class = "factor")), class = "data.frame", row.names = c(NA,
-26L))
num2words <- read.table(text = "
num word
2 Homo
02 Homo
-1 No
1 Het
01 Het
1(1) Het1
1(2) Het2
1(1)|1(2) Het1-Het2
1(2)|1(1) Het2-Het1
", header = T, stringsAsFactors = F)
lst=lapply(1:nrow(df), function(x)
split.nums <- unlist(strsplit(as.character(df[x,]), ":"))
num2words$word[match(split.nums, num2words$num)]
)
new.df=cbind(df, do.call(rbind, lst))
> new.df
df 1 2 3
1 2:02:02 Homo Homo Homo
2 2:-1:-1 Homo No No
3 -1:2:-1 No Homo No
4 2:-1:-1 Homo No No
5 1(1)|1(2):2:2 Het1-Het2 Homo Homo
6 1(1)|1(2):2:2 Het1-Het2 Homo Homo
7 2:02:02 Homo Homo Homo
8 2:-1:-1 Homo No No
9 -1:2:2 No Homo Homo
10 2:02:02 Homo Homo Homo
11 2:02:02 Homo Homo Homo
12 2:-1:2 Homo No Homo
13 2:-1:2 Homo No Homo
14 -1:-1:2 No No Homo
15 -1:-1:2 No No Homo
16 -1:2:2 No Homo Homo
17 -1:-1:2 No No Homo
18 1:1(2):1 Het Het2 Het
19 1:1(2):1 Het Het2 Het
20 1:01:01 Het Het Het
21 2:02:02 Homo Homo Homo
22 2:-1:-1 Homo No No
23 -1:-1:2 No No Homo
24 -1:-1:2 No No Homo
25 -1:2:-1 No Homo No
26 1(1)|1(2):1(1)|1(2):1(1)|1(2) Het1-Het2 Het1-Het2 Het1-Het2
I liked your answer thank you
– Learner
Mar 29 at 20:02
Thanks. I think it is succinct and easy to read.
– SinghTheCoder
Mar 29 at 23:45
add a comment
|
You can explicitly define all the possible values in num2words
data frame and then run the following
df<- structure(list(df = structure(c(10L, 8L, 2L, 8L, 7L, 7L, 10L,
8L, 3L, 10L, 10L, 9L, 9L, 1L, 1L, 3L, 1L, 5L, 5L, 4L, 10L, 8L,
1L, 1L, 2L, 6L), .Label = c("-1:-1:2", "-1:2:-1", "-1:2:2", "1:01:01",
"1:1(2):1", "1(1)|1(2):1(1)|1(2):1(1)|1(2)", "1(1)|1(2):2:2",
"2:-1:-1", "2:-1:2", "2:02:02"), class = "factor")), class = "data.frame", row.names = c(NA,
-26L))
num2words <- read.table(text = "
num word
2 Homo
02 Homo
-1 No
1 Het
01 Het
1(1) Het1
1(2) Het2
1(1)|1(2) Het1-Het2
1(2)|1(1) Het2-Het1
", header = T, stringsAsFactors = F)
lst=lapply(1:nrow(df), function(x)
split.nums <- unlist(strsplit(as.character(df[x,]), ":"))
num2words$word[match(split.nums, num2words$num)]
)
new.df=cbind(df, do.call(rbind, lst))
> new.df
df 1 2 3
1 2:02:02 Homo Homo Homo
2 2:-1:-1 Homo No No
3 -1:2:-1 No Homo No
4 2:-1:-1 Homo No No
5 1(1)|1(2):2:2 Het1-Het2 Homo Homo
6 1(1)|1(2):2:2 Het1-Het2 Homo Homo
7 2:02:02 Homo Homo Homo
8 2:-1:-1 Homo No No
9 -1:2:2 No Homo Homo
10 2:02:02 Homo Homo Homo
11 2:02:02 Homo Homo Homo
12 2:-1:2 Homo No Homo
13 2:-1:2 Homo No Homo
14 -1:-1:2 No No Homo
15 -1:-1:2 No No Homo
16 -1:2:2 No Homo Homo
17 -1:-1:2 No No Homo
18 1:1(2):1 Het Het2 Het
19 1:1(2):1 Het Het2 Het
20 1:01:01 Het Het Het
21 2:02:02 Homo Homo Homo
22 2:-1:-1 Homo No No
23 -1:-1:2 No No Homo
24 -1:-1:2 No No Homo
25 -1:2:-1 No Homo No
26 1(1)|1(2):1(1)|1(2):1(1)|1(2) Het1-Het2 Het1-Het2 Het1-Het2
I liked your answer thank you
– Learner
Mar 29 at 20:02
Thanks. I think it is succinct and easy to read.
– SinghTheCoder
Mar 29 at 23:45
add a comment
|
You can explicitly define all the possible values in num2words
data frame and then run the following
df<- structure(list(df = structure(c(10L, 8L, 2L, 8L, 7L, 7L, 10L,
8L, 3L, 10L, 10L, 9L, 9L, 1L, 1L, 3L, 1L, 5L, 5L, 4L, 10L, 8L,
1L, 1L, 2L, 6L), .Label = c("-1:-1:2", "-1:2:-1", "-1:2:2", "1:01:01",
"1:1(2):1", "1(1)|1(2):1(1)|1(2):1(1)|1(2)", "1(1)|1(2):2:2",
"2:-1:-1", "2:-1:2", "2:02:02"), class = "factor")), class = "data.frame", row.names = c(NA,
-26L))
num2words <- read.table(text = "
num word
2 Homo
02 Homo
-1 No
1 Het
01 Het
1(1) Het1
1(2) Het2
1(1)|1(2) Het1-Het2
1(2)|1(1) Het2-Het1
", header = T, stringsAsFactors = F)
lst=lapply(1:nrow(df), function(x)
split.nums <- unlist(strsplit(as.character(df[x,]), ":"))
num2words$word[match(split.nums, num2words$num)]
)
new.df=cbind(df, do.call(rbind, lst))
> new.df
df 1 2 3
1 2:02:02 Homo Homo Homo
2 2:-1:-1 Homo No No
3 -1:2:-1 No Homo No
4 2:-1:-1 Homo No No
5 1(1)|1(2):2:2 Het1-Het2 Homo Homo
6 1(1)|1(2):2:2 Het1-Het2 Homo Homo
7 2:02:02 Homo Homo Homo
8 2:-1:-1 Homo No No
9 -1:2:2 No Homo Homo
10 2:02:02 Homo Homo Homo
11 2:02:02 Homo Homo Homo
12 2:-1:2 Homo No Homo
13 2:-1:2 Homo No Homo
14 -1:-1:2 No No Homo
15 -1:-1:2 No No Homo
16 -1:2:2 No Homo Homo
17 -1:-1:2 No No Homo
18 1:1(2):1 Het Het2 Het
19 1:1(2):1 Het Het2 Het
20 1:01:01 Het Het Het
21 2:02:02 Homo Homo Homo
22 2:-1:-1 Homo No No
23 -1:-1:2 No No Homo
24 -1:-1:2 No No Homo
25 -1:2:-1 No Homo No
26 1(1)|1(2):1(1)|1(2):1(1)|1(2) Het1-Het2 Het1-Het2 Het1-Het2
You can explicitly define all the possible values in num2words
data frame and then run the following
df<- structure(list(df = structure(c(10L, 8L, 2L, 8L, 7L, 7L, 10L,
8L, 3L, 10L, 10L, 9L, 9L, 1L, 1L, 3L, 1L, 5L, 5L, 4L, 10L, 8L,
1L, 1L, 2L, 6L), .Label = c("-1:-1:2", "-1:2:-1", "-1:2:2", "1:01:01",
"1:1(2):1", "1(1)|1(2):1(1)|1(2):1(1)|1(2)", "1(1)|1(2):2:2",
"2:-1:-1", "2:-1:2", "2:02:02"), class = "factor")), class = "data.frame", row.names = c(NA,
-26L))
num2words <- read.table(text = "
num word
2 Homo
02 Homo
-1 No
1 Het
01 Het
1(1) Het1
1(2) Het2
1(1)|1(2) Het1-Het2
1(2)|1(1) Het2-Het1
", header = T, stringsAsFactors = F)
lst=lapply(1:nrow(df), function(x)
split.nums <- unlist(strsplit(as.character(df[x,]), ":"))
num2words$word[match(split.nums, num2words$num)]
)
new.df=cbind(df, do.call(rbind, lst))
> new.df
df 1 2 3
1 2:02:02 Homo Homo Homo
2 2:-1:-1 Homo No No
3 -1:2:-1 No Homo No
4 2:-1:-1 Homo No No
5 1(1)|1(2):2:2 Het1-Het2 Homo Homo
6 1(1)|1(2):2:2 Het1-Het2 Homo Homo
7 2:02:02 Homo Homo Homo
8 2:-1:-1 Homo No No
9 -1:2:2 No Homo Homo
10 2:02:02 Homo Homo Homo
11 2:02:02 Homo Homo Homo
12 2:-1:2 Homo No Homo
13 2:-1:2 Homo No Homo
14 -1:-1:2 No No Homo
15 -1:-1:2 No No Homo
16 -1:2:2 No Homo Homo
17 -1:-1:2 No No Homo
18 1:1(2):1 Het Het2 Het
19 1:1(2):1 Het Het2 Het
20 1:01:01 Het Het Het
21 2:02:02 Homo Homo Homo
22 2:-1:-1 Homo No No
23 -1:-1:2 No No Homo
24 -1:-1:2 No No Homo
25 -1:2:-1 No Homo No
26 1(1)|1(2):1(1)|1(2):1(1)|1(2) Het1-Het2 Het1-Het2 Het1-Het2
answered Mar 29 at 16:39
SinghTheCoderSinghTheCoder
1,38710 silver badges21 bronze badges
1,38710 silver badges21 bronze badges
I liked your answer thank you
– Learner
Mar 29 at 20:02
Thanks. I think it is succinct and easy to read.
– SinghTheCoder
Mar 29 at 23:45
add a comment
|
I liked your answer thank you
– Learner
Mar 29 at 20:02
Thanks. I think it is succinct and easy to read.
– SinghTheCoder
Mar 29 at 23:45
I liked your answer thank you
– Learner
Mar 29 at 20:02
I liked your answer thank you
– Learner
Mar 29 at 20:02
Thanks. I think it is succinct and easy to read.
– SinghTheCoder
Mar 29 at 23:45
Thanks. I think it is succinct and easy to read.
– SinghTheCoder
Mar 29 at 23:45
add a comment
|
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55406752%2fhow-can-i-convert-values-to-a-meaning-separated-with-a-colon-double-point%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
By "double points" do you mean a colon? Is this a regional term? Never heard it in the US
– camille
Mar 28 at 21:03
@camille In Portugal it is "dois pontos", meaning "two points".
– Rui Barradas
Mar 28 at 21:05
@Camille I meant :
– Learner
Mar 28 at 21:06
Will
02
and2
matches to the same string?– akrun
Mar 28 at 21:09
@akrun Yes 02 and 2 are the same
– Learner
Mar 28 at 21:09