Subset multiple different rows of a Data frameDrop factor levels in a subsetted data frameHow to join (merge) data frames (inner, outer, left, right)Drop data frame columns by nameWhy is `[` better than `subset`?How to combine dataframes with different columns without merge?Keep replacing duplicate rows until there are none left?Filtering Data Frame by using a Dynamic Data FrameData frame multiplication and storing in a other data frame using RDivide each values in rows of different data frames by values stored in vectorMerge multiple data frames with partially matching rows
Security Patch SUPEE-11155 - Possible issues?
How soon after takeoff can you recline your airplane seat?
What verb for taking advantage fits in "I don't want to ________ on the friendship"?
Installed software from source, how to say yum not to install it from package?
Can I take Amul cottage cheese from India to Netherlands?
The alcoholic village festival
How do I keep a running total of data in a column in excel?
What structure do natural isomorphisms preserve?
Excel prefixes or suffixes
Why was Pan Am Flight 103 flying over Lockerbie?
I agreed to cancel a long-planned vacation (with travel costs) due to project deadlines, but now the timeline has all changed again
English idiomatic equivalents of 能骗就骗 (if you can cheat, then cheat)
Having to constantly redo everything because I don't know how to do it
What are the children of two Muggle-borns called?
ESTA Elegible after Qatar?
Is leaving out prefixes like "rauf", "rüber", "rein" when describing movement considered a big mistake in spoken German?
Tricolour nonogram
Tikz, loop not appearing
Single method for different parameterized mysql command
Why wasn't EBCDIC designed with contiguous alphanumeric characters?
Why will we fail creating a self sustaining off world colony?
Can you twin the Light cantrip?
Why did the Apple //e make a hideous noise if you inserted the disk upside down?
Is it possible to pray to Hashem for a specific person as your prospective spouse?
Subset multiple different rows of a Data frame
Drop factor levels in a subsetted data frameHow to join (merge) data frames (inner, outer, left, right)Drop data frame columns by nameWhy is `[` better than `subset`?How to combine dataframes with different columns without merge?Keep replacing duplicate rows until there are none left?Filtering Data Frame by using a Dynamic Data FrameData frame multiplication and storing in a other data frame using RDivide each values in rows of different data frames by values stored in vectorMerge multiple data frames with partially matching rows
Hi How can I subset 2 different N random samples in a data frame. See example below.
I have df the main dataset. I need 2 subsets of the main dataset. I got 2 subsets by getting 3 random rows from the main dataset. However I need those 2 subsets to be unique with each other.
> df = data.frame(matrix(rnorm(20), nrow=10))
> df
X1 X2
1 0.19234071 -0.86702704
2 -0.18264853 1.75276062
3 0.75824257 -0.51314220
4 -0.84571563 -1.24841675
5 0.75470152 1.51408945
6 1.04546517 1.33292716
7 -0.51449011 -1.51275633
8 1.36014747 0.07400024
9 -0.02397481 0.17177997
10 -1.37967248 -0.50416489
df1 = df[sample(nrow(df), 3), ]
df1
X1 X2
10 -1.3796725 -0.5041649
1 0.1923407 -0.8670270
4 -0.8457156 -1.2484167
df2 = df[sample(nrow(df), 3), ]
df2
X1 X2
3 0.7582426 -0.5131422
4 -0.8457156 -1.2484167
6 1.0454652 1.3329272
As you can see the random subsets df1 and df2 have same row which is the row 4. I need 2 random subsets of the dataframe that had different rows.
r
add a comment |
Hi How can I subset 2 different N random samples in a data frame. See example below.
I have df the main dataset. I need 2 subsets of the main dataset. I got 2 subsets by getting 3 random rows from the main dataset. However I need those 2 subsets to be unique with each other.
> df = data.frame(matrix(rnorm(20), nrow=10))
> df
X1 X2
1 0.19234071 -0.86702704
2 -0.18264853 1.75276062
3 0.75824257 -0.51314220
4 -0.84571563 -1.24841675
5 0.75470152 1.51408945
6 1.04546517 1.33292716
7 -0.51449011 -1.51275633
8 1.36014747 0.07400024
9 -0.02397481 0.17177997
10 -1.37967248 -0.50416489
df1 = df[sample(nrow(df), 3), ]
df1
X1 X2
10 -1.3796725 -0.5041649
1 0.1923407 -0.8670270
4 -0.8457156 -1.2484167
df2 = df[sample(nrow(df), 3), ]
df2
X1 X2
3 0.7582426 -0.5131422
4 -0.8457156 -1.2484167
6 1.0454652 1.3329272
As you can see the random subsets df1 and df2 have same row which is the row 4. I need 2 random subsets of the dataframe that had different rows.
r
split(head(df[sample(nrow(df)),]), 1:2)?
– Frank
Mar 25 at 16:07
add a comment |
Hi How can I subset 2 different N random samples in a data frame. See example below.
I have df the main dataset. I need 2 subsets of the main dataset. I got 2 subsets by getting 3 random rows from the main dataset. However I need those 2 subsets to be unique with each other.
> df = data.frame(matrix(rnorm(20), nrow=10))
> df
X1 X2
1 0.19234071 -0.86702704
2 -0.18264853 1.75276062
3 0.75824257 -0.51314220
4 -0.84571563 -1.24841675
5 0.75470152 1.51408945
6 1.04546517 1.33292716
7 -0.51449011 -1.51275633
8 1.36014747 0.07400024
9 -0.02397481 0.17177997
10 -1.37967248 -0.50416489
df1 = df[sample(nrow(df), 3), ]
df1
X1 X2
10 -1.3796725 -0.5041649
1 0.1923407 -0.8670270
4 -0.8457156 -1.2484167
df2 = df[sample(nrow(df), 3), ]
df2
X1 X2
3 0.7582426 -0.5131422
4 -0.8457156 -1.2484167
6 1.0454652 1.3329272
As you can see the random subsets df1 and df2 have same row which is the row 4. I need 2 random subsets of the dataframe that had different rows.
r
Hi How can I subset 2 different N random samples in a data frame. See example below.
I have df the main dataset. I need 2 subsets of the main dataset. I got 2 subsets by getting 3 random rows from the main dataset. However I need those 2 subsets to be unique with each other.
> df = data.frame(matrix(rnorm(20), nrow=10))
> df
X1 X2
1 0.19234071 -0.86702704
2 -0.18264853 1.75276062
3 0.75824257 -0.51314220
4 -0.84571563 -1.24841675
5 0.75470152 1.51408945
6 1.04546517 1.33292716
7 -0.51449011 -1.51275633
8 1.36014747 0.07400024
9 -0.02397481 0.17177997
10 -1.37967248 -0.50416489
df1 = df[sample(nrow(df), 3), ]
df1
X1 X2
10 -1.3796725 -0.5041649
1 0.1923407 -0.8670270
4 -0.8457156 -1.2484167
df2 = df[sample(nrow(df), 3), ]
df2
X1 X2
3 0.7582426 -0.5131422
4 -0.8457156 -1.2484167
6 1.0454652 1.3329272
As you can see the random subsets df1 and df2 have same row which is the row 4. I need 2 random subsets of the dataframe that had different rows.
r
r
asked Mar 25 at 15:35
Mr. BusterMr. Buster
644 bronze badges
644 bronze badges
split(head(df[sample(nrow(df)),]), 1:2)?
– Frank
Mar 25 at 16:07
add a comment |
split(head(df[sample(nrow(df)),]), 1:2)?
– Frank
Mar 25 at 16:07
split(head(df[sample(nrow(df)),]), 1:2)?– Frank
Mar 25 at 16:07
split(head(df[sample(nrow(df)),]), 1:2)?– Frank
Mar 25 at 16:07
add a comment |
3 Answers
3
active
oldest
votes
If you want to split the data into 2 distinct sets, you can create an index and split the frames, something like this
set.seed(42)
idx <- sample(1:nrow(df), 3)
df1 <- df[idx, ]
df2 <- df[-idx, ]
df1
X1 X2
10 1.359814 0.6919378
9 1.248144 0.9783253
3 1.903994 0.4371896
df2
X1 X2
1 -0.3743900 0.54040310
2 -0.3204993 0.02383999
4 -0.2552918 0.94148533
5 -0.7327228 -1.25263998
6 -1.0648850 0.06567222
7 -0.2147909 -0.19137447
8 1.2148835 1.36361765
For much more complex splits, do see caret::createDataPartition
add a comment |
We can create a function if we nee to reuse the same logic
f1 <- function(data, n)
data[sample(nrow(data), n),]
Or if we need to create train/test dataset, we can use split
lst1 <- split(df, seq_len(nrow(df)) %in% sample(nrow(df), 3))
add a comment |
You could also do something like this-
idx <- sample(seq(1, 2), size = nrow(df), replace = TRUE, prob = c(.8, .2))
set1 <- df[idx == 1,]
set2 <- df[idx == 2,]
Output-
> set1
X1 X2
1 -0.85768451 -0.1545485
2 -0.76420259 1.2054883
3 -0.91973457 1.4867429
6 -1.07558176 0.2527374
7 0.03189408 1.4057502
8 0.64270649 1.3742131
9 1.59246097 -0.3845688
10 -0.14158552 -1.5792062
> set2
X1 X2
4 -0.6317524 0.06571271
5 0.5005460 0.46277511
Note**- You can change split percent in sample function. I have used 80-20%.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55341371%2fsubset-multiple-different-rows-of-a-data-frame%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
If you want to split the data into 2 distinct sets, you can create an index and split the frames, something like this
set.seed(42)
idx <- sample(1:nrow(df), 3)
df1 <- df[idx, ]
df2 <- df[-idx, ]
df1
X1 X2
10 1.359814 0.6919378
9 1.248144 0.9783253
3 1.903994 0.4371896
df2
X1 X2
1 -0.3743900 0.54040310
2 -0.3204993 0.02383999
4 -0.2552918 0.94148533
5 -0.7327228 -1.25263998
6 -1.0648850 0.06567222
7 -0.2147909 -0.19137447
8 1.2148835 1.36361765
For much more complex splits, do see caret::createDataPartition
add a comment |
If you want to split the data into 2 distinct sets, you can create an index and split the frames, something like this
set.seed(42)
idx <- sample(1:nrow(df), 3)
df1 <- df[idx, ]
df2 <- df[-idx, ]
df1
X1 X2
10 1.359814 0.6919378
9 1.248144 0.9783253
3 1.903994 0.4371896
df2
X1 X2
1 -0.3743900 0.54040310
2 -0.3204993 0.02383999
4 -0.2552918 0.94148533
5 -0.7327228 -1.25263998
6 -1.0648850 0.06567222
7 -0.2147909 -0.19137447
8 1.2148835 1.36361765
For much more complex splits, do see caret::createDataPartition
add a comment |
If you want to split the data into 2 distinct sets, you can create an index and split the frames, something like this
set.seed(42)
idx <- sample(1:nrow(df), 3)
df1 <- df[idx, ]
df2 <- df[-idx, ]
df1
X1 X2
10 1.359814 0.6919378
9 1.248144 0.9783253
3 1.903994 0.4371896
df2
X1 X2
1 -0.3743900 0.54040310
2 -0.3204993 0.02383999
4 -0.2552918 0.94148533
5 -0.7327228 -1.25263998
6 -1.0648850 0.06567222
7 -0.2147909 -0.19137447
8 1.2148835 1.36361765
For much more complex splits, do see caret::createDataPartition
If you want to split the data into 2 distinct sets, you can create an index and split the frames, something like this
set.seed(42)
idx <- sample(1:nrow(df), 3)
df1 <- df[idx, ]
df2 <- df[-idx, ]
df1
X1 X2
10 1.359814 0.6919378
9 1.248144 0.9783253
3 1.903994 0.4371896
df2
X1 X2
1 -0.3743900 0.54040310
2 -0.3204993 0.02383999
4 -0.2552918 0.94148533
5 -0.7327228 -1.25263998
6 -1.0648850 0.06567222
7 -0.2147909 -0.19137447
8 1.2148835 1.36361765
For much more complex splits, do see caret::createDataPartition
answered Mar 25 at 15:39
SonnySonny
2,6651 gold badge5 silver badges17 bronze badges
2,6651 gold badge5 silver badges17 bronze badges
add a comment |
add a comment |
We can create a function if we nee to reuse the same logic
f1 <- function(data, n)
data[sample(nrow(data), n),]
Or if we need to create train/test dataset, we can use split
lst1 <- split(df, seq_len(nrow(df)) %in% sample(nrow(df), 3))
add a comment |
We can create a function if we nee to reuse the same logic
f1 <- function(data, n)
data[sample(nrow(data), n),]
Or if we need to create train/test dataset, we can use split
lst1 <- split(df, seq_len(nrow(df)) %in% sample(nrow(df), 3))
add a comment |
We can create a function if we nee to reuse the same logic
f1 <- function(data, n)
data[sample(nrow(data), n),]
Or if we need to create train/test dataset, we can use split
lst1 <- split(df, seq_len(nrow(df)) %in% sample(nrow(df), 3))
We can create a function if we nee to reuse the same logic
f1 <- function(data, n)
data[sample(nrow(data), n),]
Or if we need to create train/test dataset, we can use split
lst1 <- split(df, seq_len(nrow(df)) %in% sample(nrow(df), 3))
edited Mar 25 at 15:56
answered Mar 25 at 15:37
akrunakrun
445k15 gold badges247 silver badges329 bronze badges
445k15 gold badges247 silver badges329 bronze badges
add a comment |
add a comment |
You could also do something like this-
idx <- sample(seq(1, 2), size = nrow(df), replace = TRUE, prob = c(.8, .2))
set1 <- df[idx == 1,]
set2 <- df[idx == 2,]
Output-
> set1
X1 X2
1 -0.85768451 -0.1545485
2 -0.76420259 1.2054883
3 -0.91973457 1.4867429
6 -1.07558176 0.2527374
7 0.03189408 1.4057502
8 0.64270649 1.3742131
9 1.59246097 -0.3845688
10 -0.14158552 -1.5792062
> set2
X1 X2
4 -0.6317524 0.06571271
5 0.5005460 0.46277511
Note**- You can change split percent in sample function. I have used 80-20%.
add a comment |
You could also do something like this-
idx <- sample(seq(1, 2), size = nrow(df), replace = TRUE, prob = c(.8, .2))
set1 <- df[idx == 1,]
set2 <- df[idx == 2,]
Output-
> set1
X1 X2
1 -0.85768451 -0.1545485
2 -0.76420259 1.2054883
3 -0.91973457 1.4867429
6 -1.07558176 0.2527374
7 0.03189408 1.4057502
8 0.64270649 1.3742131
9 1.59246097 -0.3845688
10 -0.14158552 -1.5792062
> set2
X1 X2
4 -0.6317524 0.06571271
5 0.5005460 0.46277511
Note**- You can change split percent in sample function. I have used 80-20%.
add a comment |
You could also do something like this-
idx <- sample(seq(1, 2), size = nrow(df), replace = TRUE, prob = c(.8, .2))
set1 <- df[idx == 1,]
set2 <- df[idx == 2,]
Output-
> set1
X1 X2
1 -0.85768451 -0.1545485
2 -0.76420259 1.2054883
3 -0.91973457 1.4867429
6 -1.07558176 0.2527374
7 0.03189408 1.4057502
8 0.64270649 1.3742131
9 1.59246097 -0.3845688
10 -0.14158552 -1.5792062
> set2
X1 X2
4 -0.6317524 0.06571271
5 0.5005460 0.46277511
Note**- You can change split percent in sample function. I have used 80-20%.
You could also do something like this-
idx <- sample(seq(1, 2), size = nrow(df), replace = TRUE, prob = c(.8, .2))
set1 <- df[idx == 1,]
set2 <- df[idx == 2,]
Output-
> set1
X1 X2
1 -0.85768451 -0.1545485
2 -0.76420259 1.2054883
3 -0.91973457 1.4867429
6 -1.07558176 0.2527374
7 0.03189408 1.4057502
8 0.64270649 1.3742131
9 1.59246097 -0.3845688
10 -0.14158552 -1.5792062
> set2
X1 X2
4 -0.6317524 0.06571271
5 0.5005460 0.46277511
Note**- You can change split percent in sample function. I have used 80-20%.
answered Mar 25 at 18:57
RushabhRushabh
1,4914 silver badges22 bronze badges
1,4914 silver badges22 bronze badges
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55341371%2fsubset-multiple-different-rows-of-a-data-frame%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
split(head(df[sample(nrow(df)),]), 1:2)?– Frank
Mar 25 at 16:07