Subset multiple different rows of a Data frameDrop factor levels in a subsetted data frameHow to join (merge) data frames (inner, outer, left, right)Drop data frame columns by nameWhy is `[` better than `subset`?How to combine dataframes with different columns without merge?Keep replacing duplicate rows until there are none left?Filtering Data Frame by using a Dynamic Data FrameData frame multiplication and storing in a other data frame using RDivide each values in rows of different data frames by values stored in vectorMerge multiple data frames with partially matching rows

Security Patch SUPEE-11155 - Possible issues?

How soon after takeoff can you recline your airplane seat?

What verb for taking advantage fits in "I don't want to ________ on the friendship"?

Installed software from source, how to say yum not to install it from package?

Can I take Amul cottage cheese from India to Netherlands?

The alcoholic village festival

How do I keep a running total of data in a column in excel?

What structure do natural isomorphisms preserve?

Excel prefixes or suffixes

Why was Pan Am Flight 103 flying over Lockerbie?

I agreed to cancel a long-planned vacation (with travel costs) due to project deadlines, but now the timeline has all changed again

English idiomatic equivalents of 能骗就骗 (if you can cheat, then cheat)

Having to constantly redo everything because I don't know how to do it

What are the children of two Muggle-borns called?

ESTA Elegible after Qatar?

Is leaving out prefixes like "rauf", "rüber", "rein" when describing movement considered a big mistake in spoken German?

Tricolour nonogram

Tikz, loop not appearing

Single method for different parameterized mysql command

Why wasn't EBCDIC designed with contiguous alphanumeric characters?

Why will we fail creating a self sustaining off world colony?

Can you twin the Light cantrip?

Why did the Apple //e make a hideous noise if you inserted the disk upside down?

Is it possible to pray to Hashem for a specific person as your prospective spouse?

Subset multiple different rows of a Data frame

Drop factor levels in a subsetted data frameHow to join (merge) data frames (inner, outer, left, right)Drop data frame columns by nameWhy is `[` better than `subset`?How to combine dataframes with different columns without merge?Keep replacing duplicate rows until there are none left?Filtering Data Frame by using a Dynamic Data FrameData frame multiplication and storing in a other data frame using RDivide each values in rows of different data frames by values stored in vectorMerge multiple data frames with partially matching rows

Hi How can I subset 2 different N random samples in a data frame. See example below.

I have df the main dataset. I need 2 subsets of the main dataset. I got 2 subsets by getting 3 random rows from the main dataset. However I need those 2 subsets to be unique with each other.

> df = data.frame(matrix(rnorm(20), nrow=10))
> df
 X1 X2
1 0.19234071 -0.86702704
2 -0.18264853 1.75276062
3 0.75824257 -0.51314220
4 -0.84571563 -1.24841675
5 0.75470152 1.51408945
6 1.04546517 1.33292716
7 -0.51449011 -1.51275633
8 1.36014747 0.07400024
9 -0.02397481 0.17177997
10 -1.37967248 -0.50416489

df1 = df[sample(nrow(df), 3), ]
df1
 X1 X2
10 -1.3796725 -0.5041649
1 0.1923407 -0.8670270
4 -0.8457156 -1.2484167

df2 = df[sample(nrow(df), 3), ]
df2
 X1 X2
3 0.7582426 -0.5131422
4 -0.8457156 -1.2484167
6 1.0454652 1.3329272

As you can see the random subsets df1 and df2 have same row which is the row 4. I need 2 random subsets of the dataframe that had different rows.

asked Mar 25 at 15:35

Mr. Buster

644 bronze badges

split(head(df[sample(nrow(df)),]), 1:2)?

– Frank
Mar 25 at 16:07

add a comment |

Hi How can I subset 2 different N random samples in a data frame. See example below.

I have df the main dataset. I need 2 subsets of the main dataset. I got 2 subsets by getting 3 random rows from the main dataset. However I need those 2 subsets to be unique with each other.

> df = data.frame(matrix(rnorm(20), nrow=10))
> df
 X1 X2
1 0.19234071 -0.86702704
2 -0.18264853 1.75276062
3 0.75824257 -0.51314220
4 -0.84571563 -1.24841675
5 0.75470152 1.51408945
6 1.04546517 1.33292716
7 -0.51449011 -1.51275633
8 1.36014747 0.07400024
9 -0.02397481 0.17177997
10 -1.37967248 -0.50416489

df1 = df[sample(nrow(df), 3), ]
df1
 X1 X2
10 -1.3796725 -0.5041649
1 0.1923407 -0.8670270
4 -0.8457156 -1.2484167

df2 = df[sample(nrow(df), 3), ]
df2
 X1 X2
3 0.7582426 -0.5131422
4 -0.8457156 -1.2484167
6 1.0454652 1.3329272

As you can see the random subsets df1 and df2 have same row which is the row 4. I need 2 random subsets of the dataframe that had different rows.

asked Mar 25 at 15:35

Mr. Buster

644 bronze badges

split(head(df[sample(nrow(df)),]), 1:2)?

– Frank
Mar 25 at 16:07

add a comment |

Hi How can I subset 2 different N random samples in a data frame. See example below.

I have df the main dataset. I need 2 subsets of the main dataset. I got 2 subsets by getting 3 random rows from the main dataset. However I need those 2 subsets to be unique with each other.

> df = data.frame(matrix(rnorm(20), nrow=10))
> df
 X1 X2
1 0.19234071 -0.86702704
2 -0.18264853 1.75276062
3 0.75824257 -0.51314220
4 -0.84571563 -1.24841675
5 0.75470152 1.51408945
6 1.04546517 1.33292716
7 -0.51449011 -1.51275633
8 1.36014747 0.07400024
9 -0.02397481 0.17177997
10 -1.37967248 -0.50416489

df1 = df[sample(nrow(df), 3), ]
df1
 X1 X2
10 -1.3796725 -0.5041649
1 0.1923407 -0.8670270
4 -0.8457156 -1.2484167

df2 = df[sample(nrow(df), 3), ]
df2
 X1 X2
3 0.7582426 -0.5131422
4 -0.8457156 -1.2484167
6 1.0454652 1.3329272

As you can see the random subsets df1 and df2 have same row which is the row 4. I need 2 random subsets of the dataframe that had different rows.

asked Mar 25 at 15:35

Mr. Buster

644 bronze badges

Hi How can I subset 2 different N random samples in a data frame. See example below.

I have df the main dataset. I need 2 subsets of the main dataset. I got 2 subsets by getting 3 random rows from the main dataset. However I need those 2 subsets to be unique with each other.

> df = data.frame(matrix(rnorm(20), nrow=10))
> df
 X1 X2
1 0.19234071 -0.86702704
2 -0.18264853 1.75276062
3 0.75824257 -0.51314220
4 -0.84571563 -1.24841675
5 0.75470152 1.51408945
6 1.04546517 1.33292716
7 -0.51449011 -1.51275633
8 1.36014747 0.07400024
9 -0.02397481 0.17177997
10 -1.37967248 -0.50416489

df1 = df[sample(nrow(df), 3), ]
df1
 X1 X2
10 -1.3796725 -0.5041649
1 0.1923407 -0.8670270
4 -0.8457156 -1.2484167

df2 = df[sample(nrow(df), 3), ]
df2
 X1 X2
3 0.7582426 -0.5131422
4 -0.8457156 -1.2484167
6 1.0454652 1.3329272

As you can see the random subsets df1 and df2 have same row which is the row 4. I need 2 random subsets of the dataframe that had different rows.

asked Mar 25 at 15:35

Mr. Buster

644 bronze badges

asked Mar 25 at 15:35

Mr. Buster

644 bronze badges

asked Mar 25 at 15:35

Mr. Buster

644 bronze badges

asked Mar 25 at 15:35

Mr. Buster

644 bronze badges

asked Mar 25 at 15:35

Mr. Buster

644 bronze badges

split(head(df[sample(nrow(df)),]), 1:2)?

– Frank
Mar 25 at 16:07

add a comment |

split(head(df[sample(nrow(df)),]), 1:2)?

– Frank
Mar 25 at 16:07

split(head(df[sample(nrow(df)),]), 1:2)?

– Frank
Mar 25 at 16:07

add a comment |

3 Answers
3

active

oldest

votes

If you want to split the data into 2 distinct sets, you can create an index and split the frames, something like this

set.seed(42)
idx <- sample(1:nrow(df), 3)
df1 <- df[idx, ]
df2 <- df[-idx, ]
df1
 X1 X2
10 1.359814 0.6919378
9 1.248144 0.9783253
3 1.903994 0.4371896
df2
 X1 X2
1 -0.3743900 0.54040310
2 -0.3204993 0.02383999
4 -0.2552918 0.94148533
5 -0.7327228 -1.25263998
6 -1.0648850 0.06567222
7 -0.2147909 -0.19137447
8 1.2148835 1.36361765

For much more complex splits, do see caret::createDataPartition

answered Mar 25 at 15:39

Sonny

2,6651 gold badge5 silver badges17 bronze badges

add a comment |

We can create a function if we nee to reuse the same logic

f1 <- function(data, n) 
 data[sample(nrow(data), n),]

Or if we need to create train/test dataset, we can use split

lst1 <- split(df, seq_len(nrow(df)) %in% sample(nrow(df), 3))

edited Mar 25 at 15:56

answered Mar 25 at 15:37

akrun

445k15 gold badges247 silver badges329 bronze badges

add a comment |

You could also do something like this-

idx <- sample(seq(1, 2), size = nrow(df), replace = TRUE, prob = c(.8, .2))
set1 <- df[idx == 1,]
set2 <- df[idx == 2,]

Output-

> set1
 X1 X2
1 -0.85768451 -0.1545485
2 -0.76420259 1.2054883
3 -0.91973457 1.4867429
6 -1.07558176 0.2527374
7 0.03189408 1.4057502
8 0.64270649 1.3742131
9 1.59246097 -0.3845688
10 -0.14158552 -1.5792062

> set2
 X1 X2
4 -0.6317524 0.06571271
5 0.5005460 0.46277511

Note**- You can change split percent in sample function. I have used 80-20%.

answered Mar 25 at 18:57

Rushabh

1,4914 silver badges22 bronze badges

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55341371%2fsubset-multiple-different-rows-of-a-data-frame%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

If you want to split the data into 2 distinct sets, you can create an index and split the frames, something like this

set.seed(42)
idx <- sample(1:nrow(df), 3)
df1 <- df[idx, ]
df2 <- df[-idx, ]
df1
 X1 X2
10 1.359814 0.6919378
9 1.248144 0.9783253
3 1.903994 0.4371896
df2
 X1 X2
1 -0.3743900 0.54040310
2 -0.3204993 0.02383999
4 -0.2552918 0.94148533
5 -0.7327228 -1.25263998
6 -1.0648850 0.06567222
7 -0.2147909 -0.19137447
8 1.2148835 1.36361765

For much more complex splits, do see caret::createDataPartition

answered Mar 25 at 15:39

Sonny

2,6651 gold badge5 silver badges17 bronze badges

add a comment |

If you want to split the data into 2 distinct sets, you can create an index and split the frames, something like this

set.seed(42)
idx <- sample(1:nrow(df), 3)
df1 <- df[idx, ]
df2 <- df[-idx, ]
df1
 X1 X2
10 1.359814 0.6919378
9 1.248144 0.9783253
3 1.903994 0.4371896
df2
 X1 X2
1 -0.3743900 0.54040310
2 -0.3204993 0.02383999
4 -0.2552918 0.94148533
5 -0.7327228 -1.25263998
6 -1.0648850 0.06567222
7 -0.2147909 -0.19137447
8 1.2148835 1.36361765

For much more complex splits, do see caret::createDataPartition

answered Mar 25 at 15:39

Sonny

2,6651 gold badge5 silver badges17 bronze badges

add a comment |

If you want to split the data into 2 distinct sets, you can create an index and split the frames, something like this

set.seed(42)
idx <- sample(1:nrow(df), 3)
df1 <- df[idx, ]
df2 <- df[-idx, ]
df1
 X1 X2
10 1.359814 0.6919378
9 1.248144 0.9783253
3 1.903994 0.4371896
df2
 X1 X2
1 -0.3743900 0.54040310
2 -0.3204993 0.02383999
4 -0.2552918 0.94148533
5 -0.7327228 -1.25263998
6 -1.0648850 0.06567222
7 -0.2147909 -0.19137447
8 1.2148835 1.36361765

For much more complex splits, do see caret::createDataPartition

answered Mar 25 at 15:39

Sonny

2,6651 gold badge5 silver badges17 bronze badges

If you want to split the data into 2 distinct sets, you can create an index and split the frames, something like this

set.seed(42)
idx <- sample(1:nrow(df), 3)
df1 <- df[idx, ]
df2 <- df[-idx, ]
df1
 X1 X2
10 1.359814 0.6919378
9 1.248144 0.9783253
3 1.903994 0.4371896
df2
 X1 X2
1 -0.3743900 0.54040310
2 -0.3204993 0.02383999
4 -0.2552918 0.94148533
5 -0.7327228 -1.25263998
6 -1.0648850 0.06567222
7 -0.2147909 -0.19137447
8 1.2148835 1.36361765

For much more complex splits, do see caret::createDataPartition

answered Mar 25 at 15:39

Sonny

2,6651 gold badge5 silver badges17 bronze badges

answered Mar 25 at 15:39

Sonny

2,6651 gold badge5 silver badges17 bronze badges

answered Mar 25 at 15:39

Sonny

2,6651 gold badge5 silver badges17 bronze badges

answered Mar 25 at 15:39

Sonny

2,6651 gold badge5 silver badges17 bronze badges

add a comment |

We can create a function if we nee to reuse the same logic

f1 <- function(data, n) 
 data[sample(nrow(data), n),]

Or if we need to create train/test dataset, we can use split

lst1 <- split(df, seq_len(nrow(df)) %in% sample(nrow(df), 3))

edited Mar 25 at 15:56

answered Mar 25 at 15:37

akrun

445k15 gold badges247 silver badges329 bronze badges

add a comment |

We can create a function if we nee to reuse the same logic

f1 <- function(data, n) 
 data[sample(nrow(data), n),]

Or if we need to create train/test dataset, we can use split

lst1 <- split(df, seq_len(nrow(df)) %in% sample(nrow(df), 3))

edited Mar 25 at 15:56

answered Mar 25 at 15:37

akrun

445k15 gold badges247 silver badges329 bronze badges

add a comment |

We can create a function if we nee to reuse the same logic

f1 <- function(data, n) 
 data[sample(nrow(data), n),]

Or if we need to create train/test dataset, we can use split

lst1 <- split(df, seq_len(nrow(df)) %in% sample(nrow(df), 3))

edited Mar 25 at 15:56

answered Mar 25 at 15:37

akrun

445k15 gold badges247 silver badges329 bronze badges

We can create a function if we nee to reuse the same logic

f1 <- function(data, n) 
 data[sample(nrow(data), n),]

Or if we need to create train/test dataset, we can use split

lst1 <- split(df, seq_len(nrow(df)) %in% sample(nrow(df), 3))

edited Mar 25 at 15:56

answered Mar 25 at 15:37

akrun

445k15 gold badges247 silver badges329 bronze badges

edited Mar 25 at 15:56

answered Mar 25 at 15:37

akrun

445k15 gold badges247 silver badges329 bronze badges

answered Mar 25 at 15:37

akrun

445k15 gold badges247 silver badges329 bronze badges

answered Mar 25 at 15:37

akrun

445k15 gold badges247 silver badges329 bronze badges

add a comment |

You could also do something like this-

idx <- sample(seq(1, 2), size = nrow(df), replace = TRUE, prob = c(.8, .2))
set1 <- df[idx == 1,]
set2 <- df[idx == 2,]

Output-

> set1
 X1 X2
1 -0.85768451 -0.1545485
2 -0.76420259 1.2054883
3 -0.91973457 1.4867429
6 -1.07558176 0.2527374
7 0.03189408 1.4057502
8 0.64270649 1.3742131
9 1.59246097 -0.3845688
10 -0.14158552 -1.5792062

> set2
 X1 X2
4 -0.6317524 0.06571271
5 0.5005460 0.46277511

Note**- You can change split percent in sample function. I have used 80-20%.

answered Mar 25 at 18:57

Rushabh

1,4914 silver badges22 bronze badges

add a comment |

You could also do something like this-

idx <- sample(seq(1, 2), size = nrow(df), replace = TRUE, prob = c(.8, .2))
set1 <- df[idx == 1,]
set2 <- df[idx == 2,]

Output-

> set1
 X1 X2
1 -0.85768451 -0.1545485
2 -0.76420259 1.2054883
3 -0.91973457 1.4867429
6 -1.07558176 0.2527374
7 0.03189408 1.4057502
8 0.64270649 1.3742131
9 1.59246097 -0.3845688
10 -0.14158552 -1.5792062

> set2
 X1 X2
4 -0.6317524 0.06571271
5 0.5005460 0.46277511

Note**- You can change split percent in sample function. I have used 80-20%.

answered Mar 25 at 18:57

Rushabh

1,4914 silver badges22 bronze badges

add a comment |

You could also do something like this-

idx <- sample(seq(1, 2), size = nrow(df), replace = TRUE, prob = c(.8, .2))
set1 <- df[idx == 1,]
set2 <- df[idx == 2,]

Output-

> set1
 X1 X2
1 -0.85768451 -0.1545485
2 -0.76420259 1.2054883
3 -0.91973457 1.4867429
6 -1.07558176 0.2527374
7 0.03189408 1.4057502
8 0.64270649 1.3742131
9 1.59246097 -0.3845688
10 -0.14158552 -1.5792062

> set2
 X1 X2
4 -0.6317524 0.06571271
5 0.5005460 0.46277511

Note**- You can change split percent in sample function. I have used 80-20%.

answered Mar 25 at 18:57

Rushabh

1,4914 silver badges22 bronze badges

You could also do something like this-

idx <- sample(seq(1, 2), size = nrow(df), replace = TRUE, prob = c(.8, .2))
set1 <- df[idx == 1,]
set2 <- df[idx == 2,]

Output-

> set1
 X1 X2
1 -0.85768451 -0.1545485
2 -0.76420259 1.2054883
3 -0.91973457 1.4867429
6 -1.07558176 0.2527374
7 0.03189408 1.4057502
8 0.64270649 1.3742131
9 1.59246097 -0.3845688
10 -0.14158552 -1.5792062

> set2
 X1 X2
4 -0.6317524 0.06571271
5 0.5005460 0.46277511

Note**- You can change split percent in sample function. I have used 80-20%.

answered Mar 25 at 18:57

Rushabh

1,4914 silver badges22 bronze badges

answered Mar 25 at 18:57

Rushabh

1,4914 silver badges22 bronze badges

answered Mar 25 at 18:57

Rushabh

1,4914 silver badges22 bronze badges

answered Mar 25 at 18:57

Rushabh

1,4914 silver badges22 bronze badges

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Styjun

3 Answers
3

Your Answer

Post as a guest

3 Answers
3

3 Answers
3

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

3 Answers 3

3 Answers 3

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

3 Answers
3

3 Answers
3

3 Answers
3