Rank subset into quantiles using NtileRescaling a variable in RWhy is `[` better than `subset`?How to rank within groups in R?Subset data and plotting in RHow do I preserve continuous (1,2,3,…n) ranking notation when ranking in R?Subset raster by highest XY percentage of valueRank function inconsistency with the expected output in RHow to calculate rolling quantile for each day on intra day data with data.tableR: Assign Rank 1 to Predifined Largest ValueWriting a function in R to iteratively subset dataframe by timeLinear Regression of Subset depending on spesific date
Brothers & sisters
Blender 2.8 I can't see vertices, edges or faces in edit mode
Why "Having chlorophyll without photosynthesis is actually very dangerous" and "like living with a bomb"?
Is it possible to create light that imparts a greater proportion of its energy as momentum rather than heat?
SSH "lag" in LAN on some machines, mixed distros
What does it mean to describe someone as a butt steak?
How to model explosives?
Can I ask the recruiters in my resume to put the reason why I am rejected?
Why is it a bad idea to hire a hitman to eliminate most corrupt politicians?
How can I make my BBEG immortal short of making them a Lich or Vampire?
Could gravitational lensing be used to protect a spaceship from a laser?
1960's book about a plague that kills all white people
Doing something right before you need it - expression for this?
What to put in ESTA if staying in US for a few days before going on to Canada
Why is Collection not simply treated as Collection<?>
Did converts (ger tzedek) in ancient Israel own land?
Did Shadowfax go to Valinor?
What exploit are these user agents trying to use?
Why doesn't H₄O²⁺ exist?
Can a virus destroy the BIOS of a modern computer?
Where does SFDX store details about scratch orgs?
90's TV series where a boy goes to another dimension through portal near power lines
Why is the 'in' operator throwing an error with a string literal instead of logging false?
Is it canonical bit space?
Rank subset into quantiles using Ntile
Rescaling a variable in RWhy is `[` better than `subset`?How to rank within groups in R?Subset data and plotting in RHow do I preserve continuous (1,2,3,…n) ranking notation when ranking in R?Subset raster by highest XY percentage of valueRank function inconsistency with the expected output in RHow to calculate rolling quantile for each day on intra day data with data.tableR: Assign Rank 1 to Predifined Largest ValueWriting a function in R to iteratively subset dataframe by timeLinear Regression of Subset depending on spesific date
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
I have a dataset containing 42840 observations with a total of 119 unique months (Dataset$date). The idea is that i want to assign a quantile to every dataset$Value within each month, and 'rank' them from 1(lowest value) to 5(highest value).
Date Name(ID) Value Quantile (I want to add this column where i assign the values a quantile from 1 to 5)
2009-03 1 35 (1-5)
2009-04 1 20 ...
2009-05 1 65 ...
2009-03 2 24 ...
2009-04 2 77 ...
2009-03 3 110 ...
.
.
.
2018-12 3 125 ...
2009-03 56 24 ...
2009-04 56 65 ...
2009-03 57 26 ...
2009-04 57 67 ...
2009-03 58 99 ...
I've tried to use the Ntile function, which works great for the whole dataset but there doesn't seem to be a function where I can specify for a subset of date.
Any suggestions?
r subset quantile
add a comment |
I have a dataset containing 42840 observations with a total of 119 unique months (Dataset$date). The idea is that i want to assign a quantile to every dataset$Value within each month, and 'rank' them from 1(lowest value) to 5(highest value).
Date Name(ID) Value Quantile (I want to add this column where i assign the values a quantile from 1 to 5)
2009-03 1 35 (1-5)
2009-04 1 20 ...
2009-05 1 65 ...
2009-03 2 24 ...
2009-04 2 77 ...
2009-03 3 110 ...
.
.
.
2018-12 3 125 ...
2009-03 56 24 ...
2009-04 56 65 ...
2009-03 57 26 ...
2009-04 57 67 ...
2009-03 58 99 ...
I've tried to use the Ntile function, which works great for the whole dataset but there doesn't seem to be a function where I can specify for a subset of date.
Any suggestions?
r subset quantile
What package is theNtile
function from? Why can't you just subset your data using square bracket notation and then pass that new, subset data frame into your function?
– divibisan
Mar 21 at 22:33
add a comment |
I have a dataset containing 42840 observations with a total of 119 unique months (Dataset$date). The idea is that i want to assign a quantile to every dataset$Value within each month, and 'rank' them from 1(lowest value) to 5(highest value).
Date Name(ID) Value Quantile (I want to add this column where i assign the values a quantile from 1 to 5)
2009-03 1 35 (1-5)
2009-04 1 20 ...
2009-05 1 65 ...
2009-03 2 24 ...
2009-04 2 77 ...
2009-03 3 110 ...
.
.
.
2018-12 3 125 ...
2009-03 56 24 ...
2009-04 56 65 ...
2009-03 57 26 ...
2009-04 57 67 ...
2009-03 58 99 ...
I've tried to use the Ntile function, which works great for the whole dataset but there doesn't seem to be a function where I can specify for a subset of date.
Any suggestions?
r subset quantile
I have a dataset containing 42840 observations with a total of 119 unique months (Dataset$date). The idea is that i want to assign a quantile to every dataset$Value within each month, and 'rank' them from 1(lowest value) to 5(highest value).
Date Name(ID) Value Quantile (I want to add this column where i assign the values a quantile from 1 to 5)
2009-03 1 35 (1-5)
2009-04 1 20 ...
2009-05 1 65 ...
2009-03 2 24 ...
2009-04 2 77 ...
2009-03 3 110 ...
.
.
.
2018-12 3 125 ...
2009-03 56 24 ...
2009-04 56 65 ...
2009-03 57 26 ...
2009-04 57 67 ...
2009-03 58 99 ...
I've tried to use the Ntile function, which works great for the whole dataset but there doesn't seem to be a function where I can specify for a subset of date.
Any suggestions?
r subset quantile
r subset quantile
asked Mar 21 at 21:57
Sondre FiskerstrandSondre Fiskerstrand
122
122
What package is theNtile
function from? Why can't you just subset your data using square bracket notation and then pass that new, subset data frame into your function?
– divibisan
Mar 21 at 22:33
add a comment |
What package is theNtile
function from? Why can't you just subset your data using square bracket notation and then pass that new, subset data frame into your function?
– divibisan
Mar 21 at 22:33
What package is the
Ntile
function from? Why can't you just subset your data using square bracket notation and then pass that new, subset data frame into your function?– divibisan
Mar 21 at 22:33
What package is the
Ntile
function from? Why can't you just subset your data using square bracket notation and then pass that new, subset data frame into your function?– divibisan
Mar 21 at 22:33
add a comment |
1 Answer
1
active
oldest
votes
You could use the base rank
function with dplyr
's group_by
:
library(dplyr)
# Create some data
N <- 3
dat <- tibble(
date = rep(1:12,N),
value = runif(12*N, 0, 100)
)
# The rescale function we will use later to fit on your 1-5 scale
## Adapted From https://stackoverflow.com/questions/25962508/rescaling-a-variable-in-r
RESCALE <- function (x, nx1, nx2, minx, maxx)
nx = nx1 + (nx2 - nx1) * (x - minx)/(maxx - minx)
return(ceiling(nx))
# What you want
dat %>%
group_by(date) %>% # Group the data by Date so that mutate fill compute the rank's for each Month
mutate(rank_detail = rank(value), # ranks the values within each group
rank_group = RESCALE(rank_detail, 1, 5, min(rank_detail), max(rank_detail)) ) %>% # rescales the ranking to be on you 1 to 5 scale
arrange(date)
# A tibble: 36 x 4
# # Groups: date [12]
# date value rank_detail rank_group
# <int> <dbl> <dbl> <dbl>
# 1 1 92.7 3 5
# 2 1 53.6 2 3
# 3 1 47.8 1 1
# 4 2 24.6 2 3
# 5 2 72.2 3 5
# 6 2 11.5 1 1
It seems that the value 1 in rank_group is only assigned if rank_detail is 1 as well. This results in 119 values in rank 1, which is the number of months (03/2009-12/2018 + Global). Based on the total number of values, each rank should have 8568 values. Do you have any suggestions for a solution? Otherwise, the code works great!
– Sondre Fiskerstrand
Mar 22 at 10:03
Hi there. This is probably due toreturn(ceiling(nx))
in theRESCALE
function. Tryreturn(round(nx))
instead.
– Ismail Müller
Mar 24 at 19:11
it now seems that the function isn't dividing the subset equally. for each month the quantiles are as follows: Quantile 1: 45 (Observations) Quantile 2: 90 (Observations) Quantile 3: 90 (Observations) Quantile 4: 90 (Observations) Quantile 5: 45 (Observations) You know why this is?
– Sondre Fiskerstrand
Mar 25 at 10:23
Do you need the ranking ofValue
being done for each Month ? In this case, if you don't have the same number of observations for each month, then you won't have equal subsetting when you look at your entire dataset !
– Ismail Müller
Mar 26 at 18:43
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55289872%2frank-subset-into-quantiles-using-ntile%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You could use the base rank
function with dplyr
's group_by
:
library(dplyr)
# Create some data
N <- 3
dat <- tibble(
date = rep(1:12,N),
value = runif(12*N, 0, 100)
)
# The rescale function we will use later to fit on your 1-5 scale
## Adapted From https://stackoverflow.com/questions/25962508/rescaling-a-variable-in-r
RESCALE <- function (x, nx1, nx2, minx, maxx)
nx = nx1 + (nx2 - nx1) * (x - minx)/(maxx - minx)
return(ceiling(nx))
# What you want
dat %>%
group_by(date) %>% # Group the data by Date so that mutate fill compute the rank's for each Month
mutate(rank_detail = rank(value), # ranks the values within each group
rank_group = RESCALE(rank_detail, 1, 5, min(rank_detail), max(rank_detail)) ) %>% # rescales the ranking to be on you 1 to 5 scale
arrange(date)
# A tibble: 36 x 4
# # Groups: date [12]
# date value rank_detail rank_group
# <int> <dbl> <dbl> <dbl>
# 1 1 92.7 3 5
# 2 1 53.6 2 3
# 3 1 47.8 1 1
# 4 2 24.6 2 3
# 5 2 72.2 3 5
# 6 2 11.5 1 1
It seems that the value 1 in rank_group is only assigned if rank_detail is 1 as well. This results in 119 values in rank 1, which is the number of months (03/2009-12/2018 + Global). Based on the total number of values, each rank should have 8568 values. Do you have any suggestions for a solution? Otherwise, the code works great!
– Sondre Fiskerstrand
Mar 22 at 10:03
Hi there. This is probably due toreturn(ceiling(nx))
in theRESCALE
function. Tryreturn(round(nx))
instead.
– Ismail Müller
Mar 24 at 19:11
it now seems that the function isn't dividing the subset equally. for each month the quantiles are as follows: Quantile 1: 45 (Observations) Quantile 2: 90 (Observations) Quantile 3: 90 (Observations) Quantile 4: 90 (Observations) Quantile 5: 45 (Observations) You know why this is?
– Sondre Fiskerstrand
Mar 25 at 10:23
Do you need the ranking ofValue
being done for each Month ? In this case, if you don't have the same number of observations for each month, then you won't have equal subsetting when you look at your entire dataset !
– Ismail Müller
Mar 26 at 18:43
add a comment |
You could use the base rank
function with dplyr
's group_by
:
library(dplyr)
# Create some data
N <- 3
dat <- tibble(
date = rep(1:12,N),
value = runif(12*N, 0, 100)
)
# The rescale function we will use later to fit on your 1-5 scale
## Adapted From https://stackoverflow.com/questions/25962508/rescaling-a-variable-in-r
RESCALE <- function (x, nx1, nx2, minx, maxx)
nx = nx1 + (nx2 - nx1) * (x - minx)/(maxx - minx)
return(ceiling(nx))
# What you want
dat %>%
group_by(date) %>% # Group the data by Date so that mutate fill compute the rank's for each Month
mutate(rank_detail = rank(value), # ranks the values within each group
rank_group = RESCALE(rank_detail, 1, 5, min(rank_detail), max(rank_detail)) ) %>% # rescales the ranking to be on you 1 to 5 scale
arrange(date)
# A tibble: 36 x 4
# # Groups: date [12]
# date value rank_detail rank_group
# <int> <dbl> <dbl> <dbl>
# 1 1 92.7 3 5
# 2 1 53.6 2 3
# 3 1 47.8 1 1
# 4 2 24.6 2 3
# 5 2 72.2 3 5
# 6 2 11.5 1 1
It seems that the value 1 in rank_group is only assigned if rank_detail is 1 as well. This results in 119 values in rank 1, which is the number of months (03/2009-12/2018 + Global). Based on the total number of values, each rank should have 8568 values. Do you have any suggestions for a solution? Otherwise, the code works great!
– Sondre Fiskerstrand
Mar 22 at 10:03
Hi there. This is probably due toreturn(ceiling(nx))
in theRESCALE
function. Tryreturn(round(nx))
instead.
– Ismail Müller
Mar 24 at 19:11
it now seems that the function isn't dividing the subset equally. for each month the quantiles are as follows: Quantile 1: 45 (Observations) Quantile 2: 90 (Observations) Quantile 3: 90 (Observations) Quantile 4: 90 (Observations) Quantile 5: 45 (Observations) You know why this is?
– Sondre Fiskerstrand
Mar 25 at 10:23
Do you need the ranking ofValue
being done for each Month ? In this case, if you don't have the same number of observations for each month, then you won't have equal subsetting when you look at your entire dataset !
– Ismail Müller
Mar 26 at 18:43
add a comment |
You could use the base rank
function with dplyr
's group_by
:
library(dplyr)
# Create some data
N <- 3
dat <- tibble(
date = rep(1:12,N),
value = runif(12*N, 0, 100)
)
# The rescale function we will use later to fit on your 1-5 scale
## Adapted From https://stackoverflow.com/questions/25962508/rescaling-a-variable-in-r
RESCALE <- function (x, nx1, nx2, minx, maxx)
nx = nx1 + (nx2 - nx1) * (x - minx)/(maxx - minx)
return(ceiling(nx))
# What you want
dat %>%
group_by(date) %>% # Group the data by Date so that mutate fill compute the rank's for each Month
mutate(rank_detail = rank(value), # ranks the values within each group
rank_group = RESCALE(rank_detail, 1, 5, min(rank_detail), max(rank_detail)) ) %>% # rescales the ranking to be on you 1 to 5 scale
arrange(date)
# A tibble: 36 x 4
# # Groups: date [12]
# date value rank_detail rank_group
# <int> <dbl> <dbl> <dbl>
# 1 1 92.7 3 5
# 2 1 53.6 2 3
# 3 1 47.8 1 1
# 4 2 24.6 2 3
# 5 2 72.2 3 5
# 6 2 11.5 1 1
You could use the base rank
function with dplyr
's group_by
:
library(dplyr)
# Create some data
N <- 3
dat <- tibble(
date = rep(1:12,N),
value = runif(12*N, 0, 100)
)
# The rescale function we will use later to fit on your 1-5 scale
## Adapted From https://stackoverflow.com/questions/25962508/rescaling-a-variable-in-r
RESCALE <- function (x, nx1, nx2, minx, maxx)
nx = nx1 + (nx2 - nx1) * (x - minx)/(maxx - minx)
return(ceiling(nx))
# What you want
dat %>%
group_by(date) %>% # Group the data by Date so that mutate fill compute the rank's for each Month
mutate(rank_detail = rank(value), # ranks the values within each group
rank_group = RESCALE(rank_detail, 1, 5, min(rank_detail), max(rank_detail)) ) %>% # rescales the ranking to be on you 1 to 5 scale
arrange(date)
# A tibble: 36 x 4
# # Groups: date [12]
# date value rank_detail rank_group
# <int> <dbl> <dbl> <dbl>
# 1 1 92.7 3 5
# 2 1 53.6 2 3
# 3 1 47.8 1 1
# 4 2 24.6 2 3
# 5 2 72.2 3 5
# 6 2 11.5 1 1
answered Mar 21 at 23:17
Ismail MüllerIsmail Müller
1164
1164
It seems that the value 1 in rank_group is only assigned if rank_detail is 1 as well. This results in 119 values in rank 1, which is the number of months (03/2009-12/2018 + Global). Based on the total number of values, each rank should have 8568 values. Do you have any suggestions for a solution? Otherwise, the code works great!
– Sondre Fiskerstrand
Mar 22 at 10:03
Hi there. This is probably due toreturn(ceiling(nx))
in theRESCALE
function. Tryreturn(round(nx))
instead.
– Ismail Müller
Mar 24 at 19:11
it now seems that the function isn't dividing the subset equally. for each month the quantiles are as follows: Quantile 1: 45 (Observations) Quantile 2: 90 (Observations) Quantile 3: 90 (Observations) Quantile 4: 90 (Observations) Quantile 5: 45 (Observations) You know why this is?
– Sondre Fiskerstrand
Mar 25 at 10:23
Do you need the ranking ofValue
being done for each Month ? In this case, if you don't have the same number of observations for each month, then you won't have equal subsetting when you look at your entire dataset !
– Ismail Müller
Mar 26 at 18:43
add a comment |
It seems that the value 1 in rank_group is only assigned if rank_detail is 1 as well. This results in 119 values in rank 1, which is the number of months (03/2009-12/2018 + Global). Based on the total number of values, each rank should have 8568 values. Do you have any suggestions for a solution? Otherwise, the code works great!
– Sondre Fiskerstrand
Mar 22 at 10:03
Hi there. This is probably due toreturn(ceiling(nx))
in theRESCALE
function. Tryreturn(round(nx))
instead.
– Ismail Müller
Mar 24 at 19:11
it now seems that the function isn't dividing the subset equally. for each month the quantiles are as follows: Quantile 1: 45 (Observations) Quantile 2: 90 (Observations) Quantile 3: 90 (Observations) Quantile 4: 90 (Observations) Quantile 5: 45 (Observations) You know why this is?
– Sondre Fiskerstrand
Mar 25 at 10:23
Do you need the ranking ofValue
being done for each Month ? In this case, if you don't have the same number of observations for each month, then you won't have equal subsetting when you look at your entire dataset !
– Ismail Müller
Mar 26 at 18:43
It seems that the value 1 in rank_group is only assigned if rank_detail is 1 as well. This results in 119 values in rank 1, which is the number of months (03/2009-12/2018 + Global). Based on the total number of values, each rank should have 8568 values. Do you have any suggestions for a solution? Otherwise, the code works great!
– Sondre Fiskerstrand
Mar 22 at 10:03
It seems that the value 1 in rank_group is only assigned if rank_detail is 1 as well. This results in 119 values in rank 1, which is the number of months (03/2009-12/2018 + Global). Based on the total number of values, each rank should have 8568 values. Do you have any suggestions for a solution? Otherwise, the code works great!
– Sondre Fiskerstrand
Mar 22 at 10:03
Hi there. This is probably due to
return(ceiling(nx))
in the RESCALE
function. Try return(round(nx))
instead.– Ismail Müller
Mar 24 at 19:11
Hi there. This is probably due to
return(ceiling(nx))
in the RESCALE
function. Try return(round(nx))
instead.– Ismail Müller
Mar 24 at 19:11
it now seems that the function isn't dividing the subset equally. for each month the quantiles are as follows: Quantile 1: 45 (Observations) Quantile 2: 90 (Observations) Quantile 3: 90 (Observations) Quantile 4: 90 (Observations) Quantile 5: 45 (Observations) You know why this is?
– Sondre Fiskerstrand
Mar 25 at 10:23
it now seems that the function isn't dividing the subset equally. for each month the quantiles are as follows: Quantile 1: 45 (Observations) Quantile 2: 90 (Observations) Quantile 3: 90 (Observations) Quantile 4: 90 (Observations) Quantile 5: 45 (Observations) You know why this is?
– Sondre Fiskerstrand
Mar 25 at 10:23
Do you need the ranking of
Value
being done for each Month ? In this case, if you don't have the same number of observations for each month, then you won't have equal subsetting when you look at your entire dataset !– Ismail Müller
Mar 26 at 18:43
Do you need the ranking of
Value
being done for each Month ? In this case, if you don't have the same number of observations for each month, then you won't have equal subsetting when you look at your entire dataset !– Ismail Müller
Mar 26 at 18:43
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55289872%2frank-subset-into-quantiles-using-ntile%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What package is the
Ntile
function from? Why can't you just subset your data using square bracket notation and then pass that new, subset data frame into your function?– divibisan
Mar 21 at 22:33