In R: How to extract info about valid time periods and apply it to another dataset?Linking characters from one data.frame to other datasetsas.Date is throwing a row number mismatch, but all vectors are same lengthHow to convert column values to rows for each unique value in a dataframe in R?Create monthly mean by time intervalsR - Create monthly mean by time intervals of multiple observationsHow do I update data from an incomplete lookup table?How can I find periodically appearing NA values in an 3D array (along dimension time) with RCalculatin yearly accumulative growth rate by months in ts object in R?Compute standard deviation of a percent above a threshold by groupValidate time series index
What is the best translation for "slot" in the context of multiplayer video games?
How do we know the LHC results are robust?
Method to test if a number is a perfect power?
Applicability of Single Responsibility Principle
How many times can American Tourist re-enter UK in same 6 month period?
Detecting if an element is found inside a container
How can I get through very long and very dry, but also very useful technical documents when learning a new tool?
How do scammers retract money, while you can’t?
You cannot touch me, but I can touch you, who am I?
Are student evaluations of teaching assistants read by others in the faculty?
Avoiding estate tax by giving multiple gifts
Is exact Kanji stroke length important?
Is this version of a gravity generator feasible?
Is a stroke of luck acceptable after a series of unfavorable events?
How did Doctor Strange see the winning outcome in Avengers: Infinity War?
Why escape if the_content isnt?
when is out of tune ok?
How can we prove that any integral in the set of non-elementary integrals cannot be expressed in the form of elementary functions?
Do sorcerers' subtle spells require a skill check to be unseen?
Class Action - which options I have?
Unreliable Magic - Is it worth it?
Type int? vs type int
Why didn't Theresa May consult with Parliament before negotiating a deal with the EU?
Is the destination of a commercial flight important for the pilot?
In R: How to extract info about valid time periods and apply it to another dataset?
Linking characters from one data.frame to other datasetsas.Date is throwing a row number mismatch, but all vectors are same lengthHow to convert column values to rows for each unique value in a dataframe in R?Create monthly mean by time intervalsR - Create monthly mean by time intervals of multiple observationsHow do I update data from an incomplete lookup table?How can I find periodically appearing NA values in an 3D array (along dimension time) with RCalculatin yearly accumulative growth rate by months in ts object in R?Compute standard deviation of a percent above a threshold by groupValidate time series index
I have to datasets, that I want to combine:
Dataset 1: Contains time periods for which "perc" is valid:
set.seed(1)
example_df <- data.frame(ID = rep(1:2, each=2),
start = c(as.Date("2014-01-01"), as.Date("2014-03-05"), as.Date("2014-01-13"), as.Date("2014-03-15")),
end = c(as.Date("2014-03-05"), as.Date("2014-04-12"), as.Date("2014-03-01"), as.Date("2014-04-02")),
perc = rnorm(mean= 30, sd= 10, 4))
Dataset 2: Contains pay for each month:
month_start <- as.Date("2014-01-01") + months(0:3)
month_end <- ceiling_date(month_start, "month") - days(1)
set.seed(1)
example_df2 <- data.frame(month_start, month_end,
ID = rep(1:2, each=4),
pay = rnorm(mean= 2000, sd= 80, 8))
The goal is to calculate pay for each individual for each month based on how much perc they worked. Important is to take into account the valid time periods for perc, which might change within the month.
e.g.:
Jan 2014 for ID 1: Pay = 1949.884 (pay)*23.73546 (perc)/100
because perc is valid for the entire January.
However, for March, perc is 23.73546 until 5th and perc is 31.83643 for the rest of March.
Thus,
Mar 2014 for ID 1: Pay = 1949.884 (pay)*23.73546 (perc)/100/ 31 (days of March)*5 + 1949.884 (pay)*31.83643 (perc)/100/ 31 (days of March)*26
r time dplyr period
add a comment |
I have to datasets, that I want to combine:
Dataset 1: Contains time periods for which "perc" is valid:
set.seed(1)
example_df <- data.frame(ID = rep(1:2, each=2),
start = c(as.Date("2014-01-01"), as.Date("2014-03-05"), as.Date("2014-01-13"), as.Date("2014-03-15")),
end = c(as.Date("2014-03-05"), as.Date("2014-04-12"), as.Date("2014-03-01"), as.Date("2014-04-02")),
perc = rnorm(mean= 30, sd= 10, 4))
Dataset 2: Contains pay for each month:
month_start <- as.Date("2014-01-01") + months(0:3)
month_end <- ceiling_date(month_start, "month") - days(1)
set.seed(1)
example_df2 <- data.frame(month_start, month_end,
ID = rep(1:2, each=4),
pay = rnorm(mean= 2000, sd= 80, 8))
The goal is to calculate pay for each individual for each month based on how much perc they worked. Important is to take into account the valid time periods for perc, which might change within the month.
e.g.:
Jan 2014 for ID 1: Pay = 1949.884 (pay)*23.73546 (perc)/100
because perc is valid for the entire January.
However, for March, perc is 23.73546 until 5th and perc is 31.83643 for the rest of March.
Thus,
Mar 2014 for ID 1: Pay = 1949.884 (pay)*23.73546 (perc)/100/ 31 (days of March)*5 + 1949.884 (pay)*31.83643 (perc)/100/ 31 (days of March)*26
r time dplyr period
add a comment |
I have to datasets, that I want to combine:
Dataset 1: Contains time periods for which "perc" is valid:
set.seed(1)
example_df <- data.frame(ID = rep(1:2, each=2),
start = c(as.Date("2014-01-01"), as.Date("2014-03-05"), as.Date("2014-01-13"), as.Date("2014-03-15")),
end = c(as.Date("2014-03-05"), as.Date("2014-04-12"), as.Date("2014-03-01"), as.Date("2014-04-02")),
perc = rnorm(mean= 30, sd= 10, 4))
Dataset 2: Contains pay for each month:
month_start <- as.Date("2014-01-01") + months(0:3)
month_end <- ceiling_date(month_start, "month") - days(1)
set.seed(1)
example_df2 <- data.frame(month_start, month_end,
ID = rep(1:2, each=4),
pay = rnorm(mean= 2000, sd= 80, 8))
The goal is to calculate pay for each individual for each month based on how much perc they worked. Important is to take into account the valid time periods for perc, which might change within the month.
e.g.:
Jan 2014 for ID 1: Pay = 1949.884 (pay)*23.73546 (perc)/100
because perc is valid for the entire January.
However, for March, perc is 23.73546 until 5th and perc is 31.83643 for the rest of March.
Thus,
Mar 2014 for ID 1: Pay = 1949.884 (pay)*23.73546 (perc)/100/ 31 (days of March)*5 + 1949.884 (pay)*31.83643 (perc)/100/ 31 (days of March)*26
r time dplyr period
I have to datasets, that I want to combine:
Dataset 1: Contains time periods for which "perc" is valid:
set.seed(1)
example_df <- data.frame(ID = rep(1:2, each=2),
start = c(as.Date("2014-01-01"), as.Date("2014-03-05"), as.Date("2014-01-13"), as.Date("2014-03-15")),
end = c(as.Date("2014-03-05"), as.Date("2014-04-12"), as.Date("2014-03-01"), as.Date("2014-04-02")),
perc = rnorm(mean= 30, sd= 10, 4))
Dataset 2: Contains pay for each month:
month_start <- as.Date("2014-01-01") + months(0:3)
month_end <- ceiling_date(month_start, "month") - days(1)
set.seed(1)
example_df2 <- data.frame(month_start, month_end,
ID = rep(1:2, each=4),
pay = rnorm(mean= 2000, sd= 80, 8))
The goal is to calculate pay for each individual for each month based on how much perc they worked. Important is to take into account the valid time periods for perc, which might change within the month.
e.g.:
Jan 2014 for ID 1: Pay = 1949.884 (pay)*23.73546 (perc)/100
because perc is valid for the entire January.
However, for March, perc is 23.73546 until 5th and perc is 31.83643 for the rest of March.
Thus,
Mar 2014 for ID 1: Pay = 1949.884 (pay)*23.73546 (perc)/100/ 31 (days of March)*5 + 1949.884 (pay)*31.83643 (perc)/100/ 31 (days of March)*26
r time dplyr period
r time dplyr period
edited Mar 21 at 16:02
user138089
asked Mar 21 at 15:54
user138089user138089
123
123
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Start with a left_join()
between your 2 dataframes. Each period of work of an ID
will be replicate on each pay month period of this ID
.
Then, we a succession of ifelse()
, you can determine if the total month should be count, only a part, or not at all.
library(tidyverse)
result <- example_df %>%
left_join(example_df2, by = 'ID') %>%
mutate(
TEST_MONTH = ifelse(end >= month_start & start < month_end, 1, 0),
TEST_DAYS = ifelse(TEST_MONTH == 1,
ifelse(end > month_end,
ifelse(start >= month_start, month_end - start + 1, month_end - month_start + 1),
end - month_start + 1),
0),
PAID = pay * perc/100 * TEST_DAYS / as.numeric(month_end - month_start + 1)
)
result %>% filter(ID == 1)
# ID start end perc month_start month_end pay TEST_MONTH TEST_DAYS PAID
# 1 1 2014-01-01 2014-03-05 23.73546 2014-01-01 2014-01-31 1949.884 1 31 462.81390
# 2 1 2014-01-01 2014-03-05 23.73546 2014-02-01 2014-02-28 2014.691 1 28 478.19633
# 3 1 2014-01-01 2014-03-05 23.73546 2014-03-01 2014-03-31 1933.150 1 5 74.00678
# 4 1 2014-01-01 2014-03-05 23.73546 2014-04-01 2014-04-30 2127.622 0 0 0.00000
# 5 1 2014-03-05 2014-04-12 31.83643 2014-01-01 2014-01-31 1949.884 0 0 0.00000
# 6 1 2014-03-05 2014-04-12 31.83643 2014-02-01 2014-02-28 2014.691 0 0 0.00000
# 7 1 2014-03-05 2014-04-12 31.83643 2014-03-01 2014-03-31 1933.150 1 27 536.03354
# 8 1 2014-03-05 2014-04-12 31.83643 2014-04-01 2014-04-30 2127.622 1 12 270.94364
Thank you very much!!! Yes, this works. I would not have come up with the left_join. Makes perfect sense. :-)
– user138089
Mar 21 at 17:43
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55284428%2fin-r-how-to-extract-info-about-valid-time-periods-and-apply-it-to-another-datas%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Start with a left_join()
between your 2 dataframes. Each period of work of an ID
will be replicate on each pay month period of this ID
.
Then, we a succession of ifelse()
, you can determine if the total month should be count, only a part, or not at all.
library(tidyverse)
result <- example_df %>%
left_join(example_df2, by = 'ID') %>%
mutate(
TEST_MONTH = ifelse(end >= month_start & start < month_end, 1, 0),
TEST_DAYS = ifelse(TEST_MONTH == 1,
ifelse(end > month_end,
ifelse(start >= month_start, month_end - start + 1, month_end - month_start + 1),
end - month_start + 1),
0),
PAID = pay * perc/100 * TEST_DAYS / as.numeric(month_end - month_start + 1)
)
result %>% filter(ID == 1)
# ID start end perc month_start month_end pay TEST_MONTH TEST_DAYS PAID
# 1 1 2014-01-01 2014-03-05 23.73546 2014-01-01 2014-01-31 1949.884 1 31 462.81390
# 2 1 2014-01-01 2014-03-05 23.73546 2014-02-01 2014-02-28 2014.691 1 28 478.19633
# 3 1 2014-01-01 2014-03-05 23.73546 2014-03-01 2014-03-31 1933.150 1 5 74.00678
# 4 1 2014-01-01 2014-03-05 23.73546 2014-04-01 2014-04-30 2127.622 0 0 0.00000
# 5 1 2014-03-05 2014-04-12 31.83643 2014-01-01 2014-01-31 1949.884 0 0 0.00000
# 6 1 2014-03-05 2014-04-12 31.83643 2014-02-01 2014-02-28 2014.691 0 0 0.00000
# 7 1 2014-03-05 2014-04-12 31.83643 2014-03-01 2014-03-31 1933.150 1 27 536.03354
# 8 1 2014-03-05 2014-04-12 31.83643 2014-04-01 2014-04-30 2127.622 1 12 270.94364
Thank you very much!!! Yes, this works. I would not have come up with the left_join. Makes perfect sense. :-)
– user138089
Mar 21 at 17:43
add a comment |
Start with a left_join()
between your 2 dataframes. Each period of work of an ID
will be replicate on each pay month period of this ID
.
Then, we a succession of ifelse()
, you can determine if the total month should be count, only a part, or not at all.
library(tidyverse)
result <- example_df %>%
left_join(example_df2, by = 'ID') %>%
mutate(
TEST_MONTH = ifelse(end >= month_start & start < month_end, 1, 0),
TEST_DAYS = ifelse(TEST_MONTH == 1,
ifelse(end > month_end,
ifelse(start >= month_start, month_end - start + 1, month_end - month_start + 1),
end - month_start + 1),
0),
PAID = pay * perc/100 * TEST_DAYS / as.numeric(month_end - month_start + 1)
)
result %>% filter(ID == 1)
# ID start end perc month_start month_end pay TEST_MONTH TEST_DAYS PAID
# 1 1 2014-01-01 2014-03-05 23.73546 2014-01-01 2014-01-31 1949.884 1 31 462.81390
# 2 1 2014-01-01 2014-03-05 23.73546 2014-02-01 2014-02-28 2014.691 1 28 478.19633
# 3 1 2014-01-01 2014-03-05 23.73546 2014-03-01 2014-03-31 1933.150 1 5 74.00678
# 4 1 2014-01-01 2014-03-05 23.73546 2014-04-01 2014-04-30 2127.622 0 0 0.00000
# 5 1 2014-03-05 2014-04-12 31.83643 2014-01-01 2014-01-31 1949.884 0 0 0.00000
# 6 1 2014-03-05 2014-04-12 31.83643 2014-02-01 2014-02-28 2014.691 0 0 0.00000
# 7 1 2014-03-05 2014-04-12 31.83643 2014-03-01 2014-03-31 1933.150 1 27 536.03354
# 8 1 2014-03-05 2014-04-12 31.83643 2014-04-01 2014-04-30 2127.622 1 12 270.94364
Thank you very much!!! Yes, this works. I would not have come up with the left_join. Makes perfect sense. :-)
– user138089
Mar 21 at 17:43
add a comment |
Start with a left_join()
between your 2 dataframes. Each period of work of an ID
will be replicate on each pay month period of this ID
.
Then, we a succession of ifelse()
, you can determine if the total month should be count, only a part, or not at all.
library(tidyverse)
result <- example_df %>%
left_join(example_df2, by = 'ID') %>%
mutate(
TEST_MONTH = ifelse(end >= month_start & start < month_end, 1, 0),
TEST_DAYS = ifelse(TEST_MONTH == 1,
ifelse(end > month_end,
ifelse(start >= month_start, month_end - start + 1, month_end - month_start + 1),
end - month_start + 1),
0),
PAID = pay * perc/100 * TEST_DAYS / as.numeric(month_end - month_start + 1)
)
result %>% filter(ID == 1)
# ID start end perc month_start month_end pay TEST_MONTH TEST_DAYS PAID
# 1 1 2014-01-01 2014-03-05 23.73546 2014-01-01 2014-01-31 1949.884 1 31 462.81390
# 2 1 2014-01-01 2014-03-05 23.73546 2014-02-01 2014-02-28 2014.691 1 28 478.19633
# 3 1 2014-01-01 2014-03-05 23.73546 2014-03-01 2014-03-31 1933.150 1 5 74.00678
# 4 1 2014-01-01 2014-03-05 23.73546 2014-04-01 2014-04-30 2127.622 0 0 0.00000
# 5 1 2014-03-05 2014-04-12 31.83643 2014-01-01 2014-01-31 1949.884 0 0 0.00000
# 6 1 2014-03-05 2014-04-12 31.83643 2014-02-01 2014-02-28 2014.691 0 0 0.00000
# 7 1 2014-03-05 2014-04-12 31.83643 2014-03-01 2014-03-31 1933.150 1 27 536.03354
# 8 1 2014-03-05 2014-04-12 31.83643 2014-04-01 2014-04-30 2127.622 1 12 270.94364
Start with a left_join()
between your 2 dataframes. Each period of work of an ID
will be replicate on each pay month period of this ID
.
Then, we a succession of ifelse()
, you can determine if the total month should be count, only a part, or not at all.
library(tidyverse)
result <- example_df %>%
left_join(example_df2, by = 'ID') %>%
mutate(
TEST_MONTH = ifelse(end >= month_start & start < month_end, 1, 0),
TEST_DAYS = ifelse(TEST_MONTH == 1,
ifelse(end > month_end,
ifelse(start >= month_start, month_end - start + 1, month_end - month_start + 1),
end - month_start + 1),
0),
PAID = pay * perc/100 * TEST_DAYS / as.numeric(month_end - month_start + 1)
)
result %>% filter(ID == 1)
# ID start end perc month_start month_end pay TEST_MONTH TEST_DAYS PAID
# 1 1 2014-01-01 2014-03-05 23.73546 2014-01-01 2014-01-31 1949.884 1 31 462.81390
# 2 1 2014-01-01 2014-03-05 23.73546 2014-02-01 2014-02-28 2014.691 1 28 478.19633
# 3 1 2014-01-01 2014-03-05 23.73546 2014-03-01 2014-03-31 1933.150 1 5 74.00678
# 4 1 2014-01-01 2014-03-05 23.73546 2014-04-01 2014-04-30 2127.622 0 0 0.00000
# 5 1 2014-03-05 2014-04-12 31.83643 2014-01-01 2014-01-31 1949.884 0 0 0.00000
# 6 1 2014-03-05 2014-04-12 31.83643 2014-02-01 2014-02-28 2014.691 0 0 0.00000
# 7 1 2014-03-05 2014-04-12 31.83643 2014-03-01 2014-03-31 1933.150 1 27 536.03354
# 8 1 2014-03-05 2014-04-12 31.83643 2014-04-01 2014-04-30 2127.622 1 12 270.94364
edited Mar 21 at 17:37
answered Mar 21 at 16:48
demarsylvaindemarsylvain
677214
677214
Thank you very much!!! Yes, this works. I would not have come up with the left_join. Makes perfect sense. :-)
– user138089
Mar 21 at 17:43
add a comment |
Thank you very much!!! Yes, this works. I would not have come up with the left_join. Makes perfect sense. :-)
– user138089
Mar 21 at 17:43
Thank you very much!!! Yes, this works. I would not have come up with the left_join. Makes perfect sense. :-)
– user138089
Mar 21 at 17:43
Thank you very much!!! Yes, this works. I would not have come up with the left_join. Makes perfect sense. :-)
– user138089
Mar 21 at 17:43
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55284428%2fin-r-how-to-extract-info-about-valid-time-periods-and-apply-it-to-another-datas%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown