How can I tidy student enrollment data on a per semester basis?How to make a great R reproducible exampleHow to join (merge) data frames (inner, outer, left, right)How can we make xkcd style graphs?How can I view the source code for a function?Advanced tables in R (edited for clarity)trouble debugging my ifelse statementSimple Scatter plot: variable issues on X axisHow do I extract certain words in my document into a dataframe in R?converting to tidy data format in RR dplyr summarise date gapsDeduplication of data using multiple columns in R

Will removing shelving screws from studs damage the studs?

Which meaning of "must" does the Slow spell use?

Alternatives to Network Backup

Commercial company wants me to list all prior "inventions", give up everything not listed

Can an object tethered to a spaceship be pulled out of event horizon?

Many many thanks

Fantasy Macro Economics: What would Merfolk Trade?

How do we improve collaboration with problematic tester team?

How could a self contained organic body propel itself in space

Why did Lucius make a deal out of Buckbeak hurting Draco but not about Draco being turned into a ferret?

How to force GCC to assume that a floating-point expression is non-negative?

Why did James Cameron decide to give Alita big eyes?

How do solar inverter systems easily add AC power sources together?

Is this password scheme legit?

Force SQL Server to use fragmented indexes?

Defending Castle from Zombies

Is there any problem with a full installation on a USB drive?

Can I use coax outlets for cable modem?

What stops you from using fixed income in developing countries?

Why is there not a willingness from the world to step in between Pakistan and India?

Stolen MacBook should I worry about my data?

Do sharpies or markers damage soft rock climbing gear?

Why is explainability not one of the criteria for publication?

Is there a word or phrase that means "use other people's wifi or Internet service without consent"?



How can I tidy student enrollment data on a per semester basis?


How to make a great R reproducible exampleHow to join (merge) data frames (inner, outer, left, right)How can we make xkcd style graphs?How can I view the source code for a function?Advanced tables in R (edited for clarity)trouble debugging my ifelse statementSimple Scatter plot: variable issues on X axisHow do I extract certain words in my document into a dataframe in R?converting to tidy data format in RR dplyr summarise date gapsDeduplication of data using multiple columns in R






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








1















I have a dataset that currently lists student information on a term basis (i.e., 201610, 201620, 201630, 201640, 201710, etc.) with suffix 10 = fall, 20 = winter, 30 = spring, and 40 = summer. Not all terms are necessarily listed for every student.



What I would like to do is identify the first term in which a student was enrolled, presumably the fall, as T1, and subsequent terms as T2, T3, etc. Since some students may take a winter summer term, I would like to identify those as T1_Winter, T2_Summer, etc.



I've been able to isolate the individual terms for which a student has enrolled, and have been able to identify the first, intermediate, and last terms as 1, 2, 3, etc. However, I can't manage to wrap my head around how to identify fall and spring as 1, 2, 3, 4, and the intermediary terms, winter and summer, and 1.5, 2.5, 3.5, 4.5, etc.



# Create the sample dataset
data <- data.frame(
ID = c(1, 1, 1, 2, 2, 2, 2),
RegTerm = c(201810, 201820, 201830, 201910, 201930, 201940, 202010))
)

# Isolate student IDs and terms
stdTerm <- subset(data, select = c("ID","RegTerm"))

# Sort according to ID and RegTerm
stdTerm <- stdTerm[
with(stdTerm, order(ID, RegTerm)),
]

# Remove duplicate combinations of ID and term
y <- stdTerm[!duplicated(stdTerm[c(1,2)]),]

# Create an index to identify the term number
# for which a student enrolled
library(dplyr)
z <- y %>%
arrange(ID, RegTerm) %>%
group_by(ID) %>%
mutate(StdTermIndex = seq(n()))


Right now, it's identifying the progression of all terms for a student as 1, 2, 3, etc., but not winter and summer as intermediary terms. That is, if a student enrolled in fall and winter, winter will appear as 2 and spring will appear as 3.



In the sample data provided, I would like Student ID 1 to reflect 201810 as 1, 201820 as 1.5, and 201830 as 2, etc. Any suggestions or previous code I could reference to wrap my head around how I can code the intermediary semesters?










share|improve this question


























  • give us a sample data so we can better understand your problem

    – Felipe Alvarenga
    Mar 27 at 20:29











  • Also, check this out stackoverflow.com/questions/5963269/…

    – Felipe Alvarenga
    Mar 27 at 20:30











  • Thanks, @FelipeAlvarenga! My apologies as it's my first time posting here. I've included a sample dataset in my question and hope it clarifies the problem.

    – Anna K
    Mar 27 at 20:38

















1















I have a dataset that currently lists student information on a term basis (i.e., 201610, 201620, 201630, 201640, 201710, etc.) with suffix 10 = fall, 20 = winter, 30 = spring, and 40 = summer. Not all terms are necessarily listed for every student.



What I would like to do is identify the first term in which a student was enrolled, presumably the fall, as T1, and subsequent terms as T2, T3, etc. Since some students may take a winter summer term, I would like to identify those as T1_Winter, T2_Summer, etc.



I've been able to isolate the individual terms for which a student has enrolled, and have been able to identify the first, intermediate, and last terms as 1, 2, 3, etc. However, I can't manage to wrap my head around how to identify fall and spring as 1, 2, 3, 4, and the intermediary terms, winter and summer, and 1.5, 2.5, 3.5, 4.5, etc.



# Create the sample dataset
data <- data.frame(
ID = c(1, 1, 1, 2, 2, 2, 2),
RegTerm = c(201810, 201820, 201830, 201910, 201930, 201940, 202010))
)

# Isolate student IDs and terms
stdTerm <- subset(data, select = c("ID","RegTerm"))

# Sort according to ID and RegTerm
stdTerm <- stdTerm[
with(stdTerm, order(ID, RegTerm)),
]

# Remove duplicate combinations of ID and term
y <- stdTerm[!duplicated(stdTerm[c(1,2)]),]

# Create an index to identify the term number
# for which a student enrolled
library(dplyr)
z <- y %>%
arrange(ID, RegTerm) %>%
group_by(ID) %>%
mutate(StdTermIndex = seq(n()))


Right now, it's identifying the progression of all terms for a student as 1, 2, 3, etc., but not winter and summer as intermediary terms. That is, if a student enrolled in fall and winter, winter will appear as 2 and spring will appear as 3.



In the sample data provided, I would like Student ID 1 to reflect 201810 as 1, 201820 as 1.5, and 201830 as 2, etc. Any suggestions or previous code I could reference to wrap my head around how I can code the intermediary semesters?










share|improve this question


























  • give us a sample data so we can better understand your problem

    – Felipe Alvarenga
    Mar 27 at 20:29











  • Also, check this out stackoverflow.com/questions/5963269/…

    – Felipe Alvarenga
    Mar 27 at 20:30











  • Thanks, @FelipeAlvarenga! My apologies as it's my first time posting here. I've included a sample dataset in my question and hope it clarifies the problem.

    – Anna K
    Mar 27 at 20:38













1












1








1








I have a dataset that currently lists student information on a term basis (i.e., 201610, 201620, 201630, 201640, 201710, etc.) with suffix 10 = fall, 20 = winter, 30 = spring, and 40 = summer. Not all terms are necessarily listed for every student.



What I would like to do is identify the first term in which a student was enrolled, presumably the fall, as T1, and subsequent terms as T2, T3, etc. Since some students may take a winter summer term, I would like to identify those as T1_Winter, T2_Summer, etc.



I've been able to isolate the individual terms for which a student has enrolled, and have been able to identify the first, intermediate, and last terms as 1, 2, 3, etc. However, I can't manage to wrap my head around how to identify fall and spring as 1, 2, 3, 4, and the intermediary terms, winter and summer, and 1.5, 2.5, 3.5, 4.5, etc.



# Create the sample dataset
data <- data.frame(
ID = c(1, 1, 1, 2, 2, 2, 2),
RegTerm = c(201810, 201820, 201830, 201910, 201930, 201940, 202010))
)

# Isolate student IDs and terms
stdTerm <- subset(data, select = c("ID","RegTerm"))

# Sort according to ID and RegTerm
stdTerm <- stdTerm[
with(stdTerm, order(ID, RegTerm)),
]

# Remove duplicate combinations of ID and term
y <- stdTerm[!duplicated(stdTerm[c(1,2)]),]

# Create an index to identify the term number
# for which a student enrolled
library(dplyr)
z <- y %>%
arrange(ID, RegTerm) %>%
group_by(ID) %>%
mutate(StdTermIndex = seq(n()))


Right now, it's identifying the progression of all terms for a student as 1, 2, 3, etc., but not winter and summer as intermediary terms. That is, if a student enrolled in fall and winter, winter will appear as 2 and spring will appear as 3.



In the sample data provided, I would like Student ID 1 to reflect 201810 as 1, 201820 as 1.5, and 201830 as 2, etc. Any suggestions or previous code I could reference to wrap my head around how I can code the intermediary semesters?










share|improve this question
















I have a dataset that currently lists student information on a term basis (i.e., 201610, 201620, 201630, 201640, 201710, etc.) with suffix 10 = fall, 20 = winter, 30 = spring, and 40 = summer. Not all terms are necessarily listed for every student.



What I would like to do is identify the first term in which a student was enrolled, presumably the fall, as T1, and subsequent terms as T2, T3, etc. Since some students may take a winter summer term, I would like to identify those as T1_Winter, T2_Summer, etc.



I've been able to isolate the individual terms for which a student has enrolled, and have been able to identify the first, intermediate, and last terms as 1, 2, 3, etc. However, I can't manage to wrap my head around how to identify fall and spring as 1, 2, 3, 4, and the intermediary terms, winter and summer, and 1.5, 2.5, 3.5, 4.5, etc.



# Create the sample dataset
data <- data.frame(
ID = c(1, 1, 1, 2, 2, 2, 2),
RegTerm = c(201810, 201820, 201830, 201910, 201930, 201940, 202010))
)

# Isolate student IDs and terms
stdTerm <- subset(data, select = c("ID","RegTerm"))

# Sort according to ID and RegTerm
stdTerm <- stdTerm[
with(stdTerm, order(ID, RegTerm)),
]

# Remove duplicate combinations of ID and term
y <- stdTerm[!duplicated(stdTerm[c(1,2)]),]

# Create an index to identify the term number
# for which a student enrolled
library(dplyr)
z <- y %>%
arrange(ID, RegTerm) %>%
group_by(ID) %>%
mutate(StdTermIndex = seq(n()))


Right now, it's identifying the progression of all terms for a student as 1, 2, 3, etc., but not winter and summer as intermediary terms. That is, if a student enrolled in fall and winter, winter will appear as 2 and spring will appear as 3.



In the sample data provided, I would like Student ID 1 to reflect 201810 as 1, 201820 as 1.5, and 201830 as 2, etc. Any suggestions or previous code I could reference to wrap my head around how I can code the intermediary semesters?







r dplyr data-analysis






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 27 at 20:35







Anna K

















asked Mar 27 at 19:52









Anna KAnna K

83 bronze badges




83 bronze badges















  • give us a sample data so we can better understand your problem

    – Felipe Alvarenga
    Mar 27 at 20:29











  • Also, check this out stackoverflow.com/questions/5963269/…

    – Felipe Alvarenga
    Mar 27 at 20:30











  • Thanks, @FelipeAlvarenga! My apologies as it's my first time posting here. I've included a sample dataset in my question and hope it clarifies the problem.

    – Anna K
    Mar 27 at 20:38

















  • give us a sample data so we can better understand your problem

    – Felipe Alvarenga
    Mar 27 at 20:29











  • Also, check this out stackoverflow.com/questions/5963269/…

    – Felipe Alvarenga
    Mar 27 at 20:30











  • Thanks, @FelipeAlvarenga! My apologies as it's my first time posting here. I've included a sample dataset in my question and hope it clarifies the problem.

    – Anna K
    Mar 27 at 20:38
















give us a sample data so we can better understand your problem

– Felipe Alvarenga
Mar 27 at 20:29





give us a sample data so we can better understand your problem

– Felipe Alvarenga
Mar 27 at 20:29













Also, check this out stackoverflow.com/questions/5963269/…

– Felipe Alvarenga
Mar 27 at 20:30





Also, check this out stackoverflow.com/questions/5963269/…

– Felipe Alvarenga
Mar 27 at 20:30













Thanks, @FelipeAlvarenga! My apologies as it's my first time posting here. I've included a sample dataset in my question and hope it clarifies the problem.

– Anna K
Mar 27 at 20:38





Thanks, @FelipeAlvarenga! My apologies as it's my first time posting here. I've included a sample dataset in my question and hope it clarifies the problem.

– Anna K
Mar 27 at 20:38












2 Answers
2






active

oldest

votes


















0















So, to do it in your sample, I created a handle variable that tells me whether the RegTerm is even or odd.



The reason is simple, odd RegTerm means it is a regular term, whereas even ones will be either winter or summer terms.



library(dplyr)
data <- data.frame(
ID = c(1, 1, 1, 2, 2, 2, 2),
RegTerm = c(201810, 201820, 201830, 201910, 201930, 201940, 202010)
)

dat <- data %>%
mutate(term = str_extract(RegTerm, '(?<=\d4)\d1(?=0)'),
term = as.numeric(term) %% 2) %>%
group_by(ID) %>%
mutate(numTerm = cumsum(term),
numTerm = ifelse(term == 0, numTerm + 0.5, numTerm))


The first mutate extracts the 5th digit in the RegTerm column and get the rest of its division by 2. If it equals 1, it means it is a regular term, otherwise it will be either summer or winter.



Next I take the cumulative sum of this variable, which will give you in which RegTerm the student is. Then, for every term == 0 I add to numTerm 0.5, to account for the winter and summer terms.



# A tibble: 7 x 4
# Groups: ID [2]
ID RegTerm term numTerm
<dbl> <dbl> <dbl> <dbl>
1 1 201810 1 1
2 1 201820 0 1.5
3 1 201830 1 2
4 2 201910 1 1
5 2 201930 1 2
6 2 201940 0 2.5
7 2 202010 1 3


This way, if there is a student starting in a winter term, numTerm will be assigned a 0.5 value, having numTerm = 1 only when he reaches a regular term (term == 1)






share|improve this answer


































    0















    I think a good way to do this would be to separate your RegTerm column into year and suffix and then apply some condition formula once you have the values split up.



    The below code does that, we just have to then apply it to the whole column and do some rejigging.



    paste(strsplit(as.character(201810), "")[[1]][1:4], collapse = ""))
    # "2018"
    paste(strsplit(as.character(201810), "")[[1]][5:6], collapse = ""))
    # "10"


    So to do it on the data frame you want to use something like lapply and then unlist the result and add a new column. After that you can change the values to numeric and then use some conditional statements in a mutate function to set the intermediary values etc.



    z$year <- unlist(lapply(z$RegTerm, function(x) paste(strsplit(as.character(x), "")[[1]][1:4], collapse = "")))
    z$suf <- unlist(lapply(z$RegTerm, function(x) paste(strsplit(as.character(x), "")[[1]][5:6], collapse = "")))


    It looks a bit ugly but all it is doing is separating RegTerm then selecting the first 4 or last 2 characters for year and suf respectively then collapsing (using collapse = "" in paste) them into a single string. We lapply this to the whole column then unlist it to make vector.



    I would recommend understanding the first two lines of code in this answer and then it will be made obvious.






    share|improve this answer



























      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55385419%2fhow-can-i-tidy-student-enrollment-data-on-a-per-semester-basis%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      0















      So, to do it in your sample, I created a handle variable that tells me whether the RegTerm is even or odd.



      The reason is simple, odd RegTerm means it is a regular term, whereas even ones will be either winter or summer terms.



      library(dplyr)
      data <- data.frame(
      ID = c(1, 1, 1, 2, 2, 2, 2),
      RegTerm = c(201810, 201820, 201830, 201910, 201930, 201940, 202010)
      )

      dat <- data %>%
      mutate(term = str_extract(RegTerm, '(?<=\d4)\d1(?=0)'),
      term = as.numeric(term) %% 2) %>%
      group_by(ID) %>%
      mutate(numTerm = cumsum(term),
      numTerm = ifelse(term == 0, numTerm + 0.5, numTerm))


      The first mutate extracts the 5th digit in the RegTerm column and get the rest of its division by 2. If it equals 1, it means it is a regular term, otherwise it will be either summer or winter.



      Next I take the cumulative sum of this variable, which will give you in which RegTerm the student is. Then, for every term == 0 I add to numTerm 0.5, to account for the winter and summer terms.



      # A tibble: 7 x 4
      # Groups: ID [2]
      ID RegTerm term numTerm
      <dbl> <dbl> <dbl> <dbl>
      1 1 201810 1 1
      2 1 201820 0 1.5
      3 1 201830 1 2
      4 2 201910 1 1
      5 2 201930 1 2
      6 2 201940 0 2.5
      7 2 202010 1 3


      This way, if there is a student starting in a winter term, numTerm will be assigned a 0.5 value, having numTerm = 1 only when he reaches a regular term (term == 1)






      share|improve this answer































        0















        So, to do it in your sample, I created a handle variable that tells me whether the RegTerm is even or odd.



        The reason is simple, odd RegTerm means it is a regular term, whereas even ones will be either winter or summer terms.



        library(dplyr)
        data <- data.frame(
        ID = c(1, 1, 1, 2, 2, 2, 2),
        RegTerm = c(201810, 201820, 201830, 201910, 201930, 201940, 202010)
        )

        dat <- data %>%
        mutate(term = str_extract(RegTerm, '(?<=\d4)\d1(?=0)'),
        term = as.numeric(term) %% 2) %>%
        group_by(ID) %>%
        mutate(numTerm = cumsum(term),
        numTerm = ifelse(term == 0, numTerm + 0.5, numTerm))


        The first mutate extracts the 5th digit in the RegTerm column and get the rest of its division by 2. If it equals 1, it means it is a regular term, otherwise it will be either summer or winter.



        Next I take the cumulative sum of this variable, which will give you in which RegTerm the student is. Then, for every term == 0 I add to numTerm 0.5, to account for the winter and summer terms.



        # A tibble: 7 x 4
        # Groups: ID [2]
        ID RegTerm term numTerm
        <dbl> <dbl> <dbl> <dbl>
        1 1 201810 1 1
        2 1 201820 0 1.5
        3 1 201830 1 2
        4 2 201910 1 1
        5 2 201930 1 2
        6 2 201940 0 2.5
        7 2 202010 1 3


        This way, if there is a student starting in a winter term, numTerm will be assigned a 0.5 value, having numTerm = 1 only when he reaches a regular term (term == 1)






        share|improve this answer





























          0














          0










          0









          So, to do it in your sample, I created a handle variable that tells me whether the RegTerm is even or odd.



          The reason is simple, odd RegTerm means it is a regular term, whereas even ones will be either winter or summer terms.



          library(dplyr)
          data <- data.frame(
          ID = c(1, 1, 1, 2, 2, 2, 2),
          RegTerm = c(201810, 201820, 201830, 201910, 201930, 201940, 202010)
          )

          dat <- data %>%
          mutate(term = str_extract(RegTerm, '(?<=\d4)\d1(?=0)'),
          term = as.numeric(term) %% 2) %>%
          group_by(ID) %>%
          mutate(numTerm = cumsum(term),
          numTerm = ifelse(term == 0, numTerm + 0.5, numTerm))


          The first mutate extracts the 5th digit in the RegTerm column and get the rest of its division by 2. If it equals 1, it means it is a regular term, otherwise it will be either summer or winter.



          Next I take the cumulative sum of this variable, which will give you in which RegTerm the student is. Then, for every term == 0 I add to numTerm 0.5, to account for the winter and summer terms.



          # A tibble: 7 x 4
          # Groups: ID [2]
          ID RegTerm term numTerm
          <dbl> <dbl> <dbl> <dbl>
          1 1 201810 1 1
          2 1 201820 0 1.5
          3 1 201830 1 2
          4 2 201910 1 1
          5 2 201930 1 2
          6 2 201940 0 2.5
          7 2 202010 1 3


          This way, if there is a student starting in a winter term, numTerm will be assigned a 0.5 value, having numTerm = 1 only when he reaches a regular term (term == 1)






          share|improve this answer















          So, to do it in your sample, I created a handle variable that tells me whether the RegTerm is even or odd.



          The reason is simple, odd RegTerm means it is a regular term, whereas even ones will be either winter or summer terms.



          library(dplyr)
          data <- data.frame(
          ID = c(1, 1, 1, 2, 2, 2, 2),
          RegTerm = c(201810, 201820, 201830, 201910, 201930, 201940, 202010)
          )

          dat <- data %>%
          mutate(term = str_extract(RegTerm, '(?<=\d4)\d1(?=0)'),
          term = as.numeric(term) %% 2) %>%
          group_by(ID) %>%
          mutate(numTerm = cumsum(term),
          numTerm = ifelse(term == 0, numTerm + 0.5, numTerm))


          The first mutate extracts the 5th digit in the RegTerm column and get the rest of its division by 2. If it equals 1, it means it is a regular term, otherwise it will be either summer or winter.



          Next I take the cumulative sum of this variable, which will give you in which RegTerm the student is. Then, for every term == 0 I add to numTerm 0.5, to account for the winter and summer terms.



          # A tibble: 7 x 4
          # Groups: ID [2]
          ID RegTerm term numTerm
          <dbl> <dbl> <dbl> <dbl>
          1 1 201810 1 1
          2 1 201820 0 1.5
          3 1 201830 1 2
          4 2 201910 1 1
          5 2 201930 1 2
          6 2 201940 0 2.5
          7 2 202010 1 3


          This way, if there is a student starting in a winter term, numTerm will be assigned a 0.5 value, having numTerm = 1 only when he reaches a regular term (term == 1)







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Mar 28 at 12:26

























          answered Mar 28 at 11:56









          Felipe AlvarengaFelipe Alvarenga

          1,6338 silver badges24 bronze badges




          1,6338 silver badges24 bronze badges


























              0















              I think a good way to do this would be to separate your RegTerm column into year and suffix and then apply some condition formula once you have the values split up.



              The below code does that, we just have to then apply it to the whole column and do some rejigging.



              paste(strsplit(as.character(201810), "")[[1]][1:4], collapse = ""))
              # "2018"
              paste(strsplit(as.character(201810), "")[[1]][5:6], collapse = ""))
              # "10"


              So to do it on the data frame you want to use something like lapply and then unlist the result and add a new column. After that you can change the values to numeric and then use some conditional statements in a mutate function to set the intermediary values etc.



              z$year <- unlist(lapply(z$RegTerm, function(x) paste(strsplit(as.character(x), "")[[1]][1:4], collapse = "")))
              z$suf <- unlist(lapply(z$RegTerm, function(x) paste(strsplit(as.character(x), "")[[1]][5:6], collapse = "")))


              It looks a bit ugly but all it is doing is separating RegTerm then selecting the first 4 or last 2 characters for year and suf respectively then collapsing (using collapse = "" in paste) them into a single string. We lapply this to the whole column then unlist it to make vector.



              I would recommend understanding the first two lines of code in this answer and then it will be made obvious.






              share|improve this answer





























                0















                I think a good way to do this would be to separate your RegTerm column into year and suffix and then apply some condition formula once you have the values split up.



                The below code does that, we just have to then apply it to the whole column and do some rejigging.



                paste(strsplit(as.character(201810), "")[[1]][1:4], collapse = ""))
                # "2018"
                paste(strsplit(as.character(201810), "")[[1]][5:6], collapse = ""))
                # "10"


                So to do it on the data frame you want to use something like lapply and then unlist the result and add a new column. After that you can change the values to numeric and then use some conditional statements in a mutate function to set the intermediary values etc.



                z$year <- unlist(lapply(z$RegTerm, function(x) paste(strsplit(as.character(x), "")[[1]][1:4], collapse = "")))
                z$suf <- unlist(lapply(z$RegTerm, function(x) paste(strsplit(as.character(x), "")[[1]][5:6], collapse = "")))


                It looks a bit ugly but all it is doing is separating RegTerm then selecting the first 4 or last 2 characters for year and suf respectively then collapsing (using collapse = "" in paste) them into a single string. We lapply this to the whole column then unlist it to make vector.



                I would recommend understanding the first two lines of code in this answer and then it will be made obvious.






                share|improve this answer



























                  0














                  0










                  0









                  I think a good way to do this would be to separate your RegTerm column into year and suffix and then apply some condition formula once you have the values split up.



                  The below code does that, we just have to then apply it to the whole column and do some rejigging.



                  paste(strsplit(as.character(201810), "")[[1]][1:4], collapse = ""))
                  # "2018"
                  paste(strsplit(as.character(201810), "")[[1]][5:6], collapse = ""))
                  # "10"


                  So to do it on the data frame you want to use something like lapply and then unlist the result and add a new column. After that you can change the values to numeric and then use some conditional statements in a mutate function to set the intermediary values etc.



                  z$year <- unlist(lapply(z$RegTerm, function(x) paste(strsplit(as.character(x), "")[[1]][1:4], collapse = "")))
                  z$suf <- unlist(lapply(z$RegTerm, function(x) paste(strsplit(as.character(x), "")[[1]][5:6], collapse = "")))


                  It looks a bit ugly but all it is doing is separating RegTerm then selecting the first 4 or last 2 characters for year and suf respectively then collapsing (using collapse = "" in paste) them into a single string. We lapply this to the whole column then unlist it to make vector.



                  I would recommend understanding the first two lines of code in this answer and then it will be made obvious.






                  share|improve this answer













                  I think a good way to do this would be to separate your RegTerm column into year and suffix and then apply some condition formula once you have the values split up.



                  The below code does that, we just have to then apply it to the whole column and do some rejigging.



                  paste(strsplit(as.character(201810), "")[[1]][1:4], collapse = ""))
                  # "2018"
                  paste(strsplit(as.character(201810), "")[[1]][5:6], collapse = ""))
                  # "10"


                  So to do it on the data frame you want to use something like lapply and then unlist the result and add a new column. After that you can change the values to numeric and then use some conditional statements in a mutate function to set the intermediary values etc.



                  z$year <- unlist(lapply(z$RegTerm, function(x) paste(strsplit(as.character(x), "")[[1]][1:4], collapse = "")))
                  z$suf <- unlist(lapply(z$RegTerm, function(x) paste(strsplit(as.character(x), "")[[1]][5:6], collapse = "")))


                  It looks a bit ugly but all it is doing is separating RegTerm then selecting the first 4 or last 2 characters for year and suf respectively then collapsing (using collapse = "" in paste) them into a single string. We lapply this to the whole column then unlist it to make vector.



                  I would recommend understanding the first two lines of code in this answer and then it will be made obvious.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Mar 27 at 23:08









                  CrooteCroote

                  7523 silver badges14 bronze badges




                  7523 silver badges14 bronze badges






























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55385419%2fhow-can-i-tidy-student-enrollment-data-on-a-per-semester-basis%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

                      Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

                      Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript