Combine two data frames by rows (rbind) when they have different sets of columns The 2019 Stack Overflow Developer Survey Results Are InMerge data sets by row differening columnsCombining data frames with unequal number of columnsAppend a data frame to a master data frame if some columns are commonCombining multiple data frames with different number of columnsR: how to merge 2 data frames by column nameHow to append smaller DataFrame to another DataFrame in Rcbind different length of data frames based on columnHow to combine multiple .csv files with different columns in R?Append data frame in R to new columnsAdding data frame below another data frameDrop factor levels in a subsetted data frameHow to join (merge) data frames (inner, outer, left, right)Quickly reading very large tables as dataframesR - list to data frameDrop data frame columns by nameChanging column names of a data frameExtracting specific columns from a data frameCreate an empty data.frameAdding new column to existing DataFrame in Python pandasHow to iterate over rows in a DataFrame in Pandas?

What do hard-Brexiteers want with respect to the Irish border?

Why don't Unix/Linux systems traverse through directories until they find the required version of a linked library?

Are USB sockets on wall outlets live all the time, even when the switch is off?

What does Linus Torvalds mean when he says that Git "never ever" tracks a file?

Idiomatic way to prevent slicing?

What is the steepest angle that a canal can be traversable without locks?

JSON.serialize: is it possible to suppress null values of a map?

Why is the maximum length of OpenWrt’s root password 8 characters?

Is it possible for the two major parties in the UK to form a coalition with each other instead of a much smaller party?

Why is it "Tumoren" and not "Tumore"?

Which Sci-Fi work first showed weapon of galactic-scale mass destruction?

How to make payment on the internet without leaving a money trail?

How to change the limits of integration

Springs with some finite mass

What are the motivations for publishing new editions of an existing textbook, beyond new discoveries in a field?

I looked up a future colleague on LinkedIn before I started a job. I told my colleague about it and he seemed surprised. Should I apologize?

Why is my p-value correlated to difference between means in two sample tests?

Does light intensity oscillate really fast since it is a wave?

Inversion Puzzle

What do the Banks children have against barley water?

Manuscript was "unsubmitted" because the manuscript was deposited in Arxiv Preprints

Is flight data recorder erased after every flight?

On the insanity of kings as an argument against monarchy

A poker game description that does not feel gimmicky



Combine two data frames by rows (rbind) when they have different sets of columns



The 2019 Stack Overflow Developer Survey Results Are InMerge data sets by row differening columnsCombining data frames with unequal number of columnsAppend a data frame to a master data frame if some columns are commonCombining multiple data frames with different number of columnsR: how to merge 2 data frames by column nameHow to append smaller DataFrame to another DataFrame in Rcbind different length of data frames based on columnHow to combine multiple .csv files with different columns in R?Append data frame in R to new columnsAdding data frame below another data frameDrop factor levels in a subsetted data frameHow to join (merge) data frames (inner, outer, left, right)Quickly reading very large tables as dataframesR - list to data frameDrop data frame columns by nameChanging column names of a data frameExtracting specific columns from a data frameCreate an empty data.frameAdding new column to existing DataFrame in Python pandasHow to iterate over rows in a DataFrame in Pandas?



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








177















Is it possible to row bind two data frames that don't have the same set of columns? I am hoping to retain the columns that do not match after the bind.










share|improve this question






























    177















    Is it possible to row bind two data frames that don't have the same set of columns? I am hoping to retain the columns that do not match after the bind.










    share|improve this question


























      177












      177








      177


      51






      Is it possible to row bind two data frames that don't have the same set of columns? I am hoping to retain the columns that do not match after the bind.










      share|improve this question
















      Is it possible to row bind two data frames that don't have the same set of columns? I am hoping to retain the columns that do not match after the bind.







      r dataframe r-faq






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jun 20 '17 at 7:54









      zx8754

      30.7k766105




      30.7k766105










      asked Aug 4 '10 at 3:25









      Btibert3Btibert3

      13.2k35105154




      13.2k35105154






















          13 Answers
          13






          active

          oldest

          votes


















          187














          rbind.fill from the package plyr might be what you are looking for.






          share|improve this answer


















          • 5





            rbind.fill and bind_rows() both silently drop rownames.

            – MERose
            Dec 5 '17 at 16:40






          • 2





            @MERose Hadley: "Yes, all dplyr methods ignore rownames."

            – zx8754
            Dec 7 '17 at 9:11











          • Here is a link to documentation: rdocumentation.org/packages/plyr/versions/1.8.4/topics/…

            – Gabriel Fair
            Apr 14 '18 at 16:42


















          91














          A more recent solution is to use dplyr's bind_rows function which I assume is more efficient than smartbind.






          share|improve this answer
































            45














            You can use smartbind from the gtools package.



            Example:



            library(gtools)
            df1 <- data.frame(a = c(1:5), b = c(6:10))
            df2 <- data.frame(a = c(11:15), b = c(16:20), c = LETTERS[1:5])
            smartbind(df1, df2)
            # result
            a b c
            1.1 1 6 <NA>
            1.2 2 7 <NA>
            1.3 3 8 <NA>
            1.4 4 9 <NA>
            1.5 5 10 <NA>
            2.1 11 16 A
            2.2 12 17 B
            2.3 13 18 C
            2.4 14 19 D
            2.5 15 20 E





            share|improve this answer

























            • I tried smartbind with two large data frames (in total roughly 3*10^6 rows) and aborted it after 10 minutes.

              – Joe
              May 11 '17 at 11:39


















            32














            If the columns in df1 is a subset of those in df2 (by column names):



            df3 <- rbind(df1, df2[, names(df1)])





            share|improve this answer
































              24














              An alternative with data.table:



              library(data.table)
              df1 = data.frame(a = c(1:5), b = c(6:10))
              df2 = data.frame(a = c(11:15), b = c(16:20), c = LETTERS[1:5])
              rbindlist(list(df1, df2), fill = TRUE)



              rbind will also work in data.table as long as the objects are converted to data.table objects, so



              rbind(setDT(df1), setDT(df2), fill=TRUE)


              will also work in this situation. This can be preferable when you have a couple of data.tables and don't want to construct a list.






              share|improve this answer

























              • This is the most simple, out-of-the-box solution that easily generalizes to any number of dataframes, since you can store them all in separate list elements. Other answers, like the intersect approach, only work for 2 dataframes and don't easily generalize.

                – Rich Pauloo
                Mar 19 at 16:37


















              17














              You could also just pull out the common column names.



              > cols <- intersect(colnames(df1), colnames(df2))
              > rbind(df1[,cols], df2[,cols])





              share|improve this answer






























                17














                Most of the base R answers address the situation where only one data.frame has additional columns or that the resulting data.frame would have the intersection of the columns. Since the OP writes I am hoping to retain the columns that do not match after the bind, an answer using base R methods to address this issue is probably worth posting.



                Below, I present two base R methods: One that alters the original data.frames, and one that doesn't. Additionally, I offer a method that generalizes the non-destructive method to more than two data.frames.



                First, let's get some sample data.



                # sample data, variable c is in df1, variable d is in df2
                df1 = data.frame(a=1:5, b=6:10, d=month.name[1:5])
                df2 = data.frame(a=6:10, b=16:20, c = letters[8:12])



                Two data.frames, alter originals

                In order to retain all columns from both data.frames in an rbind (and allow the function to work without resulting in an error), you add NA columns to each data.frame with the appropriate missing names filled in using setdiff.



                # fill in non-overlapping columns with NAs
                df1[setdiff(names(df2), names(df1))] <- NA
                df2[setdiff(names(df1), names(df2))] <- NA


                Now, rbind-em



                rbind(df1, df2)
                a b d c
                1 1 6 January <NA>
                2 2 7 February <NA>
                3 3 8 March <NA>
                4 4 9 April <NA>
                5 5 10 May <NA>
                6 6 16 <NA> h
                7 7 17 <NA> i
                8 8 18 <NA> j
                9 9 19 <NA> k
                10 10 20 <NA> l


                Note that the first two lines alter the original data.frames, df1 and df2, adding the full set of columns to both.




                Two data.frames, do not alter originals

                To leave the original data.frames intact, first loop through the names that differ, return a named vector of NAs that are concatenated into a list with the data.frame using c. Then, data.frame converts the result into an appropriate data.frame for the rbind.



                rbind(
                data.frame(c(df1, sapply(setdiff(names(df2), names(df1)), function(x) NA))),
                data.frame(c(df2, sapply(setdiff(names(df1), names(df2)), function(x) NA)))
                )



                Many data.frames, do not alter originals

                In the instance that you have more than two data.frames, you could do the following.



                # put data.frames into list (dfs named df1, df2, df3, etc)
                mydflist <- mget(ls(pattern="df\d+")
                # get all variable names
                allNms <- unique(unlist(lapply(mydflist, names)))

                # put em all together
                do.call(rbind,
                lapply(mydflist,
                function(x) data.frame(c(x, sapply(setdiff(allNms, names(x)),
                function(y) NA)))))


                Maybe a bit nicer to not see the row names of original data.frames? Then do this.



                do.call(rbind,
                c(lapply(mydflist,
                function(x) data.frame(c(x, sapply(setdiff(allNms, names(x)),
                function(y) NA)))),
                make.row.names=FALSE))





                share|improve this answer




















                • 1





                  base R methods are best. thanks for this

                  – user3479780
                  Dec 17 '18 at 5:04


















                6














                I wrote a function to do this because I like my code to tell me if something is wrong. This function will explicitly tell you which column names don't match and if you have a type mismatch. Then it will do its best to combine the data.frames anyway. The limitation is that you can only combine two data.frames at a time.



                ### combines data frames (like rbind) but by matching column names
                # columns without matches in the other data frame are still combined
                # but with NA in the rows corresponding to the data frame without
                # the variable
                # A warning is issued if there is a type mismatch between columns of
                # the same name and an attempt is made to combine the columns
                combineByName <- function(A,B)
                a.names <- names(A)
                b.names <- names(B)
                all.names <- union(a.names,b.names)
                print(paste("Number of columns:",length(all.names)))
                a.type <- NULL
                for (i in 1:ncol(A))
                a.type[i] <- typeof(A[,i])

                b.type <- NULL
                for (i in 1:ncol(B))
                b.type[i] <- typeof(B[,i])

                a_b.names <- names(A)[!names(A)%in%names(B)]
                b_a.names <- names(B)[!names(B)%in%names(A)]
                if (length(a_b.names)>0





                share|improve this answer






























                  2














                  Just for the documentation. You can try the Stack library and its function Stack in the following form:



                  Stack(df_1, df_2)


                  I have also the impression that it is faster than other methods for large data sets.






                  share|improve this answer






























                    1














                    Maybe I completely misread your question, but the "I am hoping to retain the columns that do not match after the bind" makes me think you are looking for a left join or right join similar to an SQL query. R has the merge function that lets you specify left, right, or inner joins similar to joining tables in SQL.



                    There is already a great question and answer on this topic here: How to join (merge) data frames (inner, outer, left, right)?






                    share|improve this answer
































                      1














                      gtools/smartbind didnt like working with Dates, probably because it was as.vectoring. So here's my solution...



                      sbind = function(x, y, fill=NA) 
                      sbind.fill = function(d, cols)
                      for(c in cols)
                      d[[c]] = fill
                      d


                      x = sbind.fill(x, setdiff(names(y),names(x)))
                      y = sbind.fill(y, setdiff(names(x),names(y)))

                      rbind(x, y)






                      share|improve this answer






























                        0














                        You could also use sjmisc::add_rows(), which uses dplyr::bind_rows(), but unlike bind_rows(), add_rows() preserves attributes and hence is useful for labelled data.



                        See following example with a labelled dataset. The frq()-function prints frequency tables with value labels, if the data is labelled.



                        library(sjmisc)
                        library(dplyr)

                        data(efc)
                        # select two subsets, with some identical and else different columns
                        x1 <- efc %>% select(1:5) %>% slice(1:10)
                        x2 <- efc %>% select(3:7) %>% slice(11:20)

                        str(x1)
                        #> 'data.frame': 10 obs. of 5 variables:
                        #> $ c12hour : num 16 148 70 168 168 16 161 110 28 40
                        #> ..- attr(*, "label")= chr "average number of hours of care per week"
                        #> $ e15relat: num 2 2 1 1 2 2 1 4 2 2
                        #> ..- attr(*, "label")= chr "relationship to elder"
                        #> ..- attr(*, "labels")= Named num 1 2 3 4 5 6 7 8
                        #> .. ..- attr(*, "names")= chr "spouse/partner" "child" "sibling" "daughter or son -in-law" ...
                        #> $ e16sex : num 2 2 2 2 2 2 1 2 2 2
                        #> ..- attr(*, "label")= chr "elder's gender"
                        #> ..- attr(*, "labels")= Named num 1 2
                        #> .. ..- attr(*, "names")= chr "male" "female"
                        #> $ e17age : num 83 88 82 67 84 85 74 87 79 83
                        #> ..- attr(*, "label")= chr "elder' age"
                        #> $ e42dep : num 3 3 3 4 4 4 4 4 4 4
                        #> ..- attr(*, "label")= chr "elder's dependency"
                        #> ..- attr(*, "labels")= Named num 1 2 3 4
                        #> .. ..- attr(*, "names")= chr "independent" "slightly dependent" "moderately dependent" "severely dependent"

                        bind_rows(x1, x1) %>% frq(e42dep)
                        #>
                        #> # e42dep <numeric>
                        #> # total N=20 valid N=20 mean=3.70 sd=0.47
                        #>
                        #> val frq raw.prc valid.prc cum.prc
                        #> 3 6 30 30 30
                        #> 4 14 70 70 100
                        #> <NA> 0 0 NA NA

                        add_rows(x1, x1) %>% frq(e42dep)
                        #>
                        #> # elder's dependency (e42dep) <numeric>
                        #> # total N=20 valid N=20 mean=3.70 sd=0.47
                        #>
                        #> val label frq raw.prc valid.prc cum.prc
                        #> 1 independent 0 0 0 0
                        #> 2 slightly dependent 0 0 0 0
                        #> 3 moderately dependent 6 30 30 30
                        #> 4 severely dependent 14 70 70 100
                        #> NA NA 0 0 NA NA





                        share|improve this answer






























                          -1














                          rbind.ordered=function(x,y)

                          diffCol = setdiff(colnames(x),colnames(y))
                          if (length(diffCol)>0)
                          cols=colnames(y)
                          for (i in 1:length(diffCol)) y=cbind(y,NA)
                          colnames(y)=c(cols,diffCol)


                          diffCol = setdiff(colnames(y),colnames(x))
                          if (length(diffCol)>0)
                          cols=colnames(x)
                          for (i in 1:length(diffCol)) x=cbind(x,NA)
                          colnames(x)=c(cols,diffCol)

                          return(rbind(x, y[, colnames(x)]))






                          share|improve this answer





















                            protected by zx8754 Mar 21 '18 at 11:12



                            Thank you for your interest in this question.
                            Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



                            Would you like to answer one of these unanswered questions instead?














                            13 Answers
                            13






                            active

                            oldest

                            votes








                            13 Answers
                            13






                            active

                            oldest

                            votes









                            active

                            oldest

                            votes






                            active

                            oldest

                            votes









                            187














                            rbind.fill from the package plyr might be what you are looking for.






                            share|improve this answer


















                            • 5





                              rbind.fill and bind_rows() both silently drop rownames.

                              – MERose
                              Dec 5 '17 at 16:40






                            • 2





                              @MERose Hadley: "Yes, all dplyr methods ignore rownames."

                              – zx8754
                              Dec 7 '17 at 9:11











                            • Here is a link to documentation: rdocumentation.org/packages/plyr/versions/1.8.4/topics/…

                              – Gabriel Fair
                              Apr 14 '18 at 16:42















                            187














                            rbind.fill from the package plyr might be what you are looking for.






                            share|improve this answer


















                            • 5





                              rbind.fill and bind_rows() both silently drop rownames.

                              – MERose
                              Dec 5 '17 at 16:40






                            • 2





                              @MERose Hadley: "Yes, all dplyr methods ignore rownames."

                              – zx8754
                              Dec 7 '17 at 9:11











                            • Here is a link to documentation: rdocumentation.org/packages/plyr/versions/1.8.4/topics/…

                              – Gabriel Fair
                              Apr 14 '18 at 16:42













                            187












                            187








                            187







                            rbind.fill from the package plyr might be what you are looking for.






                            share|improve this answer













                            rbind.fill from the package plyr might be what you are looking for.







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Aug 4 '10 at 4:00









                            Jyotirmoy BhattacharyaJyotirmoy Bhattacharya

                            6,19732434




                            6,19732434







                            • 5





                              rbind.fill and bind_rows() both silently drop rownames.

                              – MERose
                              Dec 5 '17 at 16:40






                            • 2





                              @MERose Hadley: "Yes, all dplyr methods ignore rownames."

                              – zx8754
                              Dec 7 '17 at 9:11











                            • Here is a link to documentation: rdocumentation.org/packages/plyr/versions/1.8.4/topics/…

                              – Gabriel Fair
                              Apr 14 '18 at 16:42












                            • 5





                              rbind.fill and bind_rows() both silently drop rownames.

                              – MERose
                              Dec 5 '17 at 16:40






                            • 2





                              @MERose Hadley: "Yes, all dplyr methods ignore rownames."

                              – zx8754
                              Dec 7 '17 at 9:11











                            • Here is a link to documentation: rdocumentation.org/packages/plyr/versions/1.8.4/topics/…

                              – Gabriel Fair
                              Apr 14 '18 at 16:42







                            5




                            5





                            rbind.fill and bind_rows() both silently drop rownames.

                            – MERose
                            Dec 5 '17 at 16:40





                            rbind.fill and bind_rows() both silently drop rownames.

                            – MERose
                            Dec 5 '17 at 16:40




                            2




                            2





                            @MERose Hadley: "Yes, all dplyr methods ignore rownames."

                            – zx8754
                            Dec 7 '17 at 9:11





                            @MERose Hadley: "Yes, all dplyr methods ignore rownames."

                            – zx8754
                            Dec 7 '17 at 9:11













                            Here is a link to documentation: rdocumentation.org/packages/plyr/versions/1.8.4/topics/…

                            – Gabriel Fair
                            Apr 14 '18 at 16:42





                            Here is a link to documentation: rdocumentation.org/packages/plyr/versions/1.8.4/topics/…

                            – Gabriel Fair
                            Apr 14 '18 at 16:42













                            91














                            A more recent solution is to use dplyr's bind_rows function which I assume is more efficient than smartbind.






                            share|improve this answer





























                              91














                              A more recent solution is to use dplyr's bind_rows function which I assume is more efficient than smartbind.






                              share|improve this answer



























                                91












                                91








                                91







                                A more recent solution is to use dplyr's bind_rows function which I assume is more efficient than smartbind.






                                share|improve this answer















                                A more recent solution is to use dplyr's bind_rows function which I assume is more efficient than smartbind.







                                share|improve this answer














                                share|improve this answer



                                share|improve this answer








                                edited Jun 23 '15 at 9:56









                                Henrik

                                42.4k994110




                                42.4k994110










                                answered Jan 7 '15 at 2:33









                                xiaodaixiaodai

                                4,352115073




                                4,352115073





















                                    45














                                    You can use smartbind from the gtools package.



                                    Example:



                                    library(gtools)
                                    df1 <- data.frame(a = c(1:5), b = c(6:10))
                                    df2 <- data.frame(a = c(11:15), b = c(16:20), c = LETTERS[1:5])
                                    smartbind(df1, df2)
                                    # result
                                    a b c
                                    1.1 1 6 <NA>
                                    1.2 2 7 <NA>
                                    1.3 3 8 <NA>
                                    1.4 4 9 <NA>
                                    1.5 5 10 <NA>
                                    2.1 11 16 A
                                    2.2 12 17 B
                                    2.3 13 18 C
                                    2.4 14 19 D
                                    2.5 15 20 E





                                    share|improve this answer

























                                    • I tried smartbind with two large data frames (in total roughly 3*10^6 rows) and aborted it after 10 minutes.

                                      – Joe
                                      May 11 '17 at 11:39















                                    45














                                    You can use smartbind from the gtools package.



                                    Example:



                                    library(gtools)
                                    df1 <- data.frame(a = c(1:5), b = c(6:10))
                                    df2 <- data.frame(a = c(11:15), b = c(16:20), c = LETTERS[1:5])
                                    smartbind(df1, df2)
                                    # result
                                    a b c
                                    1.1 1 6 <NA>
                                    1.2 2 7 <NA>
                                    1.3 3 8 <NA>
                                    1.4 4 9 <NA>
                                    1.5 5 10 <NA>
                                    2.1 11 16 A
                                    2.2 12 17 B
                                    2.3 13 18 C
                                    2.4 14 19 D
                                    2.5 15 20 E





                                    share|improve this answer

























                                    • I tried smartbind with two large data frames (in total roughly 3*10^6 rows) and aborted it after 10 minutes.

                                      – Joe
                                      May 11 '17 at 11:39













                                    45












                                    45








                                    45







                                    You can use smartbind from the gtools package.



                                    Example:



                                    library(gtools)
                                    df1 <- data.frame(a = c(1:5), b = c(6:10))
                                    df2 <- data.frame(a = c(11:15), b = c(16:20), c = LETTERS[1:5])
                                    smartbind(df1, df2)
                                    # result
                                    a b c
                                    1.1 1 6 <NA>
                                    1.2 2 7 <NA>
                                    1.3 3 8 <NA>
                                    1.4 4 9 <NA>
                                    1.5 5 10 <NA>
                                    2.1 11 16 A
                                    2.2 12 17 B
                                    2.3 13 18 C
                                    2.4 14 19 D
                                    2.5 15 20 E





                                    share|improve this answer















                                    You can use smartbind from the gtools package.



                                    Example:



                                    library(gtools)
                                    df1 <- data.frame(a = c(1:5), b = c(6:10))
                                    df2 <- data.frame(a = c(11:15), b = c(16:20), c = LETTERS[1:5])
                                    smartbind(df1, df2)
                                    # result
                                    a b c
                                    1.1 1 6 <NA>
                                    1.2 2 7 <NA>
                                    1.3 3 8 <NA>
                                    1.4 4 9 <NA>
                                    1.5 5 10 <NA>
                                    2.1 11 16 A
                                    2.2 12 17 B
                                    2.3 13 18 C
                                    2.4 14 19 D
                                    2.5 15 20 E






                                    share|improve this answer














                                    share|improve this answer



                                    share|improve this answer








                                    edited Jun 23 '15 at 10:03









                                    Henrik

                                    42.4k994110




                                    42.4k994110










                                    answered Aug 4 '10 at 3:45









                                    neilfwsneilfws

                                    18.7k53749




                                    18.7k53749












                                    • I tried smartbind with two large data frames (in total roughly 3*10^6 rows) and aborted it after 10 minutes.

                                      – Joe
                                      May 11 '17 at 11:39

















                                    • I tried smartbind with two large data frames (in total roughly 3*10^6 rows) and aborted it after 10 minutes.

                                      – Joe
                                      May 11 '17 at 11:39
















                                    I tried smartbind with two large data frames (in total roughly 3*10^6 rows) and aborted it after 10 minutes.

                                    – Joe
                                    May 11 '17 at 11:39





                                    I tried smartbind with two large data frames (in total roughly 3*10^6 rows) and aborted it after 10 minutes.

                                    – Joe
                                    May 11 '17 at 11:39











                                    32














                                    If the columns in df1 is a subset of those in df2 (by column names):



                                    df3 <- rbind(df1, df2[, names(df1)])





                                    share|improve this answer





























                                      32














                                      If the columns in df1 is a subset of those in df2 (by column names):



                                      df3 <- rbind(df1, df2[, names(df1)])





                                      share|improve this answer



























                                        32












                                        32








                                        32







                                        If the columns in df1 is a subset of those in df2 (by column names):



                                        df3 <- rbind(df1, df2[, names(df1)])





                                        share|improve this answer















                                        If the columns in df1 is a subset of those in df2 (by column names):



                                        df3 <- rbind(df1, df2[, names(df1)])






                                        share|improve this answer














                                        share|improve this answer



                                        share|improve this answer








                                        edited Dec 7 '17 at 9:08









                                        zx8754

                                        30.7k766105




                                        30.7k766105










                                        answered Aug 4 '10 at 4:33









                                        Aaron StathamAaron Statham

                                        1,38411015




                                        1,38411015





















                                            24














                                            An alternative with data.table:



                                            library(data.table)
                                            df1 = data.frame(a = c(1:5), b = c(6:10))
                                            df2 = data.frame(a = c(11:15), b = c(16:20), c = LETTERS[1:5])
                                            rbindlist(list(df1, df2), fill = TRUE)



                                            rbind will also work in data.table as long as the objects are converted to data.table objects, so



                                            rbind(setDT(df1), setDT(df2), fill=TRUE)


                                            will also work in this situation. This can be preferable when you have a couple of data.tables and don't want to construct a list.






                                            share|improve this answer

























                                            • This is the most simple, out-of-the-box solution that easily generalizes to any number of dataframes, since you can store them all in separate list elements. Other answers, like the intersect approach, only work for 2 dataframes and don't easily generalize.

                                              – Rich Pauloo
                                              Mar 19 at 16:37















                                            24














                                            An alternative with data.table:



                                            library(data.table)
                                            df1 = data.frame(a = c(1:5), b = c(6:10))
                                            df2 = data.frame(a = c(11:15), b = c(16:20), c = LETTERS[1:5])
                                            rbindlist(list(df1, df2), fill = TRUE)



                                            rbind will also work in data.table as long as the objects are converted to data.table objects, so



                                            rbind(setDT(df1), setDT(df2), fill=TRUE)


                                            will also work in this situation. This can be preferable when you have a couple of data.tables and don't want to construct a list.






                                            share|improve this answer

























                                            • This is the most simple, out-of-the-box solution that easily generalizes to any number of dataframes, since you can store them all in separate list elements. Other answers, like the intersect approach, only work for 2 dataframes and don't easily generalize.

                                              – Rich Pauloo
                                              Mar 19 at 16:37













                                            24












                                            24








                                            24







                                            An alternative with data.table:



                                            library(data.table)
                                            df1 = data.frame(a = c(1:5), b = c(6:10))
                                            df2 = data.frame(a = c(11:15), b = c(16:20), c = LETTERS[1:5])
                                            rbindlist(list(df1, df2), fill = TRUE)



                                            rbind will also work in data.table as long as the objects are converted to data.table objects, so



                                            rbind(setDT(df1), setDT(df2), fill=TRUE)


                                            will also work in this situation. This can be preferable when you have a couple of data.tables and don't want to construct a list.






                                            share|improve this answer















                                            An alternative with data.table:



                                            library(data.table)
                                            df1 = data.frame(a = c(1:5), b = c(6:10))
                                            df2 = data.frame(a = c(11:15), b = c(16:20), c = LETTERS[1:5])
                                            rbindlist(list(df1, df2), fill = TRUE)



                                            rbind will also work in data.table as long as the objects are converted to data.table objects, so



                                            rbind(setDT(df1), setDT(df2), fill=TRUE)


                                            will also work in this situation. This can be preferable when you have a couple of data.tables and don't want to construct a list.







                                            share|improve this answer














                                            share|improve this answer



                                            share|improve this answer








                                            edited Mar 22 at 2:52

























                                            answered Feb 22 '16 at 1:51









                                            kdauriakdauria

                                            4,41922246




                                            4,41922246












                                            • This is the most simple, out-of-the-box solution that easily generalizes to any number of dataframes, since you can store them all in separate list elements. Other answers, like the intersect approach, only work for 2 dataframes and don't easily generalize.

                                              – Rich Pauloo
                                              Mar 19 at 16:37

















                                            • This is the most simple, out-of-the-box solution that easily generalizes to any number of dataframes, since you can store them all in separate list elements. Other answers, like the intersect approach, only work for 2 dataframes and don't easily generalize.

                                              – Rich Pauloo
                                              Mar 19 at 16:37
















                                            This is the most simple, out-of-the-box solution that easily generalizes to any number of dataframes, since you can store them all in separate list elements. Other answers, like the intersect approach, only work for 2 dataframes and don't easily generalize.

                                            – Rich Pauloo
                                            Mar 19 at 16:37





                                            This is the most simple, out-of-the-box solution that easily generalizes to any number of dataframes, since you can store them all in separate list elements. Other answers, like the intersect approach, only work for 2 dataframes and don't easily generalize.

                                            – Rich Pauloo
                                            Mar 19 at 16:37











                                            17














                                            You could also just pull out the common column names.



                                            > cols <- intersect(colnames(df1), colnames(df2))
                                            > rbind(df1[,cols], df2[,cols])





                                            share|improve this answer



























                                              17














                                              You could also just pull out the common column names.



                                              > cols <- intersect(colnames(df1), colnames(df2))
                                              > rbind(df1[,cols], df2[,cols])





                                              share|improve this answer

























                                                17












                                                17








                                                17







                                                You could also just pull out the common column names.



                                                > cols <- intersect(colnames(df1), colnames(df2))
                                                > rbind(df1[,cols], df2[,cols])





                                                share|improve this answer













                                                You could also just pull out the common column names.



                                                > cols <- intersect(colnames(df1), colnames(df2))
                                                > rbind(df1[,cols], df2[,cols])






                                                share|improve this answer












                                                share|improve this answer



                                                share|improve this answer










                                                answered Aug 4 '10 at 3:50









                                                Jonathan ChangJonathan Chang

                                                17.5k52932




                                                17.5k52932





















                                                    17














                                                    Most of the base R answers address the situation where only one data.frame has additional columns or that the resulting data.frame would have the intersection of the columns. Since the OP writes I am hoping to retain the columns that do not match after the bind, an answer using base R methods to address this issue is probably worth posting.



                                                    Below, I present two base R methods: One that alters the original data.frames, and one that doesn't. Additionally, I offer a method that generalizes the non-destructive method to more than two data.frames.



                                                    First, let's get some sample data.



                                                    # sample data, variable c is in df1, variable d is in df2
                                                    df1 = data.frame(a=1:5, b=6:10, d=month.name[1:5])
                                                    df2 = data.frame(a=6:10, b=16:20, c = letters[8:12])



                                                    Two data.frames, alter originals

                                                    In order to retain all columns from both data.frames in an rbind (and allow the function to work without resulting in an error), you add NA columns to each data.frame with the appropriate missing names filled in using setdiff.



                                                    # fill in non-overlapping columns with NAs
                                                    df1[setdiff(names(df2), names(df1))] <- NA
                                                    df2[setdiff(names(df1), names(df2))] <- NA


                                                    Now, rbind-em



                                                    rbind(df1, df2)
                                                    a b d c
                                                    1 1 6 January <NA>
                                                    2 2 7 February <NA>
                                                    3 3 8 March <NA>
                                                    4 4 9 April <NA>
                                                    5 5 10 May <NA>
                                                    6 6 16 <NA> h
                                                    7 7 17 <NA> i
                                                    8 8 18 <NA> j
                                                    9 9 19 <NA> k
                                                    10 10 20 <NA> l


                                                    Note that the first two lines alter the original data.frames, df1 and df2, adding the full set of columns to both.




                                                    Two data.frames, do not alter originals

                                                    To leave the original data.frames intact, first loop through the names that differ, return a named vector of NAs that are concatenated into a list with the data.frame using c. Then, data.frame converts the result into an appropriate data.frame for the rbind.



                                                    rbind(
                                                    data.frame(c(df1, sapply(setdiff(names(df2), names(df1)), function(x) NA))),
                                                    data.frame(c(df2, sapply(setdiff(names(df1), names(df2)), function(x) NA)))
                                                    )



                                                    Many data.frames, do not alter originals

                                                    In the instance that you have more than two data.frames, you could do the following.



                                                    # put data.frames into list (dfs named df1, df2, df3, etc)
                                                    mydflist <- mget(ls(pattern="df\d+")
                                                    # get all variable names
                                                    allNms <- unique(unlist(lapply(mydflist, names)))

                                                    # put em all together
                                                    do.call(rbind,
                                                    lapply(mydflist,
                                                    function(x) data.frame(c(x, sapply(setdiff(allNms, names(x)),
                                                    function(y) NA)))))


                                                    Maybe a bit nicer to not see the row names of original data.frames? Then do this.



                                                    do.call(rbind,
                                                    c(lapply(mydflist,
                                                    function(x) data.frame(c(x, sapply(setdiff(allNms, names(x)),
                                                    function(y) NA)))),
                                                    make.row.names=FALSE))





                                                    share|improve this answer




















                                                    • 1





                                                      base R methods are best. thanks for this

                                                      – user3479780
                                                      Dec 17 '18 at 5:04















                                                    17














                                                    Most of the base R answers address the situation where only one data.frame has additional columns or that the resulting data.frame would have the intersection of the columns. Since the OP writes I am hoping to retain the columns that do not match after the bind, an answer using base R methods to address this issue is probably worth posting.



                                                    Below, I present two base R methods: One that alters the original data.frames, and one that doesn't. Additionally, I offer a method that generalizes the non-destructive method to more than two data.frames.



                                                    First, let's get some sample data.



                                                    # sample data, variable c is in df1, variable d is in df2
                                                    df1 = data.frame(a=1:5, b=6:10, d=month.name[1:5])
                                                    df2 = data.frame(a=6:10, b=16:20, c = letters[8:12])



                                                    Two data.frames, alter originals

                                                    In order to retain all columns from both data.frames in an rbind (and allow the function to work without resulting in an error), you add NA columns to each data.frame with the appropriate missing names filled in using setdiff.



                                                    # fill in non-overlapping columns with NAs
                                                    df1[setdiff(names(df2), names(df1))] <- NA
                                                    df2[setdiff(names(df1), names(df2))] <- NA


                                                    Now, rbind-em



                                                    rbind(df1, df2)
                                                    a b d c
                                                    1 1 6 January <NA>
                                                    2 2 7 February <NA>
                                                    3 3 8 March <NA>
                                                    4 4 9 April <NA>
                                                    5 5 10 May <NA>
                                                    6 6 16 <NA> h
                                                    7 7 17 <NA> i
                                                    8 8 18 <NA> j
                                                    9 9 19 <NA> k
                                                    10 10 20 <NA> l


                                                    Note that the first two lines alter the original data.frames, df1 and df2, adding the full set of columns to both.




                                                    Two data.frames, do not alter originals

                                                    To leave the original data.frames intact, first loop through the names that differ, return a named vector of NAs that are concatenated into a list with the data.frame using c. Then, data.frame converts the result into an appropriate data.frame for the rbind.



                                                    rbind(
                                                    data.frame(c(df1, sapply(setdiff(names(df2), names(df1)), function(x) NA))),
                                                    data.frame(c(df2, sapply(setdiff(names(df1), names(df2)), function(x) NA)))
                                                    )



                                                    Many data.frames, do not alter originals

                                                    In the instance that you have more than two data.frames, you could do the following.



                                                    # put data.frames into list (dfs named df1, df2, df3, etc)
                                                    mydflist <- mget(ls(pattern="df\d+")
                                                    # get all variable names
                                                    allNms <- unique(unlist(lapply(mydflist, names)))

                                                    # put em all together
                                                    do.call(rbind,
                                                    lapply(mydflist,
                                                    function(x) data.frame(c(x, sapply(setdiff(allNms, names(x)),
                                                    function(y) NA)))))


                                                    Maybe a bit nicer to not see the row names of original data.frames? Then do this.



                                                    do.call(rbind,
                                                    c(lapply(mydflist,
                                                    function(x) data.frame(c(x, sapply(setdiff(allNms, names(x)),
                                                    function(y) NA)))),
                                                    make.row.names=FALSE))





                                                    share|improve this answer




















                                                    • 1





                                                      base R methods are best. thanks for this

                                                      – user3479780
                                                      Dec 17 '18 at 5:04













                                                    17












                                                    17








                                                    17







                                                    Most of the base R answers address the situation where only one data.frame has additional columns or that the resulting data.frame would have the intersection of the columns. Since the OP writes I am hoping to retain the columns that do not match after the bind, an answer using base R methods to address this issue is probably worth posting.



                                                    Below, I present two base R methods: One that alters the original data.frames, and one that doesn't. Additionally, I offer a method that generalizes the non-destructive method to more than two data.frames.



                                                    First, let's get some sample data.



                                                    # sample data, variable c is in df1, variable d is in df2
                                                    df1 = data.frame(a=1:5, b=6:10, d=month.name[1:5])
                                                    df2 = data.frame(a=6:10, b=16:20, c = letters[8:12])



                                                    Two data.frames, alter originals

                                                    In order to retain all columns from both data.frames in an rbind (and allow the function to work without resulting in an error), you add NA columns to each data.frame with the appropriate missing names filled in using setdiff.



                                                    # fill in non-overlapping columns with NAs
                                                    df1[setdiff(names(df2), names(df1))] <- NA
                                                    df2[setdiff(names(df1), names(df2))] <- NA


                                                    Now, rbind-em



                                                    rbind(df1, df2)
                                                    a b d c
                                                    1 1 6 January <NA>
                                                    2 2 7 February <NA>
                                                    3 3 8 March <NA>
                                                    4 4 9 April <NA>
                                                    5 5 10 May <NA>
                                                    6 6 16 <NA> h
                                                    7 7 17 <NA> i
                                                    8 8 18 <NA> j
                                                    9 9 19 <NA> k
                                                    10 10 20 <NA> l


                                                    Note that the first two lines alter the original data.frames, df1 and df2, adding the full set of columns to both.




                                                    Two data.frames, do not alter originals

                                                    To leave the original data.frames intact, first loop through the names that differ, return a named vector of NAs that are concatenated into a list with the data.frame using c. Then, data.frame converts the result into an appropriate data.frame for the rbind.



                                                    rbind(
                                                    data.frame(c(df1, sapply(setdiff(names(df2), names(df1)), function(x) NA))),
                                                    data.frame(c(df2, sapply(setdiff(names(df1), names(df2)), function(x) NA)))
                                                    )



                                                    Many data.frames, do not alter originals

                                                    In the instance that you have more than two data.frames, you could do the following.



                                                    # put data.frames into list (dfs named df1, df2, df3, etc)
                                                    mydflist <- mget(ls(pattern="df\d+")
                                                    # get all variable names
                                                    allNms <- unique(unlist(lapply(mydflist, names)))

                                                    # put em all together
                                                    do.call(rbind,
                                                    lapply(mydflist,
                                                    function(x) data.frame(c(x, sapply(setdiff(allNms, names(x)),
                                                    function(y) NA)))))


                                                    Maybe a bit nicer to not see the row names of original data.frames? Then do this.



                                                    do.call(rbind,
                                                    c(lapply(mydflist,
                                                    function(x) data.frame(c(x, sapply(setdiff(allNms, names(x)),
                                                    function(y) NA)))),
                                                    make.row.names=FALSE))





                                                    share|improve this answer















                                                    Most of the base R answers address the situation where only one data.frame has additional columns or that the resulting data.frame would have the intersection of the columns. Since the OP writes I am hoping to retain the columns that do not match after the bind, an answer using base R methods to address this issue is probably worth posting.



                                                    Below, I present two base R methods: One that alters the original data.frames, and one that doesn't. Additionally, I offer a method that generalizes the non-destructive method to more than two data.frames.



                                                    First, let's get some sample data.



                                                    # sample data, variable c is in df1, variable d is in df2
                                                    df1 = data.frame(a=1:5, b=6:10, d=month.name[1:5])
                                                    df2 = data.frame(a=6:10, b=16:20, c = letters[8:12])



                                                    Two data.frames, alter originals

                                                    In order to retain all columns from both data.frames in an rbind (and allow the function to work without resulting in an error), you add NA columns to each data.frame with the appropriate missing names filled in using setdiff.



                                                    # fill in non-overlapping columns with NAs
                                                    df1[setdiff(names(df2), names(df1))] <- NA
                                                    df2[setdiff(names(df1), names(df2))] <- NA


                                                    Now, rbind-em



                                                    rbind(df1, df2)
                                                    a b d c
                                                    1 1 6 January <NA>
                                                    2 2 7 February <NA>
                                                    3 3 8 March <NA>
                                                    4 4 9 April <NA>
                                                    5 5 10 May <NA>
                                                    6 6 16 <NA> h
                                                    7 7 17 <NA> i
                                                    8 8 18 <NA> j
                                                    9 9 19 <NA> k
                                                    10 10 20 <NA> l


                                                    Note that the first two lines alter the original data.frames, df1 and df2, adding the full set of columns to both.




                                                    Two data.frames, do not alter originals

                                                    To leave the original data.frames intact, first loop through the names that differ, return a named vector of NAs that are concatenated into a list with the data.frame using c. Then, data.frame converts the result into an appropriate data.frame for the rbind.



                                                    rbind(
                                                    data.frame(c(df1, sapply(setdiff(names(df2), names(df1)), function(x) NA))),
                                                    data.frame(c(df2, sapply(setdiff(names(df1), names(df2)), function(x) NA)))
                                                    )



                                                    Many data.frames, do not alter originals

                                                    In the instance that you have more than two data.frames, you could do the following.



                                                    # put data.frames into list (dfs named df1, df2, df3, etc)
                                                    mydflist <- mget(ls(pattern="df\d+")
                                                    # get all variable names
                                                    allNms <- unique(unlist(lapply(mydflist, names)))

                                                    # put em all together
                                                    do.call(rbind,
                                                    lapply(mydflist,
                                                    function(x) data.frame(c(x, sapply(setdiff(allNms, names(x)),
                                                    function(y) NA)))))


                                                    Maybe a bit nicer to not see the row names of original data.frames? Then do this.



                                                    do.call(rbind,
                                                    c(lapply(mydflist,
                                                    function(x) data.frame(c(x, sapply(setdiff(allNms, names(x)),
                                                    function(y) NA)))),
                                                    make.row.names=FALSE))






                                                    share|improve this answer














                                                    share|improve this answer



                                                    share|improve this answer








                                                    edited Jan 27 '18 at 23:44

























                                                    answered Oct 8 '17 at 20:16









                                                    lmolmo

                                                    32.1k93651




                                                    32.1k93651







                                                    • 1





                                                      base R methods are best. thanks for this

                                                      – user3479780
                                                      Dec 17 '18 at 5:04












                                                    • 1





                                                      base R methods are best. thanks for this

                                                      – user3479780
                                                      Dec 17 '18 at 5:04







                                                    1




                                                    1





                                                    base R methods are best. thanks for this

                                                    – user3479780
                                                    Dec 17 '18 at 5:04





                                                    base R methods are best. thanks for this

                                                    – user3479780
                                                    Dec 17 '18 at 5:04











                                                    6














                                                    I wrote a function to do this because I like my code to tell me if something is wrong. This function will explicitly tell you which column names don't match and if you have a type mismatch. Then it will do its best to combine the data.frames anyway. The limitation is that you can only combine two data.frames at a time.



                                                    ### combines data frames (like rbind) but by matching column names
                                                    # columns without matches in the other data frame are still combined
                                                    # but with NA in the rows corresponding to the data frame without
                                                    # the variable
                                                    # A warning is issued if there is a type mismatch between columns of
                                                    # the same name and an attempt is made to combine the columns
                                                    combineByName <- function(A,B)
                                                    a.names <- names(A)
                                                    b.names <- names(B)
                                                    all.names <- union(a.names,b.names)
                                                    print(paste("Number of columns:",length(all.names)))
                                                    a.type <- NULL
                                                    for (i in 1:ncol(A))
                                                    a.type[i] <- typeof(A[,i])

                                                    b.type <- NULL
                                                    for (i in 1:ncol(B))
                                                    b.type[i] <- typeof(B[,i])

                                                    a_b.names <- names(A)[!names(A)%in%names(B)]
                                                    b_a.names <- names(B)[!names(B)%in%names(A)]
                                                    if (length(a_b.names)>0





                                                    share|improve this answer



























                                                      6














                                                      I wrote a function to do this because I like my code to tell me if something is wrong. This function will explicitly tell you which column names don't match and if you have a type mismatch. Then it will do its best to combine the data.frames anyway. The limitation is that you can only combine two data.frames at a time.



                                                      ### combines data frames (like rbind) but by matching column names
                                                      # columns without matches in the other data frame are still combined
                                                      # but with NA in the rows corresponding to the data frame without
                                                      # the variable
                                                      # A warning is issued if there is a type mismatch between columns of
                                                      # the same name and an attempt is made to combine the columns
                                                      combineByName <- function(A,B)
                                                      a.names <- names(A)
                                                      b.names <- names(B)
                                                      all.names <- union(a.names,b.names)
                                                      print(paste("Number of columns:",length(all.names)))
                                                      a.type <- NULL
                                                      for (i in 1:ncol(A))
                                                      a.type[i] <- typeof(A[,i])

                                                      b.type <- NULL
                                                      for (i in 1:ncol(B))
                                                      b.type[i] <- typeof(B[,i])

                                                      a_b.names <- names(A)[!names(A)%in%names(B)]
                                                      b_a.names <- names(B)[!names(B)%in%names(A)]
                                                      if (length(a_b.names)>0





                                                      share|improve this answer

























                                                        6












                                                        6








                                                        6







                                                        I wrote a function to do this because I like my code to tell me if something is wrong. This function will explicitly tell you which column names don't match and if you have a type mismatch. Then it will do its best to combine the data.frames anyway. The limitation is that you can only combine two data.frames at a time.



                                                        ### combines data frames (like rbind) but by matching column names
                                                        # columns without matches in the other data frame are still combined
                                                        # but with NA in the rows corresponding to the data frame without
                                                        # the variable
                                                        # A warning is issued if there is a type mismatch between columns of
                                                        # the same name and an attempt is made to combine the columns
                                                        combineByName <- function(A,B)
                                                        a.names <- names(A)
                                                        b.names <- names(B)
                                                        all.names <- union(a.names,b.names)
                                                        print(paste("Number of columns:",length(all.names)))
                                                        a.type <- NULL
                                                        for (i in 1:ncol(A))
                                                        a.type[i] <- typeof(A[,i])

                                                        b.type <- NULL
                                                        for (i in 1:ncol(B))
                                                        b.type[i] <- typeof(B[,i])

                                                        a_b.names <- names(A)[!names(A)%in%names(B)]
                                                        b_a.names <- names(B)[!names(B)%in%names(A)]
                                                        if (length(a_b.names)>0





                                                        share|improve this answer













                                                        I wrote a function to do this because I like my code to tell me if something is wrong. This function will explicitly tell you which column names don't match and if you have a type mismatch. Then it will do its best to combine the data.frames anyway. The limitation is that you can only combine two data.frames at a time.



                                                        ### combines data frames (like rbind) but by matching column names
                                                        # columns without matches in the other data frame are still combined
                                                        # but with NA in the rows corresponding to the data frame without
                                                        # the variable
                                                        # A warning is issued if there is a type mismatch between columns of
                                                        # the same name and an attempt is made to combine the columns
                                                        combineByName <- function(A,B)
                                                        a.names <- names(A)
                                                        b.names <- names(B)
                                                        all.names <- union(a.names,b.names)
                                                        print(paste("Number of columns:",length(all.names)))
                                                        a.type <- NULL
                                                        for (i in 1:ncol(A))
                                                        a.type[i] <- typeof(A[,i])

                                                        b.type <- NULL
                                                        for (i in 1:ncol(B))
                                                        b.type[i] <- typeof(B[,i])

                                                        a_b.names <- names(A)[!names(A)%in%names(B)]
                                                        b_a.names <- names(B)[!names(B)%in%names(A)]
                                                        if (length(a_b.names)>0






                                                        share|improve this answer












                                                        share|improve this answer



                                                        share|improve this answer










                                                        answered Feb 3 '11 at 5:27







                                                        user399470




























                                                            2














                                                            Just for the documentation. You can try the Stack library and its function Stack in the following form:



                                                            Stack(df_1, df_2)


                                                            I have also the impression that it is faster than other methods for large data sets.






                                                            share|improve this answer



























                                                              2














                                                              Just for the documentation. You can try the Stack library and its function Stack in the following form:



                                                              Stack(df_1, df_2)


                                                              I have also the impression that it is faster than other methods for large data sets.






                                                              share|improve this answer

























                                                                2












                                                                2








                                                                2







                                                                Just for the documentation. You can try the Stack library and its function Stack in the following form:



                                                                Stack(df_1, df_2)


                                                                I have also the impression that it is faster than other methods for large data sets.






                                                                share|improve this answer













                                                                Just for the documentation. You can try the Stack library and its function Stack in the following form:



                                                                Stack(df_1, df_2)


                                                                I have also the impression that it is faster than other methods for large data sets.







                                                                share|improve this answer












                                                                share|improve this answer



                                                                share|improve this answer










                                                                answered Aug 15 '17 at 19:48









                                                                Cro-MagnonCro-Magnon

                                                                9281121




                                                                9281121





















                                                                    1














                                                                    Maybe I completely misread your question, but the "I am hoping to retain the columns that do not match after the bind" makes me think you are looking for a left join or right join similar to an SQL query. R has the merge function that lets you specify left, right, or inner joins similar to joining tables in SQL.



                                                                    There is already a great question and answer on this topic here: How to join (merge) data frames (inner, outer, left, right)?






                                                                    share|improve this answer





























                                                                      1














                                                                      Maybe I completely misread your question, but the "I am hoping to retain the columns that do not match after the bind" makes me think you are looking for a left join or right join similar to an SQL query. R has the merge function that lets you specify left, right, or inner joins similar to joining tables in SQL.



                                                                      There is already a great question and answer on this topic here: How to join (merge) data frames (inner, outer, left, right)?






                                                                      share|improve this answer



























                                                                        1












                                                                        1








                                                                        1







                                                                        Maybe I completely misread your question, but the "I am hoping to retain the columns that do not match after the bind" makes me think you are looking for a left join or right join similar to an SQL query. R has the merge function that lets you specify left, right, or inner joins similar to joining tables in SQL.



                                                                        There is already a great question and answer on this topic here: How to join (merge) data frames (inner, outer, left, right)?






                                                                        share|improve this answer















                                                                        Maybe I completely misread your question, but the "I am hoping to retain the columns that do not match after the bind" makes me think you are looking for a left join or right join similar to an SQL query. R has the merge function that lets you specify left, right, or inner joins similar to joining tables in SQL.



                                                                        There is already a great question and answer on this topic here: How to join (merge) data frames (inner, outer, left, right)?







                                                                        share|improve this answer














                                                                        share|improve this answer



                                                                        share|improve this answer








                                                                        edited May 23 '17 at 12:10









                                                                        Community

                                                                        11




                                                                        11










                                                                        answered Aug 4 '10 at 13:13









                                                                        ChaseChase

                                                                        50.8k12119153




                                                                        50.8k12119153





















                                                                            1














                                                                            gtools/smartbind didnt like working with Dates, probably because it was as.vectoring. So here's my solution...



                                                                            sbind = function(x, y, fill=NA) 
                                                                            sbind.fill = function(d, cols)
                                                                            for(c in cols)
                                                                            d[[c]] = fill
                                                                            d


                                                                            x = sbind.fill(x, setdiff(names(y),names(x)))
                                                                            y = sbind.fill(y, setdiff(names(x),names(y)))

                                                                            rbind(x, y)






                                                                            share|improve this answer



























                                                                              1














                                                                              gtools/smartbind didnt like working with Dates, probably because it was as.vectoring. So here's my solution...



                                                                              sbind = function(x, y, fill=NA) 
                                                                              sbind.fill = function(d, cols)
                                                                              for(c in cols)
                                                                              d[[c]] = fill
                                                                              d


                                                                              x = sbind.fill(x, setdiff(names(y),names(x)))
                                                                              y = sbind.fill(y, setdiff(names(x),names(y)))

                                                                              rbind(x, y)






                                                                              share|improve this answer

























                                                                                1












                                                                                1








                                                                                1







                                                                                gtools/smartbind didnt like working with Dates, probably because it was as.vectoring. So here's my solution...



                                                                                sbind = function(x, y, fill=NA) 
                                                                                sbind.fill = function(d, cols)
                                                                                for(c in cols)
                                                                                d[[c]] = fill
                                                                                d


                                                                                x = sbind.fill(x, setdiff(names(y),names(x)))
                                                                                y = sbind.fill(y, setdiff(names(x),names(y)))

                                                                                rbind(x, y)






                                                                                share|improve this answer













                                                                                gtools/smartbind didnt like working with Dates, probably because it was as.vectoring. So here's my solution...



                                                                                sbind = function(x, y, fill=NA) 
                                                                                sbind.fill = function(d, cols)
                                                                                for(c in cols)
                                                                                d[[c]] = fill
                                                                                d


                                                                                x = sbind.fill(x, setdiff(names(y),names(x)))
                                                                                y = sbind.fill(y, setdiff(names(x),names(y)))

                                                                                rbind(x, y)







                                                                                share|improve this answer












                                                                                share|improve this answer



                                                                                share|improve this answer










                                                                                answered Nov 13 '13 at 16:22









                                                                                aaronaaron

                                                                                766




                                                                                766





















                                                                                    0














                                                                                    You could also use sjmisc::add_rows(), which uses dplyr::bind_rows(), but unlike bind_rows(), add_rows() preserves attributes and hence is useful for labelled data.



                                                                                    See following example with a labelled dataset. The frq()-function prints frequency tables with value labels, if the data is labelled.



                                                                                    library(sjmisc)
                                                                                    library(dplyr)

                                                                                    data(efc)
                                                                                    # select two subsets, with some identical and else different columns
                                                                                    x1 <- efc %>% select(1:5) %>% slice(1:10)
                                                                                    x2 <- efc %>% select(3:7) %>% slice(11:20)

                                                                                    str(x1)
                                                                                    #> 'data.frame': 10 obs. of 5 variables:
                                                                                    #> $ c12hour : num 16 148 70 168 168 16 161 110 28 40
                                                                                    #> ..- attr(*, "label")= chr "average number of hours of care per week"
                                                                                    #> $ e15relat: num 2 2 1 1 2 2 1 4 2 2
                                                                                    #> ..- attr(*, "label")= chr "relationship to elder"
                                                                                    #> ..- attr(*, "labels")= Named num 1 2 3 4 5 6 7 8
                                                                                    #> .. ..- attr(*, "names")= chr "spouse/partner" "child" "sibling" "daughter or son -in-law" ...
                                                                                    #> $ e16sex : num 2 2 2 2 2 2 1 2 2 2
                                                                                    #> ..- attr(*, "label")= chr "elder's gender"
                                                                                    #> ..- attr(*, "labels")= Named num 1 2
                                                                                    #> .. ..- attr(*, "names")= chr "male" "female"
                                                                                    #> $ e17age : num 83 88 82 67 84 85 74 87 79 83
                                                                                    #> ..- attr(*, "label")= chr "elder' age"
                                                                                    #> $ e42dep : num 3 3 3 4 4 4 4 4 4 4
                                                                                    #> ..- attr(*, "label")= chr "elder's dependency"
                                                                                    #> ..- attr(*, "labels")= Named num 1 2 3 4
                                                                                    #> .. ..- attr(*, "names")= chr "independent" "slightly dependent" "moderately dependent" "severely dependent"

                                                                                    bind_rows(x1, x1) %>% frq(e42dep)
                                                                                    #>
                                                                                    #> # e42dep <numeric>
                                                                                    #> # total N=20 valid N=20 mean=3.70 sd=0.47
                                                                                    #>
                                                                                    #> val frq raw.prc valid.prc cum.prc
                                                                                    #> 3 6 30 30 30
                                                                                    #> 4 14 70 70 100
                                                                                    #> <NA> 0 0 NA NA

                                                                                    add_rows(x1, x1) %>% frq(e42dep)
                                                                                    #>
                                                                                    #> # elder's dependency (e42dep) <numeric>
                                                                                    #> # total N=20 valid N=20 mean=3.70 sd=0.47
                                                                                    #>
                                                                                    #> val label frq raw.prc valid.prc cum.prc
                                                                                    #> 1 independent 0 0 0 0
                                                                                    #> 2 slightly dependent 0 0 0 0
                                                                                    #> 3 moderately dependent 6 30 30 30
                                                                                    #> 4 severely dependent 14 70 70 100
                                                                                    #> NA NA 0 0 NA NA





                                                                                    share|improve this answer



























                                                                                      0














                                                                                      You could also use sjmisc::add_rows(), which uses dplyr::bind_rows(), but unlike bind_rows(), add_rows() preserves attributes and hence is useful for labelled data.



                                                                                      See following example with a labelled dataset. The frq()-function prints frequency tables with value labels, if the data is labelled.



                                                                                      library(sjmisc)
                                                                                      library(dplyr)

                                                                                      data(efc)
                                                                                      # select two subsets, with some identical and else different columns
                                                                                      x1 <- efc %>% select(1:5) %>% slice(1:10)
                                                                                      x2 <- efc %>% select(3:7) %>% slice(11:20)

                                                                                      str(x1)
                                                                                      #> 'data.frame': 10 obs. of 5 variables:
                                                                                      #> $ c12hour : num 16 148 70 168 168 16 161 110 28 40
                                                                                      #> ..- attr(*, "label")= chr "average number of hours of care per week"
                                                                                      #> $ e15relat: num 2 2 1 1 2 2 1 4 2 2
                                                                                      #> ..- attr(*, "label")= chr "relationship to elder"
                                                                                      #> ..- attr(*, "labels")= Named num 1 2 3 4 5 6 7 8
                                                                                      #> .. ..- attr(*, "names")= chr "spouse/partner" "child" "sibling" "daughter or son -in-law" ...
                                                                                      #> $ e16sex : num 2 2 2 2 2 2 1 2 2 2
                                                                                      #> ..- attr(*, "label")= chr "elder's gender"
                                                                                      #> ..- attr(*, "labels")= Named num 1 2
                                                                                      #> .. ..- attr(*, "names")= chr "male" "female"
                                                                                      #> $ e17age : num 83 88 82 67 84 85 74 87 79 83
                                                                                      #> ..- attr(*, "label")= chr "elder' age"
                                                                                      #> $ e42dep : num 3 3 3 4 4 4 4 4 4 4
                                                                                      #> ..- attr(*, "label")= chr "elder's dependency"
                                                                                      #> ..- attr(*, "labels")= Named num 1 2 3 4
                                                                                      #> .. ..- attr(*, "names")= chr "independent" "slightly dependent" "moderately dependent" "severely dependent"

                                                                                      bind_rows(x1, x1) %>% frq(e42dep)
                                                                                      #>
                                                                                      #> # e42dep <numeric>
                                                                                      #> # total N=20 valid N=20 mean=3.70 sd=0.47
                                                                                      #>
                                                                                      #> val frq raw.prc valid.prc cum.prc
                                                                                      #> 3 6 30 30 30
                                                                                      #> 4 14 70 70 100
                                                                                      #> <NA> 0 0 NA NA

                                                                                      add_rows(x1, x1) %>% frq(e42dep)
                                                                                      #>
                                                                                      #> # elder's dependency (e42dep) <numeric>
                                                                                      #> # total N=20 valid N=20 mean=3.70 sd=0.47
                                                                                      #>
                                                                                      #> val label frq raw.prc valid.prc cum.prc
                                                                                      #> 1 independent 0 0 0 0
                                                                                      #> 2 slightly dependent 0 0 0 0
                                                                                      #> 3 moderately dependent 6 30 30 30
                                                                                      #> 4 severely dependent 14 70 70 100
                                                                                      #> NA NA 0 0 NA NA





                                                                                      share|improve this answer

























                                                                                        0












                                                                                        0








                                                                                        0







                                                                                        You could also use sjmisc::add_rows(), which uses dplyr::bind_rows(), but unlike bind_rows(), add_rows() preserves attributes and hence is useful for labelled data.



                                                                                        See following example with a labelled dataset. The frq()-function prints frequency tables with value labels, if the data is labelled.



                                                                                        library(sjmisc)
                                                                                        library(dplyr)

                                                                                        data(efc)
                                                                                        # select two subsets, with some identical and else different columns
                                                                                        x1 <- efc %>% select(1:5) %>% slice(1:10)
                                                                                        x2 <- efc %>% select(3:7) %>% slice(11:20)

                                                                                        str(x1)
                                                                                        #> 'data.frame': 10 obs. of 5 variables:
                                                                                        #> $ c12hour : num 16 148 70 168 168 16 161 110 28 40
                                                                                        #> ..- attr(*, "label")= chr "average number of hours of care per week"
                                                                                        #> $ e15relat: num 2 2 1 1 2 2 1 4 2 2
                                                                                        #> ..- attr(*, "label")= chr "relationship to elder"
                                                                                        #> ..- attr(*, "labels")= Named num 1 2 3 4 5 6 7 8
                                                                                        #> .. ..- attr(*, "names")= chr "spouse/partner" "child" "sibling" "daughter or son -in-law" ...
                                                                                        #> $ e16sex : num 2 2 2 2 2 2 1 2 2 2
                                                                                        #> ..- attr(*, "label")= chr "elder's gender"
                                                                                        #> ..- attr(*, "labels")= Named num 1 2
                                                                                        #> .. ..- attr(*, "names")= chr "male" "female"
                                                                                        #> $ e17age : num 83 88 82 67 84 85 74 87 79 83
                                                                                        #> ..- attr(*, "label")= chr "elder' age"
                                                                                        #> $ e42dep : num 3 3 3 4 4 4 4 4 4 4
                                                                                        #> ..- attr(*, "label")= chr "elder's dependency"
                                                                                        #> ..- attr(*, "labels")= Named num 1 2 3 4
                                                                                        #> .. ..- attr(*, "names")= chr "independent" "slightly dependent" "moderately dependent" "severely dependent"

                                                                                        bind_rows(x1, x1) %>% frq(e42dep)
                                                                                        #>
                                                                                        #> # e42dep <numeric>
                                                                                        #> # total N=20 valid N=20 mean=3.70 sd=0.47
                                                                                        #>
                                                                                        #> val frq raw.prc valid.prc cum.prc
                                                                                        #> 3 6 30 30 30
                                                                                        #> 4 14 70 70 100
                                                                                        #> <NA> 0 0 NA NA

                                                                                        add_rows(x1, x1) %>% frq(e42dep)
                                                                                        #>
                                                                                        #> # elder's dependency (e42dep) <numeric>
                                                                                        #> # total N=20 valid N=20 mean=3.70 sd=0.47
                                                                                        #>
                                                                                        #> val label frq raw.prc valid.prc cum.prc
                                                                                        #> 1 independent 0 0 0 0
                                                                                        #> 2 slightly dependent 0 0 0 0
                                                                                        #> 3 moderately dependent 6 30 30 30
                                                                                        #> 4 severely dependent 14 70 70 100
                                                                                        #> NA NA 0 0 NA NA





                                                                                        share|improve this answer













                                                                                        You could also use sjmisc::add_rows(), which uses dplyr::bind_rows(), but unlike bind_rows(), add_rows() preserves attributes and hence is useful for labelled data.



                                                                                        See following example with a labelled dataset. The frq()-function prints frequency tables with value labels, if the data is labelled.



                                                                                        library(sjmisc)
                                                                                        library(dplyr)

                                                                                        data(efc)
                                                                                        # select two subsets, with some identical and else different columns
                                                                                        x1 <- efc %>% select(1:5) %>% slice(1:10)
                                                                                        x2 <- efc %>% select(3:7) %>% slice(11:20)

                                                                                        str(x1)
                                                                                        #> 'data.frame': 10 obs. of 5 variables:
                                                                                        #> $ c12hour : num 16 148 70 168 168 16 161 110 28 40
                                                                                        #> ..- attr(*, "label")= chr "average number of hours of care per week"
                                                                                        #> $ e15relat: num 2 2 1 1 2 2 1 4 2 2
                                                                                        #> ..- attr(*, "label")= chr "relationship to elder"
                                                                                        #> ..- attr(*, "labels")= Named num 1 2 3 4 5 6 7 8
                                                                                        #> .. ..- attr(*, "names")= chr "spouse/partner" "child" "sibling" "daughter or son -in-law" ...
                                                                                        #> $ e16sex : num 2 2 2 2 2 2 1 2 2 2
                                                                                        #> ..- attr(*, "label")= chr "elder's gender"
                                                                                        #> ..- attr(*, "labels")= Named num 1 2
                                                                                        #> .. ..- attr(*, "names")= chr "male" "female"
                                                                                        #> $ e17age : num 83 88 82 67 84 85 74 87 79 83
                                                                                        #> ..- attr(*, "label")= chr "elder' age"
                                                                                        #> $ e42dep : num 3 3 3 4 4 4 4 4 4 4
                                                                                        #> ..- attr(*, "label")= chr "elder's dependency"
                                                                                        #> ..- attr(*, "labels")= Named num 1 2 3 4
                                                                                        #> .. ..- attr(*, "names")= chr "independent" "slightly dependent" "moderately dependent" "severely dependent"

                                                                                        bind_rows(x1, x1) %>% frq(e42dep)
                                                                                        #>
                                                                                        #> # e42dep <numeric>
                                                                                        #> # total N=20 valid N=20 mean=3.70 sd=0.47
                                                                                        #>
                                                                                        #> val frq raw.prc valid.prc cum.prc
                                                                                        #> 3 6 30 30 30
                                                                                        #> 4 14 70 70 100
                                                                                        #> <NA> 0 0 NA NA

                                                                                        add_rows(x1, x1) %>% frq(e42dep)
                                                                                        #>
                                                                                        #> # elder's dependency (e42dep) <numeric>
                                                                                        #> # total N=20 valid N=20 mean=3.70 sd=0.47
                                                                                        #>
                                                                                        #> val label frq raw.prc valid.prc cum.prc
                                                                                        #> 1 independent 0 0 0 0
                                                                                        #> 2 slightly dependent 0 0 0 0
                                                                                        #> 3 moderately dependent 6 30 30 30
                                                                                        #> 4 severely dependent 14 70 70 100
                                                                                        #> NA NA 0 0 NA NA






                                                                                        share|improve this answer












                                                                                        share|improve this answer



                                                                                        share|improve this answer










                                                                                        answered Sep 24 '18 at 11:28









                                                                                        DanielDaniel

                                                                                        3,82341730




                                                                                        3,82341730





















                                                                                            -1














                                                                                            rbind.ordered=function(x,y)

                                                                                            diffCol = setdiff(colnames(x),colnames(y))
                                                                                            if (length(diffCol)>0)
                                                                                            cols=colnames(y)
                                                                                            for (i in 1:length(diffCol)) y=cbind(y,NA)
                                                                                            colnames(y)=c(cols,diffCol)


                                                                                            diffCol = setdiff(colnames(y),colnames(x))
                                                                                            if (length(diffCol)>0)
                                                                                            cols=colnames(x)
                                                                                            for (i in 1:length(diffCol)) x=cbind(x,NA)
                                                                                            colnames(x)=c(cols,diffCol)

                                                                                            return(rbind(x, y[, colnames(x)]))






                                                                                            share|improve this answer



























                                                                                              -1














                                                                                              rbind.ordered=function(x,y)

                                                                                              diffCol = setdiff(colnames(x),colnames(y))
                                                                                              if (length(diffCol)>0)
                                                                                              cols=colnames(y)
                                                                                              for (i in 1:length(diffCol)) y=cbind(y,NA)
                                                                                              colnames(y)=c(cols,diffCol)


                                                                                              diffCol = setdiff(colnames(y),colnames(x))
                                                                                              if (length(diffCol)>0)
                                                                                              cols=colnames(x)
                                                                                              for (i in 1:length(diffCol)) x=cbind(x,NA)
                                                                                              colnames(x)=c(cols,diffCol)

                                                                                              return(rbind(x, y[, colnames(x)]))






                                                                                              share|improve this answer

























                                                                                                -1












                                                                                                -1








                                                                                                -1







                                                                                                rbind.ordered=function(x,y)

                                                                                                diffCol = setdiff(colnames(x),colnames(y))
                                                                                                if (length(diffCol)>0)
                                                                                                cols=colnames(y)
                                                                                                for (i in 1:length(diffCol)) y=cbind(y,NA)
                                                                                                colnames(y)=c(cols,diffCol)


                                                                                                diffCol = setdiff(colnames(y),colnames(x))
                                                                                                if (length(diffCol)>0)
                                                                                                cols=colnames(x)
                                                                                                for (i in 1:length(diffCol)) x=cbind(x,NA)
                                                                                                colnames(x)=c(cols,diffCol)

                                                                                                return(rbind(x, y[, colnames(x)]))






                                                                                                share|improve this answer













                                                                                                rbind.ordered=function(x,y)

                                                                                                diffCol = setdiff(colnames(x),colnames(y))
                                                                                                if (length(diffCol)>0)
                                                                                                cols=colnames(y)
                                                                                                for (i in 1:length(diffCol)) y=cbind(y,NA)
                                                                                                colnames(y)=c(cols,diffCol)


                                                                                                diffCol = setdiff(colnames(y),colnames(x))
                                                                                                if (length(diffCol)>0)
                                                                                                cols=colnames(x)
                                                                                                for (i in 1:length(diffCol)) x=cbind(x,NA)
                                                                                                colnames(x)=c(cols,diffCol)

                                                                                                return(rbind(x, y[, colnames(x)]))







                                                                                                share|improve this answer












                                                                                                share|improve this answer



                                                                                                share|improve this answer










                                                                                                answered Jul 24 '12 at 11:21









                                                                                                RockScienceRockScience

                                                                                                6,943176196




                                                                                                6,943176196















                                                                                                    protected by zx8754 Mar 21 '18 at 11:12



                                                                                                    Thank you for your interest in this question.
                                                                                                    Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



                                                                                                    Would you like to answer one of these unanswered questions instead?



                                                                                                    Popular posts from this blog

                                                                                                    Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

                                                                                                    Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

                                                                                                    Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript