Join lines with the same value in the first column The Next CEO of Stack OverflowJoin lines based on last column in BashReshuffling of specific structure file with bash scriptHow do I parse command line arguments in Bash?How to count all the lines of code in a directory recursively?join 3 files by first Column with join (was awk)?How to count lines in a document?how to trim file - remove the rows which with the same value in the columns except the first two columnsRead a file line by line assigning the value to a variableAssigning variables to values in a text file with 3 columns, line by lineawk not capturing first line / separatorMerge lines with the same value in the first columnJoin lines with similar first column

Lucky Feat: How can "more than one creature spend a luck point to influence the outcome of a roll"?

Why is information "lost" when it got into a black hole?

From jafe to El-Guest

Could a dragon use its wings to swim?

"Eavesdropping" vs "Listen in on"

What steps are necessary to read a Modern SSD in Medieval Europe?

It is correct to match light sources with the same color temperature?

Which one is the true statement?

Calculate the Mean mean of two numbers

Getting Stale Gas Out of a Gas Tank w/out Dropping the Tank

Small nick on power cord from an electric alarm clock, and copper wiring exposed but intact

Is there such a thing as a proper verb, like a proper noun?

Traveling with my 5 year old daughter (as the father) without the mother from Germany to Mexico

Is there an equivalent of cd - for cp or mv

Where do students learn to solve polynomial equations these days?

Why do we say 'Un seul M' and not 'Une seule M' even though M is a "consonne"

How do you define an element with an ID attribute using LWC?

How to Implement Deterministic Encryption Safely in .NET

What does "shotgun unity" refer to here in this sentence?

Spaces in which all closed sets are regular closed

Aggressive Under-Indexing and no data for missing index

How did Beeri the Hittite come up with naming his daughter Yehudit?

Which Pokemon have a special animation when running with them out of their pokeball?

Does higher Oxidation/ reduction potential translate to higher energy storage in battery?



Join lines with the same value in the first column



The Next CEO of Stack OverflowJoin lines based on last column in BashReshuffling of specific structure file with bash scriptHow do I parse command line arguments in Bash?How to count all the lines of code in a directory recursively?join 3 files by first Column with join (was awk)?How to count lines in a document?how to trim file - remove the rows which with the same value in the columns except the first two columnsRead a file line by line assigning the value to a variableAssigning variables to values in a text file with 3 columns, line by lineawk not capturing first line / separatorMerge lines with the same value in the first columnJoin lines with similar first column










3















I have a tab-delimited file with three columns (excerpt):



AC147602.5_FG004 IPR000146 Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase
AC147602.5_FG004 IPR023079 Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001 IPR002110 Ankyrin repeat
AC148152.3_FG001 IPR026961 PGG domain


and I'd like to get this using bash:



AC147602.5_FG004 IPR000146 Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase IPR023079 Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001 IPR023079 Sedoheptulose-1,7-bisphosphatase IPR002110 Ankyrin repeat IPR026961 PGG domain


So if ID in the first column are the same in several lines, it should produce one line for each ID with all other parts of lines joined. In the example it will give two-row file.










share|improve this question



















  • 1





    @oberlies, it is sometimes OK to add tags to a question that cover technologies used in answers, but not mentioned in the question. This would be one of those cases, especially when the alternative is creating new meta tags.

    – Charles
    Mar 29 '14 at 3:22











  • @close-voters: How can this question be too broad? The answer is a one-line awk script.

    – oberlies
    Apr 18 '14 at 14:57
















3















I have a tab-delimited file with three columns (excerpt):



AC147602.5_FG004 IPR000146 Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase
AC147602.5_FG004 IPR023079 Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001 IPR002110 Ankyrin repeat
AC148152.3_FG001 IPR026961 PGG domain


and I'd like to get this using bash:



AC147602.5_FG004 IPR000146 Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase IPR023079 Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001 IPR023079 Sedoheptulose-1,7-bisphosphatase IPR002110 Ankyrin repeat IPR026961 PGG domain


So if ID in the first column are the same in several lines, it should produce one line for each ID with all other parts of lines joined. In the example it will give two-row file.










share|improve this question



















  • 1





    @oberlies, it is sometimes OK to add tags to a question that cover technologies used in answers, but not mentioned in the question. This would be one of those cases, especially when the alternative is creating new meta tags.

    – Charles
    Mar 29 '14 at 3:22











  • @close-voters: How can this question be too broad? The answer is a one-line awk script.

    – oberlies
    Apr 18 '14 at 14:57














3












3








3


1






I have a tab-delimited file with three columns (excerpt):



AC147602.5_FG004 IPR000146 Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase
AC147602.5_FG004 IPR023079 Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001 IPR002110 Ankyrin repeat
AC148152.3_FG001 IPR026961 PGG domain


and I'd like to get this using bash:



AC147602.5_FG004 IPR000146 Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase IPR023079 Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001 IPR023079 Sedoheptulose-1,7-bisphosphatase IPR002110 Ankyrin repeat IPR026961 PGG domain


So if ID in the first column are the same in several lines, it should produce one line for each ID with all other parts of lines joined. In the example it will give two-row file.










share|improve this question
















I have a tab-delimited file with three columns (excerpt):



AC147602.5_FG004 IPR000146 Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase
AC147602.5_FG004 IPR023079 Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001 IPR002110 Ankyrin repeat
AC148152.3_FG001 IPR026961 PGG domain


and I'd like to get this using bash:



AC147602.5_FG004 IPR000146 Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase IPR023079 Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001 IPR023079 Sedoheptulose-1,7-bisphosphatase IPR002110 Ankyrin repeat IPR026961 PGG domain


So if ID in the first column are the same in several lines, it should produce one line for each ID with all other parts of lines joined. In the example it will give two-row file.







bash awk






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 29 '14 at 3:20









Charles

45.8k1287125




45.8k1287125










asked Nov 6 '13 at 22:14









boczniak767boczniak767

9518




9518







  • 1





    @oberlies, it is sometimes OK to add tags to a question that cover technologies used in answers, but not mentioned in the question. This would be one of those cases, especially when the alternative is creating new meta tags.

    – Charles
    Mar 29 '14 at 3:22











  • @close-voters: How can this question be too broad? The answer is a one-line awk script.

    – oberlies
    Apr 18 '14 at 14:57













  • 1





    @oberlies, it is sometimes OK to add tags to a question that cover technologies used in answers, but not mentioned in the question. This would be one of those cases, especially when the alternative is creating new meta tags.

    – Charles
    Mar 29 '14 at 3:22











  • @close-voters: How can this question be too broad? The answer is a one-line awk script.

    – oberlies
    Apr 18 '14 at 14:57








1




1





@oberlies, it is sometimes OK to add tags to a question that cover technologies used in answers, but not mentioned in the question. This would be one of those cases, especially when the alternative is creating new meta tags.

– Charles
Mar 29 '14 at 3:22





@oberlies, it is sometimes OK to add tags to a question that cover technologies used in answers, but not mentioned in the question. This would be one of those cases, especially when the alternative is creating new meta tags.

– Charles
Mar 29 '14 at 3:22













@close-voters: How can this question be too broad? The answer is a one-line awk script.

– oberlies
Apr 18 '14 at 14:57






@close-voters: How can this question be too broad? The answer is a one-line awk script.

– oberlies
Apr 18 '14 at 14:57













3 Answers
3






active

oldest

votes


















7














give this one-liner a try:



 awk -F't' -v OFS='t' 'x=$1;$1="";a[x]=a[x]$0ENDfor(x in a)print x,a[x]' file





share|improve this answer























  • Thanks! It works fantastic!

    – boczniak767
    Nov 7 '13 at 11:53



















0














For whatever reason, the awk solution does not work for me in cygwin. So I used Perl instead. It joins around a tab character and separates line by n



cat FILENAME | perl -e 'foreach $Line (<STDIN>) @Cols=($Line=~/^s*(d+)s*(.*?)s*$/); push(@$Link$Cols[0], $Cols[1]); foreach $List (values %Link) print join("t", @$List)."n"; '





share|improve this answer






























    0














    will depend off file size (and awk limitation)



    if too big this will reduce the awk need by sorting file first and only keep 1 label in memory for printing



    A classical version with post print using a modification of the whole line



    sort YourFile 
    | awk '
    last==$1 sub( /^[^[:blank:]]*[[:blank:]]+/, ""); C = C " " $0; next
    NR > 1 print Last C; Last = $1; C = ""
    END print Last
    '


    Another version using field and pre-print but less "human readable"



    sort YourFile 
    | awk '
    last!=$1 printf( "%s%s", (! NR ? "n" : ""), Last=$1)
    last==$1 for( i=2;i<NF;i++) printf( " %s", $i)
    '





    share|improve this answer

























      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f19823941%2fjoin-lines-with-the-same-value-in-the-first-column%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      7














      give this one-liner a try:



       awk -F't' -v OFS='t' 'x=$1;$1="";a[x]=a[x]$0ENDfor(x in a)print x,a[x]' file





      share|improve this answer























      • Thanks! It works fantastic!

        – boczniak767
        Nov 7 '13 at 11:53
















      7














      give this one-liner a try:



       awk -F't' -v OFS='t' 'x=$1;$1="";a[x]=a[x]$0ENDfor(x in a)print x,a[x]' file





      share|improve this answer























      • Thanks! It works fantastic!

        – boczniak767
        Nov 7 '13 at 11:53














      7












      7








      7







      give this one-liner a try:



       awk -F't' -v OFS='t' 'x=$1;$1="";a[x]=a[x]$0ENDfor(x in a)print x,a[x]' file





      share|improve this answer













      give this one-liner a try:



       awk -F't' -v OFS='t' 'x=$1;$1="";a[x]=a[x]$0ENDfor(x in a)print x,a[x]' file






      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Nov 6 '13 at 22:37









      KentKent

      146k27161221




      146k27161221












      • Thanks! It works fantastic!

        – boczniak767
        Nov 7 '13 at 11:53


















      • Thanks! It works fantastic!

        – boczniak767
        Nov 7 '13 at 11:53

















      Thanks! It works fantastic!

      – boczniak767
      Nov 7 '13 at 11:53






      Thanks! It works fantastic!

      – boczniak767
      Nov 7 '13 at 11:53














      0














      For whatever reason, the awk solution does not work for me in cygwin. So I used Perl instead. It joins around a tab character and separates line by n



      cat FILENAME | perl -e 'foreach $Line (<STDIN>) @Cols=($Line=~/^s*(d+)s*(.*?)s*$/); push(@$Link$Cols[0], $Cols[1]); foreach $List (values %Link) print join("t", @$List)."n"; '





      share|improve this answer



























        0














        For whatever reason, the awk solution does not work for me in cygwin. So I used Perl instead. It joins around a tab character and separates line by n



        cat FILENAME | perl -e 'foreach $Line (<STDIN>) @Cols=($Line=~/^s*(d+)s*(.*?)s*$/); push(@$Link$Cols[0], $Cols[1]); foreach $List (values %Link) print join("t", @$List)."n"; '





        share|improve this answer

























          0












          0








          0







          For whatever reason, the awk solution does not work for me in cygwin. So I used Perl instead. It joins around a tab character and separates line by n



          cat FILENAME | perl -e 'foreach $Line (<STDIN>) @Cols=($Line=~/^s*(d+)s*(.*?)s*$/); push(@$Link$Cols[0], $Cols[1]); foreach $List (values %Link) print join("t", @$List)."n"; '





          share|improve this answer













          For whatever reason, the awk solution does not work for me in cygwin. So I used Perl instead. It joins around a tab character and separates line by n



          cat FILENAME | perl -e 'foreach $Line (<STDIN>) @Cols=($Line=~/^s*(d+)s*(.*?)s*$/); push(@$Link$Cols[0], $Cols[1]); foreach $List (values %Link) print join("t", @$List)."n"; '






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jan 20 '17 at 14:03









          DakusanDakusan

          4,42932033




          4,42932033





















              0














              will depend off file size (and awk limitation)



              if too big this will reduce the awk need by sorting file first and only keep 1 label in memory for printing



              A classical version with post print using a modification of the whole line



              sort YourFile 
              | awk '
              last==$1 sub( /^[^[:blank:]]*[[:blank:]]+/, ""); C = C " " $0; next
              NR > 1 print Last C; Last = $1; C = ""
              END print Last
              '


              Another version using field and pre-print but less "human readable"



              sort YourFile 
              | awk '
              last!=$1 printf( "%s%s", (! NR ? "n" : ""), Last=$1)
              last==$1 for( i=2;i<NF;i++) printf( " %s", $i)
              '





              share|improve this answer





























                0














                will depend off file size (and awk limitation)



                if too big this will reduce the awk need by sorting file first and only keep 1 label in memory for printing



                A classical version with post print using a modification of the whole line



                sort YourFile 
                | awk '
                last==$1 sub( /^[^[:blank:]]*[[:blank:]]+/, ""); C = C " " $0; next
                NR > 1 print Last C; Last = $1; C = ""
                END print Last
                '


                Another version using field and pre-print but less "human readable"



                sort YourFile 
                | awk '
                last!=$1 printf( "%s%s", (! NR ? "n" : ""), Last=$1)
                last==$1 for( i=2;i<NF;i++) printf( " %s", $i)
                '





                share|improve this answer



























                  0












                  0








                  0







                  will depend off file size (and awk limitation)



                  if too big this will reduce the awk need by sorting file first and only keep 1 label in memory for printing



                  A classical version with post print using a modification of the whole line



                  sort YourFile 
                  | awk '
                  last==$1 sub( /^[^[:blank:]]*[[:blank:]]+/, ""); C = C " " $0; next
                  NR > 1 print Last C; Last = $1; C = ""
                  END print Last
                  '


                  Another version using field and pre-print but less "human readable"



                  sort YourFile 
                  | awk '
                  last!=$1 printf( "%s%s", (! NR ? "n" : ""), Last=$1)
                  last==$1 for( i=2;i<NF;i++) printf( " %s", $i)
                  '





                  share|improve this answer















                  will depend off file size (and awk limitation)



                  if too big this will reduce the awk need by sorting file first and only keep 1 label in memory for printing



                  A classical version with post print using a modification of the whole line



                  sort YourFile 
                  | awk '
                  last==$1 sub( /^[^[:blank:]]*[[:blank:]]+/, ""); C = C " " $0; next
                  NR > 1 print Last C; Last = $1; C = ""
                  END print Last
                  '


                  Another version using field and pre-print but less "human readable"



                  sort YourFile 
                  | awk '
                  last!=$1 printf( "%s%s", (! NR ? "n" : ""), Last=$1)
                  last==$1 for( i=2;i<NF;i++) printf( " %s", $i)
                  '






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Jan 20 '17 at 14:38

























                  answered Jan 20 '17 at 14:30









                  NeronLeVeluNeronLeVelu

                  8,74311838




                  8,74311838



























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f19823941%2fjoin-lines-with-the-same-value-in-the-first-column%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

                      Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

                      Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript