Join lines with the same value in the first column The Next CEO of Stack OverflowJoin lines based on last column in BashReshuffling of specific structure file with bash scriptHow do I parse command line arguments in Bash?How to count all the lines of code in a directory recursively?join 3 files by first Column with join (was awk)?How to count lines in a document?how to trim file - remove the rows which with the same value in the columns except the first two columnsRead a file line by line assigning the value to a variableAssigning variables to values in a text file with 3 columns, line by lineawk not capturing first line / separatorMerge lines with the same value in the first columnJoin lines with similar first column
Lucky Feat: How can "more than one creature spend a luck point to influence the outcome of a roll"?
Why is information "lost" when it got into a black hole?
From jafe to El-Guest
Could a dragon use its wings to swim?
"Eavesdropping" vs "Listen in on"
What steps are necessary to read a Modern SSD in Medieval Europe?
It is correct to match light sources with the same color temperature?
Which one is the true statement?
Calculate the Mean mean of two numbers
Getting Stale Gas Out of a Gas Tank w/out Dropping the Tank
Small nick on power cord from an electric alarm clock, and copper wiring exposed but intact
Is there such a thing as a proper verb, like a proper noun?
Traveling with my 5 year old daughter (as the father) without the mother from Germany to Mexico
Is there an equivalent of cd - for cp or mv
Where do students learn to solve polynomial equations these days?
Why do we say 'Un seul M' and not 'Une seule M' even though M is a "consonne"
How do you define an element with an ID attribute using LWC?
How to Implement Deterministic Encryption Safely in .NET
What does "shotgun unity" refer to here in this sentence?
Spaces in which all closed sets are regular closed
Aggressive Under-Indexing and no data for missing index
How did Beeri the Hittite come up with naming his daughter Yehudit?
Which Pokemon have a special animation when running with them out of their pokeball?
Does higher Oxidation/ reduction potential translate to higher energy storage in battery?
Join lines with the same value in the first column
The Next CEO of Stack OverflowJoin lines based on last column in BashReshuffling of specific structure file with bash scriptHow do I parse command line arguments in Bash?How to count all the lines of code in a directory recursively?join 3 files by first Column with join (was awk)?How to count lines in a document?how to trim file - remove the rows which with the same value in the columns except the first two columnsRead a file line by line assigning the value to a variableAssigning variables to values in a text file with 3 columns, line by lineawk not capturing first line / separatorMerge lines with the same value in the first columnJoin lines with similar first column
I have a tab-delimited file with three columns (excerpt):
AC147602.5_FG004 IPR000146 Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase
AC147602.5_FG004 IPR023079 Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001 IPR002110 Ankyrin repeat
AC148152.3_FG001 IPR026961 PGG domain
and I'd like to get this using bash:
AC147602.5_FG004 IPR000146 Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase IPR023079 Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001 IPR023079 Sedoheptulose-1,7-bisphosphatase IPR002110 Ankyrin repeat IPR026961 PGG domain
So if ID in the first column are the same in several lines, it should produce one line for each ID with all other parts of lines joined. In the example it will give two-row file.
bash awk
add a comment |
I have a tab-delimited file with three columns (excerpt):
AC147602.5_FG004 IPR000146 Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase
AC147602.5_FG004 IPR023079 Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001 IPR002110 Ankyrin repeat
AC148152.3_FG001 IPR026961 PGG domain
and I'd like to get this using bash:
AC147602.5_FG004 IPR000146 Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase IPR023079 Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001 IPR023079 Sedoheptulose-1,7-bisphosphatase IPR002110 Ankyrin repeat IPR026961 PGG domain
So if ID in the first column are the same in several lines, it should produce one line for each ID with all other parts of lines joined. In the example it will give two-row file.
bash awk
1
@oberlies, it is sometimes OK to add tags to a question that cover technologies used in answers, but not mentioned in the question. This would be one of those cases, especially when the alternative is creating new meta tags.
– Charles
Mar 29 '14 at 3:22
@close-voters: How can this question be too broad? The answer is a one-line awk script.
– oberlies
Apr 18 '14 at 14:57
add a comment |
I have a tab-delimited file with three columns (excerpt):
AC147602.5_FG004 IPR000146 Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase
AC147602.5_FG004 IPR023079 Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001 IPR002110 Ankyrin repeat
AC148152.3_FG001 IPR026961 PGG domain
and I'd like to get this using bash:
AC147602.5_FG004 IPR000146 Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase IPR023079 Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001 IPR023079 Sedoheptulose-1,7-bisphosphatase IPR002110 Ankyrin repeat IPR026961 PGG domain
So if ID in the first column are the same in several lines, it should produce one line for each ID with all other parts of lines joined. In the example it will give two-row file.
bash awk
I have a tab-delimited file with three columns (excerpt):
AC147602.5_FG004 IPR000146 Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase
AC147602.5_FG004 IPR023079 Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001 IPR002110 Ankyrin repeat
AC148152.3_FG001 IPR026961 PGG domain
and I'd like to get this using bash:
AC147602.5_FG004 IPR000146 Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase IPR023079 Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001 IPR023079 Sedoheptulose-1,7-bisphosphatase IPR002110 Ankyrin repeat IPR026961 PGG domain
So if ID in the first column are the same in several lines, it should produce one line for each ID with all other parts of lines joined. In the example it will give two-row file.
bash awk
bash awk
edited Mar 29 '14 at 3:20
Charles
45.8k1287125
45.8k1287125
asked Nov 6 '13 at 22:14
boczniak767boczniak767
9518
9518
1
@oberlies, it is sometimes OK to add tags to a question that cover technologies used in answers, but not mentioned in the question. This would be one of those cases, especially when the alternative is creating new meta tags.
– Charles
Mar 29 '14 at 3:22
@close-voters: How can this question be too broad? The answer is a one-line awk script.
– oberlies
Apr 18 '14 at 14:57
add a comment |
1
@oberlies, it is sometimes OK to add tags to a question that cover technologies used in answers, but not mentioned in the question. This would be one of those cases, especially when the alternative is creating new meta tags.
– Charles
Mar 29 '14 at 3:22
@close-voters: How can this question be too broad? The answer is a one-line awk script.
– oberlies
Apr 18 '14 at 14:57
1
1
@oberlies, it is sometimes OK to add tags to a question that cover technologies used in answers, but not mentioned in the question. This would be one of those cases, especially when the alternative is creating new meta tags.
– Charles
Mar 29 '14 at 3:22
@oberlies, it is sometimes OK to add tags to a question that cover technologies used in answers, but not mentioned in the question. This would be one of those cases, especially when the alternative is creating new meta tags.
– Charles
Mar 29 '14 at 3:22
@close-voters: How can this question be too broad? The answer is a one-line awk script.
– oberlies
Apr 18 '14 at 14:57
@close-voters: How can this question be too broad? The answer is a one-line awk script.
– oberlies
Apr 18 '14 at 14:57
add a comment |
3 Answers
3
active
oldest
votes
give this one-liner a try:
awk -F't' -v OFS='t' 'x=$1;$1="";a[x]=a[x]$0ENDfor(x in a)print x,a[x]' file
Thanks! It works fantastic!
– boczniak767
Nov 7 '13 at 11:53
add a comment |
For whatever reason, the awk solution does not work for me in cygwin. So I used Perl instead. It joins around a tab character and separates line by n
cat FILENAME | perl -e 'foreach $Line (<STDIN>) @Cols=($Line=~/^s*(d+)s*(.*?)s*$/); push(@$Link$Cols[0], $Cols[1]); foreach $List (values %Link) print join("t", @$List)."n"; '
add a comment |
will depend off file size (and awk limitation)
if too big this will reduce the awk need by sorting file first and only keep 1 label in memory for printing
A classical version with post print using a modification of the whole line
sort YourFile
| awk '
last==$1 sub( /^[^[:blank:]]*[[:blank:]]+/, ""); C = C " " $0; next
NR > 1 print Last C; Last = $1; C = ""
END print Last
'
Another version using field and pre-print but less "human readable"
sort YourFile
| awk '
last!=$1 printf( "%s%s", (! NR ? "n" : ""), Last=$1)
last==$1 for( i=2;i<NF;i++) printf( " %s", $i)
'
add a comment |
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f19823941%2fjoin-lines-with-the-same-value-in-the-first-column%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
give this one-liner a try:
awk -F't' -v OFS='t' 'x=$1;$1="";a[x]=a[x]$0ENDfor(x in a)print x,a[x]' file
Thanks! It works fantastic!
– boczniak767
Nov 7 '13 at 11:53
add a comment |
give this one-liner a try:
awk -F't' -v OFS='t' 'x=$1;$1="";a[x]=a[x]$0ENDfor(x in a)print x,a[x]' file
Thanks! It works fantastic!
– boczniak767
Nov 7 '13 at 11:53
add a comment |
give this one-liner a try:
awk -F't' -v OFS='t' 'x=$1;$1="";a[x]=a[x]$0ENDfor(x in a)print x,a[x]' file
give this one-liner a try:
awk -F't' -v OFS='t' 'x=$1;$1="";a[x]=a[x]$0ENDfor(x in a)print x,a[x]' file
answered Nov 6 '13 at 22:37
KentKent
146k27161221
146k27161221
Thanks! It works fantastic!
– boczniak767
Nov 7 '13 at 11:53
add a comment |
Thanks! It works fantastic!
– boczniak767
Nov 7 '13 at 11:53
Thanks! It works fantastic!
– boczniak767
Nov 7 '13 at 11:53
Thanks! It works fantastic!
– boczniak767
Nov 7 '13 at 11:53
add a comment |
For whatever reason, the awk solution does not work for me in cygwin. So I used Perl instead. It joins around a tab character and separates line by n
cat FILENAME | perl -e 'foreach $Line (<STDIN>) @Cols=($Line=~/^s*(d+)s*(.*?)s*$/); push(@$Link$Cols[0], $Cols[1]); foreach $List (values %Link) print join("t", @$List)."n"; '
add a comment |
For whatever reason, the awk solution does not work for me in cygwin. So I used Perl instead. It joins around a tab character and separates line by n
cat FILENAME | perl -e 'foreach $Line (<STDIN>) @Cols=($Line=~/^s*(d+)s*(.*?)s*$/); push(@$Link$Cols[0], $Cols[1]); foreach $List (values %Link) print join("t", @$List)."n"; '
add a comment |
For whatever reason, the awk solution does not work for me in cygwin. So I used Perl instead. It joins around a tab character and separates line by n
cat FILENAME | perl -e 'foreach $Line (<STDIN>) @Cols=($Line=~/^s*(d+)s*(.*?)s*$/); push(@$Link$Cols[0], $Cols[1]); foreach $List (values %Link) print join("t", @$List)."n"; '
For whatever reason, the awk solution does not work for me in cygwin. So I used Perl instead. It joins around a tab character and separates line by n
cat FILENAME | perl -e 'foreach $Line (<STDIN>) @Cols=($Line=~/^s*(d+)s*(.*?)s*$/); push(@$Link$Cols[0], $Cols[1]); foreach $List (values %Link) print join("t", @$List)."n"; '
answered Jan 20 '17 at 14:03
DakusanDakusan
4,42932033
4,42932033
add a comment |
add a comment |
will depend off file size (and awk limitation)
if too big this will reduce the awk need by sorting file first and only keep 1 label in memory for printing
A classical version with post print using a modification of the whole line
sort YourFile
| awk '
last==$1 sub( /^[^[:blank:]]*[[:blank:]]+/, ""); C = C " " $0; next
NR > 1 print Last C; Last = $1; C = ""
END print Last
'
Another version using field and pre-print but less "human readable"
sort YourFile
| awk '
last!=$1 printf( "%s%s", (! NR ? "n" : ""), Last=$1)
last==$1 for( i=2;i<NF;i++) printf( " %s", $i)
'
add a comment |
will depend off file size (and awk limitation)
if too big this will reduce the awk need by sorting file first and only keep 1 label in memory for printing
A classical version with post print using a modification of the whole line
sort YourFile
| awk '
last==$1 sub( /^[^[:blank:]]*[[:blank:]]+/, ""); C = C " " $0; next
NR > 1 print Last C; Last = $1; C = ""
END print Last
'
Another version using field and pre-print but less "human readable"
sort YourFile
| awk '
last!=$1 printf( "%s%s", (! NR ? "n" : ""), Last=$1)
last==$1 for( i=2;i<NF;i++) printf( " %s", $i)
'
add a comment |
will depend off file size (and awk limitation)
if too big this will reduce the awk need by sorting file first and only keep 1 label in memory for printing
A classical version with post print using a modification of the whole line
sort YourFile
| awk '
last==$1 sub( /^[^[:blank:]]*[[:blank:]]+/, ""); C = C " " $0; next
NR > 1 print Last C; Last = $1; C = ""
END print Last
'
Another version using field and pre-print but less "human readable"
sort YourFile
| awk '
last!=$1 printf( "%s%s", (! NR ? "n" : ""), Last=$1)
last==$1 for( i=2;i<NF;i++) printf( " %s", $i)
'
will depend off file size (and awk limitation)
if too big this will reduce the awk need by sorting file first and only keep 1 label in memory for printing
A classical version with post print using a modification of the whole line
sort YourFile
| awk '
last==$1 sub( /^[^[:blank:]]*[[:blank:]]+/, ""); C = C " " $0; next
NR > 1 print Last C; Last = $1; C = ""
END print Last
'
Another version using field and pre-print but less "human readable"
sort YourFile
| awk '
last!=$1 printf( "%s%s", (! NR ? "n" : ""), Last=$1)
last==$1 for( i=2;i<NF;i++) printf( " %s", $i)
'
edited Jan 20 '17 at 14:38
answered Jan 20 '17 at 14:30
NeronLeVeluNeronLeVelu
8,74311838
8,74311838
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f19823941%2fjoin-lines-with-the-same-value-in-the-first-column%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
@oberlies, it is sometimes OK to add tags to a question that cover technologies used in answers, but not mentioned in the question. This would be one of those cases, especially when the alternative is creating new meta tags.
– Charles
Mar 29 '14 at 3:22
@close-voters: How can this question be too broad? The answer is a one-line awk script.
– oberlies
Apr 18 '14 at 14:57