How to format phone numbers in bash with awkGet the source directory of a Bash script from within the script itselfHow do I iterate over a range of numbers defined by variables in Bash?How to check if a string contains a substring in BashHow to check if a program exists from a Bash script?How do I tell if a regular file does not exist in Bash?How do I split a string on a delimiter in Bash?Extract filename and extension in BashHow to check if a variable is set in Bash?How to concatenate string variables in BashEcho newline in Bash prints literal n
How does the 'five minute adventuring day' affect class balance?
Disk usage confusion: 10G missing on Linux home partition on SSD
How to count the number of bytes in a file, grouping the same bytes?
What was the first science fiction or fantasy multiple choice book?
What verb goes with "coup"?
Any Tips On Writing Extended Recollection In A Novel
Does friction always oppose motion?
How far can gerrymandering go?
How useful would a hydroelectric power plant be in the post-apocalypse world?
Angle Between Two Vectors Facing A Point
A* pathfinding algorithm too slow
Processes in a session in an interactive shell vs in a script
I agreed to cancel a long-planned vacation (with travel costs) due to project deadlines, but now the timeline has all changed again
Old story where computer expert digitally animates The Lord of the Rings
Why didn't Caesar move against Sextus Pompey immediately after Munda?
Avoiding repetition when using the "snprintf idiom" to write text
"nunca" placement after a verb with "no"
Installed software from source, how to say yum not to install it from package?
Why didn't Avengers simply jump 5 years back?
Is it theoretically possible to hack printer using scanner tray?
Why are examinees often not allowed to leave during the start and end of an exam?
My mom helped me cosign a car and now she wants to take it
Understanding the as-if rule, "the program was executed as written"
Tricolour nonogram
How to format phone numbers in bash with awk
Get the source directory of a Bash script from within the script itselfHow do I iterate over a range of numbers defined by variables in Bash?How to check if a string contains a substring in BashHow to check if a program exists from a Bash script?How do I tell if a regular file does not exist in Bash?How do I split a string on a delimiter in Bash?Extract filename and extension in BashHow to check if a variable is set in Bash?How to concatenate string variables in BashEcho newline in Bash prints literal n
I'm coding a new script in bash to format phone number to a french standard.
Almost everything is done, but I don't know how to change values in a CSV files.
Specifications :
- Delete all not numbers caracters ( except "+" if is in first position)
- Substitutions :
- 06xxx -> +336xxx
- 07xxx -> +337xxx
- +3306xxx -> +336xxx
- +3307xxx -> +337xxx
Sample Data (admitting data will be in the third column of my csv file, with | separators) :
||0612345678|
||+33612345678f|
||+33712345678|
||+330612345678|
||+330712345678|
||06.12.34.56.78|
||06 12 34 56 78|
||06d12d34.h*56-78|
||+2258475|
||+65823|Expected result:
||+33612345678|
||+33612345678|
||+33712345678|
||+33612345678|
||+33712345678|
||+33612345678|
||+33612345678|
||+33612345678|
||+2258475|
||+65823|- Current State
I tried to make this with sed. It's actually working with these expressions :
sed -e "s/b[^0-9]//g" sample > test
sed -e "s/[a-z]//g" test > test2
sed -e "s/b[^0-9]//g" test2 > test3
sed -e "s/^06/+336/g" test3 > test4
sed -e "s/^07/+337/g" test4 > test5
sed -e "s/^+3306/+336/g" test5 > test6
sed -e "s/^+3307/+337/g" result
BUT I don't know how to make substitution in my CSV file, only on the third column.
Then, I tried with awk :
awk '
BEGIN print substr($1,2);
"
gsub("b[^0-9]","",$1);
gsub("[a-z]","",$1);
gsub("b[^0-9]","",$1);
gsub("^06","+336",$1);
gsub("^07","+337",$1);
gsub("^+3306","+336",$1);
gsub("^+3307","+337",$1)
1
' sample
but awk don't understand all the regex expressions.
The result when using awk :
+33612345678|
+33612345678|
+33712345678|
+33612345678|
+33712345678|
+336.12.34.56.78|
+336 12 34 56 78|
+3361234.*56-78|
+2258475|
+65823|
I would like use my regex expressions directly in my csv files, advice will be much appreciated!
bash awk sed
add a comment |
I'm coding a new script in bash to format phone number to a french standard.
Almost everything is done, but I don't know how to change values in a CSV files.
Specifications :
- Delete all not numbers caracters ( except "+" if is in first position)
- Substitutions :
- 06xxx -> +336xxx
- 07xxx -> +337xxx
- +3306xxx -> +336xxx
- +3307xxx -> +337xxx
Sample Data (admitting data will be in the third column of my csv file, with | separators) :
||0612345678|
||+33612345678f|
||+33712345678|
||+330612345678|
||+330712345678|
||06.12.34.56.78|
||06 12 34 56 78|
||06d12d34.h*56-78|
||+2258475|
||+65823|Expected result:
||+33612345678|
||+33612345678|
||+33712345678|
||+33612345678|
||+33712345678|
||+33612345678|
||+33612345678|
||+33612345678|
||+2258475|
||+65823|- Current State
I tried to make this with sed. It's actually working with these expressions :
sed -e "s/b[^0-9]//g" sample > test
sed -e "s/[a-z]//g" test > test2
sed -e "s/b[^0-9]//g" test2 > test3
sed -e "s/^06/+336/g" test3 > test4
sed -e "s/^07/+337/g" test4 > test5
sed -e "s/^+3306/+336/g" test5 > test6
sed -e "s/^+3307/+337/g" result
BUT I don't know how to make substitution in my CSV file, only on the third column.
Then, I tried with awk :
awk '
BEGIN print substr($1,2);
"
gsub("b[^0-9]","",$1);
gsub("[a-z]","",$1);
gsub("b[^0-9]","",$1);
gsub("^06","+336",$1);
gsub("^07","+337",$1);
gsub("^+3306","+336",$1);
gsub("^+3307","+337",$1)
1
' sample
but awk don't understand all the regex expressions.
The result when using awk :
+33612345678|
+33612345678|
+33712345678|
+33612345678|
+33712345678|
+336.12.34.56.78|
+336 12 34 56 78|
+3361234.*56-78|
+2258475|
+65823|
I would like use my regex expressions directly in my csv files, advice will be much appreciated!
bash awk sed
only your third column will have data like this? or other columns can have similar data and data is separated by | pipes? in your sample input lines, I see two pipes in the beginning, is one pipe part of your data or both need to be considered as delimiters? just for the sample data, the following is working fine with mecat sampledata.txt | sed 's/||0/||+33/; s/+330/+33/; s/[^0-9|+]*//g'
– Ibraheem
Mar 25 at 16:55
add a comment |
I'm coding a new script in bash to format phone number to a french standard.
Almost everything is done, but I don't know how to change values in a CSV files.
Specifications :
- Delete all not numbers caracters ( except "+" if is in first position)
- Substitutions :
- 06xxx -> +336xxx
- 07xxx -> +337xxx
- +3306xxx -> +336xxx
- +3307xxx -> +337xxx
Sample Data (admitting data will be in the third column of my csv file, with | separators) :
||0612345678|
||+33612345678f|
||+33712345678|
||+330612345678|
||+330712345678|
||06.12.34.56.78|
||06 12 34 56 78|
||06d12d34.h*56-78|
||+2258475|
||+65823|Expected result:
||+33612345678|
||+33612345678|
||+33712345678|
||+33612345678|
||+33712345678|
||+33612345678|
||+33612345678|
||+33612345678|
||+2258475|
||+65823|- Current State
I tried to make this with sed. It's actually working with these expressions :
sed -e "s/b[^0-9]//g" sample > test
sed -e "s/[a-z]//g" test > test2
sed -e "s/b[^0-9]//g" test2 > test3
sed -e "s/^06/+336/g" test3 > test4
sed -e "s/^07/+337/g" test4 > test5
sed -e "s/^+3306/+336/g" test5 > test6
sed -e "s/^+3307/+337/g" result
BUT I don't know how to make substitution in my CSV file, only on the third column.
Then, I tried with awk :
awk '
BEGIN print substr($1,2);
"
gsub("b[^0-9]","",$1);
gsub("[a-z]","",$1);
gsub("b[^0-9]","",$1);
gsub("^06","+336",$1);
gsub("^07","+337",$1);
gsub("^+3306","+336",$1);
gsub("^+3307","+337",$1)
1
' sample
but awk don't understand all the regex expressions.
The result when using awk :
+33612345678|
+33612345678|
+33712345678|
+33612345678|
+33712345678|
+336.12.34.56.78|
+336 12 34 56 78|
+3361234.*56-78|
+2258475|
+65823|
I would like use my regex expressions directly in my csv files, advice will be much appreciated!
bash awk sed
I'm coding a new script in bash to format phone number to a french standard.
Almost everything is done, but I don't know how to change values in a CSV files.
Specifications :
- Delete all not numbers caracters ( except "+" if is in first position)
- Substitutions :
- 06xxx -> +336xxx
- 07xxx -> +337xxx
- +3306xxx -> +336xxx
- +3307xxx -> +337xxx
Sample Data (admitting data will be in the third column of my csv file, with | separators) :
||0612345678|
||+33612345678f|
||+33712345678|
||+330612345678|
||+330712345678|
||06.12.34.56.78|
||06 12 34 56 78|
||06d12d34.h*56-78|
||+2258475|
||+65823|Expected result:
||+33612345678|
||+33612345678|
||+33712345678|
||+33612345678|
||+33712345678|
||+33612345678|
||+33612345678|
||+33612345678|
||+2258475|
||+65823|- Current State
I tried to make this with sed. It's actually working with these expressions :
sed -e "s/b[^0-9]//g" sample > test
sed -e "s/[a-z]//g" test > test2
sed -e "s/b[^0-9]//g" test2 > test3
sed -e "s/^06/+336/g" test3 > test4
sed -e "s/^07/+337/g" test4 > test5
sed -e "s/^+3306/+336/g" test5 > test6
sed -e "s/^+3307/+337/g" result
BUT I don't know how to make substitution in my CSV file, only on the third column.
Then, I tried with awk :
awk '
BEGIN print substr($1,2);
"
gsub("b[^0-9]","",$1);
gsub("[a-z]","",$1);
gsub("b[^0-9]","",$1);
gsub("^06","+336",$1);
gsub("^07","+337",$1);
gsub("^+3306","+336",$1);
gsub("^+3307","+337",$1)
1
' sample
but awk don't understand all the regex expressions.
The result when using awk :
+33612345678|
+33612345678|
+33712345678|
+33612345678|
+33712345678|
+336.12.34.56.78|
+336 12 34 56 78|
+3361234.*56-78|
+2258475|
+65823|
I would like use my regex expressions directly in my csv files, advice will be much appreciated!
bash awk sed
bash awk sed
asked Mar 25 at 16:35
MilkyMilky
32 bronze badges
32 bronze badges
only your third column will have data like this? or other columns can have similar data and data is separated by | pipes? in your sample input lines, I see two pipes in the beginning, is one pipe part of your data or both need to be considered as delimiters? just for the sample data, the following is working fine with mecat sampledata.txt | sed 's/||0/||+33/; s/+330/+33/; s/[^0-9|+]*//g'
– Ibraheem
Mar 25 at 16:55
add a comment |
only your third column will have data like this? or other columns can have similar data and data is separated by | pipes? in your sample input lines, I see two pipes in the beginning, is one pipe part of your data or both need to be considered as delimiters? just for the sample data, the following is working fine with mecat sampledata.txt | sed 's/||0/||+33/; s/+330/+33/; s/[^0-9|+]*//g'
– Ibraheem
Mar 25 at 16:55
only your third column will have data like this? or other columns can have similar data and data is separated by | pipes? in your sample input lines, I see two pipes in the beginning, is one pipe part of your data or both need to be considered as delimiters? just for the sample data, the following is working fine with me
cat sampledata.txt | sed 's/||0/||+33/; s/+330/+33/; s/[^0-9|+]*//g'
– Ibraheem
Mar 25 at 16:55
only your third column will have data like this? or other columns can have similar data and data is separated by | pipes? in your sample input lines, I see two pipes in the beginning, is one pipe part of your data or both need to be considered as delimiters? just for the sample data, the following is working fine with me
cat sampledata.txt | sed 's/||0/||+33/; s/+330/+33/; s/[^0-9|+]*//g'
– Ibraheem
Mar 25 at 16:55
add a comment |
2 Answers
2
active
oldest
votes
Sounds like this is all you need:
$ cat tst.awk
BEGIN "
$3 != ""
gsub(/[^0-9]+/,"",$3)
sub(/^(33)?06/,"336",$3)
sub(/^(33)?07/,"337",$3)
$3 = "+" $3
print
$ awk -f tst.awk file
||+33612345678|
||+33612345678|
||+33712345678|
||+33612345678|
||+33712345678|
||+33612345678|
||+33612345678|
||+33612345678|
||+2258475|
||+65823|
sub(/^(33)?06/,"336",$3)
in this substitution, if you just replace 330 with 33, it will cover both 3306 and 3307 cases,
– Ibraheem
Mar 25 at 17:23
@Ibraheem Right but then it'd also convert 3305, 3308, etc. which the OP didn't say she wanted changed and I assume she has good reason for being so specific in her text, code, and examples.
– Ed Morton
Mar 25 at 17:27
you are also right but as OP mentioned he is standardizing phone numbers, so I expect it won't have 0 after +33 in any case, but anyway, one extra line of code won't be bothersome to match the exact specification
– Ibraheem
Mar 25 at 17:41
Thank you ! however, the script will add "+" in empty rows. I will correct this with a simple sed.
– Milky
Mar 26 at 9:54
No, don't add it with awk and then remove it with sed - just don't add it in the first place. You didn't show any empty lines in your input which is why it's not accounted for in the script but it's easily handled however you want it handled - I've updated it to one possibility (only modify $3 if it's non-null) and if that's not what you want then update your question to include how you want empty $3 and/or whole empty lines handled.
– Ed Morton
Mar 26 at 13:18
add a comment |
I can get you a little closer. I found a couple of mistakes with your awk script that should be corrected before making more progress. First, the BEGIN statement looks to be in error. Rather than print substr($1,2), it should just set the IFS and OFS. As you probably already know, BEGIN only gets executed once.
Also, once the IFS is set to pipe '|', you'll need to modify the third field in each input line. Thus, the target param for all your gsub calls should be $3, not $1.
Well, that's all I got for you. I suspect the remainder of the issues I'm seeing with your output not matching the expected results is do to the reason you mention - different regexp handling.
awk '
BEGIN "
gsub("b[^0-9]","",$3);
gsub("[a-z]","",$3);
gsub("b[^0-9]","",$3);
gsub("^06","+336",$3);
gsub("^07","+337",$3);
gsub("^+3306","+336",$3);
gsub("^+3307","+337",$3)
1
' sample
1
YMMV with what any given awk thinksb
means.
– Ed Morton
Mar 25 at 17:20
Thank you for the explanation !!
– Milky
Mar 26 at 9:43
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55342486%2fhow-to-format-phone-numbers-in-bash-with-awk%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sounds like this is all you need:
$ cat tst.awk
BEGIN "
$3 != ""
gsub(/[^0-9]+/,"",$3)
sub(/^(33)?06/,"336",$3)
sub(/^(33)?07/,"337",$3)
$3 = "+" $3
print
$ awk -f tst.awk file
||+33612345678|
||+33612345678|
||+33712345678|
||+33612345678|
||+33712345678|
||+33612345678|
||+33612345678|
||+33612345678|
||+2258475|
||+65823|
sub(/^(33)?06/,"336",$3)
in this substitution, if you just replace 330 with 33, it will cover both 3306 and 3307 cases,
– Ibraheem
Mar 25 at 17:23
@Ibraheem Right but then it'd also convert 3305, 3308, etc. which the OP didn't say she wanted changed and I assume she has good reason for being so specific in her text, code, and examples.
– Ed Morton
Mar 25 at 17:27
you are also right but as OP mentioned he is standardizing phone numbers, so I expect it won't have 0 after +33 in any case, but anyway, one extra line of code won't be bothersome to match the exact specification
– Ibraheem
Mar 25 at 17:41
Thank you ! however, the script will add "+" in empty rows. I will correct this with a simple sed.
– Milky
Mar 26 at 9:54
No, don't add it with awk and then remove it with sed - just don't add it in the first place. You didn't show any empty lines in your input which is why it's not accounted for in the script but it's easily handled however you want it handled - I've updated it to one possibility (only modify $3 if it's non-null) and if that's not what you want then update your question to include how you want empty $3 and/or whole empty lines handled.
– Ed Morton
Mar 26 at 13:18
add a comment |
Sounds like this is all you need:
$ cat tst.awk
BEGIN "
$3 != ""
gsub(/[^0-9]+/,"",$3)
sub(/^(33)?06/,"336",$3)
sub(/^(33)?07/,"337",$3)
$3 = "+" $3
print
$ awk -f tst.awk file
||+33612345678|
||+33612345678|
||+33712345678|
||+33612345678|
||+33712345678|
||+33612345678|
||+33612345678|
||+33612345678|
||+2258475|
||+65823|
sub(/^(33)?06/,"336",$3)
in this substitution, if you just replace 330 with 33, it will cover both 3306 and 3307 cases,
– Ibraheem
Mar 25 at 17:23
@Ibraheem Right but then it'd also convert 3305, 3308, etc. which the OP didn't say she wanted changed and I assume she has good reason for being so specific in her text, code, and examples.
– Ed Morton
Mar 25 at 17:27
you are also right but as OP mentioned he is standardizing phone numbers, so I expect it won't have 0 after +33 in any case, but anyway, one extra line of code won't be bothersome to match the exact specification
– Ibraheem
Mar 25 at 17:41
Thank you ! however, the script will add "+" in empty rows. I will correct this with a simple sed.
– Milky
Mar 26 at 9:54
No, don't add it with awk and then remove it with sed - just don't add it in the first place. You didn't show any empty lines in your input which is why it's not accounted for in the script but it's easily handled however you want it handled - I've updated it to one possibility (only modify $3 if it's non-null) and if that's not what you want then update your question to include how you want empty $3 and/or whole empty lines handled.
– Ed Morton
Mar 26 at 13:18
add a comment |
Sounds like this is all you need:
$ cat tst.awk
BEGIN "
$3 != ""
gsub(/[^0-9]+/,"",$3)
sub(/^(33)?06/,"336",$3)
sub(/^(33)?07/,"337",$3)
$3 = "+" $3
print
$ awk -f tst.awk file
||+33612345678|
||+33612345678|
||+33712345678|
||+33612345678|
||+33712345678|
||+33612345678|
||+33612345678|
||+33612345678|
||+2258475|
||+65823|
Sounds like this is all you need:
$ cat tst.awk
BEGIN "
$3 != ""
gsub(/[^0-9]+/,"",$3)
sub(/^(33)?06/,"336",$3)
sub(/^(33)?07/,"337",$3)
$3 = "+" $3
print
$ awk -f tst.awk file
||+33612345678|
||+33612345678|
||+33712345678|
||+33612345678|
||+33712345678|
||+33612345678|
||+33612345678|
||+33612345678|
||+2258475|
||+65823|
edited Mar 26 at 13:17
answered Mar 25 at 17:19
Ed MortonEd Morton
121k13 gold badges46 silver badges106 bronze badges
121k13 gold badges46 silver badges106 bronze badges
sub(/^(33)?06/,"336",$3)
in this substitution, if you just replace 330 with 33, it will cover both 3306 and 3307 cases,
– Ibraheem
Mar 25 at 17:23
@Ibraheem Right but then it'd also convert 3305, 3308, etc. which the OP didn't say she wanted changed and I assume she has good reason for being so specific in her text, code, and examples.
– Ed Morton
Mar 25 at 17:27
you are also right but as OP mentioned he is standardizing phone numbers, so I expect it won't have 0 after +33 in any case, but anyway, one extra line of code won't be bothersome to match the exact specification
– Ibraheem
Mar 25 at 17:41
Thank you ! however, the script will add "+" in empty rows. I will correct this with a simple sed.
– Milky
Mar 26 at 9:54
No, don't add it with awk and then remove it with sed - just don't add it in the first place. You didn't show any empty lines in your input which is why it's not accounted for in the script but it's easily handled however you want it handled - I've updated it to one possibility (only modify $3 if it's non-null) and if that's not what you want then update your question to include how you want empty $3 and/or whole empty lines handled.
– Ed Morton
Mar 26 at 13:18
add a comment |
sub(/^(33)?06/,"336",$3)
in this substitution, if you just replace 330 with 33, it will cover both 3306 and 3307 cases,
– Ibraheem
Mar 25 at 17:23
@Ibraheem Right but then it'd also convert 3305, 3308, etc. which the OP didn't say she wanted changed and I assume she has good reason for being so specific in her text, code, and examples.
– Ed Morton
Mar 25 at 17:27
you are also right but as OP mentioned he is standardizing phone numbers, so I expect it won't have 0 after +33 in any case, but anyway, one extra line of code won't be bothersome to match the exact specification
– Ibraheem
Mar 25 at 17:41
Thank you ! however, the script will add "+" in empty rows. I will correct this with a simple sed.
– Milky
Mar 26 at 9:54
No, don't add it with awk and then remove it with sed - just don't add it in the first place. You didn't show any empty lines in your input which is why it's not accounted for in the script but it's easily handled however you want it handled - I've updated it to one possibility (only modify $3 if it's non-null) and if that's not what you want then update your question to include how you want empty $3 and/or whole empty lines handled.
– Ed Morton
Mar 26 at 13:18
sub(/^(33)?06/,"336",$3)
in this substitution, if you just replace 330 with 33, it will cover both 3306 and 3307 cases,– Ibraheem
Mar 25 at 17:23
sub(/^(33)?06/,"336",$3)
in this substitution, if you just replace 330 with 33, it will cover both 3306 and 3307 cases,– Ibraheem
Mar 25 at 17:23
@Ibraheem Right but then it'd also convert 3305, 3308, etc. which the OP didn't say she wanted changed and I assume she has good reason for being so specific in her text, code, and examples.
– Ed Morton
Mar 25 at 17:27
@Ibraheem Right but then it'd also convert 3305, 3308, etc. which the OP didn't say she wanted changed and I assume she has good reason for being so specific in her text, code, and examples.
– Ed Morton
Mar 25 at 17:27
you are also right but as OP mentioned he is standardizing phone numbers, so I expect it won't have 0 after +33 in any case, but anyway, one extra line of code won't be bothersome to match the exact specification
– Ibraheem
Mar 25 at 17:41
you are also right but as OP mentioned he is standardizing phone numbers, so I expect it won't have 0 after +33 in any case, but anyway, one extra line of code won't be bothersome to match the exact specification
– Ibraheem
Mar 25 at 17:41
Thank you ! however, the script will add "+" in empty rows. I will correct this with a simple sed.
– Milky
Mar 26 at 9:54
Thank you ! however, the script will add "+" in empty rows. I will correct this with a simple sed.
– Milky
Mar 26 at 9:54
No, don't add it with awk and then remove it with sed - just don't add it in the first place. You didn't show any empty lines in your input which is why it's not accounted for in the script but it's easily handled however you want it handled - I've updated it to one possibility (only modify $3 if it's non-null) and if that's not what you want then update your question to include how you want empty $3 and/or whole empty lines handled.
– Ed Morton
Mar 26 at 13:18
No, don't add it with awk and then remove it with sed - just don't add it in the first place. You didn't show any empty lines in your input which is why it's not accounted for in the script but it's easily handled however you want it handled - I've updated it to one possibility (only modify $3 if it's non-null) and if that's not what you want then update your question to include how you want empty $3 and/or whole empty lines handled.
– Ed Morton
Mar 26 at 13:18
add a comment |
I can get you a little closer. I found a couple of mistakes with your awk script that should be corrected before making more progress. First, the BEGIN statement looks to be in error. Rather than print substr($1,2), it should just set the IFS and OFS. As you probably already know, BEGIN only gets executed once.
Also, once the IFS is set to pipe '|', you'll need to modify the third field in each input line. Thus, the target param for all your gsub calls should be $3, not $1.
Well, that's all I got for you. I suspect the remainder of the issues I'm seeing with your output not matching the expected results is do to the reason you mention - different regexp handling.
awk '
BEGIN "
gsub("b[^0-9]","",$3);
gsub("[a-z]","",$3);
gsub("b[^0-9]","",$3);
gsub("^06","+336",$3);
gsub("^07","+337",$3);
gsub("^+3306","+336",$3);
gsub("^+3307","+337",$3)
1
' sample
1
YMMV with what any given awk thinksb
means.
– Ed Morton
Mar 25 at 17:20
Thank you for the explanation !!
– Milky
Mar 26 at 9:43
add a comment |
I can get you a little closer. I found a couple of mistakes with your awk script that should be corrected before making more progress. First, the BEGIN statement looks to be in error. Rather than print substr($1,2), it should just set the IFS and OFS. As you probably already know, BEGIN only gets executed once.
Also, once the IFS is set to pipe '|', you'll need to modify the third field in each input line. Thus, the target param for all your gsub calls should be $3, not $1.
Well, that's all I got for you. I suspect the remainder of the issues I'm seeing with your output not matching the expected results is do to the reason you mention - different regexp handling.
awk '
BEGIN "
gsub("b[^0-9]","",$3);
gsub("[a-z]","",$3);
gsub("b[^0-9]","",$3);
gsub("^06","+336",$3);
gsub("^07","+337",$3);
gsub("^+3306","+336",$3);
gsub("^+3307","+337",$3)
1
' sample
1
YMMV with what any given awk thinksb
means.
– Ed Morton
Mar 25 at 17:20
Thank you for the explanation !!
– Milky
Mar 26 at 9:43
add a comment |
I can get you a little closer. I found a couple of mistakes with your awk script that should be corrected before making more progress. First, the BEGIN statement looks to be in error. Rather than print substr($1,2), it should just set the IFS and OFS. As you probably already know, BEGIN only gets executed once.
Also, once the IFS is set to pipe '|', you'll need to modify the third field in each input line. Thus, the target param for all your gsub calls should be $3, not $1.
Well, that's all I got for you. I suspect the remainder of the issues I'm seeing with your output not matching the expected results is do to the reason you mention - different regexp handling.
awk '
BEGIN "
gsub("b[^0-9]","",$3);
gsub("[a-z]","",$3);
gsub("b[^0-9]","",$3);
gsub("^06","+336",$3);
gsub("^07","+337",$3);
gsub("^+3306","+336",$3);
gsub("^+3307","+337",$3)
1
' sample
I can get you a little closer. I found a couple of mistakes with your awk script that should be corrected before making more progress. First, the BEGIN statement looks to be in error. Rather than print substr($1,2), it should just set the IFS and OFS. As you probably already know, BEGIN only gets executed once.
Also, once the IFS is set to pipe '|', you'll need to modify the third field in each input line. Thus, the target param for all your gsub calls should be $3, not $1.
Well, that's all I got for you. I suspect the remainder of the issues I'm seeing with your output not matching the expected results is do to the reason you mention - different regexp handling.
awk '
BEGIN "
gsub("b[^0-9]","",$3);
gsub("[a-z]","",$3);
gsub("b[^0-9]","",$3);
gsub("^06","+336",$3);
gsub("^07","+337",$3);
gsub("^+3306","+336",$3);
gsub("^+3307","+337",$3)
1
' sample
answered Mar 25 at 17:13
MarkMark
1,3161 gold badge8 silver badges13 bronze badges
1,3161 gold badge8 silver badges13 bronze badges
1
YMMV with what any given awk thinksb
means.
– Ed Morton
Mar 25 at 17:20
Thank you for the explanation !!
– Milky
Mar 26 at 9:43
add a comment |
1
YMMV with what any given awk thinksb
means.
– Ed Morton
Mar 25 at 17:20
Thank you for the explanation !!
– Milky
Mar 26 at 9:43
1
1
YMMV with what any given awk thinks
b
means.– Ed Morton
Mar 25 at 17:20
YMMV with what any given awk thinks
b
means.– Ed Morton
Mar 25 at 17:20
Thank you for the explanation !!
– Milky
Mar 26 at 9:43
Thank you for the explanation !!
– Milky
Mar 26 at 9:43
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55342486%2fhow-to-format-phone-numbers-in-bash-with-awk%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
only your third column will have data like this? or other columns can have similar data and data is separated by | pipes? in your sample input lines, I see two pipes in the beginning, is one pipe part of your data or both need to be considered as delimiters? just for the sample data, the following is working fine with me
cat sampledata.txt | sed 's/||0/||+33/; s/+330/+33/; s/[^0-9|+]*//g'
– Ibraheem
Mar 25 at 16:55