Find and replace partial string in dataframe?Replacements for switch statement in Python?Finding the index of an item given a list containing it in PythonConverting string into datetimeConvert bytes to a string?How to substring a string in Python?Does Python have a string 'contains' substring method?Find current directory and file's directoryDelete column from pandas DataFrameHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandas
What do you call the angle of the direction of an airplane?
How can I effectively map a multi-level dungeon?
Find max number you can create from an array of numbers
What is the fundamental difference between catching whales and hunting other animals?
PhD: When to quit and move on?
How do I check that users don't write down their passwords?
How come a desk dictionary be abridged?
Is there a way to change the aspect ratio of a DNG file?
Sleepy tired vs physically tired
What happens if the limit of 4 billion files was exceeded in an ext4 partition?
What is the difference between an "empty interior" and a "hole" in topology?
Should I cheat if the majority does it?
Initializing variables variable in an "if" statement
When is one 'Ready' to make Original Contributions to Mathematics?
Why would "dead languages" be the only languages that spells could be written in?
Would the Life cleric's Disciple of Life feature supercharge the Regenerate spell?
What's the big deal about the Nazgûl losing their horses?
Does the Milky Way orbit around anything?
SQL Server - TRY/CATCH does not work in certain cases
Why do Martians have to wear space helmets?
Why weren't Gemini capsules given names?
Boss furious on bad appraisal
What is it called when the tritone is added to a minor scale?
Motorcyle Chain needs to be cleaned every time you lube it?
Find and replace partial string in dataframe?
Replacements for switch statement in Python?Finding the index of an item given a list containing it in PythonConverting string into datetimeConvert bytes to a string?How to substring a string in Python?Does Python have a string 'contains' substring method?Find current directory and file's directoryDelete column from pandas DataFrameHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandas
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I currently have two dataframes that have been pulled from CSV files that I need to join. Problem lies in the fact that the join column isn't matching and there are many files I must go through, so manual cleaning in excel isn't optional.
Here is what I am working with...
DF1
ID Title HIF
1 A HIF-1101
2 AB HIF-1102
DF2
HIF Date Type
HIF-1101 CD42 01/12/19 Image
HIF-1102 JH96 01/14/19 Image
I need to eliminate the extra letter/number combo in DF2. All rows in join columns(there are a few thousand) carry the same number format in 'HIF-XXXX'. Maybe there is a way to find 'HIF' and then index 5 characters to the right?
python python-3.x pandas
add a comment |
I currently have two dataframes that have been pulled from CSV files that I need to join. Problem lies in the fact that the join column isn't matching and there are many files I must go through, so manual cleaning in excel isn't optional.
Here is what I am working with...
DF1
ID Title HIF
1 A HIF-1101
2 AB HIF-1102
DF2
HIF Date Type
HIF-1101 CD42 01/12/19 Image
HIF-1102 JH96 01/14/19 Image
I need to eliminate the extra letter/number combo in DF2. All rows in join columns(there are a few thousand) carry the same number format in 'HIF-XXXX'. Maybe there is a way to find 'HIF' and then index 5 characters to the right?
python python-3.x pandas
add a comment |
I currently have two dataframes that have been pulled from CSV files that I need to join. Problem lies in the fact that the join column isn't matching and there are many files I must go through, so manual cleaning in excel isn't optional.
Here is what I am working with...
DF1
ID Title HIF
1 A HIF-1101
2 AB HIF-1102
DF2
HIF Date Type
HIF-1101 CD42 01/12/19 Image
HIF-1102 JH96 01/14/19 Image
I need to eliminate the extra letter/number combo in DF2. All rows in join columns(there are a few thousand) carry the same number format in 'HIF-XXXX'. Maybe there is a way to find 'HIF' and then index 5 characters to the right?
python python-3.x pandas
I currently have two dataframes that have been pulled from CSV files that I need to join. Problem lies in the fact that the join column isn't matching and there are many files I must go through, so manual cleaning in excel isn't optional.
Here is what I am working with...
DF1
ID Title HIF
1 A HIF-1101
2 AB HIF-1102
DF2
HIF Date Type
HIF-1101 CD42 01/12/19 Image
HIF-1102 JH96 01/14/19 Image
I need to eliminate the extra letter/number combo in DF2. All rows in join columns(there are a few thousand) carry the same number format in 'HIF-XXXX'. Maybe there is a way to find 'HIF' and then index 5 characters to the right?
python python-3.x pandas
python python-3.x pandas
asked Mar 25 at 19:53
Trace R.Trace R.
415 bronze badges
415 bronze badges
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
Use str.extract
to extract the pattern HIF-w4
from df2['HIF']
, you can then merge df1
and df2
together on "HIF".
df1.merge(df2.assign(HIF=df2['HIF'].str.extract(r'(HIF-w4)')), on='HIF')
ID Title HIF Date Type
0 1 A HIF-1101 01/12/19 Image
1 2 AB HIF-1102 01/14/19 Image
1
Safer and more general answer than mine, +1
– Erfan
Mar 25 at 20:01
@Erfan Err on the side of caution! Btw, returned the upvote.
– cs95
Mar 25 at 20:02
agree, mine would fail if there are whitespaces for example.
– Erfan
Mar 25 at 20:06
1
Thanks guys! I had no idea w even existed. Y'all teach me something new every day.
– Trace R.
Mar 25 at 20:14
add a comment |
You can use pandas.Series.str.slice
df2['HIF'] = df2['HIF'].str.slice(stop=-5)
print(df2)
HIF Date Type
0 HIF-1101 01/12/19 Image
1 HIF-1102 01/14/19 Image
Then merge
df_merge = pd.merge(df1, df2, on='HIF')
print(df_merge)
ID Title HIF Date Type
0 1 A HIF-1101 01/12/19 Image
1 2 AB HIF-1102 01/14/19 Image
add a comment |
I am using str.find
df2.HIF=df2.HIF.str.findall('|'.join(df1.HIF.tolist())).str[0]
df1.merge(df2,on='HIF')
Out[73]:
ID Title HIF Date Type
0 1 A HIF-1101 01/12/19 Image
1 2 AB HIF-1102 01/14/19 Image
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55345428%2ffind-and-replace-partial-string-in-dataframe%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Use str.extract
to extract the pattern HIF-w4
from df2['HIF']
, you can then merge df1
and df2
together on "HIF".
df1.merge(df2.assign(HIF=df2['HIF'].str.extract(r'(HIF-w4)')), on='HIF')
ID Title HIF Date Type
0 1 A HIF-1101 01/12/19 Image
1 2 AB HIF-1102 01/14/19 Image
1
Safer and more general answer than mine, +1
– Erfan
Mar 25 at 20:01
@Erfan Err on the side of caution! Btw, returned the upvote.
– cs95
Mar 25 at 20:02
agree, mine would fail if there are whitespaces for example.
– Erfan
Mar 25 at 20:06
1
Thanks guys! I had no idea w even existed. Y'all teach me something new every day.
– Trace R.
Mar 25 at 20:14
add a comment |
Use str.extract
to extract the pattern HIF-w4
from df2['HIF']
, you can then merge df1
and df2
together on "HIF".
df1.merge(df2.assign(HIF=df2['HIF'].str.extract(r'(HIF-w4)')), on='HIF')
ID Title HIF Date Type
0 1 A HIF-1101 01/12/19 Image
1 2 AB HIF-1102 01/14/19 Image
1
Safer and more general answer than mine, +1
– Erfan
Mar 25 at 20:01
@Erfan Err on the side of caution! Btw, returned the upvote.
– cs95
Mar 25 at 20:02
agree, mine would fail if there are whitespaces for example.
– Erfan
Mar 25 at 20:06
1
Thanks guys! I had no idea w even existed. Y'all teach me something new every day.
– Trace R.
Mar 25 at 20:14
add a comment |
Use str.extract
to extract the pattern HIF-w4
from df2['HIF']
, you can then merge df1
and df2
together on "HIF".
df1.merge(df2.assign(HIF=df2['HIF'].str.extract(r'(HIF-w4)')), on='HIF')
ID Title HIF Date Type
0 1 A HIF-1101 01/12/19 Image
1 2 AB HIF-1102 01/14/19 Image
Use str.extract
to extract the pattern HIF-w4
from df2['HIF']
, you can then merge df1
and df2
together on "HIF".
df1.merge(df2.assign(HIF=df2['HIF'].str.extract(r'(HIF-w4)')), on='HIF')
ID Title HIF Date Type
0 1 A HIF-1101 01/12/19 Image
1 2 AB HIF-1102 01/14/19 Image
answered Mar 25 at 19:59
cs95cs95
158k26 gold badges212 silver badges279 bronze badges
158k26 gold badges212 silver badges279 bronze badges
1
Safer and more general answer than mine, +1
– Erfan
Mar 25 at 20:01
@Erfan Err on the side of caution! Btw, returned the upvote.
– cs95
Mar 25 at 20:02
agree, mine would fail if there are whitespaces for example.
– Erfan
Mar 25 at 20:06
1
Thanks guys! I had no idea w even existed. Y'all teach me something new every day.
– Trace R.
Mar 25 at 20:14
add a comment |
1
Safer and more general answer than mine, +1
– Erfan
Mar 25 at 20:01
@Erfan Err on the side of caution! Btw, returned the upvote.
– cs95
Mar 25 at 20:02
agree, mine would fail if there are whitespaces for example.
– Erfan
Mar 25 at 20:06
1
Thanks guys! I had no idea w even existed. Y'all teach me something new every day.
– Trace R.
Mar 25 at 20:14
1
1
Safer and more general answer than mine, +1
– Erfan
Mar 25 at 20:01
Safer and more general answer than mine, +1
– Erfan
Mar 25 at 20:01
@Erfan Err on the side of caution! Btw, returned the upvote.
– cs95
Mar 25 at 20:02
@Erfan Err on the side of caution! Btw, returned the upvote.
– cs95
Mar 25 at 20:02
agree, mine would fail if there are whitespaces for example.
– Erfan
Mar 25 at 20:06
agree, mine would fail if there are whitespaces for example.
– Erfan
Mar 25 at 20:06
1
1
Thanks guys! I had no idea w even existed. Y'all teach me something new every day.
– Trace R.
Mar 25 at 20:14
Thanks guys! I had no idea w even existed. Y'all teach me something new every day.
– Trace R.
Mar 25 at 20:14
add a comment |
You can use pandas.Series.str.slice
df2['HIF'] = df2['HIF'].str.slice(stop=-5)
print(df2)
HIF Date Type
0 HIF-1101 01/12/19 Image
1 HIF-1102 01/14/19 Image
Then merge
df_merge = pd.merge(df1, df2, on='HIF')
print(df_merge)
ID Title HIF Date Type
0 1 A HIF-1101 01/12/19 Image
1 2 AB HIF-1102 01/14/19 Image
add a comment |
You can use pandas.Series.str.slice
df2['HIF'] = df2['HIF'].str.slice(stop=-5)
print(df2)
HIF Date Type
0 HIF-1101 01/12/19 Image
1 HIF-1102 01/14/19 Image
Then merge
df_merge = pd.merge(df1, df2, on='HIF')
print(df_merge)
ID Title HIF Date Type
0 1 A HIF-1101 01/12/19 Image
1 2 AB HIF-1102 01/14/19 Image
add a comment |
You can use pandas.Series.str.slice
df2['HIF'] = df2['HIF'].str.slice(stop=-5)
print(df2)
HIF Date Type
0 HIF-1101 01/12/19 Image
1 HIF-1102 01/14/19 Image
Then merge
df_merge = pd.merge(df1, df2, on='HIF')
print(df_merge)
ID Title HIF Date Type
0 1 A HIF-1101 01/12/19 Image
1 2 AB HIF-1102 01/14/19 Image
You can use pandas.Series.str.slice
df2['HIF'] = df2['HIF'].str.slice(stop=-5)
print(df2)
HIF Date Type
0 HIF-1101 01/12/19 Image
1 HIF-1102 01/14/19 Image
Then merge
df_merge = pd.merge(df1, df2, on='HIF')
print(df_merge)
ID Title HIF Date Type
0 1 A HIF-1101 01/12/19 Image
1 2 AB HIF-1102 01/14/19 Image
answered Mar 25 at 20:00
ErfanErfan
8,0932 gold badges7 silver badges24 bronze badges
8,0932 gold badges7 silver badges24 bronze badges
add a comment |
add a comment |
I am using str.find
df2.HIF=df2.HIF.str.findall('|'.join(df1.HIF.tolist())).str[0]
df1.merge(df2,on='HIF')
Out[73]:
ID Title HIF Date Type
0 1 A HIF-1101 01/12/19 Image
1 2 AB HIF-1102 01/14/19 Image
add a comment |
I am using str.find
df2.HIF=df2.HIF.str.findall('|'.join(df1.HIF.tolist())).str[0]
df1.merge(df2,on='HIF')
Out[73]:
ID Title HIF Date Type
0 1 A HIF-1101 01/12/19 Image
1 2 AB HIF-1102 01/14/19 Image
add a comment |
I am using str.find
df2.HIF=df2.HIF.str.findall('|'.join(df1.HIF.tolist())).str[0]
df1.merge(df2,on='HIF')
Out[73]:
ID Title HIF Date Type
0 1 A HIF-1101 01/12/19 Image
1 2 AB HIF-1102 01/14/19 Image
I am using str.find
df2.HIF=df2.HIF.str.findall('|'.join(df1.HIF.tolist())).str[0]
df1.merge(df2,on='HIF')
Out[73]:
ID Title HIF Date Type
0 1 A HIF-1101 01/12/19 Image
1 2 AB HIF-1102 01/14/19 Image
answered Mar 25 at 20:13
WeNYoBenWeNYoBen
145k8 gold badges51 silver badges81 bronze badges
145k8 gold badges51 silver badges81 bronze badges
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55345428%2ffind-and-replace-partial-string-in-dataframe%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown