Count number of rows containing string per index using pandasAdding new column to existing DataFrame in Python pandas“Large data” work flows using pandasPandas - How to flatten a hierarchical index in columnsHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet statistics for each group (such as count, mean, etc) using pandas GroupBy?Get list from pandas DataFrame column headersPandas: conditional rolling countWhy is “1000000000000000 in range(1000000000000001)” so fast in Python 3?
What to say to a student who has failed?
Did the British navy fail to take into account the ballistics correction due to Coriolis force during WW1 Falkland Islands battle?
How to respectfully refuse to assist co-workers with IT issues?
Does merkle root contain hashes of transactions from previous blocks?
What is a CirKle Word™?
Lost property on Portuguese trains
Non-visual Computers - thoughts?
Can I get temporary health insurance while moving to the US?
Architectural feasibility of a tiered circular stone keep
Tex Quotes(UVa 272)
How to prevent clipped screen edges on my TV, HDMI-connected?
How to determine car loan length as a function of how long I plan to keep a car
Is there any way to keep a player from killing an NPC?
Papers on arXiv solving the same problem at the same time
'Us students' - Does this apposition need a comma?
Wrong arrangement of boxes in raster of tcolorbox
Change my first, I'm entertaining
Are the A380 engines interchangeable (given they are not all equipped with reverse)?
Is gzip atomic?
How do thermal tapes transfer heat despite their low thermal conductivity?
Why is the UK so keen to remove the "backstop" when their leadership seems to think that no border will be needed in Northern Ireland?
Sql server sleeping state is increasing using ADO.NET?
Can a Rogue PC teach an NPC to perform Sneak Attack?
Is there any way white can win?
Count number of rows containing string per index using pandas
Adding new column to existing DataFrame in Python pandas“Large data” work flows using pandasPandas - How to flatten a hierarchical index in columnsHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet statistics for each group (such as count, mean, etc) using pandas GroupBy?Get list from pandas DataFrame column headersPandas: conditional rolling countWhy is “1000000000000000 in range(1000000000000001)” so fast in Python 3?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I have a data set like this:
index sentence
1 bobby went to the gym
1 sally the bad
1 days are good
2 sunny side up
2 the weird
I want to count how many times 'the' appears in the columns 'sentence' by index:
index count_the
1 2
2 1
how would I do this in pandas?
python-3.x pandas
add a comment |
I have a data set like this:
index sentence
1 bobby went to the gym
1 sally the bad
1 days are good
2 sunny side up
2 the weird
I want to count how many times 'the' appears in the columns 'sentence' by index:
index count_the
1 2
2 1
how would I do this in pandas?
python-3.x pandas
add a comment |
I have a data set like this:
index sentence
1 bobby went to the gym
1 sally the bad
1 days are good
2 sunny side up
2 the weird
I want to count how many times 'the' appears in the columns 'sentence' by index:
index count_the
1 2
2 1
how would I do this in pandas?
python-3.x pandas
I have a data set like this:
index sentence
1 bobby went to the gym
1 sally the bad
1 days are good
2 sunny side up
2 the weird
I want to count how many times 'the' appears in the columns 'sentence' by index:
index count_the
1 2
2 1
how would I do this in pandas?
python-3.x pandas
python-3.x pandas
asked Mar 27 at 18:17
song0089song0089
1,0143 gold badges19 silver badges43 bronze badges
1,0143 gold badges19 silver badges43 bronze badges
add a comment |
add a comment |
4 Answers
4
active
oldest
votes
df = pd.DataFrame('index' :[1,1,1,2,2],'sentence':['bobby went to the gym','sally the bad','days are good','sunny side up','the weird'])
df['counts'] = df['sentence'].str.count('the')
print(df.groupby('index')['counts'].sum())
Thank you! this worked well
– song0089
Mar 27 at 19:33
Welcome @song0089
– Akhilesh
Mar 28 at 2:47
add a comment |
First groupby.Series.apply
, then use series.str.count
:
df = df.groupby('index').sentence.apply(' '.join).reset_index()
print(df)
index sentence
0 1 bobby went to the gym sally the bad days are good
1 2 sunny side up the weird
df['count_the'] = df.sentence.str.count('the')
print(df.drop(['sentence'],axis=1))
index count_the
0 1 2
1 2 1
add a comment |
one way from findall
, notice I treat the index columns as index here
df.sentence.str.findall(r'btheb').str.len().sum(level=0)
Out[363]:
index
1 2
2 1
Name: sentence, dtype: int64
add a comment |
Also you can use groupby()+ apply():
df.groupby('index').apply(lambda x: x['sentence'].str.contains(r'.*the').sum()).reset_index(name = 'count_the')
or groupby()+ apply():
df.groupby('index').agg('sentence': lambda x: x.str.contains(r'.*the').sum()).reset_index(name = 'count_the')
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55384046%2fcount-number-of-rows-containing-string-per-index-using-pandas%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
df = pd.DataFrame('index' :[1,1,1,2,2],'sentence':['bobby went to the gym','sally the bad','days are good','sunny side up','the weird'])
df['counts'] = df['sentence'].str.count('the')
print(df.groupby('index')['counts'].sum())
Thank you! this worked well
– song0089
Mar 27 at 19:33
Welcome @song0089
– Akhilesh
Mar 28 at 2:47
add a comment |
df = pd.DataFrame('index' :[1,1,1,2,2],'sentence':['bobby went to the gym','sally the bad','days are good','sunny side up','the weird'])
df['counts'] = df['sentence'].str.count('the')
print(df.groupby('index')['counts'].sum())
Thank you! this worked well
– song0089
Mar 27 at 19:33
Welcome @song0089
– Akhilesh
Mar 28 at 2:47
add a comment |
df = pd.DataFrame('index' :[1,1,1,2,2],'sentence':['bobby went to the gym','sally the bad','days are good','sunny side up','the weird'])
df['counts'] = df['sentence'].str.count('the')
print(df.groupby('index')['counts'].sum())
df = pd.DataFrame('index' :[1,1,1,2,2],'sentence':['bobby went to the gym','sally the bad','days are good','sunny side up','the weird'])
df['counts'] = df['sentence'].str.count('the')
print(df.groupby('index')['counts'].sum())
answered Mar 27 at 18:30
AkhileshAkhilesh
5891 gold badge4 silver badges13 bronze badges
5891 gold badge4 silver badges13 bronze badges
Thank you! this worked well
– song0089
Mar 27 at 19:33
Welcome @song0089
– Akhilesh
Mar 28 at 2:47
add a comment |
Thank you! this worked well
– song0089
Mar 27 at 19:33
Welcome @song0089
– Akhilesh
Mar 28 at 2:47
Thank you! this worked well
– song0089
Mar 27 at 19:33
Thank you! this worked well
– song0089
Mar 27 at 19:33
Welcome @song0089
– Akhilesh
Mar 28 at 2:47
Welcome @song0089
– Akhilesh
Mar 28 at 2:47
add a comment |
First groupby.Series.apply
, then use series.str.count
:
df = df.groupby('index').sentence.apply(' '.join).reset_index()
print(df)
index sentence
0 1 bobby went to the gym sally the bad days are good
1 2 sunny side up the weird
df['count_the'] = df.sentence.str.count('the')
print(df.drop(['sentence'],axis=1))
index count_the
0 1 2
1 2 1
add a comment |
First groupby.Series.apply
, then use series.str.count
:
df = df.groupby('index').sentence.apply(' '.join).reset_index()
print(df)
index sentence
0 1 bobby went to the gym sally the bad days are good
1 2 sunny side up the weird
df['count_the'] = df.sentence.str.count('the')
print(df.drop(['sentence'],axis=1))
index count_the
0 1 2
1 2 1
add a comment |
First groupby.Series.apply
, then use series.str.count
:
df = df.groupby('index').sentence.apply(' '.join).reset_index()
print(df)
index sentence
0 1 bobby went to the gym sally the bad days are good
1 2 sunny side up the weird
df['count_the'] = df.sentence.str.count('the')
print(df.drop(['sentence'],axis=1))
index count_the
0 1 2
1 2 1
First groupby.Series.apply
, then use series.str.count
:
df = df.groupby('index').sentence.apply(' '.join).reset_index()
print(df)
index sentence
0 1 bobby went to the gym sally the bad days are good
1 2 sunny side up the weird
df['count_the'] = df.sentence.str.count('the')
print(df.drop(['sentence'],axis=1))
index count_the
0 1 2
1 2 1
answered Mar 27 at 18:28
ErfanErfan
11.2k2 gold badges7 silver badges28 bronze badges
11.2k2 gold badges7 silver badges28 bronze badges
add a comment |
add a comment |
one way from findall
, notice I treat the index columns as index here
df.sentence.str.findall(r'btheb').str.len().sum(level=0)
Out[363]:
index
1 2
2 1
Name: sentence, dtype: int64
add a comment |
one way from findall
, notice I treat the index columns as index here
df.sentence.str.findall(r'btheb').str.len().sum(level=0)
Out[363]:
index
1 2
2 1
Name: sentence, dtype: int64
add a comment |
one way from findall
, notice I treat the index columns as index here
df.sentence.str.findall(r'btheb').str.len().sum(level=0)
Out[363]:
index
1 2
2 1
Name: sentence, dtype: int64
one way from findall
, notice I treat the index columns as index here
df.sentence.str.findall(r'btheb').str.len().sum(level=0)
Out[363]:
index
1 2
2 1
Name: sentence, dtype: int64
answered Mar 27 at 18:41
WeNYoBenWeNYoBen
155k8 gold badges54 silver badges84 bronze badges
155k8 gold badges54 silver badges84 bronze badges
add a comment |
add a comment |
Also you can use groupby()+ apply():
df.groupby('index').apply(lambda x: x['sentence'].str.contains(r'.*the').sum()).reset_index(name = 'count_the')
or groupby()+ apply():
df.groupby('index').agg('sentence': lambda x: x.str.contains(r'.*the').sum()).reset_index(name = 'count_the')
add a comment |
Also you can use groupby()+ apply():
df.groupby('index').apply(lambda x: x['sentence'].str.contains(r'.*the').sum()).reset_index(name = 'count_the')
or groupby()+ apply():
df.groupby('index').agg('sentence': lambda x: x.str.contains(r'.*the').sum()).reset_index(name = 'count_the')
add a comment |
Also you can use groupby()+ apply():
df.groupby('index').apply(lambda x: x['sentence'].str.contains(r'.*the').sum()).reset_index(name = 'count_the')
or groupby()+ apply():
df.groupby('index').agg('sentence': lambda x: x.str.contains(r'.*the').sum()).reset_index(name = 'count_the')
Also you can use groupby()+ apply():
df.groupby('index').apply(lambda x: x['sentence'].str.contains(r'.*the').sum()).reset_index(name = 'count_the')
or groupby()+ apply():
df.groupby('index').agg('sentence': lambda x: x.str.contains(r'.*the').sum()).reset_index(name = 'count_the')
edited Mar 27 at 20:24
answered Mar 27 at 19:57
LoochieLoochie
1,0363 silver badges11 bronze badges
1,0363 silver badges11 bronze badges
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55384046%2fcount-number-of-rows-containing-string-per-index-using-pandas%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown