get dataframe row count based on conditionsFind values >10$ in pandasWhy does count gives the total number of the rows and not the False value rows one in this case?Counting the repeated values in one column base on other columnHow to get the current time in PythonAdd one row to pandas DataFrameSelecting multiple columns in a pandas dataframeDelete column from pandas DataFrame“Large data” work flows using pandasHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headersHow to sum counted pandas dataframe column with multiple conditions row-wise
How to build up towards a "Moment of Reckoning" when my story is told in the first person?
Horizontal, Slanted, Stacked Lines in TikZ
How to convert diagonal matrix to rectangular matrix
I make billions (#6)
What exactly is a "murder hobo"?
Users forgetting to regenerate PDF before sending it
What would +1/+2/+3 items be called in game?
Optimization models for portfolio optimization
Moving millions of files to a different directory with specfic name patterns
how does the Raspberry Pi PoE shield work?
Party going through airport security at separate times?
What is the meaning of “Can I have a slice?” In NYC?
Is there a way I can open the Windows 10 Ubuntu bash without running the ~/.bashrc script?
How was the Shuttle loaded and unloaded from its carrier aircraft?
Intern not wearing safety equipment; how could I have handled this differently?
How does the Melf's Minute Meteors spell interact with the Evocation wizard's Sculpt Spells feature?
Need a non-volatile memory IC with near unlimited read/write operations capability
Found and corrected a mistake on someone's else paper -- praxis?
What does the multimeter dial do internally?
Why different specifications for telescopes and binoculars?
VHDL: is there a way to create an entity into which constants can be passed?
Can a landlord force all residents to use the landlord's in-house debit card accounts?
How should I ask for a "pint" in countries that use metric?
Can Jimmy hang on his rope?
get dataframe row count based on conditions
Find values >10$ in pandasWhy does count gives the total number of the rows and not the False value rows one in this case?Counting the repeated values in one column base on other columnHow to get the current time in PythonAdd one row to pandas DataFrameSelecting multiple columns in a pandas dataframeDelete column from pandas DataFrame“Large data” work flows using pandasHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headersHow to sum counted pandas dataframe column with multiple conditions row-wise
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I want to get the count of dataframe rows based on conditional selection. I tried the following code.
print df[(df.IP == head.idxmax()) & (df.Method == 'HEAD') & (df.Referrer == '"-"')].count()
output:
IP 57
Time 57
Method 57
Resource 57
Status 57
Bytes 57
Referrer 57
Agent 57
dtype: int64
The output shows the count for each an every column in the dataframe. Instead I need to get a single count where all of the above conditions satisfied? How to do this? If you need more explanation about my dataframe please let me know.
python pandas
add a comment |
I want to get the count of dataframe rows based on conditional selection. I tried the following code.
print df[(df.IP == head.idxmax()) & (df.Method == 'HEAD') & (df.Referrer == '"-"')].count()
output:
IP 57
Time 57
Method 57
Resource 57
Status 57
Bytes 57
Referrer 57
Agent 57
dtype: int64
The output shows the count for each an every column in the dataframe. Instead I need to get a single count where all of the above conditions satisfied? How to do this? If you need more explanation about my dataframe please let me know.
python pandas
add a comment |
I want to get the count of dataframe rows based on conditional selection. I tried the following code.
print df[(df.IP == head.idxmax()) & (df.Method == 'HEAD') & (df.Referrer == '"-"')].count()
output:
IP 57
Time 57
Method 57
Resource 57
Status 57
Bytes 57
Referrer 57
Agent 57
dtype: int64
The output shows the count for each an every column in the dataframe. Instead I need to get a single count where all of the above conditions satisfied? How to do this? If you need more explanation about my dataframe please let me know.
python pandas
I want to get the count of dataframe rows based on conditional selection. I tried the following code.
print df[(df.IP == head.idxmax()) & (df.Method == 'HEAD') & (df.Referrer == '"-"')].count()
output:
IP 57
Time 57
Method 57
Resource 57
Status 57
Bytes 57
Referrer 57
Agent 57
dtype: int64
The output shows the count for each an every column in the dataframe. Instead I need to get a single count where all of the above conditions satisfied? How to do this? If you need more explanation about my dataframe please let me know.
python pandas
python pandas
asked Jun 26 '13 at 13:56
Nilani AlgiriyageNilani Algiriyage
7,20624 gold badges63 silver badges99 bronze badges
7,20624 gold badges63 silver badges99 bronze badges
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
You are asking for the condition where all the conditions are true,
so len of the frame is the answer, unless I misunderstand what you are asking
In [17]: df = DataFrame(randn(20,4),columns=list('ABCD'))
In [18]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)]
Out[18]:
A B C D
12 0.491683 0.137766 0.859753 -1.041487
13 0.376200 0.575667 1.534179 1.247358
14 0.428739 1.539973 1.057848 -1.254489
In [19]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)].count()
Out[19]:
A 3
B 3
C 3
D 3
dtype: int64
In [20]: len(df[(df['A']>0) & (df['B']>0) & (df['C']>0)])
Out[20]: 3
Yes! That is what i wanted :) Thanks very much!
– Nilani Algiriyage
Jun 26 '13 at 14:39
5
Which one is faster?len(df[(df['A']>0)])
orsum(df['A']>0)
?
– Leandro Lima
Dec 25 '17 at 17:08
add a comment |
For increased performance you should not evaluate the dataframe using your predicate. You can just use the outcome of your predicate directly as illustrated below:
In [1]: import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(20,4),columns=list('ABCD'))
In [2]: df.head()
Out[2]:
A B C D
0 -2.019868 1.227246 -0.489257 0.149053
1 0.223285 -0.087784 -0.053048 -0.108584
2 -0.140556 -0.299735 -1.765956 0.517803
3 -0.589489 0.400487 0.107856 0.194890
4 1.309088 -0.596996 -0.623519 0.020400
In [3]: %time sum((df['A']>0) & (df['B']>0))
CPU times: user 1.11 ms, sys: 53 µs, total: 1.16 ms
Wall time: 1.12 ms
Out[3]: 4
In [4]: %time len(df[(df['A']>0) & (df['B']>0)])
CPU times: user 1.38 ms, sys: 78 µs, total: 1.46 ms
Wall time: 1.42 ms
Out[4]: 4
Keep in mind that this technique only works for counting the number of rows that comply with your predicate.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f17322109%2fget-dataframe-row-count-based-on-conditions%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You are asking for the condition where all the conditions are true,
so len of the frame is the answer, unless I misunderstand what you are asking
In [17]: df = DataFrame(randn(20,4),columns=list('ABCD'))
In [18]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)]
Out[18]:
A B C D
12 0.491683 0.137766 0.859753 -1.041487
13 0.376200 0.575667 1.534179 1.247358
14 0.428739 1.539973 1.057848 -1.254489
In [19]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)].count()
Out[19]:
A 3
B 3
C 3
D 3
dtype: int64
In [20]: len(df[(df['A']>0) & (df['B']>0) & (df['C']>0)])
Out[20]: 3
Yes! That is what i wanted :) Thanks very much!
– Nilani Algiriyage
Jun 26 '13 at 14:39
5
Which one is faster?len(df[(df['A']>0)])
orsum(df['A']>0)
?
– Leandro Lima
Dec 25 '17 at 17:08
add a comment |
You are asking for the condition where all the conditions are true,
so len of the frame is the answer, unless I misunderstand what you are asking
In [17]: df = DataFrame(randn(20,4),columns=list('ABCD'))
In [18]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)]
Out[18]:
A B C D
12 0.491683 0.137766 0.859753 -1.041487
13 0.376200 0.575667 1.534179 1.247358
14 0.428739 1.539973 1.057848 -1.254489
In [19]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)].count()
Out[19]:
A 3
B 3
C 3
D 3
dtype: int64
In [20]: len(df[(df['A']>0) & (df['B']>0) & (df['C']>0)])
Out[20]: 3
Yes! That is what i wanted :) Thanks very much!
– Nilani Algiriyage
Jun 26 '13 at 14:39
5
Which one is faster?len(df[(df['A']>0)])
orsum(df['A']>0)
?
– Leandro Lima
Dec 25 '17 at 17:08
add a comment |
You are asking for the condition where all the conditions are true,
so len of the frame is the answer, unless I misunderstand what you are asking
In [17]: df = DataFrame(randn(20,4),columns=list('ABCD'))
In [18]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)]
Out[18]:
A B C D
12 0.491683 0.137766 0.859753 -1.041487
13 0.376200 0.575667 1.534179 1.247358
14 0.428739 1.539973 1.057848 -1.254489
In [19]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)].count()
Out[19]:
A 3
B 3
C 3
D 3
dtype: int64
In [20]: len(df[(df['A']>0) & (df['B']>0) & (df['C']>0)])
Out[20]: 3
You are asking for the condition where all the conditions are true,
so len of the frame is the answer, unless I misunderstand what you are asking
In [17]: df = DataFrame(randn(20,4),columns=list('ABCD'))
In [18]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)]
Out[18]:
A B C D
12 0.491683 0.137766 0.859753 -1.041487
13 0.376200 0.575667 1.534179 1.247358
14 0.428739 1.539973 1.057848 -1.254489
In [19]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)].count()
Out[19]:
A 3
B 3
C 3
D 3
dtype: int64
In [20]: len(df[(df['A']>0) & (df['B']>0) & (df['C']>0)])
Out[20]: 3
answered Jun 26 '13 at 14:14
JeffJeff
84.3k13 gold badges165 silver badges147 bronze badges
84.3k13 gold badges165 silver badges147 bronze badges
Yes! That is what i wanted :) Thanks very much!
– Nilani Algiriyage
Jun 26 '13 at 14:39
5
Which one is faster?len(df[(df['A']>0)])
orsum(df['A']>0)
?
– Leandro Lima
Dec 25 '17 at 17:08
add a comment |
Yes! That is what i wanted :) Thanks very much!
– Nilani Algiriyage
Jun 26 '13 at 14:39
5
Which one is faster?len(df[(df['A']>0)])
orsum(df['A']>0)
?
– Leandro Lima
Dec 25 '17 at 17:08
Yes! That is what i wanted :) Thanks very much!
– Nilani Algiriyage
Jun 26 '13 at 14:39
Yes! That is what i wanted :) Thanks very much!
– Nilani Algiriyage
Jun 26 '13 at 14:39
5
5
Which one is faster?
len(df[(df['A']>0)])
or sum(df['A']>0)
?– Leandro Lima
Dec 25 '17 at 17:08
Which one is faster?
len(df[(df['A']>0)])
or sum(df['A']>0)
?– Leandro Lima
Dec 25 '17 at 17:08
add a comment |
For increased performance you should not evaluate the dataframe using your predicate. You can just use the outcome of your predicate directly as illustrated below:
In [1]: import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(20,4),columns=list('ABCD'))
In [2]: df.head()
Out[2]:
A B C D
0 -2.019868 1.227246 -0.489257 0.149053
1 0.223285 -0.087784 -0.053048 -0.108584
2 -0.140556 -0.299735 -1.765956 0.517803
3 -0.589489 0.400487 0.107856 0.194890
4 1.309088 -0.596996 -0.623519 0.020400
In [3]: %time sum((df['A']>0) & (df['B']>0))
CPU times: user 1.11 ms, sys: 53 µs, total: 1.16 ms
Wall time: 1.12 ms
Out[3]: 4
In [4]: %time len(df[(df['A']>0) & (df['B']>0)])
CPU times: user 1.38 ms, sys: 78 µs, total: 1.46 ms
Wall time: 1.42 ms
Out[4]: 4
Keep in mind that this technique only works for counting the number of rows that comply with your predicate.
add a comment |
For increased performance you should not evaluate the dataframe using your predicate. You can just use the outcome of your predicate directly as illustrated below:
In [1]: import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(20,4),columns=list('ABCD'))
In [2]: df.head()
Out[2]:
A B C D
0 -2.019868 1.227246 -0.489257 0.149053
1 0.223285 -0.087784 -0.053048 -0.108584
2 -0.140556 -0.299735 -1.765956 0.517803
3 -0.589489 0.400487 0.107856 0.194890
4 1.309088 -0.596996 -0.623519 0.020400
In [3]: %time sum((df['A']>0) & (df['B']>0))
CPU times: user 1.11 ms, sys: 53 µs, total: 1.16 ms
Wall time: 1.12 ms
Out[3]: 4
In [4]: %time len(df[(df['A']>0) & (df['B']>0)])
CPU times: user 1.38 ms, sys: 78 µs, total: 1.46 ms
Wall time: 1.42 ms
Out[4]: 4
Keep in mind that this technique only works for counting the number of rows that comply with your predicate.
add a comment |
For increased performance you should not evaluate the dataframe using your predicate. You can just use the outcome of your predicate directly as illustrated below:
In [1]: import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(20,4),columns=list('ABCD'))
In [2]: df.head()
Out[2]:
A B C D
0 -2.019868 1.227246 -0.489257 0.149053
1 0.223285 -0.087784 -0.053048 -0.108584
2 -0.140556 -0.299735 -1.765956 0.517803
3 -0.589489 0.400487 0.107856 0.194890
4 1.309088 -0.596996 -0.623519 0.020400
In [3]: %time sum((df['A']>0) & (df['B']>0))
CPU times: user 1.11 ms, sys: 53 µs, total: 1.16 ms
Wall time: 1.12 ms
Out[3]: 4
In [4]: %time len(df[(df['A']>0) & (df['B']>0)])
CPU times: user 1.38 ms, sys: 78 µs, total: 1.46 ms
Wall time: 1.42 ms
Out[4]: 4
Keep in mind that this technique only works for counting the number of rows that comply with your predicate.
For increased performance you should not evaluate the dataframe using your predicate. You can just use the outcome of your predicate directly as illustrated below:
In [1]: import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(20,4),columns=list('ABCD'))
In [2]: df.head()
Out[2]:
A B C D
0 -2.019868 1.227246 -0.489257 0.149053
1 0.223285 -0.087784 -0.053048 -0.108584
2 -0.140556 -0.299735 -1.765956 0.517803
3 -0.589489 0.400487 0.107856 0.194890
4 1.309088 -0.596996 -0.623519 0.020400
In [3]: %time sum((df['A']>0) & (df['B']>0))
CPU times: user 1.11 ms, sys: 53 µs, total: 1.16 ms
Wall time: 1.12 ms
Out[3]: 4
In [4]: %time len(df[(df['A']>0) & (df['B']>0)])
CPU times: user 1.38 ms, sys: 78 µs, total: 1.46 ms
Wall time: 1.42 ms
Out[4]: 4
Keep in mind that this technique only works for counting the number of rows that comply with your predicate.
answered Jun 27 '18 at 10:27
Enias CailliauEnias Cailliau
1762 silver badges12 bronze badges
1762 silver badges12 bronze badges
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f17322109%2fget-dataframe-row-count-based-on-conditions%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown