get dataframe row count based on conditionsFind values >10$ in pandasWhy does count gives the total number of the rows and not the False value rows one in this case?Counting the repeated values in one column base on other columnHow to get the current time in PythonAdd one row to pandas DataFrameSelecting multiple columns in a pandas dataframeDelete column from pandas DataFrame“Large data” work flows using pandasHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headersHow to sum counted pandas dataframe column with multiple conditions row-wise

How to build up towards a "Moment of Reckoning" when my story is told in the first person?

Horizontal, Slanted, Stacked Lines in TikZ

How to convert diagonal matrix to rectangular matrix

I make billions (#6)

What exactly is a "murder hobo"?

Users forgetting to regenerate PDF before sending it

What would +1/+2/+3 items be called in game?

Optimization models for portfolio optimization

Moving millions of files to a different directory with specfic name patterns

how does the Raspberry Pi PoE shield work?

Party going through airport security at separate times?

What is the meaning of “Can I have a slice?” In NYC?

Is there a way I can open the Windows 10 Ubuntu bash without running the ~/.bashrc script?

How was the Shuttle loaded and unloaded from its carrier aircraft?

Intern not wearing safety equipment; how could I have handled this differently?

How does the Melf's Minute Meteors spell interact with the Evocation wizard's Sculpt Spells feature?

Need a non-volatile memory IC with near unlimited read/write operations capability

Found and corrected a mistake on someone's else paper -- praxis?

What does the multimeter dial do internally?

Why different specifications for telescopes and binoculars?

VHDL: is there a way to create an entity into which constants can be passed?

Can a landlord force all residents to use the landlord's in-house debit card accounts?

How should I ask for a "pint" in countries that use metric?

Can Jimmy hang on his rope?

get dataframe row count based on conditions

Find values >10$ in pandasWhy does count gives the total number of the rows and not the False value rows one in this case?Counting the repeated values in one column base on other columnHow to get the current time in PythonAdd one row to pandas DataFrameSelecting multiple columns in a pandas dataframeDelete column from pandas DataFrame“Large data” work flows using pandasHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headersHow to sum counted pandas dataframe column with multiple conditions row-wise

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I want to get the count of dataframe rows based on conditional selection. I tried the following code.

print df[(df.IP == head.idxmax()) & (df.Method == 'HEAD') & (df.Referrer == '"-"')].count()

output:

IP 57
Time 57
Method 57
Resource 57
Status 57
Bytes 57
Referrer 57
Agent 57
dtype: int64

The output shows the count for each an every column in the dataframe. Instead I need to get a single count where all of the above conditions satisfied? How to do this? If you need more explanation about my dataframe please let me know.

asked Jun 26 '13 at 13:56

Nilani Algiriyage

7,20624 gold badges63 silver badges99 bronze badges

add a comment |

I want to get the count of dataframe rows based on conditional selection. I tried the following code.

print df[(df.IP == head.idxmax()) & (df.Method == 'HEAD') & (df.Referrer == '"-"')].count()

output:

IP 57
Time 57
Method 57
Resource 57
Status 57
Bytes 57
Referrer 57
Agent 57
dtype: int64

asked Jun 26 '13 at 13:56

Nilani Algiriyage

7,20624 gold badges63 silver badges99 bronze badges

add a comment |

I want to get the count of dataframe rows based on conditional selection. I tried the following code.

print df[(df.IP == head.idxmax()) & (df.Method == 'HEAD') & (df.Referrer == '"-"')].count()

output:

IP 57
Time 57
Method 57
Resource 57
Status 57
Bytes 57
Referrer 57
Agent 57
dtype: int64

asked Jun 26 '13 at 13:56

Nilani Algiriyage

7,20624 gold badges63 silver badges99 bronze badges

I want to get the count of dataframe rows based on conditional selection. I tried the following code.

print df[(df.IP == head.idxmax()) & (df.Method == 'HEAD') & (df.Referrer == '"-"')].count()

output:

IP 57
Time 57
Method 57
Resource 57
Status 57
Bytes 57
Referrer 57
Agent 57
dtype: int64

python pandas

asked Jun 26 '13 at 13:56

Nilani Algiriyage

7,20624 gold badges63 silver badges99 bronze badges

asked Jun 26 '13 at 13:56

Nilani Algiriyage

7,20624 gold badges63 silver badges99 bronze badges

asked Jun 26 '13 at 13:56

Nilani Algiriyage

7,20624 gold badges63 silver badges99 bronze badges

asked Jun 26 '13 at 13:56

Nilani Algiriyage

7,20624 gold badges63 silver badges99 bronze badges

asked Jun 26 '13 at 13:56

Nilani Algiriyage

7,20624 gold badges63 silver badges99 bronze badges

add a comment |

2 Answers
2

active

oldest

votes

You are asking for the condition where all the conditions are true,
so len of the frame is the answer, unless I misunderstand what you are asking

In [17]: df = DataFrame(randn(20,4),columns=list('ABCD'))

In [18]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)]
Out[18]: 
 A B C D
12 0.491683 0.137766 0.859753 -1.041487
13 0.376200 0.575667 1.534179 1.247358
14 0.428739 1.539973 1.057848 -1.254489

In [19]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)].count()
Out[19]: 
A 3
B 3
C 3
D 3
dtype: int64

In [20]: len(df[(df['A']>0) & (df['B']>0) & (df['C']>0)])
Out[20]: 3

answered Jun 26 '13 at 14:14

Jeff

84.3k13 gold badges165 silver badges147 bronze badges

Yes! That is what i wanted :) Thanks very much!

– Nilani Algiriyage
Jun 26 '13 at 14:39

5

Which one is faster? len(df[(df['A']>0)]) or sum(df['A']>0)?

– Leandro Lima
Dec 25 '17 at 17:08

add a comment |

For increased performance you should not evaluate the dataframe using your predicate. You can just use the outcome of your predicate directly as illustrated below:

In [1]: import pandas as pd
 import numpy as np
 df = pd.DataFrame(np.random.randn(20,4),columns=list('ABCD'))


In [2]: df.head()
Out[2]:
 A B C D
0 -2.019868 1.227246 -0.489257 0.149053
1 0.223285 -0.087784 -0.053048 -0.108584
2 -0.140556 -0.299735 -1.765956 0.517803
3 -0.589489 0.400487 0.107856 0.194890
4 1.309088 -0.596996 -0.623519 0.020400

In [3]: %time sum((df['A']>0) & (df['B']>0))
CPU times: user 1.11 ms, sys: 53 µs, total: 1.16 ms
Wall time: 1.12 ms
Out[3]: 4

In [4]: %time len(df[(df['A']>0) & (df['B']>0)])
CPU times: user 1.38 ms, sys: 78 µs, total: 1.46 ms
Wall time: 1.42 ms
Out[4]: 4

Keep in mind that this technique only works for counting the number of rows that comply with your predicate.

answered Jun 27 '18 at 10:27

Enias Cailliau

1762 silver badges12 bronze badges

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f17322109%2fget-dataframe-row-count-based-on-conditions%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

You are asking for the condition where all the conditions are true,
so len of the frame is the answer, unless I misunderstand what you are asking

In [17]: df = DataFrame(randn(20,4),columns=list('ABCD'))

In [18]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)]
Out[18]: 
 A B C D
12 0.491683 0.137766 0.859753 -1.041487
13 0.376200 0.575667 1.534179 1.247358
14 0.428739 1.539973 1.057848 -1.254489

In [19]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)].count()
Out[19]: 
A 3
B 3
C 3
D 3
dtype: int64

In [20]: len(df[(df['A']>0) & (df['B']>0) & (df['C']>0)])
Out[20]: 3

answered Jun 26 '13 at 14:14

Jeff

84.3k13 gold badges165 silver badges147 bronze badges

Yes! That is what i wanted :) Thanks very much!

– Nilani Algiriyage
Jun 26 '13 at 14:39

5

Which one is faster? len(df[(df['A']>0)]) or sum(df['A']>0)?

– Leandro Lima
Dec 25 '17 at 17:08

add a comment |

You are asking for the condition where all the conditions are true,
so len of the frame is the answer, unless I misunderstand what you are asking

In [17]: df = DataFrame(randn(20,4),columns=list('ABCD'))

In [18]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)]
Out[18]: 
 A B C D
12 0.491683 0.137766 0.859753 -1.041487
13 0.376200 0.575667 1.534179 1.247358
14 0.428739 1.539973 1.057848 -1.254489

In [19]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)].count()
Out[19]: 
A 3
B 3
C 3
D 3
dtype: int64

In [20]: len(df[(df['A']>0) & (df['B']>0) & (df['C']>0)])
Out[20]: 3

answered Jun 26 '13 at 14:14

Jeff

84.3k13 gold badges165 silver badges147 bronze badges

Yes! That is what i wanted :) Thanks very much!

– Nilani Algiriyage
Jun 26 '13 at 14:39

5

Which one is faster? len(df[(df['A']>0)]) or sum(df['A']>0)?

– Leandro Lima
Dec 25 '17 at 17:08

add a comment |

You are asking for the condition where all the conditions are true,
so len of the frame is the answer, unless I misunderstand what you are asking

In [17]: df = DataFrame(randn(20,4),columns=list('ABCD'))

In [18]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)]
Out[18]: 
 A B C D
12 0.491683 0.137766 0.859753 -1.041487
13 0.376200 0.575667 1.534179 1.247358
14 0.428739 1.539973 1.057848 -1.254489

In [19]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)].count()
Out[19]: 
A 3
B 3
C 3
D 3
dtype: int64

In [20]: len(df[(df['A']>0) & (df['B']>0) & (df['C']>0)])
Out[20]: 3

answered Jun 26 '13 at 14:14

Jeff

84.3k13 gold badges165 silver badges147 bronze badges

You are asking for the condition where all the conditions are true,
so len of the frame is the answer, unless I misunderstand what you are asking

In [17]: df = DataFrame(randn(20,4),columns=list('ABCD'))

In [18]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)]
Out[18]: 
 A B C D
12 0.491683 0.137766 0.859753 -1.041487
13 0.376200 0.575667 1.534179 1.247358
14 0.428739 1.539973 1.057848 -1.254489

In [19]: df[(df['A']>0) & (df['B']>0) & (df['C']>0)].count()
Out[19]: 
A 3
B 3
C 3
D 3
dtype: int64

In [20]: len(df[(df['A']>0) & (df['B']>0) & (df['C']>0)])
Out[20]: 3

answered Jun 26 '13 at 14:14

Jeff

84.3k13 gold badges165 silver badges147 bronze badges

answered Jun 26 '13 at 14:14

Jeff

84.3k13 gold badges165 silver badges147 bronze badges

answered Jun 26 '13 at 14:14

Jeff

84.3k13 gold badges165 silver badges147 bronze badges

answered Jun 26 '13 at 14:14

Jeff

84.3k13 gold badges165 silver badges147 bronze badges

Yes! That is what i wanted :) Thanks very much!

– Nilani Algiriyage
Jun 26 '13 at 14:39

5

Which one is faster? len(df[(df['A']>0)]) or sum(df['A']>0)?

– Leandro Lima
Dec 25 '17 at 17:08

add a comment |

Yes! That is what i wanted :) Thanks very much!

– Nilani Algiriyage
Jun 26 '13 at 14:39

5

Which one is faster? len(df[(df['A']>0)]) or sum(df['A']>0)?

– Leandro Lima
Dec 25 '17 at 17:08

Yes! That is what i wanted :) Thanks very much!

– Nilani Algiriyage
Jun 26 '13 at 14:39

Which one is faster? len(df[(df['A']>0)]) or sum(df['A']>0)?

– Leandro Lima
Dec 25 '17 at 17:08

add a comment |

For increased performance you should not evaluate the dataframe using your predicate. You can just use the outcome of your predicate directly as illustrated below:

In [1]: import pandas as pd
 import numpy as np
 df = pd.DataFrame(np.random.randn(20,4),columns=list('ABCD'))


In [2]: df.head()
Out[2]:
 A B C D
0 -2.019868 1.227246 -0.489257 0.149053
1 0.223285 -0.087784 -0.053048 -0.108584
2 -0.140556 -0.299735 -1.765956 0.517803
3 -0.589489 0.400487 0.107856 0.194890
4 1.309088 -0.596996 -0.623519 0.020400

In [3]: %time sum((df['A']>0) & (df['B']>0))
CPU times: user 1.11 ms, sys: 53 µs, total: 1.16 ms
Wall time: 1.12 ms
Out[3]: 4

In [4]: %time len(df[(df['A']>0) & (df['B']>0)])
CPU times: user 1.38 ms, sys: 78 µs, total: 1.46 ms
Wall time: 1.42 ms
Out[4]: 4

Keep in mind that this technique only works for counting the number of rows that comply with your predicate.

answered Jun 27 '18 at 10:27

Enias Cailliau

1762 silver badges12 bronze badges

add a comment |

For increased performance you should not evaluate the dataframe using your predicate. You can just use the outcome of your predicate directly as illustrated below:

In [1]: import pandas as pd
 import numpy as np
 df = pd.DataFrame(np.random.randn(20,4),columns=list('ABCD'))


In [2]: df.head()
Out[2]:
 A B C D
0 -2.019868 1.227246 -0.489257 0.149053
1 0.223285 -0.087784 -0.053048 -0.108584
2 -0.140556 -0.299735 -1.765956 0.517803
3 -0.589489 0.400487 0.107856 0.194890
4 1.309088 -0.596996 -0.623519 0.020400

In [3]: %time sum((df['A']>0) & (df['B']>0))
CPU times: user 1.11 ms, sys: 53 µs, total: 1.16 ms
Wall time: 1.12 ms
Out[3]: 4

In [4]: %time len(df[(df['A']>0) & (df['B']>0)])
CPU times: user 1.38 ms, sys: 78 µs, total: 1.46 ms
Wall time: 1.42 ms
Out[4]: 4

Keep in mind that this technique only works for counting the number of rows that comply with your predicate.

answered Jun 27 '18 at 10:27

Enias Cailliau

1762 silver badges12 bronze badges

add a comment |

For increased performance you should not evaluate the dataframe using your predicate. You can just use the outcome of your predicate directly as illustrated below:

In [1]: import pandas as pd
 import numpy as np
 df = pd.DataFrame(np.random.randn(20,4),columns=list('ABCD'))


In [2]: df.head()
Out[2]:
 A B C D
0 -2.019868 1.227246 -0.489257 0.149053
1 0.223285 -0.087784 -0.053048 -0.108584
2 -0.140556 -0.299735 -1.765956 0.517803
3 -0.589489 0.400487 0.107856 0.194890
4 1.309088 -0.596996 -0.623519 0.020400

In [3]: %time sum((df['A']>0) & (df['B']>0))
CPU times: user 1.11 ms, sys: 53 µs, total: 1.16 ms
Wall time: 1.12 ms
Out[3]: 4

In [4]: %time len(df[(df['A']>0) & (df['B']>0)])
CPU times: user 1.38 ms, sys: 78 µs, total: 1.46 ms
Wall time: 1.42 ms
Out[4]: 4

Keep in mind that this technique only works for counting the number of rows that comply with your predicate.

answered Jun 27 '18 at 10:27

Enias Cailliau

1762 silver badges12 bronze badges

For increased performance you should not evaluate the dataframe using your predicate. You can just use the outcome of your predicate directly as illustrated below:

In [1]: import pandas as pd
 import numpy as np
 df = pd.DataFrame(np.random.randn(20,4),columns=list('ABCD'))


In [2]: df.head()
Out[2]:
 A B C D
0 -2.019868 1.227246 -0.489257 0.149053
1 0.223285 -0.087784 -0.053048 -0.108584
2 -0.140556 -0.299735 -1.765956 0.517803
3 -0.589489 0.400487 0.107856 0.194890
4 1.309088 -0.596996 -0.623519 0.020400

In [3]: %time sum((df['A']>0) & (df['B']>0))
CPU times: user 1.11 ms, sys: 53 µs, total: 1.16 ms
Wall time: 1.12 ms
Out[3]: 4

In [4]: %time len(df[(df['A']>0) & (df['B']>0)])
CPU times: user 1.38 ms, sys: 78 µs, total: 1.46 ms
Wall time: 1.42 ms
Out[4]: 4

Keep in mind that this technique only works for counting the number of rows that comply with your predicate.

answered Jun 27 '18 at 10:27

Enias Cailliau

1762 silver badges12 bronze badges

answered Jun 27 '18 at 10:27

Enias Cailliau

1762 silver badges12 bronze badges

answered Jun 27 '18 at 10:27

Enias Cailliau

1762 silver badges12 bronze badges

answered Jun 27 '18 at 10:27

Enias Cailliau

1762 silver badges12 bronze badges

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Styjun

2 Answers
2

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

2 Answers
2

2 Answers
2

2 Answers
2