Count and compare occurrences across different columns in different spreadsheetswhat is the most efficient way of counting occurrences in pandas?Count the number occurrences of a character in a stringWhy does comparing strings using either '==' or 'is' sometimes produce a different result?How can I count the occurrences of a list item?Peak detection in a 2D array“Large data” work flows using pandasPython: Pyplot in loop --> curves are cumulated per iteration, not separately plottedLooping through and comparing lines of two unequal length dictionariesHow to check for occurrences of indexes in a file onto another by column and print out the resultSum categorical feature labels across columns for given row, pandasComparing words from different files

What does 'in attendance' mean on an England death certificate?

What was the first science fiction or fantasy multiple choice book?

Grid: different background color (of row) based on values

Does friction always oppose motion?

Find the closest three-digit hex colour

The alcoholic village festival

Journal standards vs. personal standards

Can I submit a paper to two or more journals at the same time?

Word ending in "-ine" for rat-like

Robots in a spaceship

Why was Pan Am Flight 103 flying over Lockerbie?

Checkmate in 1 on a Tangled Board

Is my guitar action too high or is the bridge too high?

Does a lens with a bigger max. aperture focus faster than a lens with a smaller max. aperture?

Is leaving out prefixes like "rauf", "rüber", "rein" when describing movement considered a big mistake in spoken German?

Why are symbols not written in words?

Calculus, water poured into a cone: Why is the derivative non-linear?

Have any large aeroplanes been landed — safely and without damage — in locations that they could not be flown away from?

Fully submerged water bath for stove top baking?

Active wildlife outside the window- Good or Bad for Cat psychology?

Automorphisms and epimorphisms of finite groups

What was the point of separating stdout and stderr?

Why am I getting an electric shock from the water in my hot tub?

How can an inexperienced GM keep a game fun for experienced players?

Count and compare occurrences across different columns in different spreadsheets

what is the most efficient way of counting occurrences in pandas?Count the number occurrences of a character in a stringWhy does comparing strings using either '==' or 'is' sometimes produce a different result?How can I count the occurrences of a list item?Peak detection in a 2D array“Large data” work flows using pandasPython: Pyplot in loop --> curves are cumulated per iteration, not separately plottedLooping through and comparing lines of two unequal length dictionariesHow to check for occurrences of indexes in a file onto another by column and print out the resultSum categorical feature labels across columns for given row, pandasComparing words from different files

I would like to know (in Python) how to count occurrences and compare values from different columns in different spreadsheets. After counting, I would need to know if those values fulfill a condition i.e. If Ana (user) from the first spreadsheet appears 1 time in the second spreadsheet and 5 times in the third one, I would like to sum 1 to a variable X.

I am new in Python, but I have tried getting the .values() after using the Counter from collections. However, I am not sure if the real value Ana is being considered when iterating in the results of the Counter. All in all, I need to iterate each element in spreadsheet one and see if each element of it appears one time in the second spreadsheet and five times in the third spreadsheet, if such thing happens, the variable X will be added by one.

def XInputOutputs():

list1 = []
with open(file1, 'r') as fr:
 r = csv.reader(fr)
 for row in r:
 list1.append(row[1])
 number_of_occurrences_in_list_1 = Counter(list1)
 list1_ocurrences = number_of_occurrences_in_list_1.values()

list2 = []
with open(file2, 'r') as fr:
 r = csv.reader(fr)
 for row in r:
 list2.append(row[1])
 number_of_occurrences_in_list_2 = Counter(list2)
 list2_ocurrences = number_of_occurrences_in_list_2.values()

X = 0

for x,y in zip(list1_ocurrences, list2_ocurrences):
 if x == 1 and y == 5:
 X += 1

return X

I tested with small spreadsheets, but this just works for pre-ordered values. If Ana appears after 100000 rows, everything is broken. I think it is needed to iterate each value (Ana) and check simultaneously in all the spreadsheets and sum the variable X.

Thanks.

asked Mar 25 at 15:43

Luis Puma

111 bronze badge

1

I suggest looking into Pandas

– mauve
Mar 25 at 15:47

add a comment |

def XInputOutputs():

list1 = []
with open(file1, 'r') as fr:
 r = csv.reader(fr)
 for row in r:
 list1.append(row[1])
 number_of_occurrences_in_list_1 = Counter(list1)
 list1_ocurrences = number_of_occurrences_in_list_1.values()

list2 = []
with open(file2, 'r') as fr:
 r = csv.reader(fr)
 for row in r:
 list2.append(row[1])
 number_of_occurrences_in_list_2 = Counter(list2)
 list2_ocurrences = number_of_occurrences_in_list_2.values()

X = 0

for x,y in zip(list1_ocurrences, list2_ocurrences):
 if x == 1 and y == 5:
 X += 1

return X

Thanks.

asked Mar 25 at 15:43

Luis Puma

111 bronze badge

1

I suggest looking into Pandas

– mauve
Mar 25 at 15:47

add a comment |

def XInputOutputs():

list1 = []
with open(file1, 'r') as fr:
 r = csv.reader(fr)
 for row in r:
 list1.append(row[1])
 number_of_occurrences_in_list_1 = Counter(list1)
 list1_ocurrences = number_of_occurrences_in_list_1.values()

list2 = []
with open(file2, 'r') as fr:
 r = csv.reader(fr)
 for row in r:
 list2.append(row[1])
 number_of_occurrences_in_list_2 = Counter(list2)
 list2_ocurrences = number_of_occurrences_in_list_2.values()

X = 0

for x,y in zip(list1_ocurrences, list2_ocurrences):
 if x == 1 and y == 5:
 X += 1

return X

Thanks.

asked Mar 25 at 15:43

Luis Puma

111 bronze badge

def XInputOutputs():

list1 = []
with open(file1, 'r') as fr:
 r = csv.reader(fr)
 for row in r:
 list1.append(row[1])
 number_of_occurrences_in_list_1 = Counter(list1)
 list1_ocurrences = number_of_occurrences_in_list_1.values()

list2 = []
with open(file2, 'r') as fr:
 r = csv.reader(fr)
 for row in r:
 list2.append(row[1])
 number_of_occurrences_in_list_2 = Counter(list2)
 list2_ocurrences = number_of_occurrences_in_list_2.values()

X = 0

for x,y in zip(list1_ocurrences, list2_ocurrences):
 if x == 1 and y == 5:
 X += 1

return X

Thanks.

python iteration counter spreadsheet

asked Mar 25 at 15:43

Luis Puma

111 bronze badge

asked Mar 25 at 15:43

Luis Puma

111 bronze badge

asked Mar 25 at 15:43

Luis Puma

111 bronze badge

asked Mar 25 at 15:43

Luis Puma

111 bronze badge

asked Mar 25 at 15:43

Luis Puma

111 bronze badge

1

I suggest looking into Pandas

– mauve
Mar 25 at 15:47

add a comment |

1

I suggest looking into Pandas

– mauve
Mar 25 at 15:47

I suggest looking into Pandas

– mauve
Mar 25 at 15:47

add a comment |

1 Answer
1

active

oldest

votes

I am at work, so I will be able to write a full answer only later.
If you can import modules, I suggest you to try using pandas: a real super-useful tool to quickly and efficiently manage data frames.
You can easily import a .csv spreadsheet with

import pandas as pd

df = pd.read_csv()

method, then perform almost any kind of operation.

Check out this answer out: I got few time to read it, but I hope it helps

what is the most efficient way of counting occurrences in pandas?

UPDATE: then try with this

# not tested but should work

import os
import pandas as pd

# read all csv sheets from folder - I assume your folder is named "CSVs"
for files in os.walk("CSVs"):
 files = files[-1]
# here it's generated a list of dataframes
df_list = []
for file in files:
 df = pd.read_csv("CSVs/" + file)
 df_list.append(df)

name_i_wanna_count = "" # this will be your query
columun_name = "" # here insert the column you wanna analyze
count = 0

for df in df_list:
 # retrieve a series matching your query and then counts the elements inside
 matching_serie = df.loc[df[columun_name] == name_i_wanna_count]
 partial_count = len(matching_serie)
 count = count + partial_count

print(count)

I hope it helps

edited Mar 26 at 0:09

answered Mar 25 at 15:51

Michele Rava

731 silver badge10 bronze badges

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55341530%2fcount-and-compare-occurrences-across-different-columns-in-different-spreadsheets%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

import pandas as pd

df = pd.read_csv()

method, then perform almost any kind of operation.

Check out this answer out: I got few time to read it, but I hope it helps

what is the most efficient way of counting occurrences in pandas?

UPDATE: then try with this

# not tested but should work

import os
import pandas as pd

# read all csv sheets from folder - I assume your folder is named "CSVs"
for files in os.walk("CSVs"):
 files = files[-1]
# here it's generated a list of dataframes
df_list = []
for file in files:
 df = pd.read_csv("CSVs/" + file)
 df_list.append(df)

name_i_wanna_count = "" # this will be your query
columun_name = "" # here insert the column you wanna analyze
count = 0

for df in df_list:
 # retrieve a series matching your query and then counts the elements inside
 matching_serie = df.loc[df[columun_name] == name_i_wanna_count]
 partial_count = len(matching_serie)
 count = count + partial_count

print(count)

I hope it helps

edited Mar 26 at 0:09

answered Mar 25 at 15:51

Michele Rava

731 silver badge10 bronze badges

add a comment |

import pandas as pd

df = pd.read_csv()

method, then perform almost any kind of operation.

Check out this answer out: I got few time to read it, but I hope it helps

what is the most efficient way of counting occurrences in pandas?

UPDATE: then try with this

# not tested but should work

import os
import pandas as pd

# read all csv sheets from folder - I assume your folder is named "CSVs"
for files in os.walk("CSVs"):
 files = files[-1]
# here it's generated a list of dataframes
df_list = []
for file in files:
 df = pd.read_csv("CSVs/" + file)
 df_list.append(df)

name_i_wanna_count = "" # this will be your query
columun_name = "" # here insert the column you wanna analyze
count = 0

for df in df_list:
 # retrieve a series matching your query and then counts the elements inside
 matching_serie = df.loc[df[columun_name] == name_i_wanna_count]
 partial_count = len(matching_serie)
 count = count + partial_count

print(count)

I hope it helps

edited Mar 26 at 0:09

answered Mar 25 at 15:51

Michele Rava

731 silver badge10 bronze badges

add a comment |

import pandas as pd

df = pd.read_csv()

method, then perform almost any kind of operation.

Check out this answer out: I got few time to read it, but I hope it helps

what is the most efficient way of counting occurrences in pandas?

UPDATE: then try with this

# not tested but should work

import os
import pandas as pd

# read all csv sheets from folder - I assume your folder is named "CSVs"
for files in os.walk("CSVs"):
 files = files[-1]
# here it's generated a list of dataframes
df_list = []
for file in files:
 df = pd.read_csv("CSVs/" + file)
 df_list.append(df)

name_i_wanna_count = "" # this will be your query
columun_name = "" # here insert the column you wanna analyze
count = 0

for df in df_list:
 # retrieve a series matching your query and then counts the elements inside
 matching_serie = df.loc[df[columun_name] == name_i_wanna_count]
 partial_count = len(matching_serie)
 count = count + partial_count

print(count)

I hope it helps

edited Mar 26 at 0:09

answered Mar 25 at 15:51

Michele Rava

731 silver badge10 bronze badges

import pandas as pd

df = pd.read_csv()

method, then perform almost any kind of operation.

Check out this answer out: I got few time to read it, but I hope it helps

what is the most efficient way of counting occurrences in pandas?

UPDATE: then try with this

# not tested but should work

import os
import pandas as pd

# read all csv sheets from folder - I assume your folder is named "CSVs"
for files in os.walk("CSVs"):
 files = files[-1]
# here it's generated a list of dataframes
df_list = []
for file in files:
 df = pd.read_csv("CSVs/" + file)
 df_list.append(df)

name_i_wanna_count = "" # this will be your query
columun_name = "" # here insert the column you wanna analyze
count = 0

for df in df_list:
 # retrieve a series matching your query and then counts the elements inside
 matching_serie = df.loc[df[columun_name] == name_i_wanna_count]
 partial_count = len(matching_serie)
 count = count + partial_count

print(count)

I hope it helps

edited Mar 26 at 0:09

answered Mar 25 at 15:51

Michele Rava

731 silver badge10 bronze badges

edited Mar 26 at 0:09

answered Mar 25 at 15:51

Michele Rava

731 silver badge10 bronze badges

answered Mar 25 at 15:51

Michele Rava

731 silver badge10 bronze badges

answered Mar 25 at 15:51

Michele Rava

731 silver badge10 bronze badges

add a comment |

Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Styjun

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

1 Answer
1

1 Answer
1

1 Answer
1