Count and compare occurrences across different columns in different spreadsheetswhat is the most efficient way of counting occurrences in pandas?Count the number occurrences of a character in a stringWhy does comparing strings using either '==' or 'is' sometimes produce a different result?How can I count the occurrences of a list item?Peak detection in a 2D array“Large data” work flows using pandasPython: Pyplot in loop --> curves are cumulated per iteration, not separately plottedLooping through and comparing lines of two unequal length dictionariesHow to check for occurrences of indexes in a file onto another by column and print out the resultSum categorical feature labels across columns for given row, pandasComparing words from different files

What does 'in attendance' mean on an England death certificate?

What was the first science fiction or fantasy multiple choice book?

Grid: different background color (of row) based on values

Does friction always oppose motion?

Find the closest three-digit hex colour

The alcoholic village festival

Journal standards vs. personal standards

Can I submit a paper to two or more journals at the same time?

Word ending in "-ine" for rat-like

Robots in a spaceship

Why was Pan Am Flight 103 flying over Lockerbie?

Checkmate in 1 on a Tangled Board

Is my guitar action too high or is the bridge too high?

Does a lens with a bigger max. aperture focus faster than a lens with a smaller max. aperture?

Is leaving out prefixes like "rauf", "rüber", "rein" when describing movement considered a big mistake in spoken German?

Why are symbols not written in words?

Calculus, water poured into a cone: Why is the derivative non-linear?

Have any large aeroplanes been landed — safely and without damage — in locations that they could not be flown away from?

Fully submerged water bath for stove top baking?

Active wildlife outside the window- Good or Bad for Cat psychology?

Automorphisms and epimorphisms of finite groups

What was the point of separating stdout and stderr?

Why am I getting an electric shock from the water in my hot tub?

How can an inexperienced GM keep a game fun for experienced players?



Count and compare occurrences across different columns in different spreadsheets


what is the most efficient way of counting occurrences in pandas?Count the number occurrences of a character in a stringWhy does comparing strings using either '==' or 'is' sometimes produce a different result?How can I count the occurrences of a list item?Peak detection in a 2D array“Large data” work flows using pandasPython: Pyplot in loop --> curves are cumulated per iteration, not separately plottedLooping through and comparing lines of two unequal length dictionariesHow to check for occurrences of indexes in a file onto another by column and print out the resultSum categorical feature labels across columns for given row, pandasComparing words from different files













2















I would like to know (in Python) how to count occurrences and compare values from different columns in different spreadsheets. After counting, I would need to know if those values fulfill a condition i.e. If Ana (user) from the first spreadsheet appears 1 time in the second spreadsheet and 5 times in the third one, I would like to sum 1 to a variable X.



I am new in Python, but I have tried getting the .values() after using the Counter from collections. However, I am not sure if the real value Ana is being considered when iterating in the results of the Counter. All in all, I need to iterate each element in spreadsheet one and see if each element of it appears one time in the second spreadsheet and five times in the third spreadsheet, if such thing happens, the variable X will be added by one.



def XInputOutputs():



list1 = []
with open(file1, 'r') as fr:
r = csv.reader(fr)
for row in r:
list1.append(row[1])
number_of_occurrences_in_list_1 = Counter(list1)
list1_ocurrences = number_of_occurrences_in_list_1.values()

list2 = []
with open(file2, 'r') as fr:
r = csv.reader(fr)
for row in r:
list2.append(row[1])
number_of_occurrences_in_list_2 = Counter(list2)
list2_ocurrences = number_of_occurrences_in_list_2.values()

X = 0

for x,y in zip(list1_ocurrences, list2_ocurrences):
if x == 1 and y == 5:
X += 1

return X


I tested with small spreadsheets, but this just works for pre-ordered values. If Ana appears after 100000 rows, everything is broken. I think it is needed to iterate each value (Ana) and check simultaneously in all the spreadsheets and sum the variable X.



Thanks.










share|improve this question

















  • 1





    I suggest looking into Pandas

    – mauve
    Mar 25 at 15:47















2















I would like to know (in Python) how to count occurrences and compare values from different columns in different spreadsheets. After counting, I would need to know if those values fulfill a condition i.e. If Ana (user) from the first spreadsheet appears 1 time in the second spreadsheet and 5 times in the third one, I would like to sum 1 to a variable X.



I am new in Python, but I have tried getting the .values() after using the Counter from collections. However, I am not sure if the real value Ana is being considered when iterating in the results of the Counter. All in all, I need to iterate each element in spreadsheet one and see if each element of it appears one time in the second spreadsheet and five times in the third spreadsheet, if such thing happens, the variable X will be added by one.



def XInputOutputs():



list1 = []
with open(file1, 'r') as fr:
r = csv.reader(fr)
for row in r:
list1.append(row[1])
number_of_occurrences_in_list_1 = Counter(list1)
list1_ocurrences = number_of_occurrences_in_list_1.values()

list2 = []
with open(file2, 'r') as fr:
r = csv.reader(fr)
for row in r:
list2.append(row[1])
number_of_occurrences_in_list_2 = Counter(list2)
list2_ocurrences = number_of_occurrences_in_list_2.values()

X = 0

for x,y in zip(list1_ocurrences, list2_ocurrences):
if x == 1 and y == 5:
X += 1

return X


I tested with small spreadsheets, but this just works for pre-ordered values. If Ana appears after 100000 rows, everything is broken. I think it is needed to iterate each value (Ana) and check simultaneously in all the spreadsheets and sum the variable X.



Thanks.










share|improve this question

















  • 1





    I suggest looking into Pandas

    – mauve
    Mar 25 at 15:47













2












2








2








I would like to know (in Python) how to count occurrences and compare values from different columns in different spreadsheets. After counting, I would need to know if those values fulfill a condition i.e. If Ana (user) from the first spreadsheet appears 1 time in the second spreadsheet and 5 times in the third one, I would like to sum 1 to a variable X.



I am new in Python, but I have tried getting the .values() after using the Counter from collections. However, I am not sure if the real value Ana is being considered when iterating in the results of the Counter. All in all, I need to iterate each element in spreadsheet one and see if each element of it appears one time in the second spreadsheet and five times in the third spreadsheet, if such thing happens, the variable X will be added by one.



def XInputOutputs():



list1 = []
with open(file1, 'r') as fr:
r = csv.reader(fr)
for row in r:
list1.append(row[1])
number_of_occurrences_in_list_1 = Counter(list1)
list1_ocurrences = number_of_occurrences_in_list_1.values()

list2 = []
with open(file2, 'r') as fr:
r = csv.reader(fr)
for row in r:
list2.append(row[1])
number_of_occurrences_in_list_2 = Counter(list2)
list2_ocurrences = number_of_occurrences_in_list_2.values()

X = 0

for x,y in zip(list1_ocurrences, list2_ocurrences):
if x == 1 and y == 5:
X += 1

return X


I tested with small spreadsheets, but this just works for pre-ordered values. If Ana appears after 100000 rows, everything is broken. I think it is needed to iterate each value (Ana) and check simultaneously in all the spreadsheets and sum the variable X.



Thanks.










share|improve this question














I would like to know (in Python) how to count occurrences and compare values from different columns in different spreadsheets. After counting, I would need to know if those values fulfill a condition i.e. If Ana (user) from the first spreadsheet appears 1 time in the second spreadsheet and 5 times in the third one, I would like to sum 1 to a variable X.



I am new in Python, but I have tried getting the .values() after using the Counter from collections. However, I am not sure if the real value Ana is being considered when iterating in the results of the Counter. All in all, I need to iterate each element in spreadsheet one and see if each element of it appears one time in the second spreadsheet and five times in the third spreadsheet, if such thing happens, the variable X will be added by one.



def XInputOutputs():



list1 = []
with open(file1, 'r') as fr:
r = csv.reader(fr)
for row in r:
list1.append(row[1])
number_of_occurrences_in_list_1 = Counter(list1)
list1_ocurrences = number_of_occurrences_in_list_1.values()

list2 = []
with open(file2, 'r') as fr:
r = csv.reader(fr)
for row in r:
list2.append(row[1])
number_of_occurrences_in_list_2 = Counter(list2)
list2_ocurrences = number_of_occurrences_in_list_2.values()

X = 0

for x,y in zip(list1_ocurrences, list2_ocurrences):
if x == 1 and y == 5:
X += 1

return X


I tested with small spreadsheets, but this just works for pre-ordered values. If Ana appears after 100000 rows, everything is broken. I think it is needed to iterate each value (Ana) and check simultaneously in all the spreadsheets and sum the variable X.



Thanks.







python iteration counter spreadsheet






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Mar 25 at 15:43









Luis PumaLuis Puma

111 bronze badge




111 bronze badge







  • 1





    I suggest looking into Pandas

    – mauve
    Mar 25 at 15:47












  • 1





    I suggest looking into Pandas

    – mauve
    Mar 25 at 15:47







1




1





I suggest looking into Pandas

– mauve
Mar 25 at 15:47





I suggest looking into Pandas

– mauve
Mar 25 at 15:47










1 Answer
1






active

oldest

votes


















0














I am at work, so I will be able to write a full answer only later.
If you can import modules, I suggest you to try using pandas: a real super-useful tool to quickly and efficiently manage data frames.
You can easily import a .csv spreadsheet with



import pandas as pd

df = pd.read_csv()


method, then perform almost any kind of operation.



Check out this answer out: I got few time to read it, but I hope it helps



what is the most efficient way of counting occurrences in pandas?



UPDATE: then try with this



# not tested but should work

import os
import pandas as pd

# read all csv sheets from folder - I assume your folder is named "CSVs"
for files in os.walk("CSVs"):
files = files[-1]
# here it's generated a list of dataframes
df_list = []
for file in files:
df = pd.read_csv("CSVs/" + file)
df_list.append(df)

name_i_wanna_count = "" # this will be your query
columun_name = "" # here insert the column you wanna analyze
count = 0

for df in df_list:
# retrieve a series matching your query and then counts the elements inside
matching_serie = df.loc[df[columun_name] == name_i_wanna_count]
partial_count = len(matching_serie)
count = count + partial_count

print(count)


I hope it helps






share|improve this answer
























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55341530%2fcount-and-compare-occurrences-across-different-columns-in-different-spreadsheets%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    I am at work, so I will be able to write a full answer only later.
    If you can import modules, I suggest you to try using pandas: a real super-useful tool to quickly and efficiently manage data frames.
    You can easily import a .csv spreadsheet with



    import pandas as pd

    df = pd.read_csv()


    method, then perform almost any kind of operation.



    Check out this answer out: I got few time to read it, but I hope it helps



    what is the most efficient way of counting occurrences in pandas?



    UPDATE: then try with this



    # not tested but should work

    import os
    import pandas as pd

    # read all csv sheets from folder - I assume your folder is named "CSVs"
    for files in os.walk("CSVs"):
    files = files[-1]
    # here it's generated a list of dataframes
    df_list = []
    for file in files:
    df = pd.read_csv("CSVs/" + file)
    df_list.append(df)

    name_i_wanna_count = "" # this will be your query
    columun_name = "" # here insert the column you wanna analyze
    count = 0

    for df in df_list:
    # retrieve a series matching your query and then counts the elements inside
    matching_serie = df.loc[df[columun_name] == name_i_wanna_count]
    partial_count = len(matching_serie)
    count = count + partial_count

    print(count)


    I hope it helps






    share|improve this answer





























      0














      I am at work, so I will be able to write a full answer only later.
      If you can import modules, I suggest you to try using pandas: a real super-useful tool to quickly and efficiently manage data frames.
      You can easily import a .csv spreadsheet with



      import pandas as pd

      df = pd.read_csv()


      method, then perform almost any kind of operation.



      Check out this answer out: I got few time to read it, but I hope it helps



      what is the most efficient way of counting occurrences in pandas?



      UPDATE: then try with this



      # not tested but should work

      import os
      import pandas as pd

      # read all csv sheets from folder - I assume your folder is named "CSVs"
      for files in os.walk("CSVs"):
      files = files[-1]
      # here it's generated a list of dataframes
      df_list = []
      for file in files:
      df = pd.read_csv("CSVs/" + file)
      df_list.append(df)

      name_i_wanna_count = "" # this will be your query
      columun_name = "" # here insert the column you wanna analyze
      count = 0

      for df in df_list:
      # retrieve a series matching your query and then counts the elements inside
      matching_serie = df.loc[df[columun_name] == name_i_wanna_count]
      partial_count = len(matching_serie)
      count = count + partial_count

      print(count)


      I hope it helps






      share|improve this answer



























        0












        0








        0







        I am at work, so I will be able to write a full answer only later.
        If you can import modules, I suggest you to try using pandas: a real super-useful tool to quickly and efficiently manage data frames.
        You can easily import a .csv spreadsheet with



        import pandas as pd

        df = pd.read_csv()


        method, then perform almost any kind of operation.



        Check out this answer out: I got few time to read it, but I hope it helps



        what is the most efficient way of counting occurrences in pandas?



        UPDATE: then try with this



        # not tested but should work

        import os
        import pandas as pd

        # read all csv sheets from folder - I assume your folder is named "CSVs"
        for files in os.walk("CSVs"):
        files = files[-1]
        # here it's generated a list of dataframes
        df_list = []
        for file in files:
        df = pd.read_csv("CSVs/" + file)
        df_list.append(df)

        name_i_wanna_count = "" # this will be your query
        columun_name = "" # here insert the column you wanna analyze
        count = 0

        for df in df_list:
        # retrieve a series matching your query and then counts the elements inside
        matching_serie = df.loc[df[columun_name] == name_i_wanna_count]
        partial_count = len(matching_serie)
        count = count + partial_count

        print(count)


        I hope it helps






        share|improve this answer















        I am at work, so I will be able to write a full answer only later.
        If you can import modules, I suggest you to try using pandas: a real super-useful tool to quickly and efficiently manage data frames.
        You can easily import a .csv spreadsheet with



        import pandas as pd

        df = pd.read_csv()


        method, then perform almost any kind of operation.



        Check out this answer out: I got few time to read it, but I hope it helps



        what is the most efficient way of counting occurrences in pandas?



        UPDATE: then try with this



        # not tested but should work

        import os
        import pandas as pd

        # read all csv sheets from folder - I assume your folder is named "CSVs"
        for files in os.walk("CSVs"):
        files = files[-1]
        # here it's generated a list of dataframes
        df_list = []
        for file in files:
        df = pd.read_csv("CSVs/" + file)
        df_list.append(df)

        name_i_wanna_count = "" # this will be your query
        columun_name = "" # here insert the column you wanna analyze
        count = 0

        for df in df_list:
        # retrieve a series matching your query and then counts the elements inside
        matching_serie = df.loc[df[columun_name] == name_i_wanna_count]
        partial_count = len(matching_serie)
        count = count + partial_count

        print(count)


        I hope it helps







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Mar 26 at 0:09

























        answered Mar 25 at 15:51









        Michele RavaMichele Rava

        731 silver badge10 bronze badges




        731 silver badge10 bronze badges
















            Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.







            Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.



















            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55341530%2fcount-and-compare-occurrences-across-different-columns-in-different-spreadsheets%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

            Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

            Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript