How to merge values in columnB based on values in columnAHow to merge two dictionaries in a single expression?How do I check if a list is empty?How do I check whether a file exists without exceptions?How can I safely create a nested directory?How do I sort a dictionary by value?How to make a chain of function decorators?How do I list all files of a directory?How to access environment variable values?“Large data” work flows using pandasSelect rows from a DataFrame based on values in a column in pandas

Passing multiple files through stdin (over ssh)

Frame failure sudden death?

How Can I Tell The Difference Between Unmarked Sugar and Stevia?

Is it a problem if <h4>, <h5> and <h6> are smaller than regular text?

What risks are there when you clear your cookies instead of logging off?

What does the term "railed" mean in signal processing?

How to build suspense or so to establish and justify xenophobia of characters in the eyes of the reader?

Is open-sourcing the code of a webapp not recommended?

Scrum Master role: Reporting?

Chemmacros scheme translation

How can drunken, homicidal elves successfully conduct a wild hunt?

How can I most clearly write a homebrew item that affects the ground below its radius after the initial explosion it creates?

Winning Strategy for the Magician and his Apprentice

Is using haveibeenpwned to validate password strength rational?

Words that signal future content

Do any instruments not produce overtones?

What's the largest optical telescope mirror ever put in space?

When 2-pentene reacts with HBr, what will be the major product?

Can anyone identify this tank?

How to officially communicate to a non-responsive colleague?

Soft question: Examples where lack of mathematical rigour cause security breaches?

How to retract an idea already pitched to an employer?

At what point in time did Dumbledore ask Snape for this favor?

Can the poison from Kingsmen be concocted?



How to merge values in columnB based on values in columnA


How to merge two dictionaries in a single expression?How do I check if a list is empty?How do I check whether a file exists without exceptions?How can I safely create a nested directory?How do I sort a dictionary by value?How to make a chain of function decorators?How do I list all files of a directory?How to access environment variable values?“Large data” work flows using pandasSelect rows from a DataFrame based on values in a column in pandas






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








2















I have a xlsx looks like this:



Company N
A 1234;878;3434
A 5678;873
B 539
B 00;123
C 155;741;655
C 5377;454


I'm using pandas to import it into my program, can I merge N based on their company?



Desired outcome: 'A': [1234,878,3434,5678,873], 'B': [539, 00, 123], 'C': [155, 741, 655, 5377, 454]










share|improve this question






























    2















    I have a xlsx looks like this:



    Company N
    A 1234;878;3434
    A 5678;873
    B 539
    B 00;123
    C 155;741;655
    C 5377;454


    I'm using pandas to import it into my program, can I merge N based on their company?



    Desired outcome: 'A': [1234,878,3434,5678,873], 'B': [539, 00, 123], 'C': [155, 741, 655, 5377, 454]










    share|improve this question


























      2












      2








      2








      I have a xlsx looks like this:



      Company N
      A 1234;878;3434
      A 5678;873
      B 539
      B 00;123
      C 155;741;655
      C 5377;454


      I'm using pandas to import it into my program, can I merge N based on their company?



      Desired outcome: 'A': [1234,878,3434,5678,873], 'B': [539, 00, 123], 'C': [155, 741, 655, 5377, 454]










      share|improve this question
















      I have a xlsx looks like this:



      Company N
      A 1234;878;3434
      A 5678;873
      B 539
      B 00;123
      C 155;741;655
      C 5377;454


      I'm using pandas to import it into my program, can I merge N based on their company?



      Desired outcome: 'A': [1234,878,3434,5678,873], 'B': [539, 00, 123], 'C': [155, 741, 655, 5377, 454]







      python excel python-3.x pandas






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 24 at 17:24









      anky_91

      15.7k41023




      15.7k41023










      asked Mar 24 at 16:26









      AlexAlex

      356




      356






















          2 Answers
          2






          active

          oldest

          votes


















          2














          groupby and split, then apply list and turn to dict like:



          import itertools
          (df.groupby('Company').apply(lambda x:
          list(map(int,itertools.chain.from_iterable(x['N'].str.split(';'))))).to_dict())



          'A': [1234, 878, 3434, 5678, 873],
          'B': [539, 0, 123],
          'C': [155, 741, 655, 5377, 454]


          you can also use sum, for concating the lists, but not recommended for large data(it has performance issues, better use itertools)



          EDIT for slicing 2 elements from the list use:



          import itertools
          (df.groupby('Company').apply(lambda x:
          list(map(int,[k[:2] for k in itertools.chain.from_iterable(x['N'].str.split(';'))]))).to_dict())


          this outputs:



          'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]


          Note the use of map() here, it is used to convert the list elements from string to a int. Since the original dtype is a string and we do a str.split(), the list has strings.






          share|improve this answer

























          • Hi, thanks for your great solution! Can you explain the map() part please? Also, how do I slice it if I only want to keep the first 2 digits? Ex. 'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]?

            – Alex
            Mar 25 at 14:11











          • @Alex check updated answer under EDIT. Hope it helps. :)

            – anky_91
            Mar 25 at 14:42






          • 1





            It helps! Thank you so much! So if I don't need to convert string to int, I don't need to use map() and list() since it's already a list?

            – Alex
            Mar 27 at 13:51












          • @Alex yes, exactly.

            – anky_91
            Mar 27 at 13:52






          • 1





            Thanks for being so patient with me! Hope you have a blessed day!

            – Alex
            Mar 27 at 13:58


















          2














          You can read the xlsx file and convert your dataframe into a dictionary using the below code



          import pandas as pd
          xls_dict = xls_data.to_dict('records')
          print(xls_dict)


          Then, you can generate your required output with the below code



          output_dict = dict()

          for xls_dat in xls_dict:
          key_list = list()
          if 'N' in xls_dat:
          if xls_dat.get('Company') in output_dict:
          lis = output_dict.get(xls_dat.get('Company'))
          lis2 = [int(i) for i in xls_dat.get('N').split(';')]
          output_dict[xls_dat.get('Company')] = lis + lis2
          else:
          key_list = [int(i) for i in xls_dat.get('N').split(';')]
          output_dict[xls_dat.get('Company')] = key_list


          Output:



          'A': [1234, 878, 3434, 5678, 873], 'B': [539, 0, 123], 'C': [155, 741, 655, 5377, 454]





          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function ()
            StackExchange.using("externalEditor", function ()
            StackExchange.using("snippets", function ()
            StackExchange.snippets.init();
            );
            );
            , "code-snippets");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "1"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55325953%2fhow-to-merge-values-in-columnb-based-on-values-in-columna%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            2














            groupby and split, then apply list and turn to dict like:



            import itertools
            (df.groupby('Company').apply(lambda x:
            list(map(int,itertools.chain.from_iterable(x['N'].str.split(';'))))).to_dict())



            'A': [1234, 878, 3434, 5678, 873],
            'B': [539, 0, 123],
            'C': [155, 741, 655, 5377, 454]


            you can also use sum, for concating the lists, but not recommended for large data(it has performance issues, better use itertools)



            EDIT for slicing 2 elements from the list use:



            import itertools
            (df.groupby('Company').apply(lambda x:
            list(map(int,[k[:2] for k in itertools.chain.from_iterable(x['N'].str.split(';'))]))).to_dict())


            this outputs:



            'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]


            Note the use of map() here, it is used to convert the list elements from string to a int. Since the original dtype is a string and we do a str.split(), the list has strings.






            share|improve this answer

























            • Hi, thanks for your great solution! Can you explain the map() part please? Also, how do I slice it if I only want to keep the first 2 digits? Ex. 'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]?

              – Alex
              Mar 25 at 14:11











            • @Alex check updated answer under EDIT. Hope it helps. :)

              – anky_91
              Mar 25 at 14:42






            • 1





              It helps! Thank you so much! So if I don't need to convert string to int, I don't need to use map() and list() since it's already a list?

              – Alex
              Mar 27 at 13:51












            • @Alex yes, exactly.

              – anky_91
              Mar 27 at 13:52






            • 1





              Thanks for being so patient with me! Hope you have a blessed day!

              – Alex
              Mar 27 at 13:58















            2














            groupby and split, then apply list and turn to dict like:



            import itertools
            (df.groupby('Company').apply(lambda x:
            list(map(int,itertools.chain.from_iterable(x['N'].str.split(';'))))).to_dict())



            'A': [1234, 878, 3434, 5678, 873],
            'B': [539, 0, 123],
            'C': [155, 741, 655, 5377, 454]


            you can also use sum, for concating the lists, but not recommended for large data(it has performance issues, better use itertools)



            EDIT for slicing 2 elements from the list use:



            import itertools
            (df.groupby('Company').apply(lambda x:
            list(map(int,[k[:2] for k in itertools.chain.from_iterable(x['N'].str.split(';'))]))).to_dict())


            this outputs:



            'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]


            Note the use of map() here, it is used to convert the list elements from string to a int. Since the original dtype is a string and we do a str.split(), the list has strings.






            share|improve this answer

























            • Hi, thanks for your great solution! Can you explain the map() part please? Also, how do I slice it if I only want to keep the first 2 digits? Ex. 'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]?

              – Alex
              Mar 25 at 14:11











            • @Alex check updated answer under EDIT. Hope it helps. :)

              – anky_91
              Mar 25 at 14:42






            • 1





              It helps! Thank you so much! So if I don't need to convert string to int, I don't need to use map() and list() since it's already a list?

              – Alex
              Mar 27 at 13:51












            • @Alex yes, exactly.

              – anky_91
              Mar 27 at 13:52






            • 1





              Thanks for being so patient with me! Hope you have a blessed day!

              – Alex
              Mar 27 at 13:58













            2












            2








            2







            groupby and split, then apply list and turn to dict like:



            import itertools
            (df.groupby('Company').apply(lambda x:
            list(map(int,itertools.chain.from_iterable(x['N'].str.split(';'))))).to_dict())



            'A': [1234, 878, 3434, 5678, 873],
            'B': [539, 0, 123],
            'C': [155, 741, 655, 5377, 454]


            you can also use sum, for concating the lists, but not recommended for large data(it has performance issues, better use itertools)



            EDIT for slicing 2 elements from the list use:



            import itertools
            (df.groupby('Company').apply(lambda x:
            list(map(int,[k[:2] for k in itertools.chain.from_iterable(x['N'].str.split(';'))]))).to_dict())


            this outputs:



            'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]


            Note the use of map() here, it is used to convert the list elements from string to a int. Since the original dtype is a string and we do a str.split(), the list has strings.






            share|improve this answer















            groupby and split, then apply list and turn to dict like:



            import itertools
            (df.groupby('Company').apply(lambda x:
            list(map(int,itertools.chain.from_iterable(x['N'].str.split(';'))))).to_dict())



            'A': [1234, 878, 3434, 5678, 873],
            'B': [539, 0, 123],
            'C': [155, 741, 655, 5377, 454]


            you can also use sum, for concating the lists, but not recommended for large data(it has performance issues, better use itertools)



            EDIT for slicing 2 elements from the list use:



            import itertools
            (df.groupby('Company').apply(lambda x:
            list(map(int,[k[:2] for k in itertools.chain.from_iterable(x['N'].str.split(';'))]))).to_dict())


            this outputs:



            'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]


            Note the use of map() here, it is used to convert the list elements from string to a int. Since the original dtype is a string and we do a str.split(), the list has strings.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Mar 25 at 14:42

























            answered Mar 24 at 16:39









            anky_91anky_91

            15.7k41023




            15.7k41023












            • Hi, thanks for your great solution! Can you explain the map() part please? Also, how do I slice it if I only want to keep the first 2 digits? Ex. 'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]?

              – Alex
              Mar 25 at 14:11











            • @Alex check updated answer under EDIT. Hope it helps. :)

              – anky_91
              Mar 25 at 14:42






            • 1





              It helps! Thank you so much! So if I don't need to convert string to int, I don't need to use map() and list() since it's already a list?

              – Alex
              Mar 27 at 13:51












            • @Alex yes, exactly.

              – anky_91
              Mar 27 at 13:52






            • 1





              Thanks for being so patient with me! Hope you have a blessed day!

              – Alex
              Mar 27 at 13:58

















            • Hi, thanks for your great solution! Can you explain the map() part please? Also, how do I slice it if I only want to keep the first 2 digits? Ex. 'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]?

              – Alex
              Mar 25 at 14:11











            • @Alex check updated answer under EDIT. Hope it helps. :)

              – anky_91
              Mar 25 at 14:42






            • 1





              It helps! Thank you so much! So if I don't need to convert string to int, I don't need to use map() and list() since it's already a list?

              – Alex
              Mar 27 at 13:51












            • @Alex yes, exactly.

              – anky_91
              Mar 27 at 13:52






            • 1





              Thanks for being so patient with me! Hope you have a blessed day!

              – Alex
              Mar 27 at 13:58
















            Hi, thanks for your great solution! Can you explain the map() part please? Also, how do I slice it if I only want to keep the first 2 digits? Ex. 'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]?

            – Alex
            Mar 25 at 14:11





            Hi, thanks for your great solution! Can you explain the map() part please? Also, how do I slice it if I only want to keep the first 2 digits? Ex. 'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]?

            – Alex
            Mar 25 at 14:11













            @Alex check updated answer under EDIT. Hope it helps. :)

            – anky_91
            Mar 25 at 14:42





            @Alex check updated answer under EDIT. Hope it helps. :)

            – anky_91
            Mar 25 at 14:42




            1




            1





            It helps! Thank you so much! So if I don't need to convert string to int, I don't need to use map() and list() since it's already a list?

            – Alex
            Mar 27 at 13:51






            It helps! Thank you so much! So if I don't need to convert string to int, I don't need to use map() and list() since it's already a list?

            – Alex
            Mar 27 at 13:51














            @Alex yes, exactly.

            – anky_91
            Mar 27 at 13:52





            @Alex yes, exactly.

            – anky_91
            Mar 27 at 13:52




            1




            1





            Thanks for being so patient with me! Hope you have a blessed day!

            – Alex
            Mar 27 at 13:58





            Thanks for being so patient with me! Hope you have a blessed day!

            – Alex
            Mar 27 at 13:58













            2














            You can read the xlsx file and convert your dataframe into a dictionary using the below code



            import pandas as pd
            xls_dict = xls_data.to_dict('records')
            print(xls_dict)


            Then, you can generate your required output with the below code



            output_dict = dict()

            for xls_dat in xls_dict:
            key_list = list()
            if 'N' in xls_dat:
            if xls_dat.get('Company') in output_dict:
            lis = output_dict.get(xls_dat.get('Company'))
            lis2 = [int(i) for i in xls_dat.get('N').split(';')]
            output_dict[xls_dat.get('Company')] = lis + lis2
            else:
            key_list = [int(i) for i in xls_dat.get('N').split(';')]
            output_dict[xls_dat.get('Company')] = key_list


            Output:



            'A': [1234, 878, 3434, 5678, 873], 'B': [539, 0, 123], 'C': [155, 741, 655, 5377, 454]





            share|improve this answer



























              2














              You can read the xlsx file and convert your dataframe into a dictionary using the below code



              import pandas as pd
              xls_dict = xls_data.to_dict('records')
              print(xls_dict)


              Then, you can generate your required output with the below code



              output_dict = dict()

              for xls_dat in xls_dict:
              key_list = list()
              if 'N' in xls_dat:
              if xls_dat.get('Company') in output_dict:
              lis = output_dict.get(xls_dat.get('Company'))
              lis2 = [int(i) for i in xls_dat.get('N').split(';')]
              output_dict[xls_dat.get('Company')] = lis + lis2
              else:
              key_list = [int(i) for i in xls_dat.get('N').split(';')]
              output_dict[xls_dat.get('Company')] = key_list


              Output:



              'A': [1234, 878, 3434, 5678, 873], 'B': [539, 0, 123], 'C': [155, 741, 655, 5377, 454]





              share|improve this answer

























                2












                2








                2







                You can read the xlsx file and convert your dataframe into a dictionary using the below code



                import pandas as pd
                xls_dict = xls_data.to_dict('records')
                print(xls_dict)


                Then, you can generate your required output with the below code



                output_dict = dict()

                for xls_dat in xls_dict:
                key_list = list()
                if 'N' in xls_dat:
                if xls_dat.get('Company') in output_dict:
                lis = output_dict.get(xls_dat.get('Company'))
                lis2 = [int(i) for i in xls_dat.get('N').split(';')]
                output_dict[xls_dat.get('Company')] = lis + lis2
                else:
                key_list = [int(i) for i in xls_dat.get('N').split(';')]
                output_dict[xls_dat.get('Company')] = key_list


                Output:



                'A': [1234, 878, 3434, 5678, 873], 'B': [539, 0, 123], 'C': [155, 741, 655, 5377, 454]





                share|improve this answer













                You can read the xlsx file and convert your dataframe into a dictionary using the below code



                import pandas as pd
                xls_dict = xls_data.to_dict('records')
                print(xls_dict)


                Then, you can generate your required output with the below code



                output_dict = dict()

                for xls_dat in xls_dict:
                key_list = list()
                if 'N' in xls_dat:
                if xls_dat.get('Company') in output_dict:
                lis = output_dict.get(xls_dat.get('Company'))
                lis2 = [int(i) for i in xls_dat.get('N').split(';')]
                output_dict[xls_dat.get('Company')] = lis + lis2
                else:
                key_list = [int(i) for i in xls_dat.get('N').split(';')]
                output_dict[xls_dat.get('Company')] = key_list


                Output:



                'A': [1234, 878, 3434, 5678, 873], 'B': [539, 0, 123], 'C': [155, 741, 655, 5377, 454]






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Mar 24 at 17:33









                DineshDinesh

                1078




                1078



























                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55325953%2fhow-to-merge-values-in-columnb-based-on-values-in-columna%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

                    SQL error code 1064 with creating Laravel foreign keysForeign key constraints: When to use ON UPDATE and ON DELETEDropping column with foreign key Laravel error: General error: 1025 Error on renameLaravel SQL Can't create tableLaravel Migration foreign key errorLaravel php artisan migrate:refresh giving a syntax errorSQLSTATE[42S01]: Base table or view already exists or Base table or view already exists: 1050 Tableerror in migrating laravel file to xampp serverSyntax error or access violation: 1064:syntax to use near 'unsigned not null, modelName varchar(191) not null, title varchar(191) not nLaravel cannot create new table field in mysqlLaravel 5.7:Last migration creates table but is not registered in the migration table

                    은진 송씨 목차 역사 본관 분파 인물 조선 왕실과의 인척 관계 집성촌 항렬자 인구 같이 보기 각주 둘러보기 메뉴은진 송씨세종실록 149권, 지리지 충청도 공주목 은진현