How to merge values in columnB based on values in columnAHow to merge two dictionaries in a single expression?How do I check if a list is empty?How do I check whether a file exists without exceptions?How can I safely create a nested directory?How do I sort a dictionary by value?How to make a chain of function decorators?How do I list all files of a directory?How to access environment variable values?“Large data” work flows using pandasSelect rows from a DataFrame based on values in a column in pandas

Passing multiple files through stdin (over ssh)

Frame failure sudden death?

How Can I Tell The Difference Between Unmarked Sugar and Stevia?

Is it a problem if <h4>, <h5> and <h6> are smaller than regular text?

What risks are there when you clear your cookies instead of logging off?

What does the term "railed" mean in signal processing?

How to build suspense or so to establish and justify xenophobia of characters in the eyes of the reader?

Is open-sourcing the code of a webapp not recommended?

Scrum Master role: Reporting?

Chemmacros scheme translation

How can drunken, homicidal elves successfully conduct a wild hunt?

How can I most clearly write a homebrew item that affects the ground below its radius after the initial explosion it creates?

Winning Strategy for the Magician and his Apprentice

Is using haveibeenpwned to validate password strength rational?

Words that signal future content

Do any instruments not produce overtones?

What's the largest optical telescope mirror ever put in space?

When 2-pentene reacts with HBr, what will be the major product?

Can anyone identify this tank?

How to officially communicate to a non-responsive colleague?

Soft question: Examples where lack of mathematical rigour cause security breaches?

How to retract an idea already pitched to an employer?

At what point in time did Dumbledore ask Snape for this favor?

Can the poison from Kingsmen be concocted?

How to merge values in columnB based on values in columnA

How to merge two dictionaries in a single expression?How do I check if a list is empty?How do I check whether a file exists without exceptions?How can I safely create a nested directory?How do I sort a dictionary by value?How to make a chain of function decorators?How do I list all files of a directory?How to access environment variable values?“Large data” work flows using pandasSelect rows from a DataFrame based on values in a column in pandas

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;

I have a xlsx looks like this:

Company N
A 1234;878;3434
A 5678;873
B 539
B 00;123
C 155;741;655
C 5377;454

I'm using pandas to import it into my program, can I merge N based on their company?

Desired outcome: 'A': [1234,878,3434,5678,873], 'B': [539, 00, 123], 'C': [155, 741, 655, 5377, 454]

edited Mar 24 at 17:24

anky_91

15.7k41023

asked Mar 24 at 16:26

Alex

356

add a comment |

I have a xlsx looks like this:

Company N
A 1234;878;3434
A 5678;873
B 539
B 00;123
C 155;741;655
C 5377;454

I'm using pandas to import it into my program, can I merge N based on their company?

Desired outcome: 'A': [1234,878,3434,5678,873], 'B': [539, 00, 123], 'C': [155, 741, 655, 5377, 454]

edited Mar 24 at 17:24

anky_91

15.7k41023

asked Mar 24 at 16:26

Alex

356

add a comment |

I have a xlsx looks like this:

Company N
A 1234;878;3434
A 5678;873
B 539
B 00;123
C 155;741;655
C 5377;454

I'm using pandas to import it into my program, can I merge N based on their company?

Desired outcome: 'A': [1234,878,3434,5678,873], 'B': [539, 00, 123], 'C': [155, 741, 655, 5377, 454]

edited Mar 24 at 17:24

anky_91

15.7k41023

asked Mar 24 at 16:26

Alex

356

I have a xlsx looks like this:

Company N
A 1234;878;3434
A 5678;873
B 539
B 00;123
C 155;741;655
C 5377;454

I'm using pandas to import it into my program, can I merge N based on their company?

Desired outcome: 'A': [1234,878,3434,5678,873], 'B': [539, 00, 123], 'C': [155, 741, 655, 5377, 454]

python excel python-3.x pandas

edited Mar 24 at 17:24

anky_91

15.7k41023

asked Mar 24 at 16:26

Alex

356

edited Mar 24 at 17:24

anky_91

15.7k41023

asked Mar 24 at 16:26

Alex

356

edited Mar 24 at 17:24

anky_91

15.7k41023

edited Mar 24 at 17:24

anky_91

15.7k41023

edited Mar 24 at 17:24

anky_91

15.7k41023

asked Mar 24 at 16:26

Alex

356

asked Mar 24 at 16:26

Alex

356

asked Mar 24 at 16:26

Alex

356

add a comment |

2 Answers
2

active

oldest

votes

groupby and split, then apply list and turn to dict like:

import itertools
(df.groupby('Company').apply(lambda x: 
 list(map(int,itertools.chain.from_iterable(x['N'].str.split(';'))))).to_dict())

'A': [1234, 878, 3434, 5678, 873],
 'B': [539, 0, 123],
 'C': [155, 741, 655, 5377, 454]

you can also use sum, for concating the lists, but not recommended for large data(it has performance issues, better use itertools)

EDIT for slicing 2 elements from the list use:

import itertools
(df.groupby('Company').apply(lambda x: 
 list(map(int,[k[:2] for k in itertools.chain.from_iterable(x['N'].str.split(';'))]))).to_dict())

this outputs:

'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]

Note the use of map() here, it is used to convert the list elements from string to a int. Since the original dtype is a string and we do a str.split(), the list has strings.

edited Mar 25 at 14:42

answered Mar 24 at 16:39

anky_91

15.7k41023

Hi, thanks for your great solution! Can you explain the map() part please? Also, how do I slice it if I only want to keep the first 2 digits? Ex. 'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]?

– Alex
Mar 25 at 14:11

@Alex check updated answer under EDIT. Hope it helps. :)

– anky_91
Mar 25 at 14:42

1

It helps! Thank you so much! So if I don't need to convert string to int, I don't need to use map() and list() since it's already a list?

– Alex
Mar 27 at 13:51

@Alex yes, exactly.

– anky_91
Mar 27 at 13:52

1

Thanks for being so patient with me! Hope you have a blessed day!

– Alex
Mar 27 at 13:58

|
show 1 more comment

You can read the xlsx file and convert your dataframe into a dictionary using the below code

import pandas as pd
xls_dict = xls_data.to_dict('records')
print(xls_dict)

Then, you can generate your required output with the below code

output_dict = dict()

for xls_dat in xls_dict:
 key_list = list()
 if 'N' in xls_dat:
 if xls_dat.get('Company') in output_dict:
 lis = output_dict.get(xls_dat.get('Company'))
 lis2 = [int(i) for i in xls_dat.get('N').split(';')]
 output_dict[xls_dat.get('Company')] = lis + lis2
 else:
 key_list = [int(i) for i in xls_dat.get('N').split(';')]
 output_dict[xls_dat.get('Company')] = key_list

Output:

'A': [1234, 878, 3434, 5678, 873], 'B': [539, 0, 123], 'C': [155, 741, 655, 5377, 454]

answered Mar 24 at 17:33

Dinesh

1078

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55325953%2fhow-to-merge-values-in-columnb-based-on-values-in-columna%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

groupby and split, then apply list and turn to dict like:

import itertools
(df.groupby('Company').apply(lambda x: 
 list(map(int,itertools.chain.from_iterable(x['N'].str.split(';'))))).to_dict())

'A': [1234, 878, 3434, 5678, 873],
 'B': [539, 0, 123],
 'C': [155, 741, 655, 5377, 454]

you can also use sum, for concating the lists, but not recommended for large data(it has performance issues, better use itertools)

EDIT for slicing 2 elements from the list use:

import itertools
(df.groupby('Company').apply(lambda x: 
 list(map(int,[k[:2] for k in itertools.chain.from_iterable(x['N'].str.split(';'))]))).to_dict())

this outputs:

'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]

Note the use of map() here, it is used to convert the list elements from string to a int. Since the original dtype is a string and we do a str.split(), the list has strings.

edited Mar 25 at 14:42

answered Mar 24 at 16:39

anky_91

15.7k41023

Hi, thanks for your great solution! Can you explain the map() part please? Also, how do I slice it if I only want to keep the first 2 digits? Ex. 'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]?

– Alex
Mar 25 at 14:11

@Alex check updated answer under EDIT. Hope it helps. :)

– anky_91
Mar 25 at 14:42

1

It helps! Thank you so much! So if I don't need to convert string to int, I don't need to use map() and list() since it's already a list?

– Alex
Mar 27 at 13:51

@Alex yes, exactly.

– anky_91
Mar 27 at 13:52

1

Thanks for being so patient with me! Hope you have a blessed day!

– Alex
Mar 27 at 13:58

|
show 1 more comment

groupby and split, then apply list and turn to dict like:

import itertools
(df.groupby('Company').apply(lambda x: 
 list(map(int,itertools.chain.from_iterable(x['N'].str.split(';'))))).to_dict())

'A': [1234, 878, 3434, 5678, 873],
 'B': [539, 0, 123],
 'C': [155, 741, 655, 5377, 454]

you can also use sum, for concating the lists, but not recommended for large data(it has performance issues, better use itertools)

EDIT for slicing 2 elements from the list use:

import itertools
(df.groupby('Company').apply(lambda x: 
 list(map(int,[k[:2] for k in itertools.chain.from_iterable(x['N'].str.split(';'))]))).to_dict())

this outputs:

'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]

Note the use of map() here, it is used to convert the list elements from string to a int. Since the original dtype is a string and we do a str.split(), the list has strings.

edited Mar 25 at 14:42

answered Mar 24 at 16:39

anky_91

15.7k41023

Hi, thanks for your great solution! Can you explain the map() part please? Also, how do I slice it if I only want to keep the first 2 digits? Ex. 'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]?

– Alex
Mar 25 at 14:11

@Alex check updated answer under EDIT. Hope it helps. :)

– anky_91
Mar 25 at 14:42

1

It helps! Thank you so much! So if I don't need to convert string to int, I don't need to use map() and list() since it's already a list?

– Alex
Mar 27 at 13:51

@Alex yes, exactly.

– anky_91
Mar 27 at 13:52

1

Thanks for being so patient with me! Hope you have a blessed day!

– Alex
Mar 27 at 13:58

|
show 1 more comment

groupby and split, then apply list and turn to dict like:

import itertools
(df.groupby('Company').apply(lambda x: 
 list(map(int,itertools.chain.from_iterable(x['N'].str.split(';'))))).to_dict())

'A': [1234, 878, 3434, 5678, 873],
 'B': [539, 0, 123],
 'C': [155, 741, 655, 5377, 454]

you can also use sum, for concating the lists, but not recommended for large data(it has performance issues, better use itertools)

EDIT for slicing 2 elements from the list use:

import itertools
(df.groupby('Company').apply(lambda x: 
 list(map(int,[k[:2] for k in itertools.chain.from_iterable(x['N'].str.split(';'))]))).to_dict())

this outputs:

'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]

Note the use of map() here, it is used to convert the list elements from string to a int. Since the original dtype is a string and we do a str.split(), the list has strings.

edited Mar 25 at 14:42

answered Mar 24 at 16:39

anky_91

15.7k41023

groupby and split, then apply list and turn to dict like:

import itertools
(df.groupby('Company').apply(lambda x: 
 list(map(int,itertools.chain.from_iterable(x['N'].str.split(';'))))).to_dict())

'A': [1234, 878, 3434, 5678, 873],
 'B': [539, 0, 123],
 'C': [155, 741, 655, 5377, 454]

you can also use sum, for concating the lists, but not recommended for large data(it has performance issues, better use itertools)

EDIT for slicing 2 elements from the list use:

import itertools
(df.groupby('Company').apply(lambda x: 
 list(map(int,[k[:2] for k in itertools.chain.from_iterable(x['N'].str.split(';'))]))).to_dict())

this outputs:

'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]

Note the use of map() here, it is used to convert the list elements from string to a int. Since the original dtype is a string and we do a str.split(), the list has strings.

edited Mar 25 at 14:42

answered Mar 24 at 16:39

anky_91

15.7k41023

edited Mar 25 at 14:42

answered Mar 24 at 16:39

anky_91

15.7k41023

answered Mar 24 at 16:39

anky_91

15.7k41023

answered Mar 24 at 16:39

anky_91

15.7k41023

Hi, thanks for your great solution! Can you explain the map() part please? Also, how do I slice it if I only want to keep the first 2 digits? Ex. 'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]?

– Alex
Mar 25 at 14:11

@Alex check updated answer under EDIT. Hope it helps. :)

– anky_91
Mar 25 at 14:42

1

It helps! Thank you so much! So if I don't need to convert string to int, I don't need to use map() and list() since it's already a list?

– Alex
Mar 27 at 13:51

@Alex yes, exactly.

– anky_91
Mar 27 at 13:52

1

Thanks for being so patient with me! Hope you have a blessed day!

– Alex
Mar 27 at 13:58

|
show 1 more comment

Hi, thanks for your great solution! Can you explain the map() part please? Also, how do I slice it if I only want to keep the first 2 digits? Ex. 'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]?

– Alex
Mar 25 at 14:11

@Alex check updated answer under EDIT. Hope it helps. :)

– anky_91
Mar 25 at 14:42

1

It helps! Thank you so much! So if I don't need to convert string to int, I don't need to use map() and list() since it's already a list?

– Alex
Mar 27 at 13:51

@Alex yes, exactly.

– anky_91
Mar 27 at 13:52

1

Thanks for being so patient with me! Hope you have a blessed day!

– Alex
Mar 27 at 13:58

Hi, thanks for your great solution! Can you explain the map() part please? Also, how do I slice it if I only want to keep the first 2 digits? Ex. 'A': [12, 87, 34, 56, 87], 'B': [53, 0, 12], 'C': [15, 74, 65, 53, 45]?

– Alex
Mar 25 at 14:11

@Alex check updated answer under EDIT. Hope it helps. :)

– anky_91
Mar 25 at 14:42

It helps! Thank you so much! So if I don't need to convert string to int, I don't need to use map() and list() since it's already a list?

– Alex
Mar 27 at 13:51

@Alex yes, exactly.

– anky_91
Mar 27 at 13:52

Thanks for being so patient with me! Hope you have a blessed day!

– Alex
Mar 27 at 13:58

|
show 1 more comment

You can read the xlsx file and convert your dataframe into a dictionary using the below code

import pandas as pd
xls_dict = xls_data.to_dict('records')
print(xls_dict)

Then, you can generate your required output with the below code

output_dict = dict()

for xls_dat in xls_dict:
 key_list = list()
 if 'N' in xls_dat:
 if xls_dat.get('Company') in output_dict:
 lis = output_dict.get(xls_dat.get('Company'))
 lis2 = [int(i) for i in xls_dat.get('N').split(';')]
 output_dict[xls_dat.get('Company')] = lis + lis2
 else:
 key_list = [int(i) for i in xls_dat.get('N').split(';')]
 output_dict[xls_dat.get('Company')] = key_list

Output:

'A': [1234, 878, 3434, 5678, 873], 'B': [539, 0, 123], 'C': [155, 741, 655, 5377, 454]

answered Mar 24 at 17:33

Dinesh

1078

add a comment |

You can read the xlsx file and convert your dataframe into a dictionary using the below code

import pandas as pd
xls_dict = xls_data.to_dict('records')
print(xls_dict)

Then, you can generate your required output with the below code

output_dict = dict()

for xls_dat in xls_dict:
 key_list = list()
 if 'N' in xls_dat:
 if xls_dat.get('Company') in output_dict:
 lis = output_dict.get(xls_dat.get('Company'))
 lis2 = [int(i) for i in xls_dat.get('N').split(';')]
 output_dict[xls_dat.get('Company')] = lis + lis2
 else:
 key_list = [int(i) for i in xls_dat.get('N').split(';')]
 output_dict[xls_dat.get('Company')] = key_list

Output:

'A': [1234, 878, 3434, 5678, 873], 'B': [539, 0, 123], 'C': [155, 741, 655, 5377, 454]

answered Mar 24 at 17:33

Dinesh

1078

add a comment |

You can read the xlsx file and convert your dataframe into a dictionary using the below code

import pandas as pd
xls_dict = xls_data.to_dict('records')
print(xls_dict)

Then, you can generate your required output with the below code

output_dict = dict()

for xls_dat in xls_dict:
 key_list = list()
 if 'N' in xls_dat:
 if xls_dat.get('Company') in output_dict:
 lis = output_dict.get(xls_dat.get('Company'))
 lis2 = [int(i) for i in xls_dat.get('N').split(';')]
 output_dict[xls_dat.get('Company')] = lis + lis2
 else:
 key_list = [int(i) for i in xls_dat.get('N').split(';')]
 output_dict[xls_dat.get('Company')] = key_list

Output:

'A': [1234, 878, 3434, 5678, 873], 'B': [539, 0, 123], 'C': [155, 741, 655, 5377, 454]

answered Mar 24 at 17:33

Dinesh

1078

You can read the xlsx file and convert your dataframe into a dictionary using the below code

import pandas as pd
xls_dict = xls_data.to_dict('records')
print(xls_dict)

Then, you can generate your required output with the below code

output_dict = dict()

for xls_dat in xls_dict:
 key_list = list()
 if 'N' in xls_dat:
 if xls_dat.get('Company') in output_dict:
 lis = output_dict.get(xls_dat.get('Company'))
 lis2 = [int(i) for i in xls_dat.get('N').split(';')]
 output_dict[xls_dat.get('Company')] = lis + lis2
 else:
 key_list = [int(i) for i in xls_dat.get('N').split(';')]
 output_dict[xls_dat.get('Company')] = key_list

Output:

'A': [1234, 878, 3434, 5678, 873], 'B': [539, 0, 123], 'C': [155, 741, 655, 5377, 454]

answered Mar 24 at 17:33

Dinesh

1078

answered Mar 24 at 17:33

Dinesh

1078

answered Mar 24 at 17:33

Dinesh

1078

answered Mar 24 at 17:33

Dinesh

1078

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Styjun

2 Answers
2

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Post as a guest

Popular posts from this blog

용인 삼성생명 블루밍스 목차 통계 역대 감독 선수단 응원단 경기장 같이 보기 외부 링크 둘러보기 메뉴samsungblueminx.comeh선수 명단용인 삼성생명 블루밍스용인 삼성생명 블루밍스ehsamsungblueminx.comeheheheh

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

용인 삼성생명 블루밍스 목차 통계 역대 감독 선수단 응원단 경기장 같이 보기 외부 링크 둘러보기 메뉴samsungblueminx.comeh선수 명단용인 삼성생명 블루밍스용인 삼성생명 블루밍스ehsamsungblueminx.comeheheheh

2 Answers
2

2 Answers
2

2 Answers
2