How to join multiple rows in single pandas dataframe by common key column (fixed length limit)?Reshape DataFrame from long to wide along one columnAdd one row to pandas DataFrameSelecting multiple columns in a pandas dataframeAdding new column to existing DataFrame in Python pandasDelete column from pandas DataFrameHow to drop rows of Pandas DataFrame whose value in a certain column is NaN“Large data” work flows using pandasHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headers

Coupling two 15 Amp circuit breaker for 20 Amp

Convert shapefille to KML

Why did Lucius make a deal out of Buckbeak hurting Draco but not about Draco being turned into a ferret?

Heat output from a 200W electric radiator?

What is this "opened" cube called?

How to determine the convexity of my problem and categorize it?

Why is 3/4 a simple meter while 6/8 is a compound meter?

Why do motor drives have multiple bus capacitors of small value capacitance instead of a single bus capacitor of large value?

In Endgame, wouldn't Stark have remembered Hulk busting out of the stairwell?

RAID0 instead of RAID1 or 5, is this crazy?

Why didn't Doc believe Marty was from the future?

Is it recommended to point out a professor's mistake during their lecture?

Is this position a forced win for Black after move 14?

In what language did Túrin converse with Mím?

Can two aircraft stay on the same runway at the same time?

Are spot colors limited and why CMYK mix is not treated same as spot color mix?

How can I observe Sgr A* with itelescope.net

Why do presidential pardons exist in a country having a clear separation of powers?

I feel cheated by my new employer, does this sound right?

Why did Starhopper's exhaust plume become brighter just before landing?

How can I throw a body?

How can I reply to coworkers who accuse me of automating people out of work?

How to save money by shopping at a variety of grocery stores?

Under GDPR, can I give permission once to allow everyone to store and process my data?

How to join multiple rows in single pandas dataframe by common key column (fixed length limit)?

Reshape DataFrame from long to wide along one columnAdd one row to pandas DataFrameSelecting multiple columns in a pandas dataframeAdding new column to existing DataFrame in Python pandasDelete column from pandas DataFrameHow to drop rows of Pandas DataFrame whose value in a certain column is NaN“Large data” work flows using pandasHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headers

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

How can you join multiple rows in single pandas dataframe by common key column where we let there be a fixed length limit for any combined row of rows (as the number of rows with a given common key in this case is variable)?

Have a dataframe of a form like...

key x1 x2 x3
-------------
1 a1 a2 a3
1 b1 b2 b3
2 c1 c2 c3
3 d1 d2 d3
3 e1 e2 e3
3 f1 f2 f3
3 g1 g2 g3
....

and would like to change it to something like

key x11 x12 x13 x21 x22 x23 x31 x32 x33
-------------
1 a1 a2 a3 b1 b2 b3 NA NA NA
2 c1 c2 c3 NA NA NA NA NA NA
3 d1 d2 d3 e1 e2 e3 f1 f2 f3
....

where column xjk is the kth feature of the jth row having the same key as the other rows grouped in this same row up to (in this case is manually set to...) 3 per group (but may want to change later and may be a value greater than the amount of groupable rows (eg. 5 here) in which case it should just fill with NAs). Notice that when there are less than the max limit of individual original rows to group we fill the values with NA and when there are too many rows we group only up to the max limit of rows and drop the rest from the dataframe. Also note that sometimes an individual row may have missing values.

Any suggestions on how this could be done?

edited Mar 28 at 1:50

asked Mar 27 at 22:16

lampShadesDrifter

1,2822 gold badges9 silver badges31 bronze badges

add a comment |

Have a dataframe of a form like...

key x1 x2 x3
-------------
1 a1 a2 a3
1 b1 b2 b3
2 c1 c2 c3
3 d1 d2 d3
3 e1 e2 e3
3 f1 f2 f3
3 g1 g2 g3
....

and would like to change it to something like

key x11 x12 x13 x21 x22 x23 x31 x32 x33
-------------
1 a1 a2 a3 b1 b2 b3 NA NA NA
2 c1 c2 c3 NA NA NA NA NA NA
3 d1 d2 d3 e1 e2 e3 f1 f2 f3
....

Any suggestions on how this could be done?

edited Mar 28 at 1:50

asked Mar 27 at 22:16

lampShadesDrifter

1,2822 gold badges9 silver badges31 bronze badges

add a comment |

Have a dataframe of a form like...

key x1 x2 x3
-------------
1 a1 a2 a3
1 b1 b2 b3
2 c1 c2 c3
3 d1 d2 d3
3 e1 e2 e3
3 f1 f2 f3
3 g1 g2 g3
....

and would like to change it to something like

key x11 x12 x13 x21 x22 x23 x31 x32 x33
-------------
1 a1 a2 a3 b1 b2 b3 NA NA NA
2 c1 c2 c3 NA NA NA NA NA NA
3 d1 d2 d3 e1 e2 e3 f1 f2 f3
....

Any suggestions on how this could be done?

edited Mar 28 at 1:50

asked Mar 27 at 22:16

lampShadesDrifter

1,2822 gold badges9 silver badges31 bronze badges

Have a dataframe of a form like...

key x1 x2 x3
-------------
1 a1 a2 a3
1 b1 b2 b3
2 c1 c2 c3
3 d1 d2 d3
3 e1 e2 e3
3 f1 f2 f3
3 g1 g2 g3
....

and would like to change it to something like

key x11 x12 x13 x21 x22 x23 x31 x32 x33
-------------
1 a1 a2 a3 b1 b2 b3 NA NA NA
2 c1 c2 c3 NA NA NA NA NA NA
3 d1 d2 d3 e1 e2 e3 f1 f2 f3
....

Any suggestions on how this could be done?

python pandas

edited Mar 28 at 1:50

asked Mar 27 at 22:16

lampShadesDrifter

1,2822 gold badges9 silver badges31 bronze badges

edited Mar 28 at 1:50

asked Mar 27 at 22:16

lampShadesDrifter

1,2822 gold badges9 silver badges31 bronze badges

edited Mar 28 at 1:50

asked Mar 27 at 22:16

lampShadesDrifter

1,2822 gold badges9 silver badges31 bronze badges

asked Mar 27 at 22:16

lampShadesDrifter

1,2822 gold badges9 silver badges31 bronze badges

asked Mar 27 at 22:16

lampShadesDrifter

1,2822 gold badges9 silver badges31 bronze badges

add a comment |

1 Answer
1

active

oldest

votes

Using groupby and then ravel to flatten all values inside a group:

lim = 5

df = df.set_index('key')
k = len(df.columns)

x = df.groupby(level=0).apply(
 lambda z: z.iloc[:lim].values.ravel().tolist() +
 [np.nan]*(lim*k-z.size))

x = pd.DataFrame(x.tolist(), x.index)

x.columns = [f'x1+i//k1+i%k' for i in x.columns]

print(x)

Output:

 x11 x12 x13 x21 x22 x23 x31 x32 x33 x41 x42 x43 x51 x52 x53
key 
1 a1 a2 a3 b1 b2 b3 NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 c1 c2 c3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 d1 d2 d3 e1 e2 e3 f1 f2 f3 g1 g2 g3 NaN NaN NaN

edited Mar 28 at 6:19

answered Mar 27 at 22:24

perl

2,1014 silver badges17 bronze badges

2

wow, amazing answer

– Yuca
Mar 27 at 22:25

2

Thanks. Note for others using this question, note that the last line's string formatting for labeling the columns will only work in python 3.6+, if still using python 2.7 need to use 'x.format((1+i//len(x)), (1+i%len(x)))'.

– lampShadesDrifter
Mar 27 at 22:50

Sorry, you're right, I missed that requirement. Updated my answer with lim variable that sets this limit. We basically need to take the first lim rows in the apply with .iloc[:lim]

– perl
Mar 27 at 22:53

@lampShadesDrifter: And thanks, it's a very good point about the f-strings in python 3.6

– perl
Mar 27 at 22:57

Oddly, this code does not seem to be working for me (using python 2.7) a test dataframe made to be like the that in the original question. Getting column labels: x11 x12 x13 x14 x15 x16 x17 x18 x19. I think the last line in the given code should be something like: x.columns = [f'x1+i//len(df.columns)1+i%len(df.columns)' for i in x.columns]. That then gave me the results shown in this answer.

– lampShadesDrifter
Mar 28 at 2:17

|
show 2 more comments

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55387317%2fhow-to-join-multiple-rows-in-single-pandas-dataframe-by-common-key-column-fixed%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Using groupby and then ravel to flatten all values inside a group:

lim = 5

df = df.set_index('key')
k = len(df.columns)

x = df.groupby(level=0).apply(
 lambda z: z.iloc[:lim].values.ravel().tolist() +
 [np.nan]*(lim*k-z.size))

x = pd.DataFrame(x.tolist(), x.index)

x.columns = [f'x1+i//k1+i%k' for i in x.columns]

print(x)

Output:

 x11 x12 x13 x21 x22 x23 x31 x32 x33 x41 x42 x43 x51 x52 x53
key 
1 a1 a2 a3 b1 b2 b3 NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 c1 c2 c3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 d1 d2 d3 e1 e2 e3 f1 f2 f3 g1 g2 g3 NaN NaN NaN

edited Mar 28 at 6:19

answered Mar 27 at 22:24

perl

2,1014 silver badges17 bronze badges

2

wow, amazing answer

– Yuca
Mar 27 at 22:25

2

Thanks. Note for others using this question, note that the last line's string formatting for labeling the columns will only work in python 3.6+, if still using python 2.7 need to use 'x.format((1+i//len(x)), (1+i%len(x)))'.

– lampShadesDrifter
Mar 27 at 22:50

Sorry, you're right, I missed that requirement. Updated my answer with lim variable that sets this limit. We basically need to take the first lim rows in the apply with .iloc[:lim]

– perl
Mar 27 at 22:53

@lampShadesDrifter: And thanks, it's a very good point about the f-strings in python 3.6

– perl
Mar 27 at 22:57

Oddly, this code does not seem to be working for me (using python 2.7) a test dataframe made to be like the that in the original question. Getting column labels: x11 x12 x13 x14 x15 x16 x17 x18 x19. I think the last line in the given code should be something like: x.columns = [f'x1+i//len(df.columns)1+i%len(df.columns)' for i in x.columns]. That then gave me the results shown in this answer.

– lampShadesDrifter
Mar 28 at 2:17

|
show 2 more comments

Using groupby and then ravel to flatten all values inside a group:

lim = 5

df = df.set_index('key')
k = len(df.columns)

x = df.groupby(level=0).apply(
 lambda z: z.iloc[:lim].values.ravel().tolist() +
 [np.nan]*(lim*k-z.size))

x = pd.DataFrame(x.tolist(), x.index)

x.columns = [f'x1+i//k1+i%k' for i in x.columns]

print(x)

Output:

 x11 x12 x13 x21 x22 x23 x31 x32 x33 x41 x42 x43 x51 x52 x53
key 
1 a1 a2 a3 b1 b2 b3 NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 c1 c2 c3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 d1 d2 d3 e1 e2 e3 f1 f2 f3 g1 g2 g3 NaN NaN NaN

edited Mar 28 at 6:19

answered Mar 27 at 22:24

perl

2,1014 silver badges17 bronze badges

2

wow, amazing answer

– Yuca
Mar 27 at 22:25

2

Thanks. Note for others using this question, note that the last line's string formatting for labeling the columns will only work in python 3.6+, if still using python 2.7 need to use 'x.format((1+i//len(x)), (1+i%len(x)))'.

– lampShadesDrifter
Mar 27 at 22:50

Sorry, you're right, I missed that requirement. Updated my answer with lim variable that sets this limit. We basically need to take the first lim rows in the apply with .iloc[:lim]

– perl
Mar 27 at 22:53

@lampShadesDrifter: And thanks, it's a very good point about the f-strings in python 3.6

– perl
Mar 27 at 22:57

Oddly, this code does not seem to be working for me (using python 2.7) a test dataframe made to be like the that in the original question. Getting column labels: x11 x12 x13 x14 x15 x16 x17 x18 x19. I think the last line in the given code should be something like: x.columns = [f'x1+i//len(df.columns)1+i%len(df.columns)' for i in x.columns]. That then gave me the results shown in this answer.

– lampShadesDrifter
Mar 28 at 2:17

|
show 2 more comments

Using groupby and then ravel to flatten all values inside a group:

lim = 5

df = df.set_index('key')
k = len(df.columns)

x = df.groupby(level=0).apply(
 lambda z: z.iloc[:lim].values.ravel().tolist() +
 [np.nan]*(lim*k-z.size))

x = pd.DataFrame(x.tolist(), x.index)

x.columns = [f'x1+i//k1+i%k' for i in x.columns]

print(x)

Output:

 x11 x12 x13 x21 x22 x23 x31 x32 x33 x41 x42 x43 x51 x52 x53
key 
1 a1 a2 a3 b1 b2 b3 NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 c1 c2 c3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 d1 d2 d3 e1 e2 e3 f1 f2 f3 g1 g2 g3 NaN NaN NaN

edited Mar 28 at 6:19

answered Mar 27 at 22:24

perl

2,1014 silver badges17 bronze badges

Using groupby and then ravel to flatten all values inside a group:

lim = 5

df = df.set_index('key')
k = len(df.columns)

x = df.groupby(level=0).apply(
 lambda z: z.iloc[:lim].values.ravel().tolist() +
 [np.nan]*(lim*k-z.size))

x = pd.DataFrame(x.tolist(), x.index)

x.columns = [f'x1+i//k1+i%k' for i in x.columns]

print(x)

Output:

 x11 x12 x13 x21 x22 x23 x31 x32 x33 x41 x42 x43 x51 x52 x53
key 
1 a1 a2 a3 b1 b2 b3 NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 c1 c2 c3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 d1 d2 d3 e1 e2 e3 f1 f2 f3 g1 g2 g3 NaN NaN NaN

edited Mar 28 at 6:19

answered Mar 27 at 22:24

perl

2,1014 silver badges17 bronze badges

edited Mar 28 at 6:19

answered Mar 27 at 22:24

perl

2,1014 silver badges17 bronze badges

answered Mar 27 at 22:24

perl

2,1014 silver badges17 bronze badges

answered Mar 27 at 22:24

perl

2,1014 silver badges17 bronze badges

2

wow, amazing answer

– Yuca
Mar 27 at 22:25

2

Thanks. Note for others using this question, note that the last line's string formatting for labeling the columns will only work in python 3.6+, if still using python 2.7 need to use 'x.format((1+i//len(x)), (1+i%len(x)))'.

– lampShadesDrifter
Mar 27 at 22:50

Sorry, you're right, I missed that requirement. Updated my answer with lim variable that sets this limit. We basically need to take the first lim rows in the apply with .iloc[:lim]

– perl
Mar 27 at 22:53

@lampShadesDrifter: And thanks, it's a very good point about the f-strings in python 3.6

– perl
Mar 27 at 22:57

Oddly, this code does not seem to be working for me (using python 2.7) a test dataframe made to be like the that in the original question. Getting column labels: x11 x12 x13 x14 x15 x16 x17 x18 x19. I think the last line in the given code should be something like: x.columns = [f'x1+i//len(df.columns)1+i%len(df.columns)' for i in x.columns]. That then gave me the results shown in this answer.

– lampShadesDrifter
Mar 28 at 2:17

|
show 2 more comments

2

wow, amazing answer

– Yuca
Mar 27 at 22:25

2

Thanks. Note for others using this question, note that the last line's string formatting for labeling the columns will only work in python 3.6+, if still using python 2.7 need to use 'x.format((1+i//len(x)), (1+i%len(x)))'.

– lampShadesDrifter
Mar 27 at 22:50

Sorry, you're right, I missed that requirement. Updated my answer with lim variable that sets this limit. We basically need to take the first lim rows in the apply with .iloc[:lim]

– perl
Mar 27 at 22:53

@lampShadesDrifter: And thanks, it's a very good point about the f-strings in python 3.6

– perl
Mar 27 at 22:57

Oddly, this code does not seem to be working for me (using python 2.7) a test dataframe made to be like the that in the original question. Getting column labels: x11 x12 x13 x14 x15 x16 x17 x18 x19. I think the last line in the given code should be something like: x.columns = [f'x1+i//len(df.columns)1+i%len(df.columns)' for i in x.columns]. That then gave me the results shown in this answer.

– lampShadesDrifter
Mar 28 at 2:17

wow, amazing answer

– Yuca
Mar 27 at 22:25

Thanks. Note for others using this question, note that the last line's string formatting for labeling the columns will only work in python 3.6+, if still using python 2.7 need to use 'x.format((1+i//len(x)), (1+i%len(x)))'.

– lampShadesDrifter
Mar 27 at 22:50

Sorry, you're right, I missed that requirement. Updated my answer with lim variable that sets this limit. We basically need to take the first lim rows in the apply with .iloc[:lim]

– perl
Mar 27 at 22:53

@lampShadesDrifter: And thanks, it's a very good point about the f-strings in python 3.6

– perl
Mar 27 at 22:57

Oddly, this code does not seem to be working for me (using python 2.7) a test dataframe made to be like the that in the original question. Getting column labels: x11 x12 x13 x14 x15 x16 x17 x18 x19. I think the last line in the given code should be something like: x.columns = [f'x1+i//len(df.columns)1+i%len(df.columns)' for i in x.columns]. That then gave me the results shown in this answer.

– lampShadesDrifter
Mar 28 at 2:17

|
show 2 more comments

Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Styjun

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

은진 송씨 목차 역사 본관 분파 인물 조선 왕실과의 인척 관계 집성촌 항렬자 인구 같이 보기 각주 둘러보기 메뉴은진 송씨세종실록 149권, 지리지 충청도 공주목 은진현

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

은진 송씨 목차 역사 본관 분파 인물 조선 왕실과의 인척 관계 집성촌 항렬자 인구 같이 보기 각주 둘러보기 메뉴은진 송씨세종실록 149권, 지리지 충청도 공주목 은진현

1 Answer
1

1 Answer
1

1 Answer
1