Nan in pd.DataFrame (simmetrical matrix)How can I check for NaN values?How to drop rows of Pandas DataFrame whose value in certain columns is NaNHow to split a column into two columns?Creating a pandas DataFrame from columns of other DataFrames with similar indexesExtracting just Month and Year separately from Pandas Datetime columnNaN values when new column added to pandas DataFrameSum across all NaNs in pandas returns zero?Replace cell value in pandas dataframe where value is 'NaN' with value from another/same dataframeHow to pivot a dataframeDouble loop to pd.DataFrame

Are there advantages in writing by hand over typing out a story?

Old story where computer expert digitally animates The Lord of the Rings

Are the Gray and Death Slaad's Bite and Claw attacks magical?

Simplify the code

Are the plates of a battery really charged?

Angular: Using ComponentFactoryResolver for dynamic instantiation of the components, rendering inside SVG

Does Dhp 256-257 condone judging others?

Why am I getting an electric shock from the water in my hot tub?

Searching for single buildings in QGIS

How can I change my buffer system for protein purification?

Finding an optimal set without forbidden subsets

What was the ASCII end of medium (EM) character intended to be used for?

Trace in the category of propositional statements

I agreed to cancel a long-planned vacation (with travel costs) due to project deadlines, but now the timeline has all changed again

Why is my 401k manager recommending me to save more?

Is there a word for the act of simultaneously pulling and twisting an object?

Robots in a spaceship

SQL Server Ignoring Instance name when using port number of different instance

Which are more efficient in putting out wildfires: planes or helicopters?

Why did the Middle Kingdom stop building pyramid tombs?

A quine of sorts

Can I deep fry food in butter instead of vegetable oil?

Why is the saxophone not common in classical repertoire?

Can combing bent evaporator coil fins damage it?

Nan in pd.DataFrame (simmetrical matrix)

How can I check for NaN values?How to drop rows of Pandas DataFrame whose value in certain columns is NaNHow to split a column into two columns?Creating a pandas DataFrame from columns of other DataFrames with similar indexesExtracting just Month and Year separately from Pandas Datetime columnNaN values when new column added to pandas DataFrameSum across all NaNs in pandas returns zero?Replace cell value in pandas dataframe where value is 'NaN' with value from another/same dataframeHow to pivot a dataframeDouble loop to pd.DataFrame

I've got a dataframe like this one. I'd like to remove the nans and shift up the cells. Then add a date column and set it as index.

 ciao google microsoft
Search Volume 368000 NaN NaN
Search Volume 368000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN NaN 135000
Search Volume NaN NaN 135000
Search Volume NaN NaN 110000
Search Volume NaN NaN 110000
Search Volume NaN NaN 110000
Search Volume NaN NaN 110000

The output should be like:

date = ['20140115', '20140215', '20140315', '20140415', '20140515', '20140615']

date ciao google microsoft
20140115 368000 37200000 135000
20140215 368000 37200000 135000
20140315 450000 37200000 110000
20140415 450000 37200000 110000
20140515 450000 37200000 110000
20140615 450000 37200000 110000

Looks simple but I don't know how to do it. Thanks

edited Mar 25 at 16:44

wpercy

6,9064 gold badges24 silver badges35 bronze badges

asked Mar 25 at 16:42

SkuPak

103 bronze badges

add a comment |

I've got a dataframe like this one. I'd like to remove the nans and shift up the cells. Then add a date column and set it as index.

 ciao google microsoft
Search Volume 368000 NaN NaN
Search Volume 368000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN NaN 135000
Search Volume NaN NaN 135000
Search Volume NaN NaN 110000
Search Volume NaN NaN 110000
Search Volume NaN NaN 110000
Search Volume NaN NaN 110000

The output should be like:

date = ['20140115', '20140215', '20140315', '20140415', '20140515', '20140615']

date ciao google microsoft
20140115 368000 37200000 135000
20140215 368000 37200000 135000
20140315 450000 37200000 110000
20140415 450000 37200000 110000
20140515 450000 37200000 110000
20140615 450000 37200000 110000

Looks simple but I don't know how to do it. Thanks

edited Mar 25 at 16:44

wpercy

6,9064 gold badges24 silver badges35 bronze badges

asked Mar 25 at 16:42

SkuPak

103 bronze badges

add a comment |

I've got a dataframe like this one. I'd like to remove the nans and shift up the cells. Then add a date column and set it as index.

 ciao google microsoft
Search Volume 368000 NaN NaN
Search Volume 368000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN NaN 135000
Search Volume NaN NaN 135000
Search Volume NaN NaN 110000
Search Volume NaN NaN 110000
Search Volume NaN NaN 110000
Search Volume NaN NaN 110000

The output should be like:

date = ['20140115', '20140215', '20140315', '20140415', '20140515', '20140615']

date ciao google microsoft
20140115 368000 37200000 135000
20140215 368000 37200000 135000
20140315 450000 37200000 110000
20140415 450000 37200000 110000
20140515 450000 37200000 110000
20140615 450000 37200000 110000

Looks simple but I don't know how to do it. Thanks

edited Mar 25 at 16:44

wpercy

6,9064 gold badges24 silver badges35 bronze badges

asked Mar 25 at 16:42

SkuPak

103 bronze badges

I've got a dataframe like this one. I'd like to remove the nans and shift up the cells. Then add a date column and set it as index.

 ciao google microsoft
Search Volume 368000 NaN NaN
Search Volume 368000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN NaN 135000
Search Volume NaN NaN 135000
Search Volume NaN NaN 110000
Search Volume NaN NaN 110000
Search Volume NaN NaN 110000
Search Volume NaN NaN 110000

The output should be like:

date = ['20140115', '20140215', '20140315', '20140415', '20140515', '20140615']

date ciao google microsoft
20140115 368000 37200000 135000
20140215 368000 37200000 135000
20140315 450000 37200000 110000
20140415 450000 37200000 110000
20140515 450000 37200000 110000
20140615 450000 37200000 110000

Looks simple but I don't know how to do it. Thanks

python pandas

edited Mar 25 at 16:44

wpercy

6,9064 gold badges24 silver badges35 bronze badges

asked Mar 25 at 16:42

SkuPak

103 bronze badges

edited Mar 25 at 16:44

wpercy

6,9064 gold badges24 silver badges35 bronze badges

asked Mar 25 at 16:42

SkuPak

103 bronze badges

edited Mar 25 at 16:44

wpercy

6,9064 gold badges24 silver badges35 bronze badges

edited Mar 25 at 16:44

wpercy

6,9064 gold badges24 silver badges35 bronze badges

edited Mar 25 at 16:44

wpercy

6,9064 gold badges24 silver badges35 bronze badges

asked Mar 25 at 16:42

SkuPak

103 bronze badges

asked Mar 25 at 16:42

SkuPak

103 bronze badges

asked Mar 25 at 16:42

SkuPak

103 bronze badges

add a comment |

5 Answers
5

active

oldest

votes

you could use apply with dropna:

df = df.apply(lambda x: pd.Series(x.dropna().values)).fillna('')
df['date'] = date
print(df)

output:

 ciao google microsoft date 
 368000.0 37200000.0 135000.0 20140115 
 368000.0 37200000.0 135000.0 20140215 
 450000.0 37200000.0 110000.0 20140315 
 450000.0 37200000.0 110000.0 20140415 
 450000.0 37200000.0 110000.0 20140515 
 450000.0 37200000.0 110000.0 20140615

answered Mar 25 at 17:06

Frenchy

2,6662 gold badges5 silver badges18 bronze badges

add a comment |

You can also use dropna on the columns as series

df1=pd.DataFrame(data=[df[i].dropna().values for i in df.columns]).T
df1.index=dates

answered Mar 25 at 17:00

G. Anderson

2,6371 gold badge6 silver badges13 bronze badges

add a comment |

One tricky solution cause by you have duplicate index

pd.concat([df[x].dropna() for x in df.columns],1)
Out[24]: 
 ciao google microsoft
SearchVolume 368000.0 37200000.0 135000.0
SearchVolume 368000.0 37200000.0 135000.0
SearchVolume 450000.0 37200000.0 110000.0
SearchVolume 450000.0 37200000.0 110000.0
SearchVolume 450000.0 37200000.0 110000.0
SearchVolume 450000.0 37200000.0 110000.0

answered Mar 25 at 17:03

WeNYoBen

144k8 gold badges51 silver badges80 bronze badges

add a comment |

My proposition is:

pd.DataFrame(data= colName: df[colName].dropna().values for colName in df.columns ,
 index=['20140115', '20140215', '20140315', '20140415', '20140515', '20140615'])

The main point is a dictionary comprehension, executed for each column.

dropna removes NaN items and values allows to free oneself from
index values.

edited Mar 25 at 17:08

answered Mar 25 at 17:02

Valdi_Bo

6,5812 gold badges9 silver badges16 bronze badges

add a comment |

This should work:

denulled = col: df.loc[df[col].notnull(),col].values for col in df.columns

df_out = pd.DataFrame(denulled, index=date)

edited Mar 25 at 17:30

answered Mar 25 at 16:54

ags29

1,1391 gold badge2 silver badges7 bronze badges

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55342616%2fnan-in-pd-dataframe-simmetrical-matrix%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

5 Answers
5

active

oldest

votes

5 Answers
5

active

oldest

votes

you could use apply with dropna:

df = df.apply(lambda x: pd.Series(x.dropna().values)).fillna('')
df['date'] = date
print(df)

output:

 ciao google microsoft date 
 368000.0 37200000.0 135000.0 20140115 
 368000.0 37200000.0 135000.0 20140215 
 450000.0 37200000.0 110000.0 20140315 
 450000.0 37200000.0 110000.0 20140415 
 450000.0 37200000.0 110000.0 20140515 
 450000.0 37200000.0 110000.0 20140615

answered Mar 25 at 17:06

Frenchy

2,6662 gold badges5 silver badges18 bronze badges

add a comment |

you could use apply with dropna:

df = df.apply(lambda x: pd.Series(x.dropna().values)).fillna('')
df['date'] = date
print(df)

output:

 ciao google microsoft date 
 368000.0 37200000.0 135000.0 20140115 
 368000.0 37200000.0 135000.0 20140215 
 450000.0 37200000.0 110000.0 20140315 
 450000.0 37200000.0 110000.0 20140415 
 450000.0 37200000.0 110000.0 20140515 
 450000.0 37200000.0 110000.0 20140615

answered Mar 25 at 17:06

Frenchy

2,6662 gold badges5 silver badges18 bronze badges

add a comment |

you could use apply with dropna:

df = df.apply(lambda x: pd.Series(x.dropna().values)).fillna('')
df['date'] = date
print(df)

output:

 ciao google microsoft date 
 368000.0 37200000.0 135000.0 20140115 
 368000.0 37200000.0 135000.0 20140215 
 450000.0 37200000.0 110000.0 20140315 
 450000.0 37200000.0 110000.0 20140415 
 450000.0 37200000.0 110000.0 20140515 
 450000.0 37200000.0 110000.0 20140615

answered Mar 25 at 17:06

Frenchy

2,6662 gold badges5 silver badges18 bronze badges

you could use apply with dropna:

df = df.apply(lambda x: pd.Series(x.dropna().values)).fillna('')
df['date'] = date
print(df)

output:

 ciao google microsoft date 
 368000.0 37200000.0 135000.0 20140115 
 368000.0 37200000.0 135000.0 20140215 
 450000.0 37200000.0 110000.0 20140315 
 450000.0 37200000.0 110000.0 20140415 
 450000.0 37200000.0 110000.0 20140515 
 450000.0 37200000.0 110000.0 20140615

answered Mar 25 at 17:06

Frenchy

2,6662 gold badges5 silver badges18 bronze badges

answered Mar 25 at 17:06

Frenchy

2,6662 gold badges5 silver badges18 bronze badges

answered Mar 25 at 17:06

Frenchy

2,6662 gold badges5 silver badges18 bronze badges

answered Mar 25 at 17:06

Frenchy

2,6662 gold badges5 silver badges18 bronze badges

add a comment |

You can also use dropna on the columns as series

df1=pd.DataFrame(data=[df[i].dropna().values for i in df.columns]).T
df1.index=dates

answered Mar 25 at 17:00

G. Anderson

2,6371 gold badge6 silver badges13 bronze badges

add a comment |

You can also use dropna on the columns as series

df1=pd.DataFrame(data=[df[i].dropna().values for i in df.columns]).T
df1.index=dates

answered Mar 25 at 17:00

G. Anderson

2,6371 gold badge6 silver badges13 bronze badges

add a comment |

You can also use dropna on the columns as series

df1=pd.DataFrame(data=[df[i].dropna().values for i in df.columns]).T
df1.index=dates

answered Mar 25 at 17:00

G. Anderson

2,6371 gold badge6 silver badges13 bronze badges

You can also use dropna on the columns as series

df1=pd.DataFrame(data=[df[i].dropna().values for i in df.columns]).T
df1.index=dates

answered Mar 25 at 17:00

G. Anderson

2,6371 gold badge6 silver badges13 bronze badges

answered Mar 25 at 17:00

G. Anderson

2,6371 gold badge6 silver badges13 bronze badges

answered Mar 25 at 17:00

G. Anderson

2,6371 gold badge6 silver badges13 bronze badges

answered Mar 25 at 17:00

G. Anderson

2,6371 gold badge6 silver badges13 bronze badges

add a comment |

One tricky solution cause by you have duplicate index

pd.concat([df[x].dropna() for x in df.columns],1)
Out[24]: 
 ciao google microsoft
SearchVolume 368000.0 37200000.0 135000.0
SearchVolume 368000.0 37200000.0 135000.0
SearchVolume 450000.0 37200000.0 110000.0
SearchVolume 450000.0 37200000.0 110000.0
SearchVolume 450000.0 37200000.0 110000.0
SearchVolume 450000.0 37200000.0 110000.0

answered Mar 25 at 17:03

WeNYoBen

144k8 gold badges51 silver badges80 bronze badges

add a comment |

One tricky solution cause by you have duplicate index

pd.concat([df[x].dropna() for x in df.columns],1)
Out[24]: 
 ciao google microsoft
SearchVolume 368000.0 37200000.0 135000.0
SearchVolume 368000.0 37200000.0 135000.0
SearchVolume 450000.0 37200000.0 110000.0
SearchVolume 450000.0 37200000.0 110000.0
SearchVolume 450000.0 37200000.0 110000.0
SearchVolume 450000.0 37200000.0 110000.0

answered Mar 25 at 17:03

WeNYoBen

144k8 gold badges51 silver badges80 bronze badges

add a comment |

One tricky solution cause by you have duplicate index

pd.concat([df[x].dropna() for x in df.columns],1)
Out[24]: 
 ciao google microsoft
SearchVolume 368000.0 37200000.0 135000.0
SearchVolume 368000.0 37200000.0 135000.0
SearchVolume 450000.0 37200000.0 110000.0
SearchVolume 450000.0 37200000.0 110000.0
SearchVolume 450000.0 37200000.0 110000.0
SearchVolume 450000.0 37200000.0 110000.0

answered Mar 25 at 17:03

WeNYoBen

144k8 gold badges51 silver badges80 bronze badges

One tricky solution cause by you have duplicate index

pd.concat([df[x].dropna() for x in df.columns],1)
Out[24]: 
 ciao google microsoft
SearchVolume 368000.0 37200000.0 135000.0
SearchVolume 368000.0 37200000.0 135000.0
SearchVolume 450000.0 37200000.0 110000.0
SearchVolume 450000.0 37200000.0 110000.0
SearchVolume 450000.0 37200000.0 110000.0
SearchVolume 450000.0 37200000.0 110000.0

answered Mar 25 at 17:03

WeNYoBen

144k8 gold badges51 silver badges80 bronze badges

answered Mar 25 at 17:03

WeNYoBen

144k8 gold badges51 silver badges80 bronze badges

answered Mar 25 at 17:03

WeNYoBen

144k8 gold badges51 silver badges80 bronze badges

answered Mar 25 at 17:03

WeNYoBen

144k8 gold badges51 silver badges80 bronze badges

add a comment |

My proposition is:

pd.DataFrame(data= colName: df[colName].dropna().values for colName in df.columns ,
 index=['20140115', '20140215', '20140315', '20140415', '20140515', '20140615'])

The main point is a dictionary comprehension, executed for each column.

dropna removes NaN items and values allows to free oneself from
index values.

edited Mar 25 at 17:08

answered Mar 25 at 17:02

Valdi_Bo

6,5812 gold badges9 silver badges16 bronze badges

add a comment |

My proposition is:

pd.DataFrame(data= colName: df[colName].dropna().values for colName in df.columns ,
 index=['20140115', '20140215', '20140315', '20140415', '20140515', '20140615'])

The main point is a dictionary comprehension, executed for each column.

dropna removes NaN items and values allows to free oneself from
index values.

edited Mar 25 at 17:08

answered Mar 25 at 17:02

Valdi_Bo

6,5812 gold badges9 silver badges16 bronze badges

add a comment |

My proposition is:

pd.DataFrame(data= colName: df[colName].dropna().values for colName in df.columns ,
 index=['20140115', '20140215', '20140315', '20140415', '20140515', '20140615'])

The main point is a dictionary comprehension, executed for each column.

dropna removes NaN items and values allows to free oneself from
index values.

edited Mar 25 at 17:08

answered Mar 25 at 17:02

Valdi_Bo

6,5812 gold badges9 silver badges16 bronze badges

My proposition is:

pd.DataFrame(data= colName: df[colName].dropna().values for colName in df.columns ,
 index=['20140115', '20140215', '20140315', '20140415', '20140515', '20140615'])

The main point is a dictionary comprehension, executed for each column.

dropna removes NaN items and values allows to free oneself from
index values.

edited Mar 25 at 17:08

answered Mar 25 at 17:02

Valdi_Bo

6,5812 gold badges9 silver badges16 bronze badges

edited Mar 25 at 17:08

answered Mar 25 at 17:02

Valdi_Bo

6,5812 gold badges9 silver badges16 bronze badges

answered Mar 25 at 17:02

Valdi_Bo

6,5812 gold badges9 silver badges16 bronze badges

answered Mar 25 at 17:02

Valdi_Bo

6,5812 gold badges9 silver badges16 bronze badges

add a comment |

This should work:

denulled = col: df.loc[df[col].notnull(),col].values for col in df.columns

df_out = pd.DataFrame(denulled, index=date)

edited Mar 25 at 17:30

answered Mar 25 at 16:54

ags29

1,1391 gold badge2 silver badges7 bronze badges

add a comment |

This should work:

denulled = col: df.loc[df[col].notnull(),col].values for col in df.columns

df_out = pd.DataFrame(denulled, index=date)

edited Mar 25 at 17:30

answered Mar 25 at 16:54

ags29

1,1391 gold badge2 silver badges7 bronze badges

add a comment |

This should work:

denulled = col: df.loc[df[col].notnull(),col].values for col in df.columns

df_out = pd.DataFrame(denulled, index=date)

edited Mar 25 at 17:30

answered Mar 25 at 16:54

ags29

1,1391 gold badge2 silver badges7 bronze badges

This should work:

denulled = col: df.loc[df[col].notnull(),col].values for col in df.columns

df_out = pd.DataFrame(denulled, index=date)

edited Mar 25 at 17:30

answered Mar 25 at 16:54

ags29

1,1391 gold badge2 silver badges7 bronze badges

edited Mar 25 at 17:30

answered Mar 25 at 16:54

ags29

1,1391 gold badge2 silver badges7 bronze badges

answered Mar 25 at 16:54

ags29

1,1391 gold badge2 silver badges7 bronze badges

answered Mar 25 at 16:54

ags29

1,1391 gold badge2 silver badges7 bronze badges

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Styjun

5 Answers
5

Your Answer

Post as a guest

5 Answers
5

5 Answers
5

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

5 Answers 5

Your Answer

Sign up or log in

Post as a guest

Post as a guest

5 Answers 5

5 Answers 5

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

5 Answers
5

5 Answers
5

5 Answers
5