Nan in pd.DataFrame (simmetrical matrix)How can I check for NaN values?How to drop rows of Pandas DataFrame whose value in certain columns is NaNHow to split a column into two columns?Creating a pandas DataFrame from columns of other DataFrames with similar indexesExtracting just Month and Year separately from Pandas Datetime columnNaN values when new column added to pandas DataFrameSum across all NaNs in pandas returns zero?Replace cell value in pandas dataframe where value is 'NaN' with value from another/same dataframeHow to pivot a dataframeDouble loop to pd.DataFrame

Are there advantages in writing by hand over typing out a story?

Old story where computer expert digitally animates The Lord of the Rings

Are the Gray and Death Slaad's Bite and Claw attacks magical?

Simplify the code

Are the plates of a battery really charged?

Angular: Using ComponentFactoryResolver for dynamic instantiation of the components, rendering inside SVG

Does Dhp 256-257 condone judging others?

Why am I getting an electric shock from the water in my hot tub?

Searching for single buildings in QGIS

How can I change my buffer system for protein purification?

Finding an optimal set without forbidden subsets

What was the ASCII end of medium (EM) character intended to be used for?

Trace in the category of propositional statements

I agreed to cancel a long-planned vacation (with travel costs) due to project deadlines, but now the timeline has all changed again

Why is my 401k manager recommending me to save more?

Is there a word for the act of simultaneously pulling and twisting an object?

Robots in a spaceship

SQL Server Ignoring Instance name when using port number of different instance

Which are more efficient in putting out wildfires: planes or helicopters?

Why did the Middle Kingdom stop building pyramid tombs?

A quine of sorts

Can I deep fry food in butter instead of vegetable oil?

Why is the saxophone not common in classical repertoire?

Can combing bent evaporator coil fins damage it?



Nan in pd.DataFrame (simmetrical matrix)


How can I check for NaN values?How to drop rows of Pandas DataFrame whose value in certain columns is NaNHow to split a column into two columns?Creating a pandas DataFrame from columns of other DataFrames with similar indexesExtracting just Month and Year separately from Pandas Datetime columnNaN values when new column added to pandas DataFrameSum across all NaNs in pandas returns zero?Replace cell value in pandas dataframe where value is 'NaN' with value from another/same dataframeHow to pivot a dataframeDouble loop to pd.DataFrame













1















I've got a dataframe like this one. I'd like to remove the nans and shift up the cells. Then add a date column and set it as index.



 ciao google microsoft
Search Volume 368000 NaN NaN
Search Volume 368000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume 450000 NaN NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN 37200000 NaN
Search Volume NaN NaN 135000
Search Volume NaN NaN 135000
Search Volume NaN NaN 110000
Search Volume NaN NaN 110000
Search Volume NaN NaN 110000
Search Volume NaN NaN 110000


The output should be like:



date = ['20140115', '20140215', '20140315', '20140415', '20140515', '20140615']

date ciao google microsoft
20140115 368000 37200000 135000
20140215 368000 37200000 135000
20140315 450000 37200000 110000
20140415 450000 37200000 110000
20140515 450000 37200000 110000
20140615 450000 37200000 110000


Looks simple but I don't know how to do it. Thanks










share|improve this question




























    1















    I've got a dataframe like this one. I'd like to remove the nans and shift up the cells. Then add a date column and set it as index.



     ciao google microsoft
    Search Volume 368000 NaN NaN
    Search Volume 368000 NaN NaN
    Search Volume 450000 NaN NaN
    Search Volume 450000 NaN NaN
    Search Volume 450000 NaN NaN
    Search Volume 450000 NaN NaN
    Search Volume NaN 37200000 NaN
    Search Volume NaN 37200000 NaN
    Search Volume NaN 37200000 NaN
    Search Volume NaN 37200000 NaN
    Search Volume NaN 37200000 NaN
    Search Volume NaN 37200000 NaN
    Search Volume NaN NaN 135000
    Search Volume NaN NaN 135000
    Search Volume NaN NaN 110000
    Search Volume NaN NaN 110000
    Search Volume NaN NaN 110000
    Search Volume NaN NaN 110000


    The output should be like:



    date = ['20140115', '20140215', '20140315', '20140415', '20140515', '20140615']

    date ciao google microsoft
    20140115 368000 37200000 135000
    20140215 368000 37200000 135000
    20140315 450000 37200000 110000
    20140415 450000 37200000 110000
    20140515 450000 37200000 110000
    20140615 450000 37200000 110000


    Looks simple but I don't know how to do it. Thanks










    share|improve this question


























      1












      1








      1








      I've got a dataframe like this one. I'd like to remove the nans and shift up the cells. Then add a date column and set it as index.



       ciao google microsoft
      Search Volume 368000 NaN NaN
      Search Volume 368000 NaN NaN
      Search Volume 450000 NaN NaN
      Search Volume 450000 NaN NaN
      Search Volume 450000 NaN NaN
      Search Volume 450000 NaN NaN
      Search Volume NaN 37200000 NaN
      Search Volume NaN 37200000 NaN
      Search Volume NaN 37200000 NaN
      Search Volume NaN 37200000 NaN
      Search Volume NaN 37200000 NaN
      Search Volume NaN 37200000 NaN
      Search Volume NaN NaN 135000
      Search Volume NaN NaN 135000
      Search Volume NaN NaN 110000
      Search Volume NaN NaN 110000
      Search Volume NaN NaN 110000
      Search Volume NaN NaN 110000


      The output should be like:



      date = ['20140115', '20140215', '20140315', '20140415', '20140515', '20140615']

      date ciao google microsoft
      20140115 368000 37200000 135000
      20140215 368000 37200000 135000
      20140315 450000 37200000 110000
      20140415 450000 37200000 110000
      20140515 450000 37200000 110000
      20140615 450000 37200000 110000


      Looks simple but I don't know how to do it. Thanks










      share|improve this question
















      I've got a dataframe like this one. I'd like to remove the nans and shift up the cells. Then add a date column and set it as index.



       ciao google microsoft
      Search Volume 368000 NaN NaN
      Search Volume 368000 NaN NaN
      Search Volume 450000 NaN NaN
      Search Volume 450000 NaN NaN
      Search Volume 450000 NaN NaN
      Search Volume 450000 NaN NaN
      Search Volume NaN 37200000 NaN
      Search Volume NaN 37200000 NaN
      Search Volume NaN 37200000 NaN
      Search Volume NaN 37200000 NaN
      Search Volume NaN 37200000 NaN
      Search Volume NaN 37200000 NaN
      Search Volume NaN NaN 135000
      Search Volume NaN NaN 135000
      Search Volume NaN NaN 110000
      Search Volume NaN NaN 110000
      Search Volume NaN NaN 110000
      Search Volume NaN NaN 110000


      The output should be like:



      date = ['20140115', '20140215', '20140315', '20140415', '20140515', '20140615']

      date ciao google microsoft
      20140115 368000 37200000 135000
      20140215 368000 37200000 135000
      20140315 450000 37200000 110000
      20140415 450000 37200000 110000
      20140515 450000 37200000 110000
      20140615 450000 37200000 110000


      Looks simple but I don't know how to do it. Thanks







      python pandas






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 25 at 16:44









      wpercy

      6,9064 gold badges24 silver badges35 bronze badges




      6,9064 gold badges24 silver badges35 bronze badges










      asked Mar 25 at 16:42









      SkuPakSkuPak

      103 bronze badges




      103 bronze badges




















          5 Answers
          5






          active

          oldest

          votes


















          0














          you could use apply with dropna:



          df = df.apply(lambda x: pd.Series(x.dropna().values)).fillna('')
          df['date'] = date
          print(df)


          output:



           ciao google microsoft date 
          368000.0 37200000.0 135000.0 20140115
          368000.0 37200000.0 135000.0 20140215
          450000.0 37200000.0 110000.0 20140315
          450000.0 37200000.0 110000.0 20140415
          450000.0 37200000.0 110000.0 20140515
          450000.0 37200000.0 110000.0 20140615





          share|improve this answer






























            0














            You can also use dropna on the columns as series



            df1=pd.DataFrame(data=[df[i].dropna().values for i in df.columns]).T
            df1.index=dates





            share|improve this answer






























              0














              One tricky solution cause by you have duplicate index



              pd.concat([df[x].dropna() for x in df.columns],1)
              Out[24]:
              ciao google microsoft
              SearchVolume 368000.0 37200000.0 135000.0
              SearchVolume 368000.0 37200000.0 135000.0
              SearchVolume 450000.0 37200000.0 110000.0
              SearchVolume 450000.0 37200000.0 110000.0
              SearchVolume 450000.0 37200000.0 110000.0
              SearchVolume 450000.0 37200000.0 110000.0





              share|improve this answer






























                0














                My proposition is:



                pd.DataFrame(data= colName: df[colName].dropna().values for colName in df.columns ,
                index=['20140115', '20140215', '20140315', '20140415', '20140515', '20140615'])


                The main point is a dictionary comprehension, executed for each column.



                dropna removes NaN items and values allows to free oneself from
                index values.






                share|improve this answer
































                  0














                  This should work:



                  denulled = col: df.loc[df[col].notnull(),col].values for col in df.columns

                  df_out = pd.DataFrame(denulled, index=date)





                  share|improve this answer



























                    Your Answer






                    StackExchange.ifUsing("editor", function ()
                    StackExchange.using("externalEditor", function ()
                    StackExchange.using("snippets", function ()
                    StackExchange.snippets.init();
                    );
                    );
                    , "code-snippets");

                    StackExchange.ready(function()
                    var channelOptions =
                    tags: "".split(" "),
                    id: "1"
                    ;
                    initTagRenderer("".split(" "), "".split(" "), channelOptions);

                    StackExchange.using("externalEditor", function()
                    // Have to fire editor after snippets, if snippets enabled
                    if (StackExchange.settings.snippets.snippetsEnabled)
                    StackExchange.using("snippets", function()
                    createEditor();
                    );

                    else
                    createEditor();

                    );

                    function createEditor()
                    StackExchange.prepareEditor(
                    heartbeatType: 'answer',
                    autoActivateHeartbeat: false,
                    convertImagesToLinks: true,
                    noModals: true,
                    showLowRepImageUploadWarning: true,
                    reputationToPostImages: 10,
                    bindNavPrevention: true,
                    postfix: "",
                    imageUploader:
                    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                    allowUrls: true
                    ,
                    onDemand: true,
                    discardSelector: ".discard-answer"
                    ,immediatelyShowMarkdownHelp:true
                    );



                    );













                    draft saved

                    draft discarded


















                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55342616%2fnan-in-pd-dataframe-simmetrical-matrix%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown

























                    5 Answers
                    5






                    active

                    oldest

                    votes








                    5 Answers
                    5






                    active

                    oldest

                    votes









                    active

                    oldest

                    votes






                    active

                    oldest

                    votes









                    0














                    you could use apply with dropna:



                    df = df.apply(lambda x: pd.Series(x.dropna().values)).fillna('')
                    df['date'] = date
                    print(df)


                    output:



                     ciao google microsoft date 
                    368000.0 37200000.0 135000.0 20140115
                    368000.0 37200000.0 135000.0 20140215
                    450000.0 37200000.0 110000.0 20140315
                    450000.0 37200000.0 110000.0 20140415
                    450000.0 37200000.0 110000.0 20140515
                    450000.0 37200000.0 110000.0 20140615





                    share|improve this answer



























                      0














                      you could use apply with dropna:



                      df = df.apply(lambda x: pd.Series(x.dropna().values)).fillna('')
                      df['date'] = date
                      print(df)


                      output:



                       ciao google microsoft date 
                      368000.0 37200000.0 135000.0 20140115
                      368000.0 37200000.0 135000.0 20140215
                      450000.0 37200000.0 110000.0 20140315
                      450000.0 37200000.0 110000.0 20140415
                      450000.0 37200000.0 110000.0 20140515
                      450000.0 37200000.0 110000.0 20140615





                      share|improve this answer

























                        0












                        0








                        0







                        you could use apply with dropna:



                        df = df.apply(lambda x: pd.Series(x.dropna().values)).fillna('')
                        df['date'] = date
                        print(df)


                        output:



                         ciao google microsoft date 
                        368000.0 37200000.0 135000.0 20140115
                        368000.0 37200000.0 135000.0 20140215
                        450000.0 37200000.0 110000.0 20140315
                        450000.0 37200000.0 110000.0 20140415
                        450000.0 37200000.0 110000.0 20140515
                        450000.0 37200000.0 110000.0 20140615





                        share|improve this answer













                        you could use apply with dropna:



                        df = df.apply(lambda x: pd.Series(x.dropna().values)).fillna('')
                        df['date'] = date
                        print(df)


                        output:



                         ciao google microsoft date 
                        368000.0 37200000.0 135000.0 20140115
                        368000.0 37200000.0 135000.0 20140215
                        450000.0 37200000.0 110000.0 20140315
                        450000.0 37200000.0 110000.0 20140415
                        450000.0 37200000.0 110000.0 20140515
                        450000.0 37200000.0 110000.0 20140615






                        share|improve this answer












                        share|improve this answer



                        share|improve this answer










                        answered Mar 25 at 17:06









                        FrenchyFrenchy

                        2,6662 gold badges5 silver badges18 bronze badges




                        2,6662 gold badges5 silver badges18 bronze badges





















                            0














                            You can also use dropna on the columns as series



                            df1=pd.DataFrame(data=[df[i].dropna().values for i in df.columns]).T
                            df1.index=dates





                            share|improve this answer



























                              0














                              You can also use dropna on the columns as series



                              df1=pd.DataFrame(data=[df[i].dropna().values for i in df.columns]).T
                              df1.index=dates





                              share|improve this answer

























                                0












                                0








                                0







                                You can also use dropna on the columns as series



                                df1=pd.DataFrame(data=[df[i].dropna().values for i in df.columns]).T
                                df1.index=dates





                                share|improve this answer













                                You can also use dropna on the columns as series



                                df1=pd.DataFrame(data=[df[i].dropna().values for i in df.columns]).T
                                df1.index=dates






                                share|improve this answer












                                share|improve this answer



                                share|improve this answer










                                answered Mar 25 at 17:00









                                G. AndersonG. Anderson

                                2,6371 gold badge6 silver badges13 bronze badges




                                2,6371 gold badge6 silver badges13 bronze badges





















                                    0














                                    One tricky solution cause by you have duplicate index



                                    pd.concat([df[x].dropna() for x in df.columns],1)
                                    Out[24]:
                                    ciao google microsoft
                                    SearchVolume 368000.0 37200000.0 135000.0
                                    SearchVolume 368000.0 37200000.0 135000.0
                                    SearchVolume 450000.0 37200000.0 110000.0
                                    SearchVolume 450000.0 37200000.0 110000.0
                                    SearchVolume 450000.0 37200000.0 110000.0
                                    SearchVolume 450000.0 37200000.0 110000.0





                                    share|improve this answer



























                                      0














                                      One tricky solution cause by you have duplicate index



                                      pd.concat([df[x].dropna() for x in df.columns],1)
                                      Out[24]:
                                      ciao google microsoft
                                      SearchVolume 368000.0 37200000.0 135000.0
                                      SearchVolume 368000.0 37200000.0 135000.0
                                      SearchVolume 450000.0 37200000.0 110000.0
                                      SearchVolume 450000.0 37200000.0 110000.0
                                      SearchVolume 450000.0 37200000.0 110000.0
                                      SearchVolume 450000.0 37200000.0 110000.0





                                      share|improve this answer

























                                        0












                                        0








                                        0







                                        One tricky solution cause by you have duplicate index



                                        pd.concat([df[x].dropna() for x in df.columns],1)
                                        Out[24]:
                                        ciao google microsoft
                                        SearchVolume 368000.0 37200000.0 135000.0
                                        SearchVolume 368000.0 37200000.0 135000.0
                                        SearchVolume 450000.0 37200000.0 110000.0
                                        SearchVolume 450000.0 37200000.0 110000.0
                                        SearchVolume 450000.0 37200000.0 110000.0
                                        SearchVolume 450000.0 37200000.0 110000.0





                                        share|improve this answer













                                        One tricky solution cause by you have duplicate index



                                        pd.concat([df[x].dropna() for x in df.columns],1)
                                        Out[24]:
                                        ciao google microsoft
                                        SearchVolume 368000.0 37200000.0 135000.0
                                        SearchVolume 368000.0 37200000.0 135000.0
                                        SearchVolume 450000.0 37200000.0 110000.0
                                        SearchVolume 450000.0 37200000.0 110000.0
                                        SearchVolume 450000.0 37200000.0 110000.0
                                        SearchVolume 450000.0 37200000.0 110000.0






                                        share|improve this answer












                                        share|improve this answer



                                        share|improve this answer










                                        answered Mar 25 at 17:03









                                        WeNYoBenWeNYoBen

                                        144k8 gold badges51 silver badges80 bronze badges




                                        144k8 gold badges51 silver badges80 bronze badges





















                                            0














                                            My proposition is:



                                            pd.DataFrame(data= colName: df[colName].dropna().values for colName in df.columns ,
                                            index=['20140115', '20140215', '20140315', '20140415', '20140515', '20140615'])


                                            The main point is a dictionary comprehension, executed for each column.



                                            dropna removes NaN items and values allows to free oneself from
                                            index values.






                                            share|improve this answer





























                                              0














                                              My proposition is:



                                              pd.DataFrame(data= colName: df[colName].dropna().values for colName in df.columns ,
                                              index=['20140115', '20140215', '20140315', '20140415', '20140515', '20140615'])


                                              The main point is a dictionary comprehension, executed for each column.



                                              dropna removes NaN items and values allows to free oneself from
                                              index values.






                                              share|improve this answer



























                                                0












                                                0








                                                0







                                                My proposition is:



                                                pd.DataFrame(data= colName: df[colName].dropna().values for colName in df.columns ,
                                                index=['20140115', '20140215', '20140315', '20140415', '20140515', '20140615'])


                                                The main point is a dictionary comprehension, executed for each column.



                                                dropna removes NaN items and values allows to free oneself from
                                                index values.






                                                share|improve this answer















                                                My proposition is:



                                                pd.DataFrame(data= colName: df[colName].dropna().values for colName in df.columns ,
                                                index=['20140115', '20140215', '20140315', '20140415', '20140515', '20140615'])


                                                The main point is a dictionary comprehension, executed for each column.



                                                dropna removes NaN items and values allows to free oneself from
                                                index values.







                                                share|improve this answer














                                                share|improve this answer



                                                share|improve this answer








                                                edited Mar 25 at 17:08

























                                                answered Mar 25 at 17:02









                                                Valdi_BoValdi_Bo

                                                6,5812 gold badges9 silver badges16 bronze badges




                                                6,5812 gold badges9 silver badges16 bronze badges





















                                                    0














                                                    This should work:



                                                    denulled = col: df.loc[df[col].notnull(),col].values for col in df.columns

                                                    df_out = pd.DataFrame(denulled, index=date)





                                                    share|improve this answer





























                                                      0














                                                      This should work:



                                                      denulled = col: df.loc[df[col].notnull(),col].values for col in df.columns

                                                      df_out = pd.DataFrame(denulled, index=date)





                                                      share|improve this answer



























                                                        0












                                                        0








                                                        0







                                                        This should work:



                                                        denulled = col: df.loc[df[col].notnull(),col].values for col in df.columns

                                                        df_out = pd.DataFrame(denulled, index=date)





                                                        share|improve this answer















                                                        This should work:



                                                        denulled = col: df.loc[df[col].notnull(),col].values for col in df.columns

                                                        df_out = pd.DataFrame(denulled, index=date)






                                                        share|improve this answer














                                                        share|improve this answer



                                                        share|improve this answer








                                                        edited Mar 25 at 17:30

























                                                        answered Mar 25 at 16:54









                                                        ags29ags29

                                                        1,1391 gold badge2 silver badges7 bronze badges




                                                        1,1391 gold badge2 silver badges7 bronze badges



























                                                            draft saved

                                                            draft discarded
















































                                                            Thanks for contributing an answer to Stack Overflow!


                                                            • Please be sure to answer the question. Provide details and share your research!

                                                            But avoid


                                                            • Asking for help, clarification, or responding to other answers.

                                                            • Making statements based on opinion; back them up with references or personal experience.

                                                            To learn more, see our tips on writing great answers.




                                                            draft saved


                                                            draft discarded














                                                            StackExchange.ready(
                                                            function ()
                                                            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55342616%2fnan-in-pd-dataframe-simmetrical-matrix%23new-answer', 'question_page');

                                                            );

                                                            Post as a guest















                                                            Required, but never shown





















































                                                            Required, but never shown














                                                            Required, but never shown












                                                            Required, but never shown







                                                            Required, but never shown

































                                                            Required, but never shown














                                                            Required, but never shown












                                                            Required, but never shown







                                                            Required, but never shown







                                                            Popular posts from this blog

                                                            SQL error code 1064 with creating Laravel foreign keysForeign key constraints: When to use ON UPDATE and ON DELETEDropping column with foreign key Laravel error: General error: 1025 Error on renameLaravel SQL Can't create tableLaravel Migration foreign key errorLaravel php artisan migrate:refresh giving a syntax errorSQLSTATE[42S01]: Base table or view already exists or Base table or view already exists: 1050 Tableerror in migrating laravel file to xampp serverSyntax error or access violation: 1064:syntax to use near 'unsigned not null, modelName varchar(191) not null, title varchar(191) not nLaravel cannot create new table field in mysqlLaravel 5.7:Last migration creates table but is not registered in the migration table

                                                            용인 삼성생명 블루밍스 목차 통계 역대 감독 선수단 응원단 경기장 같이 보기 외부 링크 둘러보기 메뉴samsungblueminx.comeh선수 명단용인 삼성생명 블루밍스용인 삼성생명 블루밍스ehsamsungblueminx.comeheheheh

                                                            155 수학 과학 기타 둘러보기 메뉴eh추가해eh문서를 완성해