How can I concatenate multiple rows of excel data into one?How can I represent an 'Enum' in Python?How can I safely create a nested directory?How can I remove a trailing newline?How can I make a time delay in Python?How can you profile a Python script?How do I concatenate two lists in Python?How can I count the occurrences of a list item?“Large data” work flows using pandasHow to iterate over rows in a DataFrame in Pandas?Importing Multiple Excel Files using OpenPyXL

Why is the UH-60 tail rotor canted?

Considerations when providing money to one child now, and the other later?

Can't understand how static works exactly

How often should alkaline batteries be checked when they are in a device?

Where can I find maps and other historical resources / references of Calcutta / Kolkata in the Victorian era?

Adding gears to my grandson's 12" bike

Origin of the suffix in hippocampus

Why are Oscar, India, and X-Ray (O, I, and X) not used as taxiway identifiers?

Could I play forever with this loop combination?

Why does the salt in the oceans not sink to the bottom?

What kind of world would drive brains to evolve high-throughput sensory?

How can Kazakhstan perform MITM attacks on all HTTPS traffic?

Found old paper shares of Motorola Inc that has since been broken up

Are stackless C++20 coroutines a problem?

How can I show that the speed of light in vacuum is the same in all reference frames?

How am I supposed to put out fires?

Is an easily guessed plot twist a good plot twist?

Book in which the "mountain" in the distance was a hole in the flat world

Do you have to hide in order for other creatures to not know your location inside a fog cloud?

Why is DC so, so, so Democratic?

Create a dropshadow only layer

How does mathematics work?

Is it better to merge "often" or only after completion do a big merge of feature branches?

Is the apartment I want to rent a scam?



How can I concatenate multiple rows of excel data into one?


How can I represent an 'Enum' in Python?How can I safely create a nested directory?How can I remove a trailing newline?How can I make a time delay in Python?How can you profile a Python script?How do I concatenate two lists in Python?How can I count the occurrences of a list item?“Large data” work flows using pandasHow to iterate over rows in a DataFrame in Pandas?Importing Multiple Excel Files using OpenPyXL






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








3















I'm currently facing an issue where I need to bring all of the data shown in the images below into one line only.



enter image description here



So using Python and Openpyxl, I tried to write a parsing script that reads the line and only copies when values are non-null or non-identical, into a new workbook.



I get out of range errors, and the code does not keep just the data I want. I've spent multiple hours on it, so I thought I would ask here to see if I can get unstuck.



I've read some documentation on Openpyxl and about making lists in python, tried a couple of videos on youtube, but none of them did exactly what I was trying to achieve.



import openpyxl
from openpyxl import Workbook

path = "sample.xlsx"
wb = openpyxl.load_workbook(path)
ws = wb.active
path2 = "output.xlsx"
wb2 = Workbook()
ws2 = wb2.active

listab = []
rows = ws.max_row
columns = ws.max_column

for i in range (1, rows+1):
listab.append([])

cellValue = " "
prevCell = " "

for c in range (1, rows+1):
for r in range(1, columns+1):
cellValue = ws.cell(row=r, column=c).value
if cellValue == prevCell:
listab[r-1].append(prevCell)
elif cellValue == "NULL":
listab[r-1].append(prevCell)
elif cellValue != prevCell:
listab[r-1].append(cellValue)
prevCell = cellValue

for r in range(1, rows+1):
for c in range (1, columns+1):
j = ws2.cell(row = r, column=c)
j.value = listab[r-1][c-1]

print(listab)

wb2.save("output.xlsx")


There should be one line with the below information:



ods_service_id | service_name| service_plan_name| CPU | RAM | NIC | DRIVE |










share|improve this question
























  • It's not entirely clear what you want to do. Please describe what you want to do with individual rows.

    – Charlie Clark
    Mar 26 at 14:40

















3















I'm currently facing an issue where I need to bring all of the data shown in the images below into one line only.



enter image description here



So using Python and Openpyxl, I tried to write a parsing script that reads the line and only copies when values are non-null or non-identical, into a new workbook.



I get out of range errors, and the code does not keep just the data I want. I've spent multiple hours on it, so I thought I would ask here to see if I can get unstuck.



I've read some documentation on Openpyxl and about making lists in python, tried a couple of videos on youtube, but none of them did exactly what I was trying to achieve.



import openpyxl
from openpyxl import Workbook

path = "sample.xlsx"
wb = openpyxl.load_workbook(path)
ws = wb.active
path2 = "output.xlsx"
wb2 = Workbook()
ws2 = wb2.active

listab = []
rows = ws.max_row
columns = ws.max_column

for i in range (1, rows+1):
listab.append([])

cellValue = " "
prevCell = " "

for c in range (1, rows+1):
for r in range(1, columns+1):
cellValue = ws.cell(row=r, column=c).value
if cellValue == prevCell:
listab[r-1].append(prevCell)
elif cellValue == "NULL":
listab[r-1].append(prevCell)
elif cellValue != prevCell:
listab[r-1].append(cellValue)
prevCell = cellValue

for r in range(1, rows+1):
for c in range (1, columns+1):
j = ws2.cell(row = r, column=c)
j.value = listab[r-1][c-1]

print(listab)

wb2.save("output.xlsx")


There should be one line with the below information:



ods_service_id | service_name| service_plan_name| CPU | RAM | NIC | DRIVE |










share|improve this question
























  • It's not entirely clear what you want to do. Please describe what you want to do with individual rows.

    – Charlie Clark
    Mar 26 at 14:40













3












3








3








I'm currently facing an issue where I need to bring all of the data shown in the images below into one line only.



enter image description here



So using Python and Openpyxl, I tried to write a parsing script that reads the line and only copies when values are non-null or non-identical, into a new workbook.



I get out of range errors, and the code does not keep just the data I want. I've spent multiple hours on it, so I thought I would ask here to see if I can get unstuck.



I've read some documentation on Openpyxl and about making lists in python, tried a couple of videos on youtube, but none of them did exactly what I was trying to achieve.



import openpyxl
from openpyxl import Workbook

path = "sample.xlsx"
wb = openpyxl.load_workbook(path)
ws = wb.active
path2 = "output.xlsx"
wb2 = Workbook()
ws2 = wb2.active

listab = []
rows = ws.max_row
columns = ws.max_column

for i in range (1, rows+1):
listab.append([])

cellValue = " "
prevCell = " "

for c in range (1, rows+1):
for r in range(1, columns+1):
cellValue = ws.cell(row=r, column=c).value
if cellValue == prevCell:
listab[r-1].append(prevCell)
elif cellValue == "NULL":
listab[r-1].append(prevCell)
elif cellValue != prevCell:
listab[r-1].append(cellValue)
prevCell = cellValue

for r in range(1, rows+1):
for c in range (1, columns+1):
j = ws2.cell(row = r, column=c)
j.value = listab[r-1][c-1]

print(listab)

wb2.save("output.xlsx")


There should be one line with the below information:



ods_service_id | service_name| service_plan_name| CPU | RAM | NIC | DRIVE |










share|improve this question
















I'm currently facing an issue where I need to bring all of the data shown in the images below into one line only.



enter image description here



So using Python and Openpyxl, I tried to write a parsing script that reads the line and only copies when values are non-null or non-identical, into a new workbook.



I get out of range errors, and the code does not keep just the data I want. I've spent multiple hours on it, so I thought I would ask here to see if I can get unstuck.



I've read some documentation on Openpyxl and about making lists in python, tried a couple of videos on youtube, but none of them did exactly what I was trying to achieve.



import openpyxl
from openpyxl import Workbook

path = "sample.xlsx"
wb = openpyxl.load_workbook(path)
ws = wb.active
path2 = "output.xlsx"
wb2 = Workbook()
ws2 = wb2.active

listab = []
rows = ws.max_row
columns = ws.max_column

for i in range (1, rows+1):
listab.append([])

cellValue = " "
prevCell = " "

for c in range (1, rows+1):
for r in range(1, columns+1):
cellValue = ws.cell(row=r, column=c).value
if cellValue == prevCell:
listab[r-1].append(prevCell)
elif cellValue == "NULL":
listab[r-1].append(prevCell)
elif cellValue != prevCell:
listab[r-1].append(cellValue)
prevCell = cellValue

for r in range(1, rows+1):
for c in range (1, columns+1):
j = ws2.cell(row = r, column=c)
j.value = listab[r-1][c-1]

print(listab)

wb2.save("output.xlsx")


There should be one line with the below information:



ods_service_id | service_name| service_plan_name| CPU | RAM | NIC | DRIVE |







python openpyxl






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 26 at 14:45









Mailerdaimon

4,7651 gold badge26 silver badges40 bronze badges




4,7651 gold badge26 silver badges40 bronze badges










asked Mar 26 at 14:08









Mathieu MoquinMathieu Moquin

192 bronze badges




192 bronze badges












  • It's not entirely clear what you want to do. Please describe what you want to do with individual rows.

    – Charlie Clark
    Mar 26 at 14:40

















  • It's not entirely clear what you want to do. Please describe what you want to do with individual rows.

    – Charlie Clark
    Mar 26 at 14:40
















It's not entirely clear what you want to do. Please describe what you want to do with individual rows.

– Charlie Clark
Mar 26 at 14:40





It's not entirely clear what you want to do. Please describe what you want to do with individual rows.

– Charlie Clark
Mar 26 at 14:40












3 Answers
3






active

oldest

votes


















1














Personally I would go with pandas.



import pandas as pd

#Loading into pandas
df_data = pd.read_excel('sample.xlsx')
df_data.fillna("NO DATA",inplace=True) ## Replaced nan values with "NO DATA"
unique_ids = df_data.ods_service_ids.unique()

#Storing pd into a list
records_list = df_data.to_dict('records')
keys_to_check = ['service_name', 'service_plan_name', 'CPU','RAM','NIC','DRIVE']
processed =

#Go through unique ids
for key in unique_ids:
processed[key] =

#Get related records
matching_records = [y for y in records_list if y['ods_service_ids'] == key]
#Loop through records
for record in matching_records:
#For each key to check, save in dict if non null
processed[key]['ods_service_ids'] = key
for detail_key in keys_to_check:
if record[detail_key] != "NO DATA" :
processed[key][detail_key] = record[detail_key]
##Note : doesn't handle duplicate values for different keys so far


#Records are put back in list
output_data = [processed[x] for x in processed.keys()]
# -> to Pandas
df = pd.DataFrame(output_data)[['ods_service_ids','service_name', 'service_plan_name', 'CPU','RAM','NIC','DRIVE']]

#Export to Excel
df.to_excel("output.xlsx",sheet_name='Sheet_name_1', index=False)


The above should work but I wasn't really sure on how you wanted to save duplicated records for the same id. Do you look to store them as DRIVE_0, DRIVE_1, DRIVE_2 ?



EDIT:



df could be exported in a different way. Replaced below #export to Excel with the following :



df.to_excel("output.xlsx",sheet_name='Sheet_name_1')


EDIT 2:



with no input data it was hard to see any flows. Corrected the code above with fake data






share|improve this answer

























  • I think doing the processing Pandas is fine but then I'd write rows directly with openpyxl.

    – Charlie Clark
    Mar 26 at 16:28











  • I tried running this and I get: ```` >>> xcel Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'xcel' is not defined >>> >>> writer = pd.ExcelWriter('output.xlsx') Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'pd' is not defined >>> df.to_excel(writer,'Sheet1') Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'df' is not defined >>> writer.save() File "<stdin>", line 1, in <module> NameError: name 'writer' is not defined >>> ````

    – Mathieu Moquin
    Mar 26 at 16:52












  • Just edited the code for exporting

    – Sebastien D
    Mar 26 at 17:01











  • @SebastienD it says that df is not defined, which is weird given the fact that you actually define its values. Could it be an error with my Visual Code configuration? I set Environment variables for Python, do I need to do it for each library I add?

    – Mathieu Moquin
    Mar 26 at 17:10







  • 1





    Mathieu, I just ran the code and indeed there were a couple errors. Corrected them, hope it will work. Anyway, do not focus on my approach, please consider what @CharlieClarck suggested which looks interesting

    – Sebastien D
    Mar 26 at 17:35



















1














To be honest, I think you've managed to get confused by data structures and come up with something far more complicated than you need.



One approach that would suit would be to use Python dictionaries for each service, updating them row by row.



wb = load_workbook("sample.xlsx")
ws = wb.active
objs =
headers = next(ws.iter_rows(min_row=1, max_row=1, values_only=True))
for row in ws.iter_rows(min_row=2, values_only=True):
if row[0] not in objs:
obj = key:value for key, value in zip(headers, row)
objs[obj['ods_service_id']] = obj
else:# update dict with non-None values
extra = key:value for key, value in zip(headers[3:], row[3:]) if value != "NULL"
obj.update(extra)

# write to new workbook
wb2 = Workbook()
ws2 = wb2.active
ws2.append(headers)
for row in objs.values(): # do they need sorting?
ws2.append([obj[key] for key in headers])


Note how you can do everything without using counters.






share|improve this answer

























  • For the most part this runs, but it is telling me that for "header", in zip(header, row), header is not defined. because of that error though, the script fails to write anything in output.xlsx

    – Mathieu Moquin
    Mar 26 at 19:02












  • seems the header issue was indeed the S missing. So the code continues running here and I'm getting a keyerror on line 13 : objs[obj['ods_service_id']] = obj

    – Mathieu Moquin
    Mar 26 at 19:11












  • What does obj.keys() give you? Or what's the first value of ´headers`?

    – Charlie Clark
    Mar 27 at 8:31


















-1














I would suggest using the pandas library for this and then you can easily do any kind of transformation.



import pandas as pd

exceldata = pd.read_excel('tmp.xlsx', index_col=0)

print(exceldata)


You can easily remove null/na value or you can replace it and export it into excel format.



Reference For Help:



Reading Excel



Drop Na Value



Replace NA Value






share|improve this answer



























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55359200%2fhow-can-i-concatenate-multiple-rows-of-excel-data-into-one%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    Personally I would go with pandas.



    import pandas as pd

    #Loading into pandas
    df_data = pd.read_excel('sample.xlsx')
    df_data.fillna("NO DATA",inplace=True) ## Replaced nan values with "NO DATA"
    unique_ids = df_data.ods_service_ids.unique()

    #Storing pd into a list
    records_list = df_data.to_dict('records')
    keys_to_check = ['service_name', 'service_plan_name', 'CPU','RAM','NIC','DRIVE']
    processed =

    #Go through unique ids
    for key in unique_ids:
    processed[key] =

    #Get related records
    matching_records = [y for y in records_list if y['ods_service_ids'] == key]
    #Loop through records
    for record in matching_records:
    #For each key to check, save in dict if non null
    processed[key]['ods_service_ids'] = key
    for detail_key in keys_to_check:
    if record[detail_key] != "NO DATA" :
    processed[key][detail_key] = record[detail_key]
    ##Note : doesn't handle duplicate values for different keys so far


    #Records are put back in list
    output_data = [processed[x] for x in processed.keys()]
    # -> to Pandas
    df = pd.DataFrame(output_data)[['ods_service_ids','service_name', 'service_plan_name', 'CPU','RAM','NIC','DRIVE']]

    #Export to Excel
    df.to_excel("output.xlsx",sheet_name='Sheet_name_1', index=False)


    The above should work but I wasn't really sure on how you wanted to save duplicated records for the same id. Do you look to store them as DRIVE_0, DRIVE_1, DRIVE_2 ?



    EDIT:



    df could be exported in a different way. Replaced below #export to Excel with the following :



    df.to_excel("output.xlsx",sheet_name='Sheet_name_1')


    EDIT 2:



    with no input data it was hard to see any flows. Corrected the code above with fake data






    share|improve this answer

























    • I think doing the processing Pandas is fine but then I'd write rows directly with openpyxl.

      – Charlie Clark
      Mar 26 at 16:28











    • I tried running this and I get: ```` >>> xcel Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'xcel' is not defined >>> >>> writer = pd.ExcelWriter('output.xlsx') Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'pd' is not defined >>> df.to_excel(writer,'Sheet1') Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'df' is not defined >>> writer.save() File "<stdin>", line 1, in <module> NameError: name 'writer' is not defined >>> ````

      – Mathieu Moquin
      Mar 26 at 16:52












    • Just edited the code for exporting

      – Sebastien D
      Mar 26 at 17:01











    • @SebastienD it says that df is not defined, which is weird given the fact that you actually define its values. Could it be an error with my Visual Code configuration? I set Environment variables for Python, do I need to do it for each library I add?

      – Mathieu Moquin
      Mar 26 at 17:10







    • 1





      Mathieu, I just ran the code and indeed there were a couple errors. Corrected them, hope it will work. Anyway, do not focus on my approach, please consider what @CharlieClarck suggested which looks interesting

      – Sebastien D
      Mar 26 at 17:35
















    1














    Personally I would go with pandas.



    import pandas as pd

    #Loading into pandas
    df_data = pd.read_excel('sample.xlsx')
    df_data.fillna("NO DATA",inplace=True) ## Replaced nan values with "NO DATA"
    unique_ids = df_data.ods_service_ids.unique()

    #Storing pd into a list
    records_list = df_data.to_dict('records')
    keys_to_check = ['service_name', 'service_plan_name', 'CPU','RAM','NIC','DRIVE']
    processed =

    #Go through unique ids
    for key in unique_ids:
    processed[key] =

    #Get related records
    matching_records = [y for y in records_list if y['ods_service_ids'] == key]
    #Loop through records
    for record in matching_records:
    #For each key to check, save in dict if non null
    processed[key]['ods_service_ids'] = key
    for detail_key in keys_to_check:
    if record[detail_key] != "NO DATA" :
    processed[key][detail_key] = record[detail_key]
    ##Note : doesn't handle duplicate values for different keys so far


    #Records are put back in list
    output_data = [processed[x] for x in processed.keys()]
    # -> to Pandas
    df = pd.DataFrame(output_data)[['ods_service_ids','service_name', 'service_plan_name', 'CPU','RAM','NIC','DRIVE']]

    #Export to Excel
    df.to_excel("output.xlsx",sheet_name='Sheet_name_1', index=False)


    The above should work but I wasn't really sure on how you wanted to save duplicated records for the same id. Do you look to store them as DRIVE_0, DRIVE_1, DRIVE_2 ?



    EDIT:



    df could be exported in a different way. Replaced below #export to Excel with the following :



    df.to_excel("output.xlsx",sheet_name='Sheet_name_1')


    EDIT 2:



    with no input data it was hard to see any flows. Corrected the code above with fake data






    share|improve this answer

























    • I think doing the processing Pandas is fine but then I'd write rows directly with openpyxl.

      – Charlie Clark
      Mar 26 at 16:28











    • I tried running this and I get: ```` >>> xcel Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'xcel' is not defined >>> >>> writer = pd.ExcelWriter('output.xlsx') Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'pd' is not defined >>> df.to_excel(writer,'Sheet1') Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'df' is not defined >>> writer.save() File "<stdin>", line 1, in <module> NameError: name 'writer' is not defined >>> ````

      – Mathieu Moquin
      Mar 26 at 16:52












    • Just edited the code for exporting

      – Sebastien D
      Mar 26 at 17:01











    • @SebastienD it says that df is not defined, which is weird given the fact that you actually define its values. Could it be an error with my Visual Code configuration? I set Environment variables for Python, do I need to do it for each library I add?

      – Mathieu Moquin
      Mar 26 at 17:10







    • 1





      Mathieu, I just ran the code and indeed there were a couple errors. Corrected them, hope it will work. Anyway, do not focus on my approach, please consider what @CharlieClarck suggested which looks interesting

      – Sebastien D
      Mar 26 at 17:35














    1












    1








    1







    Personally I would go with pandas.



    import pandas as pd

    #Loading into pandas
    df_data = pd.read_excel('sample.xlsx')
    df_data.fillna("NO DATA",inplace=True) ## Replaced nan values with "NO DATA"
    unique_ids = df_data.ods_service_ids.unique()

    #Storing pd into a list
    records_list = df_data.to_dict('records')
    keys_to_check = ['service_name', 'service_plan_name', 'CPU','RAM','NIC','DRIVE']
    processed =

    #Go through unique ids
    for key in unique_ids:
    processed[key] =

    #Get related records
    matching_records = [y for y in records_list if y['ods_service_ids'] == key]
    #Loop through records
    for record in matching_records:
    #For each key to check, save in dict if non null
    processed[key]['ods_service_ids'] = key
    for detail_key in keys_to_check:
    if record[detail_key] != "NO DATA" :
    processed[key][detail_key] = record[detail_key]
    ##Note : doesn't handle duplicate values for different keys so far


    #Records are put back in list
    output_data = [processed[x] for x in processed.keys()]
    # -> to Pandas
    df = pd.DataFrame(output_data)[['ods_service_ids','service_name', 'service_plan_name', 'CPU','RAM','NIC','DRIVE']]

    #Export to Excel
    df.to_excel("output.xlsx",sheet_name='Sheet_name_1', index=False)


    The above should work but I wasn't really sure on how you wanted to save duplicated records for the same id. Do you look to store them as DRIVE_0, DRIVE_1, DRIVE_2 ?



    EDIT:



    df could be exported in a different way. Replaced below #export to Excel with the following :



    df.to_excel("output.xlsx",sheet_name='Sheet_name_1')


    EDIT 2:



    with no input data it was hard to see any flows. Corrected the code above with fake data






    share|improve this answer















    Personally I would go with pandas.



    import pandas as pd

    #Loading into pandas
    df_data = pd.read_excel('sample.xlsx')
    df_data.fillna("NO DATA",inplace=True) ## Replaced nan values with "NO DATA"
    unique_ids = df_data.ods_service_ids.unique()

    #Storing pd into a list
    records_list = df_data.to_dict('records')
    keys_to_check = ['service_name', 'service_plan_name', 'CPU','RAM','NIC','DRIVE']
    processed =

    #Go through unique ids
    for key in unique_ids:
    processed[key] =

    #Get related records
    matching_records = [y for y in records_list if y['ods_service_ids'] == key]
    #Loop through records
    for record in matching_records:
    #For each key to check, save in dict if non null
    processed[key]['ods_service_ids'] = key
    for detail_key in keys_to_check:
    if record[detail_key] != "NO DATA" :
    processed[key][detail_key] = record[detail_key]
    ##Note : doesn't handle duplicate values for different keys so far


    #Records are put back in list
    output_data = [processed[x] for x in processed.keys()]
    # -> to Pandas
    df = pd.DataFrame(output_data)[['ods_service_ids','service_name', 'service_plan_name', 'CPU','RAM','NIC','DRIVE']]

    #Export to Excel
    df.to_excel("output.xlsx",sheet_name='Sheet_name_1', index=False)


    The above should work but I wasn't really sure on how you wanted to save duplicated records for the same id. Do you look to store them as DRIVE_0, DRIVE_1, DRIVE_2 ?



    EDIT:



    df could be exported in a different way. Replaced below #export to Excel with the following :



    df.to_excel("output.xlsx",sheet_name='Sheet_name_1')


    EDIT 2:



    with no input data it was hard to see any flows. Corrected the code above with fake data







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Mar 26 at 17:34

























    answered Mar 26 at 15:21









    Sebastien DSebastien D

    2,2072 gold badges8 silver badges29 bronze badges




    2,2072 gold badges8 silver badges29 bronze badges












    • I think doing the processing Pandas is fine but then I'd write rows directly with openpyxl.

      – Charlie Clark
      Mar 26 at 16:28











    • I tried running this and I get: ```` >>> xcel Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'xcel' is not defined >>> >>> writer = pd.ExcelWriter('output.xlsx') Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'pd' is not defined >>> df.to_excel(writer,'Sheet1') Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'df' is not defined >>> writer.save() File "<stdin>", line 1, in <module> NameError: name 'writer' is not defined >>> ````

      – Mathieu Moquin
      Mar 26 at 16:52












    • Just edited the code for exporting

      – Sebastien D
      Mar 26 at 17:01











    • @SebastienD it says that df is not defined, which is weird given the fact that you actually define its values. Could it be an error with my Visual Code configuration? I set Environment variables for Python, do I need to do it for each library I add?

      – Mathieu Moquin
      Mar 26 at 17:10







    • 1





      Mathieu, I just ran the code and indeed there were a couple errors. Corrected them, hope it will work. Anyway, do not focus on my approach, please consider what @CharlieClarck suggested which looks interesting

      – Sebastien D
      Mar 26 at 17:35


















    • I think doing the processing Pandas is fine but then I'd write rows directly with openpyxl.

      – Charlie Clark
      Mar 26 at 16:28











    • I tried running this and I get: ```` >>> xcel Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'xcel' is not defined >>> >>> writer = pd.ExcelWriter('output.xlsx') Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'pd' is not defined >>> df.to_excel(writer,'Sheet1') Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'df' is not defined >>> writer.save() File "<stdin>", line 1, in <module> NameError: name 'writer' is not defined >>> ````

      – Mathieu Moquin
      Mar 26 at 16:52












    • Just edited the code for exporting

      – Sebastien D
      Mar 26 at 17:01











    • @SebastienD it says that df is not defined, which is weird given the fact that you actually define its values. Could it be an error with my Visual Code configuration? I set Environment variables for Python, do I need to do it for each library I add?

      – Mathieu Moquin
      Mar 26 at 17:10







    • 1





      Mathieu, I just ran the code and indeed there were a couple errors. Corrected them, hope it will work. Anyway, do not focus on my approach, please consider what @CharlieClarck suggested which looks interesting

      – Sebastien D
      Mar 26 at 17:35

















    I think doing the processing Pandas is fine but then I'd write rows directly with openpyxl.

    – Charlie Clark
    Mar 26 at 16:28





    I think doing the processing Pandas is fine but then I'd write rows directly with openpyxl.

    – Charlie Clark
    Mar 26 at 16:28













    I tried running this and I get: ```` >>> xcel Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'xcel' is not defined >>> >>> writer = pd.ExcelWriter('output.xlsx') Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'pd' is not defined >>> df.to_excel(writer,'Sheet1') Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'df' is not defined >>> writer.save() File "<stdin>", line 1, in <module> NameError: name 'writer' is not defined >>> ````

    – Mathieu Moquin
    Mar 26 at 16:52






    I tried running this and I get: ```` >>> xcel Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'xcel' is not defined >>> >>> writer = pd.ExcelWriter('output.xlsx') Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'pd' is not defined >>> df.to_excel(writer,'Sheet1') Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'df' is not defined >>> writer.save() File "<stdin>", line 1, in <module> NameError: name 'writer' is not defined >>> ````

    – Mathieu Moquin
    Mar 26 at 16:52














    Just edited the code for exporting

    – Sebastien D
    Mar 26 at 17:01





    Just edited the code for exporting

    – Sebastien D
    Mar 26 at 17:01













    @SebastienD it says that df is not defined, which is weird given the fact that you actually define its values. Could it be an error with my Visual Code configuration? I set Environment variables for Python, do I need to do it for each library I add?

    – Mathieu Moquin
    Mar 26 at 17:10






    @SebastienD it says that df is not defined, which is weird given the fact that you actually define its values. Could it be an error with my Visual Code configuration? I set Environment variables for Python, do I need to do it for each library I add?

    – Mathieu Moquin
    Mar 26 at 17:10





    1




    1





    Mathieu, I just ran the code and indeed there were a couple errors. Corrected them, hope it will work. Anyway, do not focus on my approach, please consider what @CharlieClarck suggested which looks interesting

    – Sebastien D
    Mar 26 at 17:35






    Mathieu, I just ran the code and indeed there were a couple errors. Corrected them, hope it will work. Anyway, do not focus on my approach, please consider what @CharlieClarck suggested which looks interesting

    – Sebastien D
    Mar 26 at 17:35














    1














    To be honest, I think you've managed to get confused by data structures and come up with something far more complicated than you need.



    One approach that would suit would be to use Python dictionaries for each service, updating them row by row.



    wb = load_workbook("sample.xlsx")
    ws = wb.active
    objs =
    headers = next(ws.iter_rows(min_row=1, max_row=1, values_only=True))
    for row in ws.iter_rows(min_row=2, values_only=True):
    if row[0] not in objs:
    obj = key:value for key, value in zip(headers, row)
    objs[obj['ods_service_id']] = obj
    else:# update dict with non-None values
    extra = key:value for key, value in zip(headers[3:], row[3:]) if value != "NULL"
    obj.update(extra)

    # write to new workbook
    wb2 = Workbook()
    ws2 = wb2.active
    ws2.append(headers)
    for row in objs.values(): # do they need sorting?
    ws2.append([obj[key] for key in headers])


    Note how you can do everything without using counters.






    share|improve this answer

























    • For the most part this runs, but it is telling me that for "header", in zip(header, row), header is not defined. because of that error though, the script fails to write anything in output.xlsx

      – Mathieu Moquin
      Mar 26 at 19:02












    • seems the header issue was indeed the S missing. So the code continues running here and I'm getting a keyerror on line 13 : objs[obj['ods_service_id']] = obj

      – Mathieu Moquin
      Mar 26 at 19:11












    • What does obj.keys() give you? Or what's the first value of ´headers`?

      – Charlie Clark
      Mar 27 at 8:31















    1














    To be honest, I think you've managed to get confused by data structures and come up with something far more complicated than you need.



    One approach that would suit would be to use Python dictionaries for each service, updating them row by row.



    wb = load_workbook("sample.xlsx")
    ws = wb.active
    objs =
    headers = next(ws.iter_rows(min_row=1, max_row=1, values_only=True))
    for row in ws.iter_rows(min_row=2, values_only=True):
    if row[0] not in objs:
    obj = key:value for key, value in zip(headers, row)
    objs[obj['ods_service_id']] = obj
    else:# update dict with non-None values
    extra = key:value for key, value in zip(headers[3:], row[3:]) if value != "NULL"
    obj.update(extra)

    # write to new workbook
    wb2 = Workbook()
    ws2 = wb2.active
    ws2.append(headers)
    for row in objs.values(): # do they need sorting?
    ws2.append([obj[key] for key in headers])


    Note how you can do everything without using counters.






    share|improve this answer

























    • For the most part this runs, but it is telling me that for "header", in zip(header, row), header is not defined. because of that error though, the script fails to write anything in output.xlsx

      – Mathieu Moquin
      Mar 26 at 19:02












    • seems the header issue was indeed the S missing. So the code continues running here and I'm getting a keyerror on line 13 : objs[obj['ods_service_id']] = obj

      – Mathieu Moquin
      Mar 26 at 19:11












    • What does obj.keys() give you? Or what's the first value of ´headers`?

      – Charlie Clark
      Mar 27 at 8:31













    1












    1








    1







    To be honest, I think you've managed to get confused by data structures and come up with something far more complicated than you need.



    One approach that would suit would be to use Python dictionaries for each service, updating them row by row.



    wb = load_workbook("sample.xlsx")
    ws = wb.active
    objs =
    headers = next(ws.iter_rows(min_row=1, max_row=1, values_only=True))
    for row in ws.iter_rows(min_row=2, values_only=True):
    if row[0] not in objs:
    obj = key:value for key, value in zip(headers, row)
    objs[obj['ods_service_id']] = obj
    else:# update dict with non-None values
    extra = key:value for key, value in zip(headers[3:], row[3:]) if value != "NULL"
    obj.update(extra)

    # write to new workbook
    wb2 = Workbook()
    ws2 = wb2.active
    ws2.append(headers)
    for row in objs.values(): # do they need sorting?
    ws2.append([obj[key] for key in headers])


    Note how you can do everything without using counters.






    share|improve this answer















    To be honest, I think you've managed to get confused by data structures and come up with something far more complicated than you need.



    One approach that would suit would be to use Python dictionaries for each service, updating them row by row.



    wb = load_workbook("sample.xlsx")
    ws = wb.active
    objs =
    headers = next(ws.iter_rows(min_row=1, max_row=1, values_only=True))
    for row in ws.iter_rows(min_row=2, values_only=True):
    if row[0] not in objs:
    obj = key:value for key, value in zip(headers, row)
    objs[obj['ods_service_id']] = obj
    else:# update dict with non-None values
    extra = key:value for key, value in zip(headers[3:], row[3:]) if value != "NULL"
    obj.update(extra)

    # write to new workbook
    wb2 = Workbook()
    ws2 = wb2.active
    ws2.append(headers)
    for row in objs.values(): # do they need sorting?
    ws2.append([obj[key] for key in headers])


    Note how you can do everything without using counters.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Mar 27 at 8:33

























    answered Mar 26 at 16:44









    Charlie ClarkCharlie Clark

    11k2 gold badges26 silver badges37 bronze badges




    11k2 gold badges26 silver badges37 bronze badges












    • For the most part this runs, but it is telling me that for "header", in zip(header, row), header is not defined. because of that error though, the script fails to write anything in output.xlsx

      – Mathieu Moquin
      Mar 26 at 19:02












    • seems the header issue was indeed the S missing. So the code continues running here and I'm getting a keyerror on line 13 : objs[obj['ods_service_id']] = obj

      – Mathieu Moquin
      Mar 26 at 19:11












    • What does obj.keys() give you? Or what's the first value of ´headers`?

      – Charlie Clark
      Mar 27 at 8:31

















    • For the most part this runs, but it is telling me that for "header", in zip(header, row), header is not defined. because of that error though, the script fails to write anything in output.xlsx

      – Mathieu Moquin
      Mar 26 at 19:02












    • seems the header issue was indeed the S missing. So the code continues running here and I'm getting a keyerror on line 13 : objs[obj['ods_service_id']] = obj

      – Mathieu Moquin
      Mar 26 at 19:11












    • What does obj.keys() give you? Or what's the first value of ´headers`?

      – Charlie Clark
      Mar 27 at 8:31
















    For the most part this runs, but it is telling me that for "header", in zip(header, row), header is not defined. because of that error though, the script fails to write anything in output.xlsx

    – Mathieu Moquin
    Mar 26 at 19:02






    For the most part this runs, but it is telling me that for "header", in zip(header, row), header is not defined. because of that error though, the script fails to write anything in output.xlsx

    – Mathieu Moquin
    Mar 26 at 19:02














    seems the header issue was indeed the S missing. So the code continues running here and I'm getting a keyerror on line 13 : objs[obj['ods_service_id']] = obj

    – Mathieu Moquin
    Mar 26 at 19:11






    seems the header issue was indeed the S missing. So the code continues running here and I'm getting a keyerror on line 13 : objs[obj['ods_service_id']] = obj

    – Mathieu Moquin
    Mar 26 at 19:11














    What does obj.keys() give you? Or what's the first value of ´headers`?

    – Charlie Clark
    Mar 27 at 8:31





    What does obj.keys() give you? Or what's the first value of ´headers`?

    – Charlie Clark
    Mar 27 at 8:31











    -1














    I would suggest using the pandas library for this and then you can easily do any kind of transformation.



    import pandas as pd

    exceldata = pd.read_excel('tmp.xlsx', index_col=0)

    print(exceldata)


    You can easily remove null/na value or you can replace it and export it into excel format.



    Reference For Help:



    Reading Excel



    Drop Na Value



    Replace NA Value






    share|improve this answer





























      -1














      I would suggest using the pandas library for this and then you can easily do any kind of transformation.



      import pandas as pd

      exceldata = pd.read_excel('tmp.xlsx', index_col=0)

      print(exceldata)


      You can easily remove null/na value or you can replace it and export it into excel format.



      Reference For Help:



      Reading Excel



      Drop Na Value



      Replace NA Value






      share|improve this answer



























        -1












        -1








        -1







        I would suggest using the pandas library for this and then you can easily do any kind of transformation.



        import pandas as pd

        exceldata = pd.read_excel('tmp.xlsx', index_col=0)

        print(exceldata)


        You can easily remove null/na value or you can replace it and export it into excel format.



        Reference For Help:



        Reading Excel



        Drop Na Value



        Replace NA Value






        share|improve this answer















        I would suggest using the pandas library for this and then you can easily do any kind of transformation.



        import pandas as pd

        exceldata = pd.read_excel('tmp.xlsx', index_col=0)

        print(exceldata)


        You can easily remove null/na value or you can replace it and export it into excel format.



        Reference For Help:



        Reading Excel



        Drop Na Value



        Replace NA Value







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Mar 26 at 15:31









        baduker

        1,2224 gold badges11 silver badges21 bronze badges




        1,2224 gold badges11 silver badges21 bronze badges










        answered Mar 26 at 14:59









        Prakash ChoudharyPrakash Choudhary

        849 bronze badges




        849 bronze badges



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55359200%2fhow-can-i-concatenate-multiple-rows-of-excel-data-into-one%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            SQL error code 1064 with creating Laravel foreign keysForeign key constraints: When to use ON UPDATE and ON DELETEDropping column with foreign key Laravel error: General error: 1025 Error on renameLaravel SQL Can't create tableLaravel Migration foreign key errorLaravel php artisan migrate:refresh giving a syntax errorSQLSTATE[42S01]: Base table or view already exists or Base table or view already exists: 1050 Tableerror in migrating laravel file to xampp serverSyntax error or access violation: 1064:syntax to use near 'unsigned not null, modelName varchar(191) not null, title varchar(191) not nLaravel cannot create new table field in mysqlLaravel 5.7:Last migration creates table but is not registered in the migration table

            용인 삼성생명 블루밍스 목차 통계 역대 감독 선수단 응원단 경기장 같이 보기 외부 링크 둘러보기 메뉴samsungblueminx.comeh선수 명단용인 삼성생명 블루밍스용인 삼성생명 블루밍스ehsamsungblueminx.comeheheheh

            155 수학 과학 기타 둘러보기 메뉴eh추가해eh문서를 완성해