Is there a possibility to convert complex json to pandas dataframe?What is the difference between json.load() and json.loads() functionsHow do I format a Microsoft JSON date?Can comments be used in JSON?How can I pretty-print JSON in a shell script?What is the correct JSON content type?Why does Google prepend while(1); to their JSON responses?Convert JS object to JSON stringParse JSON in JavaScript?How do I POST JSON data with Curl from a terminal/commandline to Test Spring REST?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandas
Should I have shared a document with a former employee?
Why would word of Princess Leia's capture generate sympathy for the Rebellion in the Senate?
Do higher dimensions have axes?
What's a German word for »Sandbagger«?
Is it possible to target 2 allies with the Warding Bond spell using the Sorcerer's Twinned Spell metamagic option?
Which modern firearm should a time traveler bring to be easily reproducible for a historic civilization?
Linux ext4 restore file and directory access rights after bad backup/restore
What makes MOVEQ quicker than a normal MOVE in 68000 assembly?
PLINQ code analysis in terms of efficiency of cryptographic hashes generation
Extract the attribute names from a large number of Shapefiles
Why are there few or no black super GMs?
A spacecraft is travelling at X units per hour. But relative to what exactly? Does it depend on orbit? How?
Locked-up DOS computer beeped on keypress. What mechanism caused that?
Applying for jobs with an obvious scar
I want light controlled by one switch, not two
Is encryption still applied if you ignore the SSL certificate warning for self-signed certs?
Suggestions for how to track down the source of this force:source:push error?
Why is this guy handcuffed censored?
Masyu-making game
What would be the safest way to drop thousands of small, hard objects from a typical, high wing, GA airplane?
What is this green alien supposed to be on the American covers of the "Hitchhiker's Guide to the Galaxy"?
How do you send money when you're not sure it's not a scam?
Why does a tetrahedral molecule like methane have a dipole moment of zero?
Function over a list that depends on the index
Is there a possibility to convert complex json to pandas dataframe?
What is the difference between json.load() and json.loads() functionsHow do I format a Microsoft JSON date?Can comments be used in JSON?How can I pretty-print JSON in a shell script?What is the correct JSON content type?Why does Google prepend while(1); to their JSON responses?Convert JS object to JSON stringParse JSON in JavaScript?How do I POST JSON data with Curl from a terminal/commandline to Test Spring REST?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandas
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I know there are quite similar questions so far, but I just don't get a right idea sorry.
I would like to convert a complex json to a clean pandas dataframe.
My Code so far:
with open('JSON_Input.json', 'r') as json_file:
json_data = json.load(json_file)
json_data = json.loads(json_data)
This creates the following complex, nested json object:
json_data
'time': 0,
'day1': ['time': 0,
'coordinates': ['x': 1202.5, 'y': 486, 'time': 3276,
'x': 1162.5, 'y': 484, 'time': 3331,
'x': 742.5, 'y': 492.5, 'time': 3487,
'x': 673.5, 'y': 501.5, 'time': 3514,
'x': 636, 'y': 508.5, 'time': 3539],
'path': 'path1',
'time': 3558,
'coordinates': ['x': 1237, 'y': 173, 'time': 5437,
'x': 1240, 'y': 182, 'time': 5601,
'x': 1260, 'y': 161, 'time': 7289,
'x': 1263, 'y': 165, 'time': 7465,
'x': 1482, 'y': 114.5, 'time': 8072,
'x': 1482, 'y': 114, 'time': 8197,
'x': 1482, 'y': 126.5, 'time': 9539],
'path': 'path2',
'time': 23620,
'coordinates': ['x': 227.5, 'y': 420, 'time': 25228,
'x': 235, 'y': 418, 'time': 25426],
'path': 'path3',
'time': 35891,
'coordinates': ['x': 681.5, 'y': 431, 'time': 36648,
'x': 704.5, 'y': 427.5, 'time': 36661,
'x': 874.5, 'y': 420.5, 'time': 36714,
'x': 909.5, 'y': 422, 'time': 36734]],
'day2': 'path': 'path4',
'time': 36743,
'coordinates': ['x': 600, 'y': 622.5, 'time': 37390,
'x': 603, 'y': 594.5, 'time': 37448,
'x': 605, 'y': 541.5, 'time': 37478,
'x': 608.5, 'y': 481.5, 'time': 37495,
'x': 620, 'y': 369, 'time': 37530,
'x': 624.5, 'y': 329, 'time': 37547,
'x': 636, 'y': 366, 'time': 38043]
Now how can get a clean dataframe out of this json file?
python json
|
show 4 more comments
I know there are quite similar questions so far, but I just don't get a right idea sorry.
I would like to convert a complex json to a clean pandas dataframe.
My Code so far:
with open('JSON_Input.json', 'r') as json_file:
json_data = json.load(json_file)
json_data = json.loads(json_data)
This creates the following complex, nested json object:
json_data
'time': 0,
'day1': ['time': 0,
'coordinates': ['x': 1202.5, 'y': 486, 'time': 3276,
'x': 1162.5, 'y': 484, 'time': 3331,
'x': 742.5, 'y': 492.5, 'time': 3487,
'x': 673.5, 'y': 501.5, 'time': 3514,
'x': 636, 'y': 508.5, 'time': 3539],
'path': 'path1',
'time': 3558,
'coordinates': ['x': 1237, 'y': 173, 'time': 5437,
'x': 1240, 'y': 182, 'time': 5601,
'x': 1260, 'y': 161, 'time': 7289,
'x': 1263, 'y': 165, 'time': 7465,
'x': 1482, 'y': 114.5, 'time': 8072,
'x': 1482, 'y': 114, 'time': 8197,
'x': 1482, 'y': 126.5, 'time': 9539],
'path': 'path2',
'time': 23620,
'coordinates': ['x': 227.5, 'y': 420, 'time': 25228,
'x': 235, 'y': 418, 'time': 25426],
'path': 'path3',
'time': 35891,
'coordinates': ['x': 681.5, 'y': 431, 'time': 36648,
'x': 704.5, 'y': 427.5, 'time': 36661,
'x': 874.5, 'y': 420.5, 'time': 36714,
'x': 909.5, 'y': 422, 'time': 36734]],
'day2': 'path': 'path4',
'time': 36743,
'coordinates': ['x': 600, 'y': 622.5, 'time': 37390,
'x': 603, 'y': 594.5, 'time': 37448,
'x': 605, 'y': 541.5, 'time': 37478,
'x': 608.5, 'y': 481.5, 'time': 37495,
'x': 620, 'y': 369, 'time': 37530,
'x': 624.5, 'y': 329, 'time': 37547,
'x': 636, 'y': 366, 'time': 38043]
Now how can get a clean dataframe out of this json file?
python json
show the desired dataframe output please
– aws_apprentice
Mar 26 at 11:32
That's the next problem. I was just given this json file and was told to analyze it and to make a dataframe that can be shown in Excel for example. Neither me nor my collegue knows the expected dataframe output =/
– Larsus123
Mar 26 at 11:42
that makes it kind of hard to help. so you can either provide context for exactly what it is you have to analyze or you go back to your boss and ask for more information
– aws_apprentice
Mar 26 at 11:44
Yes I will do that. Thanks for your time. When I have more information, I will write again.
– Larsus123
Mar 26 at 11:45
@Larsus123 any purticular reason why you are doingjson_data = json.load(json_file) & json_data = json.loads(json_data)
– penta
Mar 26 at 12:10
|
show 4 more comments
I know there are quite similar questions so far, but I just don't get a right idea sorry.
I would like to convert a complex json to a clean pandas dataframe.
My Code so far:
with open('JSON_Input.json', 'r') as json_file:
json_data = json.load(json_file)
json_data = json.loads(json_data)
This creates the following complex, nested json object:
json_data
'time': 0,
'day1': ['time': 0,
'coordinates': ['x': 1202.5, 'y': 486, 'time': 3276,
'x': 1162.5, 'y': 484, 'time': 3331,
'x': 742.5, 'y': 492.5, 'time': 3487,
'x': 673.5, 'y': 501.5, 'time': 3514,
'x': 636, 'y': 508.5, 'time': 3539],
'path': 'path1',
'time': 3558,
'coordinates': ['x': 1237, 'y': 173, 'time': 5437,
'x': 1240, 'y': 182, 'time': 5601,
'x': 1260, 'y': 161, 'time': 7289,
'x': 1263, 'y': 165, 'time': 7465,
'x': 1482, 'y': 114.5, 'time': 8072,
'x': 1482, 'y': 114, 'time': 8197,
'x': 1482, 'y': 126.5, 'time': 9539],
'path': 'path2',
'time': 23620,
'coordinates': ['x': 227.5, 'y': 420, 'time': 25228,
'x': 235, 'y': 418, 'time': 25426],
'path': 'path3',
'time': 35891,
'coordinates': ['x': 681.5, 'y': 431, 'time': 36648,
'x': 704.5, 'y': 427.5, 'time': 36661,
'x': 874.5, 'y': 420.5, 'time': 36714,
'x': 909.5, 'y': 422, 'time': 36734]],
'day2': 'path': 'path4',
'time': 36743,
'coordinates': ['x': 600, 'y': 622.5, 'time': 37390,
'x': 603, 'y': 594.5, 'time': 37448,
'x': 605, 'y': 541.5, 'time': 37478,
'x': 608.5, 'y': 481.5, 'time': 37495,
'x': 620, 'y': 369, 'time': 37530,
'x': 624.5, 'y': 329, 'time': 37547,
'x': 636, 'y': 366, 'time': 38043]
Now how can get a clean dataframe out of this json file?
python json
I know there are quite similar questions so far, but I just don't get a right idea sorry.
I would like to convert a complex json to a clean pandas dataframe.
My Code so far:
with open('JSON_Input.json', 'r') as json_file:
json_data = json.load(json_file)
json_data = json.loads(json_data)
This creates the following complex, nested json object:
json_data
'time': 0,
'day1': ['time': 0,
'coordinates': ['x': 1202.5, 'y': 486, 'time': 3276,
'x': 1162.5, 'y': 484, 'time': 3331,
'x': 742.5, 'y': 492.5, 'time': 3487,
'x': 673.5, 'y': 501.5, 'time': 3514,
'x': 636, 'y': 508.5, 'time': 3539],
'path': 'path1',
'time': 3558,
'coordinates': ['x': 1237, 'y': 173, 'time': 5437,
'x': 1240, 'y': 182, 'time': 5601,
'x': 1260, 'y': 161, 'time': 7289,
'x': 1263, 'y': 165, 'time': 7465,
'x': 1482, 'y': 114.5, 'time': 8072,
'x': 1482, 'y': 114, 'time': 8197,
'x': 1482, 'y': 126.5, 'time': 9539],
'path': 'path2',
'time': 23620,
'coordinates': ['x': 227.5, 'y': 420, 'time': 25228,
'x': 235, 'y': 418, 'time': 25426],
'path': 'path3',
'time': 35891,
'coordinates': ['x': 681.5, 'y': 431, 'time': 36648,
'x': 704.5, 'y': 427.5, 'time': 36661,
'x': 874.5, 'y': 420.5, 'time': 36714,
'x': 909.5, 'y': 422, 'time': 36734]],
'day2': 'path': 'path4',
'time': 36743,
'coordinates': ['x': 600, 'y': 622.5, 'time': 37390,
'x': 603, 'y': 594.5, 'time': 37448,
'x': 605, 'y': 541.5, 'time': 37478,
'x': 608.5, 'y': 481.5, 'time': 37495,
'x': 620, 'y': 369, 'time': 37530,
'x': 624.5, 'y': 329, 'time': 37547,
'x': 636, 'y': 366, 'time': 38043]
Now how can get a clean dataframe out of this json file?
python json
python json
edited Mar 26 at 12:32
penta
1,3232 gold badges11 silver badges33 bronze badges
1,3232 gold badges11 silver badges33 bronze badges
asked Mar 26 at 11:31
Larsus123Larsus123
11 bronze badge
11 bronze badge
show the desired dataframe output please
– aws_apprentice
Mar 26 at 11:32
That's the next problem. I was just given this json file and was told to analyze it and to make a dataframe that can be shown in Excel for example. Neither me nor my collegue knows the expected dataframe output =/
– Larsus123
Mar 26 at 11:42
that makes it kind of hard to help. so you can either provide context for exactly what it is you have to analyze or you go back to your boss and ask for more information
– aws_apprentice
Mar 26 at 11:44
Yes I will do that. Thanks for your time. When I have more information, I will write again.
– Larsus123
Mar 26 at 11:45
@Larsus123 any purticular reason why you are doingjson_data = json.load(json_file) & json_data = json.loads(json_data)
– penta
Mar 26 at 12:10
|
show 4 more comments
show the desired dataframe output please
– aws_apprentice
Mar 26 at 11:32
That's the next problem. I was just given this json file and was told to analyze it and to make a dataframe that can be shown in Excel for example. Neither me nor my collegue knows the expected dataframe output =/
– Larsus123
Mar 26 at 11:42
that makes it kind of hard to help. so you can either provide context for exactly what it is you have to analyze or you go back to your boss and ask for more information
– aws_apprentice
Mar 26 at 11:44
Yes I will do that. Thanks for your time. When I have more information, I will write again.
– Larsus123
Mar 26 at 11:45
@Larsus123 any purticular reason why you are doingjson_data = json.load(json_file) & json_data = json.loads(json_data)
– penta
Mar 26 at 12:10
show the desired dataframe output please
– aws_apprentice
Mar 26 at 11:32
show the desired dataframe output please
– aws_apprentice
Mar 26 at 11:32
That's the next problem. I was just given this json file and was told to analyze it and to make a dataframe that can be shown in Excel for example. Neither me nor my collegue knows the expected dataframe output =/
– Larsus123
Mar 26 at 11:42
That's the next problem. I was just given this json file and was told to analyze it and to make a dataframe that can be shown in Excel for example. Neither me nor my collegue knows the expected dataframe output =/
– Larsus123
Mar 26 at 11:42
that makes it kind of hard to help. so you can either provide context for exactly what it is you have to analyze or you go back to your boss and ask for more information
– aws_apprentice
Mar 26 at 11:44
that makes it kind of hard to help. so you can either provide context for exactly what it is you have to analyze or you go back to your boss and ask for more information
– aws_apprentice
Mar 26 at 11:44
Yes I will do that. Thanks for your time. When I have more information, I will write again.
– Larsus123
Mar 26 at 11:45
Yes I will do that. Thanks for your time. When I have more information, I will write again.
– Larsus123
Mar 26 at 11:45
@Larsus123 any purticular reason why you are doing
json_data = json.load(json_file) & json_data = json.loads(json_data)
– penta
Mar 26 at 12:10
@Larsus123 any purticular reason why you are doing
json_data = json.load(json_file) & json_data = json.loads(json_data)
– penta
Mar 26 at 12:10
|
show 4 more comments
1 Answer
1
active
oldest
votes
This is tricky. You're going to end up with lots of nulls, and I also don't know exactly how you want the end datframe to look like. But maybe this can get you going in the right direction:
jsonObj = 'time': 0,
'day1': ['time': 0,
'coordinates': ['x': 1202.5, 'y': 486, 'time': 3276,
'x': 1162.5, 'y': 484, 'time': 3331,
'x': 742.5, 'y': 492.5, 'time': 3487,
'x': 673.5, 'y': 501.5, 'time': 3514,
'x': 636, 'y': 508.5, 'time': 3539],
'path': 'path1',
'time': 3558,
'coordinates': ['x': 1237, 'y': 173, 'time': 5437,
'x': 1240, 'y': 182, 'time': 5601,
'x': 1260, 'y': 161, 'time': 7289,
'x': 1263, 'y': 165, 'time': 7465,
'x': 1482, 'y': 114.5, 'time': 8072,
'x': 1482, 'y': 114, 'time': 8197,
'x': 1482, 'y': 126.5, 'time': 9539],
'path': 'path2',
'time': 23620,
'coordinates': ['x': 227.5, 'y': 420, 'time': 25228,
'x': 235, 'y': 418, 'time': 25426],
'path': 'path3',
'time': 35891,
'coordinates': ['x': 681.5, 'y': 431, 'time': 36648,
'x': 704.5, 'y': 427.5, 'time': 36661,
'x': 874.5, 'y': 420.5, 'time': 36714,
'x': 909.5, 'y': 422, 'time': 36734]],
'day2': 'path': 'path4',
'time': 36743,
'coordinates': ['x': 600, 'y': 622.5, 'time': 37390,
'x': 603, 'y': 594.5, 'time': 37448,
'x': 605, 'y': 541.5, 'time': 37478,
'x': 608.5, 'y': 481.5, 'time': 37495,
'x': 620, 'y': 369, 'time': 37530,
'x': 624.5, 'y': 329, 'time': 37547,
'x': 636, 'y': 366, 'time': 38043]
import pandas as pd
import re
def flatten_json(y):
out =
def flatten(x, name=''):
if type(x) is dict:
for a in x:
flatten(x[a], name + a + '_')
elif type(x) is list:
i = 0
for a in x:
flatten(a, name + str(i) + '_')
i += 1
else:
out[name[:-1]] = x
flatten(y)
return out
results = pd.DataFrame()
for k in jsonObj:
flat = flatten_json(jsonObj[k])
temp_df = pd.DataFrame()
special_cols = []
columns_list = list(flat.keys())
for item in columns_list:
try:
row_idx = re.findall(r'_(d+)_', item )[0]
except:
special_cols.append(item)
continue
column = re.findall(r'_d+_(.*)', item )[0]
column = column.replace('_', '')
row_idx = int(row_idx)
value = flat[item]
temp_df.loc[row_idx, column] = value
for item in special_cols:
temp_df[item] = flat[item]
if 'day' in k:
temp_df['day'] = k
results = results.append(temp_df).reset_index(drop=True)
results = results.dropna(axis=1, how='all')
Thank you for your support. I found a way program a little bit around and get some insights. If this problem scales up, I am gonna try your code, which looks a lot more efficient than mine. Thanks again!
– Larsus123
Mar 27 at 12:52
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55356104%2fis-there-a-possibility-to-convert-complex-json-to-pandas-dataframe%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
This is tricky. You're going to end up with lots of nulls, and I also don't know exactly how you want the end datframe to look like. But maybe this can get you going in the right direction:
jsonObj = 'time': 0,
'day1': ['time': 0,
'coordinates': ['x': 1202.5, 'y': 486, 'time': 3276,
'x': 1162.5, 'y': 484, 'time': 3331,
'x': 742.5, 'y': 492.5, 'time': 3487,
'x': 673.5, 'y': 501.5, 'time': 3514,
'x': 636, 'y': 508.5, 'time': 3539],
'path': 'path1',
'time': 3558,
'coordinates': ['x': 1237, 'y': 173, 'time': 5437,
'x': 1240, 'y': 182, 'time': 5601,
'x': 1260, 'y': 161, 'time': 7289,
'x': 1263, 'y': 165, 'time': 7465,
'x': 1482, 'y': 114.5, 'time': 8072,
'x': 1482, 'y': 114, 'time': 8197,
'x': 1482, 'y': 126.5, 'time': 9539],
'path': 'path2',
'time': 23620,
'coordinates': ['x': 227.5, 'y': 420, 'time': 25228,
'x': 235, 'y': 418, 'time': 25426],
'path': 'path3',
'time': 35891,
'coordinates': ['x': 681.5, 'y': 431, 'time': 36648,
'x': 704.5, 'y': 427.5, 'time': 36661,
'x': 874.5, 'y': 420.5, 'time': 36714,
'x': 909.5, 'y': 422, 'time': 36734]],
'day2': 'path': 'path4',
'time': 36743,
'coordinates': ['x': 600, 'y': 622.5, 'time': 37390,
'x': 603, 'y': 594.5, 'time': 37448,
'x': 605, 'y': 541.5, 'time': 37478,
'x': 608.5, 'y': 481.5, 'time': 37495,
'x': 620, 'y': 369, 'time': 37530,
'x': 624.5, 'y': 329, 'time': 37547,
'x': 636, 'y': 366, 'time': 38043]
import pandas as pd
import re
def flatten_json(y):
out =
def flatten(x, name=''):
if type(x) is dict:
for a in x:
flatten(x[a], name + a + '_')
elif type(x) is list:
i = 0
for a in x:
flatten(a, name + str(i) + '_')
i += 1
else:
out[name[:-1]] = x
flatten(y)
return out
results = pd.DataFrame()
for k in jsonObj:
flat = flatten_json(jsonObj[k])
temp_df = pd.DataFrame()
special_cols = []
columns_list = list(flat.keys())
for item in columns_list:
try:
row_idx = re.findall(r'_(d+)_', item )[0]
except:
special_cols.append(item)
continue
column = re.findall(r'_d+_(.*)', item )[0]
column = column.replace('_', '')
row_idx = int(row_idx)
value = flat[item]
temp_df.loc[row_idx, column] = value
for item in special_cols:
temp_df[item] = flat[item]
if 'day' in k:
temp_df['day'] = k
results = results.append(temp_df).reset_index(drop=True)
results = results.dropna(axis=1, how='all')
Thank you for your support. I found a way program a little bit around and get some insights. If this problem scales up, I am gonna try your code, which looks a lot more efficient than mine. Thanks again!
– Larsus123
Mar 27 at 12:52
add a comment |
This is tricky. You're going to end up with lots of nulls, and I also don't know exactly how you want the end datframe to look like. But maybe this can get you going in the right direction:
jsonObj = 'time': 0,
'day1': ['time': 0,
'coordinates': ['x': 1202.5, 'y': 486, 'time': 3276,
'x': 1162.5, 'y': 484, 'time': 3331,
'x': 742.5, 'y': 492.5, 'time': 3487,
'x': 673.5, 'y': 501.5, 'time': 3514,
'x': 636, 'y': 508.5, 'time': 3539],
'path': 'path1',
'time': 3558,
'coordinates': ['x': 1237, 'y': 173, 'time': 5437,
'x': 1240, 'y': 182, 'time': 5601,
'x': 1260, 'y': 161, 'time': 7289,
'x': 1263, 'y': 165, 'time': 7465,
'x': 1482, 'y': 114.5, 'time': 8072,
'x': 1482, 'y': 114, 'time': 8197,
'x': 1482, 'y': 126.5, 'time': 9539],
'path': 'path2',
'time': 23620,
'coordinates': ['x': 227.5, 'y': 420, 'time': 25228,
'x': 235, 'y': 418, 'time': 25426],
'path': 'path3',
'time': 35891,
'coordinates': ['x': 681.5, 'y': 431, 'time': 36648,
'x': 704.5, 'y': 427.5, 'time': 36661,
'x': 874.5, 'y': 420.5, 'time': 36714,
'x': 909.5, 'y': 422, 'time': 36734]],
'day2': 'path': 'path4',
'time': 36743,
'coordinates': ['x': 600, 'y': 622.5, 'time': 37390,
'x': 603, 'y': 594.5, 'time': 37448,
'x': 605, 'y': 541.5, 'time': 37478,
'x': 608.5, 'y': 481.5, 'time': 37495,
'x': 620, 'y': 369, 'time': 37530,
'x': 624.5, 'y': 329, 'time': 37547,
'x': 636, 'y': 366, 'time': 38043]
import pandas as pd
import re
def flatten_json(y):
out =
def flatten(x, name=''):
if type(x) is dict:
for a in x:
flatten(x[a], name + a + '_')
elif type(x) is list:
i = 0
for a in x:
flatten(a, name + str(i) + '_')
i += 1
else:
out[name[:-1]] = x
flatten(y)
return out
results = pd.DataFrame()
for k in jsonObj:
flat = flatten_json(jsonObj[k])
temp_df = pd.DataFrame()
special_cols = []
columns_list = list(flat.keys())
for item in columns_list:
try:
row_idx = re.findall(r'_(d+)_', item )[0]
except:
special_cols.append(item)
continue
column = re.findall(r'_d+_(.*)', item )[0]
column = column.replace('_', '')
row_idx = int(row_idx)
value = flat[item]
temp_df.loc[row_idx, column] = value
for item in special_cols:
temp_df[item] = flat[item]
if 'day' in k:
temp_df['day'] = k
results = results.append(temp_df).reset_index(drop=True)
results = results.dropna(axis=1, how='all')
Thank you for your support. I found a way program a little bit around and get some insights. If this problem scales up, I am gonna try your code, which looks a lot more efficient than mine. Thanks again!
– Larsus123
Mar 27 at 12:52
add a comment |
This is tricky. You're going to end up with lots of nulls, and I also don't know exactly how you want the end datframe to look like. But maybe this can get you going in the right direction:
jsonObj = 'time': 0,
'day1': ['time': 0,
'coordinates': ['x': 1202.5, 'y': 486, 'time': 3276,
'x': 1162.5, 'y': 484, 'time': 3331,
'x': 742.5, 'y': 492.5, 'time': 3487,
'x': 673.5, 'y': 501.5, 'time': 3514,
'x': 636, 'y': 508.5, 'time': 3539],
'path': 'path1',
'time': 3558,
'coordinates': ['x': 1237, 'y': 173, 'time': 5437,
'x': 1240, 'y': 182, 'time': 5601,
'x': 1260, 'y': 161, 'time': 7289,
'x': 1263, 'y': 165, 'time': 7465,
'x': 1482, 'y': 114.5, 'time': 8072,
'x': 1482, 'y': 114, 'time': 8197,
'x': 1482, 'y': 126.5, 'time': 9539],
'path': 'path2',
'time': 23620,
'coordinates': ['x': 227.5, 'y': 420, 'time': 25228,
'x': 235, 'y': 418, 'time': 25426],
'path': 'path3',
'time': 35891,
'coordinates': ['x': 681.5, 'y': 431, 'time': 36648,
'x': 704.5, 'y': 427.5, 'time': 36661,
'x': 874.5, 'y': 420.5, 'time': 36714,
'x': 909.5, 'y': 422, 'time': 36734]],
'day2': 'path': 'path4',
'time': 36743,
'coordinates': ['x': 600, 'y': 622.5, 'time': 37390,
'x': 603, 'y': 594.5, 'time': 37448,
'x': 605, 'y': 541.5, 'time': 37478,
'x': 608.5, 'y': 481.5, 'time': 37495,
'x': 620, 'y': 369, 'time': 37530,
'x': 624.5, 'y': 329, 'time': 37547,
'x': 636, 'y': 366, 'time': 38043]
import pandas as pd
import re
def flatten_json(y):
out =
def flatten(x, name=''):
if type(x) is dict:
for a in x:
flatten(x[a], name + a + '_')
elif type(x) is list:
i = 0
for a in x:
flatten(a, name + str(i) + '_')
i += 1
else:
out[name[:-1]] = x
flatten(y)
return out
results = pd.DataFrame()
for k in jsonObj:
flat = flatten_json(jsonObj[k])
temp_df = pd.DataFrame()
special_cols = []
columns_list = list(flat.keys())
for item in columns_list:
try:
row_idx = re.findall(r'_(d+)_', item )[0]
except:
special_cols.append(item)
continue
column = re.findall(r'_d+_(.*)', item )[0]
column = column.replace('_', '')
row_idx = int(row_idx)
value = flat[item]
temp_df.loc[row_idx, column] = value
for item in special_cols:
temp_df[item] = flat[item]
if 'day' in k:
temp_df['day'] = k
results = results.append(temp_df).reset_index(drop=True)
results = results.dropna(axis=1, how='all')
This is tricky. You're going to end up with lots of nulls, and I also don't know exactly how you want the end datframe to look like. But maybe this can get you going in the right direction:
jsonObj = 'time': 0,
'day1': ['time': 0,
'coordinates': ['x': 1202.5, 'y': 486, 'time': 3276,
'x': 1162.5, 'y': 484, 'time': 3331,
'x': 742.5, 'y': 492.5, 'time': 3487,
'x': 673.5, 'y': 501.5, 'time': 3514,
'x': 636, 'y': 508.5, 'time': 3539],
'path': 'path1',
'time': 3558,
'coordinates': ['x': 1237, 'y': 173, 'time': 5437,
'x': 1240, 'y': 182, 'time': 5601,
'x': 1260, 'y': 161, 'time': 7289,
'x': 1263, 'y': 165, 'time': 7465,
'x': 1482, 'y': 114.5, 'time': 8072,
'x': 1482, 'y': 114, 'time': 8197,
'x': 1482, 'y': 126.5, 'time': 9539],
'path': 'path2',
'time': 23620,
'coordinates': ['x': 227.5, 'y': 420, 'time': 25228,
'x': 235, 'y': 418, 'time': 25426],
'path': 'path3',
'time': 35891,
'coordinates': ['x': 681.5, 'y': 431, 'time': 36648,
'x': 704.5, 'y': 427.5, 'time': 36661,
'x': 874.5, 'y': 420.5, 'time': 36714,
'x': 909.5, 'y': 422, 'time': 36734]],
'day2': 'path': 'path4',
'time': 36743,
'coordinates': ['x': 600, 'y': 622.5, 'time': 37390,
'x': 603, 'y': 594.5, 'time': 37448,
'x': 605, 'y': 541.5, 'time': 37478,
'x': 608.5, 'y': 481.5, 'time': 37495,
'x': 620, 'y': 369, 'time': 37530,
'x': 624.5, 'y': 329, 'time': 37547,
'x': 636, 'y': 366, 'time': 38043]
import pandas as pd
import re
def flatten_json(y):
out =
def flatten(x, name=''):
if type(x) is dict:
for a in x:
flatten(x[a], name + a + '_')
elif type(x) is list:
i = 0
for a in x:
flatten(a, name + str(i) + '_')
i += 1
else:
out[name[:-1]] = x
flatten(y)
return out
results = pd.DataFrame()
for k in jsonObj:
flat = flatten_json(jsonObj[k])
temp_df = pd.DataFrame()
special_cols = []
columns_list = list(flat.keys())
for item in columns_list:
try:
row_idx = re.findall(r'_(d+)_', item )[0]
except:
special_cols.append(item)
continue
column = re.findall(r'_d+_(.*)', item )[0]
column = column.replace('_', '')
row_idx = int(row_idx)
value = flat[item]
temp_df.loc[row_idx, column] = value
for item in special_cols:
temp_df[item] = flat[item]
if 'day' in k:
temp_df['day'] = k
results = results.append(temp_df).reset_index(drop=True)
results = results.dropna(axis=1, how='all')
answered Mar 26 at 14:57
chitown88chitown88
7,6591 gold badge6 silver badges28 bronze badges
7,6591 gold badge6 silver badges28 bronze badges
Thank you for your support. I found a way program a little bit around and get some insights. If this problem scales up, I am gonna try your code, which looks a lot more efficient than mine. Thanks again!
– Larsus123
Mar 27 at 12:52
add a comment |
Thank you for your support. I found a way program a little bit around and get some insights. If this problem scales up, I am gonna try your code, which looks a lot more efficient than mine. Thanks again!
– Larsus123
Mar 27 at 12:52
Thank you for your support. I found a way program a little bit around and get some insights. If this problem scales up, I am gonna try your code, which looks a lot more efficient than mine. Thanks again!
– Larsus123
Mar 27 at 12:52
Thank you for your support. I found a way program a little bit around and get some insights. If this problem scales up, I am gonna try your code, which looks a lot more efficient than mine. Thanks again!
– Larsus123
Mar 27 at 12:52
add a comment |
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55356104%2fis-there-a-possibility-to-convert-complex-json-to-pandas-dataframe%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
show the desired dataframe output please
– aws_apprentice
Mar 26 at 11:32
That's the next problem. I was just given this json file and was told to analyze it and to make a dataframe that can be shown in Excel for example. Neither me nor my collegue knows the expected dataframe output =/
– Larsus123
Mar 26 at 11:42
that makes it kind of hard to help. so you can either provide context for exactly what it is you have to analyze or you go back to your boss and ask for more information
– aws_apprentice
Mar 26 at 11:44
Yes I will do that. Thanks for your time. When I have more information, I will write again.
– Larsus123
Mar 26 at 11:45
@Larsus123 any purticular reason why you are doing
json_data = json.load(json_file) & json_data = json.loads(json_data)
– penta
Mar 26 at 12:10