How can I parse CSV file while preserving dictionary values within rowsHow do I sort a list of dictionaries by a value of the dictionary?How do I sort a dictionary by value?How can I check for NaN values?How can I read and parse CSV files in C++?How can I parse a YAML file in PythonParsing CSV files in C#, with headerHow to import CSV file data into a PostgreSQL table?CSV file written with Python has blank lines between each rowReading CSV file and storing values into an arrayHow can I sort a dictionary by key?
Has the United States ever had a non-Christian President?
Why does sound not move through a wall?
How do I, as a DM, handle a party that decides to set up an ambush in a dungeon?
Hostile Divisor Numbers
Understanding ties
Install Firefox without updates
As black, how should one respond to 4. Qe2 by white in the Russian Game, Damiano Variation?
Should homeowners insurance cover the cost of the home?
Motion-trail-like lines
Firefox 66.0.3 (64-bit) addons broken
Does running exec do anything?
In linear regression why does regularisation penalise the parameter values as well?
Why would one crossvalidate the random state number?
How can a hefty sand storm happen in a thin atmosphere like Martian?
Sci-fi/fantasy book - ships on steel runners skating across ice sheets
As a GM, is it bad form to ask for a moment to think when improvising?
What happens if I accidentally leave an app running and click "Install Now" in Software Updater?
GitLab account hacked and repo wiped
Is disk brake effectiveness mitigated by tyres losing traction under strong braking?
Dirichlet series with a single zero
Sparring against two opponents test
Clarification of algebra in moment generating functions
What is the closest airport to the center of the city it serves?
Which US defense organization would respond to an invasion like this?
How can I parse CSV file while preserving dictionary values within rows
How do I sort a list of dictionaries by a value of the dictionary?How do I sort a dictionary by value?How can I check for NaN values?How can I read and parse CSV files in C++?How can I parse a YAML file in PythonParsing CSV files in C#, with headerHow to import CSV file data into a PostgreSQL table?CSV file written with Python has blank lines between each rowReading CSV file and storing values into an arrayHow can I sort a dictionary by key?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
*If anyone has suggestions on how to format this better I am more than willing to do so.
This is for a homework assignment, the end goal is to normalize the data and insert into a schema that is in 2NF (Second Normal Form) then perform queries on the database. I am thinking I could store the data in a bunch of lists and then insert into the schema from there, how to setup the schema in a way to ensure 2NF is separate question but any suggestions would really help.
As far as the main question regarding parsing goes I am not expecting someone to just code it up for me but I am very much stumped on how this can be done in a way that preserves the dictionary like formatting of certain values.
Below is a snippet from the CSV file (each row is a movie):
budget , genres, homepage, id, keywords, original_language, original_title, overview,popularity, production_companies, production_countries, release_date, revenue, runtime, spoken_languages, status, tagline, title, vote_average, vote_count
237000000,"[""id"": 28, ""name"": ""Action"", ""id"": 12, ""name"": ""Adventure"", ""id"": 14, ""name"": ""Fantasy"", ""id"": 878, ""name"": ""Science Fiction""]",http://www.avatarmovie.com/,19995,"[""id"": 1463, ""name"": ""culture clash"", ""id"": 2964, ""name"": ""future"", ""id"": 3386, ""name"": ""space war"", ""id"": 3388, ""name"": ""space colony"", ""id"": 3679, ""name"": ""society"", ""id"": 3801, ""name"": ""space travel"", ""id"": 9685, ""name"": ""futuristic"", ""id"": 9840, ""name"": ""romance"", ""id"": 9882, ""name"": ""space"", ""id"": 9951, ""name"": ""alien"", ""id"": 10148, ""name"": ""tribe"", ""id"": 10158, ""name"": ""alien planet"", ""id"": 10987, ""name"": ""cgi"", ""id"": 11399, ""name"": ""marine"", ""id"": 13065, ""name"": ""soldier"", ""id"": 14643, ""name"": ""battle"", ""id"": 14720, ""name"": ""love affair"", ""id"": 165431, ""name"": ""anti war"", ""id"": 193554, ""name"": ""power relations"", ""id"": 206690, ""name"": ""mind and soul"", ""id"": 209714, ""name"": ""3d""]",en,Avatar,"In the 22nd century, a paraplegic Marine is dispatched to the moon Pandora on a unique mission, but becomes torn between following orders and protecting an alien civilization.",150.437577,"[""name"": ""Ingenious Film Partners"", ""id"": 289, ""name"": ""Twentieth Century Fox Film Corporation"", ""id"": 306, ""name"": ""Dune Entertainment"", ""id"": 444, ""name"": ""Lightstorm Entertainment"", ""id"": 574]","[""iso_3166_1"": ""US"", ""name"": ""United States of America"", ""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""]",2009-12-10,2787965087,162,"[""iso_639_1"": ""en"", ""name"": ""English"", ""iso_639_1"": ""es"", ""name"": ""Espau00f1ol""]",Released,Enter the World of Pandora.,Avatar,7.2,11800
300000000,"[""id"": 12, ""name"": ""Adventure"", ""id"": 14, ""name"": ""Fantasy"", ""id"": 28, ""name"": ""Action""]",http://disney.go.com/disneypictures/pirates/,285,"[""id"": 270, ""name"": ""ocean"", ""id"": 726, ""name"": ""drug abuse"", ""id"": 911, ""name"": ""exotic island"", ""id"": 1319, ""name"": ""east india trading company"", ""id"": 2038, ""name"": ""love of one's life"", ""id"": 2052, ""name"": ""traitor"", ""id"": 2580, ""name"": ""shipwreck"", ""id"": 2660, ""name"": ""strong woman"", ""id"": 3799, ""name"": ""ship"", ""id"": 5740, ""name"": ""alliance"", ""id"": 5941, ""name"": ""calypso"", ""id"": 6155, ""name"": ""afterlife"", ""id"": 6211, ""name"": ""fighter"", ""id"": 12988, ""name"": ""pirate"", ""id"": 157186, ""name"": ""swashbuckler"", ""id"": 179430, ""name"": ""aftercreditsstinger""]",en,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, has come back to life and is headed to the edge of the Earth with Will Turner and Elizabeth Swann. But nothing is quite as it seems.",139.082615,"[""name"": ""Walt Disney Pictures"", ""id"": 2, ""name"": ""Jerry Bruckheimer Films"", ""id"": 130, ""name"": ""Second Mate Productions"", ""id"": 19936]","[""iso_3166_1"": ""US"", ""name"": ""United States of America""]",2007-05-19,961000000,169,"[""iso_639_1"": ""en"", ""name"": ""English""]",Released,"At the end of the world, the adventure begins.",Pirates of the Caribbean: At World's End,6.9,4500
245000000,"[""id"": 28, ""name"": ""Action"", ""id"": 12, ""name"": ""Adventure"", ""id"": 80, ""name"": ""Crime""]",http://www.sonypictures.com/movies/spectre/,206647,"[""id"": 470, ""name"": ""spy"", ""id"": 818, ""name"": ""based on novel"", ""id"": 4289, ""name"": ""secret agent"", ""id"": 9663, ""name"": ""sequel"", ""id"": 14555, ""name"": ""mi6"", ""id"": 156095, ""name"": ""british secret service"", ""id"": 158431, ""name"": ""united kingdom""]",en,Spectre,"A cryptic message from Bond’s past sends him on a trail to uncover a sinister organization. While M battles political forces to keep the secret service alive, Bond peels back the layers of deceit to reveal the terrible truth behind SPECTRE.",107.376788,"[""name"": ""Columbia Pictures"", ""id"": 5, ""name"": ""Danjaq"", ""id"": 10761, ""name"": ""B24"", ""id"": 69434]","[""iso_3166_1"": ""GB"", ""name"": ""United Kingdom"", ""iso_3166_1"": ""US"", ""name"": ""United States of America""]",2015-10-26,880674609,148,"[""iso_639_1"": ""fr"", ""name"": ""Franu00e7ais"", ""iso_639_1"": ""en"", ""name"": ""English"", ""iso_639_1"": ""es"", ""name"": ""Espau00f1ol"", ""iso_639_1"": ""it"", ""name"": ""Italiano"", ""iso_639_1"": ""de"", ""name"": ""Deutsch""]",Released,A Plan No One Escapes,Spectre,6.3,4466
250000000,"[""id"": 28, ""name"": ""Action"", ""id"": 80, ""name"": ""Crime"", ""id"": 18, ""name"": ""Drama"", ""id"": 53, ""name"": ""Thriller""]",http://www.thedarkknightrises.com/,49026,"[""id"": 849, ""name"": ""dc comics"", ""id"": 853, ""name"": ""crime fighter"", ""id"": 949, ""name"": ""terrorist"", ""id"": 1308, ""name"": ""secret identity"", ""id"": 1437, ""name"": ""burglar"", ""id"": 3051, ""name"": ""hostage drama"", ""id"": 3562, ""name"": ""time bomb"", ""id"": 6969, ""name"": ""gotham city"", ""id"": 7002, ""name"": ""vigilante"", ""id"": 9665, ""name"": ""cover-up"", ""id"": 9715, ""name"": ""superhero"", ""id"": 9990, ""name"": ""villainess"", ""id"": 10044, ""name"": ""tragic hero"", ""id"": 13015, ""name"": ""terrorism"", ""id"": 14796, ""name"": ""destruction"", ""id"": 18933, ""name"": ""catwoman"", ""id"": 156082, ""name"": ""cat burglar"", ""id"": 156395, ""name"": ""imax"", ""id"": 173272, ""name"": ""flood"", ""id"": 179093, ""name"": ""criminal underworld"", ""id"": 230775, ""name"": ""batman""]",en,The Dark Knight Rises,"Following the death of District Attorney Harvey Dent, Batman assumes responsibility for Dent's crimes to protect the late attorney's reputation and is subsequently hunted by the Gotham City Police Department. Eight years later, Batman encounters the mysterious Selina Kyle and the villainous Bane, a new terrorist leader who overwhelms Gotham's finest. The Dark Knight resurfaces to protect a city that has branded him an enemy.",112.31295,"[""name"": ""Legendary Pictures"", ""id"": 923, ""name"": ""Warner Bros."", ""id"": 6194, ""name"": ""DC Entertainment"", ""id"": 9993, ""name"": ""Syncopy"", ""id"": 9996]","[""iso_3166_1"": ""US"", ""name"": ""United States of America""]",2012-07-16,1084939099,165,"[""iso_639_1"": ""en"", ""name"": ""English""]",Released,The Legend Ends,The Dark Knight Rises,7.6,9106
There are 19 columns (column attributes are shown in row 1), the 4th column attribute "id" is the unique identifier for each row i.e. movie. Certain columns contain a list of dictionaries, for example the genres column in row 2 contains:
[""id"": 28, ""name"": ""Action"", ""id"": 12, ""name"": ""Adventure"", ""id"": 14, ""name"": ""Fantasy"", ""id"": 878, ""name"": ""Science Fiction""]
Below is what I am currently doing which is just resulting in a list of strings for each attribute column:
import csv
budgets = []
genres = []
homepage = []
mainID = []
keywords = []
original_language = []
original_title = []
overview = []
popularity = []
production_companies = []
production_countries = []
release_date = []
revenue = []
runtime = []
spoken_languages = []
status = []
tag_line = []
title = []
vote_avg = []
vote_count = []
key_id = []
with open('tmdb_5000_movies.csv', 'r', encoding='utf-8') as csv_file:
csv_reader = csv.reader(csv_file)
skip_attributes_row = next(csv_reader)
for item in csv_reader:
budgets.append(item[0])
genres.append(item[1])
homepage.append(item[2])
mainID.append(item[3])
keywords.append(item[4])
original_language.append(item[5])
original_title.append(item[6])
overview.append(item[7])
popularity.append(item[8])
production_companies.append(item[9])
production_countries.append(item[10])
release_date.append(item[11])
revenue.append(item[12])
runtime.append(item[13])
spoken_languages.append(item[14])
status.append(item[15])
tag_line.append(item[16])
title.append(item[17])
vote_avg.append(item[18])
vote_count.append(item[19])
python csv
add a comment |
*If anyone has suggestions on how to format this better I am more than willing to do so.
This is for a homework assignment, the end goal is to normalize the data and insert into a schema that is in 2NF (Second Normal Form) then perform queries on the database. I am thinking I could store the data in a bunch of lists and then insert into the schema from there, how to setup the schema in a way to ensure 2NF is separate question but any suggestions would really help.
As far as the main question regarding parsing goes I am not expecting someone to just code it up for me but I am very much stumped on how this can be done in a way that preserves the dictionary like formatting of certain values.
Below is a snippet from the CSV file (each row is a movie):
budget , genres, homepage, id, keywords, original_language, original_title, overview,popularity, production_companies, production_countries, release_date, revenue, runtime, spoken_languages, status, tagline, title, vote_average, vote_count
237000000,"[""id"": 28, ""name"": ""Action"", ""id"": 12, ""name"": ""Adventure"", ""id"": 14, ""name"": ""Fantasy"", ""id"": 878, ""name"": ""Science Fiction""]",http://www.avatarmovie.com/,19995,"[""id"": 1463, ""name"": ""culture clash"", ""id"": 2964, ""name"": ""future"", ""id"": 3386, ""name"": ""space war"", ""id"": 3388, ""name"": ""space colony"", ""id"": 3679, ""name"": ""society"", ""id"": 3801, ""name"": ""space travel"", ""id"": 9685, ""name"": ""futuristic"", ""id"": 9840, ""name"": ""romance"", ""id"": 9882, ""name"": ""space"", ""id"": 9951, ""name"": ""alien"", ""id"": 10148, ""name"": ""tribe"", ""id"": 10158, ""name"": ""alien planet"", ""id"": 10987, ""name"": ""cgi"", ""id"": 11399, ""name"": ""marine"", ""id"": 13065, ""name"": ""soldier"", ""id"": 14643, ""name"": ""battle"", ""id"": 14720, ""name"": ""love affair"", ""id"": 165431, ""name"": ""anti war"", ""id"": 193554, ""name"": ""power relations"", ""id"": 206690, ""name"": ""mind and soul"", ""id"": 209714, ""name"": ""3d""]",en,Avatar,"In the 22nd century, a paraplegic Marine is dispatched to the moon Pandora on a unique mission, but becomes torn between following orders and protecting an alien civilization.",150.437577,"[""name"": ""Ingenious Film Partners"", ""id"": 289, ""name"": ""Twentieth Century Fox Film Corporation"", ""id"": 306, ""name"": ""Dune Entertainment"", ""id"": 444, ""name"": ""Lightstorm Entertainment"", ""id"": 574]","[""iso_3166_1"": ""US"", ""name"": ""United States of America"", ""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""]",2009-12-10,2787965087,162,"[""iso_639_1"": ""en"", ""name"": ""English"", ""iso_639_1"": ""es"", ""name"": ""Espau00f1ol""]",Released,Enter the World of Pandora.,Avatar,7.2,11800
300000000,"[""id"": 12, ""name"": ""Adventure"", ""id"": 14, ""name"": ""Fantasy"", ""id"": 28, ""name"": ""Action""]",http://disney.go.com/disneypictures/pirates/,285,"[""id"": 270, ""name"": ""ocean"", ""id"": 726, ""name"": ""drug abuse"", ""id"": 911, ""name"": ""exotic island"", ""id"": 1319, ""name"": ""east india trading company"", ""id"": 2038, ""name"": ""love of one's life"", ""id"": 2052, ""name"": ""traitor"", ""id"": 2580, ""name"": ""shipwreck"", ""id"": 2660, ""name"": ""strong woman"", ""id"": 3799, ""name"": ""ship"", ""id"": 5740, ""name"": ""alliance"", ""id"": 5941, ""name"": ""calypso"", ""id"": 6155, ""name"": ""afterlife"", ""id"": 6211, ""name"": ""fighter"", ""id"": 12988, ""name"": ""pirate"", ""id"": 157186, ""name"": ""swashbuckler"", ""id"": 179430, ""name"": ""aftercreditsstinger""]",en,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, has come back to life and is headed to the edge of the Earth with Will Turner and Elizabeth Swann. But nothing is quite as it seems.",139.082615,"[""name"": ""Walt Disney Pictures"", ""id"": 2, ""name"": ""Jerry Bruckheimer Films"", ""id"": 130, ""name"": ""Second Mate Productions"", ""id"": 19936]","[""iso_3166_1"": ""US"", ""name"": ""United States of America""]",2007-05-19,961000000,169,"[""iso_639_1"": ""en"", ""name"": ""English""]",Released,"At the end of the world, the adventure begins.",Pirates of the Caribbean: At World's End,6.9,4500
245000000,"[""id"": 28, ""name"": ""Action"", ""id"": 12, ""name"": ""Adventure"", ""id"": 80, ""name"": ""Crime""]",http://www.sonypictures.com/movies/spectre/,206647,"[""id"": 470, ""name"": ""spy"", ""id"": 818, ""name"": ""based on novel"", ""id"": 4289, ""name"": ""secret agent"", ""id"": 9663, ""name"": ""sequel"", ""id"": 14555, ""name"": ""mi6"", ""id"": 156095, ""name"": ""british secret service"", ""id"": 158431, ""name"": ""united kingdom""]",en,Spectre,"A cryptic message from Bond’s past sends him on a trail to uncover a sinister organization. While M battles political forces to keep the secret service alive, Bond peels back the layers of deceit to reveal the terrible truth behind SPECTRE.",107.376788,"[""name"": ""Columbia Pictures"", ""id"": 5, ""name"": ""Danjaq"", ""id"": 10761, ""name"": ""B24"", ""id"": 69434]","[""iso_3166_1"": ""GB"", ""name"": ""United Kingdom"", ""iso_3166_1"": ""US"", ""name"": ""United States of America""]",2015-10-26,880674609,148,"[""iso_639_1"": ""fr"", ""name"": ""Franu00e7ais"", ""iso_639_1"": ""en"", ""name"": ""English"", ""iso_639_1"": ""es"", ""name"": ""Espau00f1ol"", ""iso_639_1"": ""it"", ""name"": ""Italiano"", ""iso_639_1"": ""de"", ""name"": ""Deutsch""]",Released,A Plan No One Escapes,Spectre,6.3,4466
250000000,"[""id"": 28, ""name"": ""Action"", ""id"": 80, ""name"": ""Crime"", ""id"": 18, ""name"": ""Drama"", ""id"": 53, ""name"": ""Thriller""]",http://www.thedarkknightrises.com/,49026,"[""id"": 849, ""name"": ""dc comics"", ""id"": 853, ""name"": ""crime fighter"", ""id"": 949, ""name"": ""terrorist"", ""id"": 1308, ""name"": ""secret identity"", ""id"": 1437, ""name"": ""burglar"", ""id"": 3051, ""name"": ""hostage drama"", ""id"": 3562, ""name"": ""time bomb"", ""id"": 6969, ""name"": ""gotham city"", ""id"": 7002, ""name"": ""vigilante"", ""id"": 9665, ""name"": ""cover-up"", ""id"": 9715, ""name"": ""superhero"", ""id"": 9990, ""name"": ""villainess"", ""id"": 10044, ""name"": ""tragic hero"", ""id"": 13015, ""name"": ""terrorism"", ""id"": 14796, ""name"": ""destruction"", ""id"": 18933, ""name"": ""catwoman"", ""id"": 156082, ""name"": ""cat burglar"", ""id"": 156395, ""name"": ""imax"", ""id"": 173272, ""name"": ""flood"", ""id"": 179093, ""name"": ""criminal underworld"", ""id"": 230775, ""name"": ""batman""]",en,The Dark Knight Rises,"Following the death of District Attorney Harvey Dent, Batman assumes responsibility for Dent's crimes to protect the late attorney's reputation and is subsequently hunted by the Gotham City Police Department. Eight years later, Batman encounters the mysterious Selina Kyle and the villainous Bane, a new terrorist leader who overwhelms Gotham's finest. The Dark Knight resurfaces to protect a city that has branded him an enemy.",112.31295,"[""name"": ""Legendary Pictures"", ""id"": 923, ""name"": ""Warner Bros."", ""id"": 6194, ""name"": ""DC Entertainment"", ""id"": 9993, ""name"": ""Syncopy"", ""id"": 9996]","[""iso_3166_1"": ""US"", ""name"": ""United States of America""]",2012-07-16,1084939099,165,"[""iso_639_1"": ""en"", ""name"": ""English""]",Released,The Legend Ends,The Dark Knight Rises,7.6,9106
There are 19 columns (column attributes are shown in row 1), the 4th column attribute "id" is the unique identifier for each row i.e. movie. Certain columns contain a list of dictionaries, for example the genres column in row 2 contains:
[""id"": 28, ""name"": ""Action"", ""id"": 12, ""name"": ""Adventure"", ""id"": 14, ""name"": ""Fantasy"", ""id"": 878, ""name"": ""Science Fiction""]
Below is what I am currently doing which is just resulting in a list of strings for each attribute column:
import csv
budgets = []
genres = []
homepage = []
mainID = []
keywords = []
original_language = []
original_title = []
overview = []
popularity = []
production_companies = []
production_countries = []
release_date = []
revenue = []
runtime = []
spoken_languages = []
status = []
tag_line = []
title = []
vote_avg = []
vote_count = []
key_id = []
with open('tmdb_5000_movies.csv', 'r', encoding='utf-8') as csv_file:
csv_reader = csv.reader(csv_file)
skip_attributes_row = next(csv_reader)
for item in csv_reader:
budgets.append(item[0])
genres.append(item[1])
homepage.append(item[2])
mainID.append(item[3])
keywords.append(item[4])
original_language.append(item[5])
original_title.append(item[6])
overview.append(item[7])
popularity.append(item[8])
production_companies.append(item[9])
production_countries.append(item[10])
release_date.append(item[11])
revenue.append(item[12])
runtime.append(item[13])
spoken_languages.append(item[14])
status.append(item[15])
tag_line.append(item[16])
title.append(item[17])
vote_avg.append(item[18])
vote_count.append(item[19])
python csv
Please provide a sample of expected output.
– gmds
Mar 23 at 3:04
I can edit the post, essentially I want each column in its own list but columns like "genres" are causing problems and need to be separated into two lists, one storing genre_id and the other storing genre_name.
– SSJ
Mar 23 at 3:17
"each column in its own list": As I don't see the benefit of doing so you have to flatten the columngenres list
doinggenres.extend(<column generes list>)
. Are you aware that you are dealing withjson
column data?
– stovfl
Mar 23 at 9:21
add a comment |
*If anyone has suggestions on how to format this better I am more than willing to do so.
This is for a homework assignment, the end goal is to normalize the data and insert into a schema that is in 2NF (Second Normal Form) then perform queries on the database. I am thinking I could store the data in a bunch of lists and then insert into the schema from there, how to setup the schema in a way to ensure 2NF is separate question but any suggestions would really help.
As far as the main question regarding parsing goes I am not expecting someone to just code it up for me but I am very much stumped on how this can be done in a way that preserves the dictionary like formatting of certain values.
Below is a snippet from the CSV file (each row is a movie):
budget , genres, homepage, id, keywords, original_language, original_title, overview,popularity, production_companies, production_countries, release_date, revenue, runtime, spoken_languages, status, tagline, title, vote_average, vote_count
237000000,"[""id"": 28, ""name"": ""Action"", ""id"": 12, ""name"": ""Adventure"", ""id"": 14, ""name"": ""Fantasy"", ""id"": 878, ""name"": ""Science Fiction""]",http://www.avatarmovie.com/,19995,"[""id"": 1463, ""name"": ""culture clash"", ""id"": 2964, ""name"": ""future"", ""id"": 3386, ""name"": ""space war"", ""id"": 3388, ""name"": ""space colony"", ""id"": 3679, ""name"": ""society"", ""id"": 3801, ""name"": ""space travel"", ""id"": 9685, ""name"": ""futuristic"", ""id"": 9840, ""name"": ""romance"", ""id"": 9882, ""name"": ""space"", ""id"": 9951, ""name"": ""alien"", ""id"": 10148, ""name"": ""tribe"", ""id"": 10158, ""name"": ""alien planet"", ""id"": 10987, ""name"": ""cgi"", ""id"": 11399, ""name"": ""marine"", ""id"": 13065, ""name"": ""soldier"", ""id"": 14643, ""name"": ""battle"", ""id"": 14720, ""name"": ""love affair"", ""id"": 165431, ""name"": ""anti war"", ""id"": 193554, ""name"": ""power relations"", ""id"": 206690, ""name"": ""mind and soul"", ""id"": 209714, ""name"": ""3d""]",en,Avatar,"In the 22nd century, a paraplegic Marine is dispatched to the moon Pandora on a unique mission, but becomes torn between following orders and protecting an alien civilization.",150.437577,"[""name"": ""Ingenious Film Partners"", ""id"": 289, ""name"": ""Twentieth Century Fox Film Corporation"", ""id"": 306, ""name"": ""Dune Entertainment"", ""id"": 444, ""name"": ""Lightstorm Entertainment"", ""id"": 574]","[""iso_3166_1"": ""US"", ""name"": ""United States of America"", ""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""]",2009-12-10,2787965087,162,"[""iso_639_1"": ""en"", ""name"": ""English"", ""iso_639_1"": ""es"", ""name"": ""Espau00f1ol""]",Released,Enter the World of Pandora.,Avatar,7.2,11800
300000000,"[""id"": 12, ""name"": ""Adventure"", ""id"": 14, ""name"": ""Fantasy"", ""id"": 28, ""name"": ""Action""]",http://disney.go.com/disneypictures/pirates/,285,"[""id"": 270, ""name"": ""ocean"", ""id"": 726, ""name"": ""drug abuse"", ""id"": 911, ""name"": ""exotic island"", ""id"": 1319, ""name"": ""east india trading company"", ""id"": 2038, ""name"": ""love of one's life"", ""id"": 2052, ""name"": ""traitor"", ""id"": 2580, ""name"": ""shipwreck"", ""id"": 2660, ""name"": ""strong woman"", ""id"": 3799, ""name"": ""ship"", ""id"": 5740, ""name"": ""alliance"", ""id"": 5941, ""name"": ""calypso"", ""id"": 6155, ""name"": ""afterlife"", ""id"": 6211, ""name"": ""fighter"", ""id"": 12988, ""name"": ""pirate"", ""id"": 157186, ""name"": ""swashbuckler"", ""id"": 179430, ""name"": ""aftercreditsstinger""]",en,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, has come back to life and is headed to the edge of the Earth with Will Turner and Elizabeth Swann. But nothing is quite as it seems.",139.082615,"[""name"": ""Walt Disney Pictures"", ""id"": 2, ""name"": ""Jerry Bruckheimer Films"", ""id"": 130, ""name"": ""Second Mate Productions"", ""id"": 19936]","[""iso_3166_1"": ""US"", ""name"": ""United States of America""]",2007-05-19,961000000,169,"[""iso_639_1"": ""en"", ""name"": ""English""]",Released,"At the end of the world, the adventure begins.",Pirates of the Caribbean: At World's End,6.9,4500
245000000,"[""id"": 28, ""name"": ""Action"", ""id"": 12, ""name"": ""Adventure"", ""id"": 80, ""name"": ""Crime""]",http://www.sonypictures.com/movies/spectre/,206647,"[""id"": 470, ""name"": ""spy"", ""id"": 818, ""name"": ""based on novel"", ""id"": 4289, ""name"": ""secret agent"", ""id"": 9663, ""name"": ""sequel"", ""id"": 14555, ""name"": ""mi6"", ""id"": 156095, ""name"": ""british secret service"", ""id"": 158431, ""name"": ""united kingdom""]",en,Spectre,"A cryptic message from Bond’s past sends him on a trail to uncover a sinister organization. While M battles political forces to keep the secret service alive, Bond peels back the layers of deceit to reveal the terrible truth behind SPECTRE.",107.376788,"[""name"": ""Columbia Pictures"", ""id"": 5, ""name"": ""Danjaq"", ""id"": 10761, ""name"": ""B24"", ""id"": 69434]","[""iso_3166_1"": ""GB"", ""name"": ""United Kingdom"", ""iso_3166_1"": ""US"", ""name"": ""United States of America""]",2015-10-26,880674609,148,"[""iso_639_1"": ""fr"", ""name"": ""Franu00e7ais"", ""iso_639_1"": ""en"", ""name"": ""English"", ""iso_639_1"": ""es"", ""name"": ""Espau00f1ol"", ""iso_639_1"": ""it"", ""name"": ""Italiano"", ""iso_639_1"": ""de"", ""name"": ""Deutsch""]",Released,A Plan No One Escapes,Spectre,6.3,4466
250000000,"[""id"": 28, ""name"": ""Action"", ""id"": 80, ""name"": ""Crime"", ""id"": 18, ""name"": ""Drama"", ""id"": 53, ""name"": ""Thriller""]",http://www.thedarkknightrises.com/,49026,"[""id"": 849, ""name"": ""dc comics"", ""id"": 853, ""name"": ""crime fighter"", ""id"": 949, ""name"": ""terrorist"", ""id"": 1308, ""name"": ""secret identity"", ""id"": 1437, ""name"": ""burglar"", ""id"": 3051, ""name"": ""hostage drama"", ""id"": 3562, ""name"": ""time bomb"", ""id"": 6969, ""name"": ""gotham city"", ""id"": 7002, ""name"": ""vigilante"", ""id"": 9665, ""name"": ""cover-up"", ""id"": 9715, ""name"": ""superhero"", ""id"": 9990, ""name"": ""villainess"", ""id"": 10044, ""name"": ""tragic hero"", ""id"": 13015, ""name"": ""terrorism"", ""id"": 14796, ""name"": ""destruction"", ""id"": 18933, ""name"": ""catwoman"", ""id"": 156082, ""name"": ""cat burglar"", ""id"": 156395, ""name"": ""imax"", ""id"": 173272, ""name"": ""flood"", ""id"": 179093, ""name"": ""criminal underworld"", ""id"": 230775, ""name"": ""batman""]",en,The Dark Knight Rises,"Following the death of District Attorney Harvey Dent, Batman assumes responsibility for Dent's crimes to protect the late attorney's reputation and is subsequently hunted by the Gotham City Police Department. Eight years later, Batman encounters the mysterious Selina Kyle and the villainous Bane, a new terrorist leader who overwhelms Gotham's finest. The Dark Knight resurfaces to protect a city that has branded him an enemy.",112.31295,"[""name"": ""Legendary Pictures"", ""id"": 923, ""name"": ""Warner Bros."", ""id"": 6194, ""name"": ""DC Entertainment"", ""id"": 9993, ""name"": ""Syncopy"", ""id"": 9996]","[""iso_3166_1"": ""US"", ""name"": ""United States of America""]",2012-07-16,1084939099,165,"[""iso_639_1"": ""en"", ""name"": ""English""]",Released,The Legend Ends,The Dark Knight Rises,7.6,9106
There are 19 columns (column attributes are shown in row 1), the 4th column attribute "id" is the unique identifier for each row i.e. movie. Certain columns contain a list of dictionaries, for example the genres column in row 2 contains:
[""id"": 28, ""name"": ""Action"", ""id"": 12, ""name"": ""Adventure"", ""id"": 14, ""name"": ""Fantasy"", ""id"": 878, ""name"": ""Science Fiction""]
Below is what I am currently doing which is just resulting in a list of strings for each attribute column:
import csv
budgets = []
genres = []
homepage = []
mainID = []
keywords = []
original_language = []
original_title = []
overview = []
popularity = []
production_companies = []
production_countries = []
release_date = []
revenue = []
runtime = []
spoken_languages = []
status = []
tag_line = []
title = []
vote_avg = []
vote_count = []
key_id = []
with open('tmdb_5000_movies.csv', 'r', encoding='utf-8') as csv_file:
csv_reader = csv.reader(csv_file)
skip_attributes_row = next(csv_reader)
for item in csv_reader:
budgets.append(item[0])
genres.append(item[1])
homepage.append(item[2])
mainID.append(item[3])
keywords.append(item[4])
original_language.append(item[5])
original_title.append(item[6])
overview.append(item[7])
popularity.append(item[8])
production_companies.append(item[9])
production_countries.append(item[10])
release_date.append(item[11])
revenue.append(item[12])
runtime.append(item[13])
spoken_languages.append(item[14])
status.append(item[15])
tag_line.append(item[16])
title.append(item[17])
vote_avg.append(item[18])
vote_count.append(item[19])
python csv
*If anyone has suggestions on how to format this better I am more than willing to do so.
This is for a homework assignment, the end goal is to normalize the data and insert into a schema that is in 2NF (Second Normal Form) then perform queries on the database. I am thinking I could store the data in a bunch of lists and then insert into the schema from there, how to setup the schema in a way to ensure 2NF is separate question but any suggestions would really help.
As far as the main question regarding parsing goes I am not expecting someone to just code it up for me but I am very much stumped on how this can be done in a way that preserves the dictionary like formatting of certain values.
Below is a snippet from the CSV file (each row is a movie):
budget , genres, homepage, id, keywords, original_language, original_title, overview,popularity, production_companies, production_countries, release_date, revenue, runtime, spoken_languages, status, tagline, title, vote_average, vote_count
237000000,"[""id"": 28, ""name"": ""Action"", ""id"": 12, ""name"": ""Adventure"", ""id"": 14, ""name"": ""Fantasy"", ""id"": 878, ""name"": ""Science Fiction""]",http://www.avatarmovie.com/,19995,"[""id"": 1463, ""name"": ""culture clash"", ""id"": 2964, ""name"": ""future"", ""id"": 3386, ""name"": ""space war"", ""id"": 3388, ""name"": ""space colony"", ""id"": 3679, ""name"": ""society"", ""id"": 3801, ""name"": ""space travel"", ""id"": 9685, ""name"": ""futuristic"", ""id"": 9840, ""name"": ""romance"", ""id"": 9882, ""name"": ""space"", ""id"": 9951, ""name"": ""alien"", ""id"": 10148, ""name"": ""tribe"", ""id"": 10158, ""name"": ""alien planet"", ""id"": 10987, ""name"": ""cgi"", ""id"": 11399, ""name"": ""marine"", ""id"": 13065, ""name"": ""soldier"", ""id"": 14643, ""name"": ""battle"", ""id"": 14720, ""name"": ""love affair"", ""id"": 165431, ""name"": ""anti war"", ""id"": 193554, ""name"": ""power relations"", ""id"": 206690, ""name"": ""mind and soul"", ""id"": 209714, ""name"": ""3d""]",en,Avatar,"In the 22nd century, a paraplegic Marine is dispatched to the moon Pandora on a unique mission, but becomes torn between following orders and protecting an alien civilization.",150.437577,"[""name"": ""Ingenious Film Partners"", ""id"": 289, ""name"": ""Twentieth Century Fox Film Corporation"", ""id"": 306, ""name"": ""Dune Entertainment"", ""id"": 444, ""name"": ""Lightstorm Entertainment"", ""id"": 574]","[""iso_3166_1"": ""US"", ""name"": ""United States of America"", ""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""]",2009-12-10,2787965087,162,"[""iso_639_1"": ""en"", ""name"": ""English"", ""iso_639_1"": ""es"", ""name"": ""Espau00f1ol""]",Released,Enter the World of Pandora.,Avatar,7.2,11800
300000000,"[""id"": 12, ""name"": ""Adventure"", ""id"": 14, ""name"": ""Fantasy"", ""id"": 28, ""name"": ""Action""]",http://disney.go.com/disneypictures/pirates/,285,"[""id"": 270, ""name"": ""ocean"", ""id"": 726, ""name"": ""drug abuse"", ""id"": 911, ""name"": ""exotic island"", ""id"": 1319, ""name"": ""east india trading company"", ""id"": 2038, ""name"": ""love of one's life"", ""id"": 2052, ""name"": ""traitor"", ""id"": 2580, ""name"": ""shipwreck"", ""id"": 2660, ""name"": ""strong woman"", ""id"": 3799, ""name"": ""ship"", ""id"": 5740, ""name"": ""alliance"", ""id"": 5941, ""name"": ""calypso"", ""id"": 6155, ""name"": ""afterlife"", ""id"": 6211, ""name"": ""fighter"", ""id"": 12988, ""name"": ""pirate"", ""id"": 157186, ""name"": ""swashbuckler"", ""id"": 179430, ""name"": ""aftercreditsstinger""]",en,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, has come back to life and is headed to the edge of the Earth with Will Turner and Elizabeth Swann. But nothing is quite as it seems.",139.082615,"[""name"": ""Walt Disney Pictures"", ""id"": 2, ""name"": ""Jerry Bruckheimer Films"", ""id"": 130, ""name"": ""Second Mate Productions"", ""id"": 19936]","[""iso_3166_1"": ""US"", ""name"": ""United States of America""]",2007-05-19,961000000,169,"[""iso_639_1"": ""en"", ""name"": ""English""]",Released,"At the end of the world, the adventure begins.",Pirates of the Caribbean: At World's End,6.9,4500
245000000,"[""id"": 28, ""name"": ""Action"", ""id"": 12, ""name"": ""Adventure"", ""id"": 80, ""name"": ""Crime""]",http://www.sonypictures.com/movies/spectre/,206647,"[""id"": 470, ""name"": ""spy"", ""id"": 818, ""name"": ""based on novel"", ""id"": 4289, ""name"": ""secret agent"", ""id"": 9663, ""name"": ""sequel"", ""id"": 14555, ""name"": ""mi6"", ""id"": 156095, ""name"": ""british secret service"", ""id"": 158431, ""name"": ""united kingdom""]",en,Spectre,"A cryptic message from Bond’s past sends him on a trail to uncover a sinister organization. While M battles political forces to keep the secret service alive, Bond peels back the layers of deceit to reveal the terrible truth behind SPECTRE.",107.376788,"[""name"": ""Columbia Pictures"", ""id"": 5, ""name"": ""Danjaq"", ""id"": 10761, ""name"": ""B24"", ""id"": 69434]","[""iso_3166_1"": ""GB"", ""name"": ""United Kingdom"", ""iso_3166_1"": ""US"", ""name"": ""United States of America""]",2015-10-26,880674609,148,"[""iso_639_1"": ""fr"", ""name"": ""Franu00e7ais"", ""iso_639_1"": ""en"", ""name"": ""English"", ""iso_639_1"": ""es"", ""name"": ""Espau00f1ol"", ""iso_639_1"": ""it"", ""name"": ""Italiano"", ""iso_639_1"": ""de"", ""name"": ""Deutsch""]",Released,A Plan No One Escapes,Spectre,6.3,4466
250000000,"[""id"": 28, ""name"": ""Action"", ""id"": 80, ""name"": ""Crime"", ""id"": 18, ""name"": ""Drama"", ""id"": 53, ""name"": ""Thriller""]",http://www.thedarkknightrises.com/,49026,"[""id"": 849, ""name"": ""dc comics"", ""id"": 853, ""name"": ""crime fighter"", ""id"": 949, ""name"": ""terrorist"", ""id"": 1308, ""name"": ""secret identity"", ""id"": 1437, ""name"": ""burglar"", ""id"": 3051, ""name"": ""hostage drama"", ""id"": 3562, ""name"": ""time bomb"", ""id"": 6969, ""name"": ""gotham city"", ""id"": 7002, ""name"": ""vigilante"", ""id"": 9665, ""name"": ""cover-up"", ""id"": 9715, ""name"": ""superhero"", ""id"": 9990, ""name"": ""villainess"", ""id"": 10044, ""name"": ""tragic hero"", ""id"": 13015, ""name"": ""terrorism"", ""id"": 14796, ""name"": ""destruction"", ""id"": 18933, ""name"": ""catwoman"", ""id"": 156082, ""name"": ""cat burglar"", ""id"": 156395, ""name"": ""imax"", ""id"": 173272, ""name"": ""flood"", ""id"": 179093, ""name"": ""criminal underworld"", ""id"": 230775, ""name"": ""batman""]",en,The Dark Knight Rises,"Following the death of District Attorney Harvey Dent, Batman assumes responsibility for Dent's crimes to protect the late attorney's reputation and is subsequently hunted by the Gotham City Police Department. Eight years later, Batman encounters the mysterious Selina Kyle and the villainous Bane, a new terrorist leader who overwhelms Gotham's finest. The Dark Knight resurfaces to protect a city that has branded him an enemy.",112.31295,"[""name"": ""Legendary Pictures"", ""id"": 923, ""name"": ""Warner Bros."", ""id"": 6194, ""name"": ""DC Entertainment"", ""id"": 9993, ""name"": ""Syncopy"", ""id"": 9996]","[""iso_3166_1"": ""US"", ""name"": ""United States of America""]",2012-07-16,1084939099,165,"[""iso_639_1"": ""en"", ""name"": ""English""]",Released,The Legend Ends,The Dark Knight Rises,7.6,9106
There are 19 columns (column attributes are shown in row 1), the 4th column attribute "id" is the unique identifier for each row i.e. movie. Certain columns contain a list of dictionaries, for example the genres column in row 2 contains:
[""id"": 28, ""name"": ""Action"", ""id"": 12, ""name"": ""Adventure"", ""id"": 14, ""name"": ""Fantasy"", ""id"": 878, ""name"": ""Science Fiction""]
Below is what I am currently doing which is just resulting in a list of strings for each attribute column:
import csv
budgets = []
genres = []
homepage = []
mainID = []
keywords = []
original_language = []
original_title = []
overview = []
popularity = []
production_companies = []
production_countries = []
release_date = []
revenue = []
runtime = []
spoken_languages = []
status = []
tag_line = []
title = []
vote_avg = []
vote_count = []
key_id = []
with open('tmdb_5000_movies.csv', 'r', encoding='utf-8') as csv_file:
csv_reader = csv.reader(csv_file)
skip_attributes_row = next(csv_reader)
for item in csv_reader:
budgets.append(item[0])
genres.append(item[1])
homepage.append(item[2])
mainID.append(item[3])
keywords.append(item[4])
original_language.append(item[5])
original_title.append(item[6])
overview.append(item[7])
popularity.append(item[8])
production_companies.append(item[9])
production_countries.append(item[10])
release_date.append(item[11])
revenue.append(item[12])
runtime.append(item[13])
spoken_languages.append(item[14])
status.append(item[15])
tag_line.append(item[16])
title.append(item[17])
vote_avg.append(item[18])
vote_count.append(item[19])
python csv
python csv
asked Mar 23 at 2:39
SSJSSJ
1
1
Please provide a sample of expected output.
– gmds
Mar 23 at 3:04
I can edit the post, essentially I want each column in its own list but columns like "genres" are causing problems and need to be separated into two lists, one storing genre_id and the other storing genre_name.
– SSJ
Mar 23 at 3:17
"each column in its own list": As I don't see the benefit of doing so you have to flatten the columngenres list
doinggenres.extend(<column generes list>)
. Are you aware that you are dealing withjson
column data?
– stovfl
Mar 23 at 9:21
add a comment |
Please provide a sample of expected output.
– gmds
Mar 23 at 3:04
I can edit the post, essentially I want each column in its own list but columns like "genres" are causing problems and need to be separated into two lists, one storing genre_id and the other storing genre_name.
– SSJ
Mar 23 at 3:17
"each column in its own list": As I don't see the benefit of doing so you have to flatten the columngenres list
doinggenres.extend(<column generes list>)
. Are you aware that you are dealing withjson
column data?
– stovfl
Mar 23 at 9:21
Please provide a sample of expected output.
– gmds
Mar 23 at 3:04
Please provide a sample of expected output.
– gmds
Mar 23 at 3:04
I can edit the post, essentially I want each column in its own list but columns like "genres" are causing problems and need to be separated into two lists, one storing genre_id and the other storing genre_name.
– SSJ
Mar 23 at 3:17
I can edit the post, essentially I want each column in its own list but columns like "genres" are causing problems and need to be separated into two lists, one storing genre_id and the other storing genre_name.
– SSJ
Mar 23 at 3:17
"each column in its own list": As I don't see the benefit of doing so you have to flatten the column
genres list
doing genres.extend(<column generes list>)
. Are you aware that you are dealing with json
column data?– stovfl
Mar 23 at 9:21
"each column in its own list": As I don't see the benefit of doing so you have to flatten the column
genres list
doing genres.extend(<column generes list>)
. Are you aware that you are dealing with json
column data?– stovfl
Mar 23 at 9:21
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55310092%2fhow-can-i-parse-csv-file-while-preserving-dictionary-values-within-rows%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55310092%2fhow-can-i-parse-csv-file-while-preserving-dictionary-values-within-rows%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Please provide a sample of expected output.
– gmds
Mar 23 at 3:04
I can edit the post, essentially I want each column in its own list but columns like "genres" are causing problems and need to be separated into two lists, one storing genre_id and the other storing genre_name.
– SSJ
Mar 23 at 3:17
"each column in its own list": As I don't see the benefit of doing so you have to flatten the column
genres list
doinggenres.extend(<column generes list>)
. Are you aware that you are dealing withjson
column data?– stovfl
Mar 23 at 9:21