Dataset import using pandasPython: Read several json files from a folderSelecting multiple columns in a pandas dataframeRenaming columns in pandasLarge, persistent DataFrame in pandasAdding new column to existing DataFrame in Python pandasDelete column from pandas DataFrameRelative imports for the billionth time“Large data” work flows using pandasHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headers
What is the purpose of the fuel shutoff valve?
Why do people say "I am broke" instead of "I am broken"?
401(k) investment after being fired. Do I own it?
Spoken encryption
Is it normal practice to screen share with a client?
Big Sample size, Small coefficients, significant results. What should I do?
Moving files accidentally to an not existing directory erases files?
Extrapolation v. Interpolation
Is a normal-sized rug with the Animate Objects spell cast on it able to carry a person and fly?
How to sort and filter a constantly changing list of data?
How can I prevent corporations from growing their own workforce?
Protected custom settings as a parameter in an @AuraEnabled method causes error
Inadvertently nuked my disk permission structure - why?
Book with a female main character living in a convent who has to fight gods
Why did computer video outputs go from digital to analog, then back to digital?
Film where a boy turns into a princess
Why is chess failing to attract big name sponsors?
How do I run a game when my PCs have different approaches to combat?
Are glider winch launches rarer in the USA than in the rest of the world? Why?
Why are there not any MRI machines available in Interstellar?
Other than a swing wing, what types of variable geometry have flown?
Examples of solving for unknowns using equivalence relations that are not equality, inequality, or boolean truth?
How did C64 games handle music during gameplay?
Determine if a triangle is equilateral, isosceles, or scalene
Dataset import using pandas
Python: Read several json files from a folderSelecting multiple columns in a pandas dataframeRenaming columns in pandasLarge, persistent DataFrame in pandasAdding new column to existing DataFrame in Python pandasDelete column from pandas DataFrameRelative imports for the billionth time“Large data” work flows using pandasHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headers
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I imported a data-set from github (json) which is a folder that contains many sub-folders, under sub-folders there are numbers of document files but now I have downloaded the data-set to my local drive and I don't know how to import the data-set folder from my local drive. I do have knowledge about importing csv file using pandas but since my data-set is a folder like I have mentioned above. Could somebody please tell how to import it from my local drive without compromising the following code. Of course I am working with python. Please check the code which shows the dataset being imported from github. And '20_newsgroup' is the name of the folder in my local drive.
# Import Dataset
df = pd.read_json('https://raw.githubusercontent.com/selva86/datasets/master/newsgroups.json')
df = df.loc[df.target_names.isin(['soc.religion.christian', 'rec.sport.hockey', 'talk.politics.mideast', 'rec.motorcycles']) , :]
print(df.shape) #> (2361, 3)
df.head()
# Convert to list
data = df.content.values.tolist()
data_words = list(sent_to_words(data))
print(data_words[:1])
python python-3.x pandas
add a comment |
I imported a data-set from github (json) which is a folder that contains many sub-folders, under sub-folders there are numbers of document files but now I have downloaded the data-set to my local drive and I don't know how to import the data-set folder from my local drive. I do have knowledge about importing csv file using pandas but since my data-set is a folder like I have mentioned above. Could somebody please tell how to import it from my local drive without compromising the following code. Of course I am working with python. Please check the code which shows the dataset being imported from github. And '20_newsgroup' is the name of the folder in my local drive.
# Import Dataset
df = pd.read_json('https://raw.githubusercontent.com/selva86/datasets/master/newsgroups.json')
df = df.loc[df.target_names.isin(['soc.religion.christian', 'rec.sport.hockey', 'talk.politics.mideast', 'rec.motorcycles']) , :]
print(df.shape) #> (2361, 3)
df.head()
# Convert to list
data = df.content.values.tolist()
data_words = list(sent_to_words(data))
print(data_words[:1])
python python-3.x pandas
Pandas read_json(path_or_buff,args) takes a file path, URL or bytes buffer. A valid url uses the following schemes include http, ftp, s3, gcs, and file. https is not supported scheme in your case.
– MUNGAI NJOROGE
Mar 26 at 17:53
Ah! yes a mistake. Dead-link. Corrected
– Kenneth Flank
Mar 27 at 4:54
add a comment |
I imported a data-set from github (json) which is a folder that contains many sub-folders, under sub-folders there are numbers of document files but now I have downloaded the data-set to my local drive and I don't know how to import the data-set folder from my local drive. I do have knowledge about importing csv file using pandas but since my data-set is a folder like I have mentioned above. Could somebody please tell how to import it from my local drive without compromising the following code. Of course I am working with python. Please check the code which shows the dataset being imported from github. And '20_newsgroup' is the name of the folder in my local drive.
# Import Dataset
df = pd.read_json('https://raw.githubusercontent.com/selva86/datasets/master/newsgroups.json')
df = df.loc[df.target_names.isin(['soc.religion.christian', 'rec.sport.hockey', 'talk.politics.mideast', 'rec.motorcycles']) , :]
print(df.shape) #> (2361, 3)
df.head()
# Convert to list
data = df.content.values.tolist()
data_words = list(sent_to_words(data))
print(data_words[:1])
python python-3.x pandas
I imported a data-set from github (json) which is a folder that contains many sub-folders, under sub-folders there are numbers of document files but now I have downloaded the data-set to my local drive and I don't know how to import the data-set folder from my local drive. I do have knowledge about importing csv file using pandas but since my data-set is a folder like I have mentioned above. Could somebody please tell how to import it from my local drive without compromising the following code. Of course I am working with python. Please check the code which shows the dataset being imported from github. And '20_newsgroup' is the name of the folder in my local drive.
# Import Dataset
df = pd.read_json('https://raw.githubusercontent.com/selva86/datasets/master/newsgroups.json')
df = df.loc[df.target_names.isin(['soc.religion.christian', 'rec.sport.hockey', 'talk.politics.mideast', 'rec.motorcycles']) , :]
print(df.shape) #> (2361, 3)
df.head()
# Convert to list
data = df.content.values.tolist()
data_words = list(sent_to_words(data))
print(data_words[:1])
python python-3.x pandas
python python-3.x pandas
edited Mar 27 at 4:56
Kenneth Flank
asked Mar 26 at 15:50
Kenneth FlankKenneth Flank
236 bronze badges
236 bronze badges
Pandas read_json(path_or_buff,args) takes a file path, URL or bytes buffer. A valid url uses the following schemes include http, ftp, s3, gcs, and file. https is not supported scheme in your case.
– MUNGAI NJOROGE
Mar 26 at 17:53
Ah! yes a mistake. Dead-link. Corrected
– Kenneth Flank
Mar 27 at 4:54
add a comment |
Pandas read_json(path_or_buff,args) takes a file path, URL or bytes buffer. A valid url uses the following schemes include http, ftp, s3, gcs, and file. https is not supported scheme in your case.
– MUNGAI NJOROGE
Mar 26 at 17:53
Ah! yes a mistake. Dead-link. Corrected
– Kenneth Flank
Mar 27 at 4:54
Pandas read_json(path_or_buff,args) takes a file path, URL or bytes buffer. A valid url uses the following schemes include http, ftp, s3, gcs, and file. https is not supported scheme in your case.
– MUNGAI NJOROGE
Mar 26 at 17:53
Pandas read_json(path_or_buff,args) takes a file path, URL or bytes buffer. A valid url uses the following schemes include http, ftp, s3, gcs, and file. https is not supported scheme in your case.
– MUNGAI NJOROGE
Mar 26 at 17:53
Ah! yes a mistake. Dead-link. Corrected
– Kenneth Flank
Mar 27 at 4:54
Ah! yes a mistake. Dead-link. Corrected
– Kenneth Flank
Mar 27 at 4:54
add a comment |
2 Answers
2
active
oldest
votes
df = pd.read_json('newsgroups.json')
should suffice.
(Or pd.read_json('some/directory/newsgroups.json')
if it's not in the current directory.)
Thanks a lot. Actually my case is that I downloaded the data-set 'newsgroups' from UCI data-set site which is a folder like any other folder and I want to import that folder but your method also work just fine. I downloaded the json file from the github website and it works. THANKS
– Kenneth Flank
Mar 27 at 4:53
add a comment |
In terms of uploading multiple files from a directory, I would see if this answers your question: https://stackoverflow.com/a/30540662/9524722
Helpful. Thanks
– Kenneth Flank
Mar 27 at 4:57
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55361275%2fdataset-import-using-pandas%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
df = pd.read_json('newsgroups.json')
should suffice.
(Or pd.read_json('some/directory/newsgroups.json')
if it's not in the current directory.)
Thanks a lot. Actually my case is that I downloaded the data-set 'newsgroups' from UCI data-set site which is a folder like any other folder and I want to import that folder but your method also work just fine. I downloaded the json file from the github website and it works. THANKS
– Kenneth Flank
Mar 27 at 4:53
add a comment |
df = pd.read_json('newsgroups.json')
should suffice.
(Or pd.read_json('some/directory/newsgroups.json')
if it's not in the current directory.)
Thanks a lot. Actually my case is that I downloaded the data-set 'newsgroups' from UCI data-set site which is a folder like any other folder and I want to import that folder but your method also work just fine. I downloaded the json file from the github website and it works. THANKS
– Kenneth Flank
Mar 27 at 4:53
add a comment |
df = pd.read_json('newsgroups.json')
should suffice.
(Or pd.read_json('some/directory/newsgroups.json')
if it's not in the current directory.)
df = pd.read_json('newsgroups.json')
should suffice.
(Or pd.read_json('some/directory/newsgroups.json')
if it's not in the current directory.)
answered Mar 26 at 17:41
J_HJ_H
6,0451 gold badge9 silver badges24 bronze badges
6,0451 gold badge9 silver badges24 bronze badges
Thanks a lot. Actually my case is that I downloaded the data-set 'newsgroups' from UCI data-set site which is a folder like any other folder and I want to import that folder but your method also work just fine. I downloaded the json file from the github website and it works. THANKS
– Kenneth Flank
Mar 27 at 4:53
add a comment |
Thanks a lot. Actually my case is that I downloaded the data-set 'newsgroups' from UCI data-set site which is a folder like any other folder and I want to import that folder but your method also work just fine. I downloaded the json file from the github website and it works. THANKS
– Kenneth Flank
Mar 27 at 4:53
Thanks a lot. Actually my case is that I downloaded the data-set 'newsgroups' from UCI data-set site which is a folder like any other folder and I want to import that folder but your method also work just fine. I downloaded the json file from the github website and it works. THANKS
– Kenneth Flank
Mar 27 at 4:53
Thanks a lot. Actually my case is that I downloaded the data-set 'newsgroups' from UCI data-set site which is a folder like any other folder and I want to import that folder but your method also work just fine. I downloaded the json file from the github website and it works. THANKS
– Kenneth Flank
Mar 27 at 4:53
add a comment |
In terms of uploading multiple files from a directory, I would see if this answers your question: https://stackoverflow.com/a/30540662/9524722
Helpful. Thanks
– Kenneth Flank
Mar 27 at 4:57
add a comment |
In terms of uploading multiple files from a directory, I would see if this answers your question: https://stackoverflow.com/a/30540662/9524722
Helpful. Thanks
– Kenneth Flank
Mar 27 at 4:57
add a comment |
In terms of uploading multiple files from a directory, I would see if this answers your question: https://stackoverflow.com/a/30540662/9524722
In terms of uploading multiple files from a directory, I would see if this answers your question: https://stackoverflow.com/a/30540662/9524722
answered Mar 26 at 17:50
KeenanKeenan
314 bronze badges
314 bronze badges
Helpful. Thanks
– Kenneth Flank
Mar 27 at 4:57
add a comment |
Helpful. Thanks
– Kenneth Flank
Mar 27 at 4:57
Helpful. Thanks
– Kenneth Flank
Mar 27 at 4:57
Helpful. Thanks
– Kenneth Flank
Mar 27 at 4:57
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55361275%2fdataset-import-using-pandas%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Pandas read_json(path_or_buff,args) takes a file path, URL or bytes buffer. A valid url uses the following schemes include http, ftp, s3, gcs, and file. https is not supported scheme in your case.
– MUNGAI NJOROGE
Mar 26 at 17:53
Ah! yes a mistake. Dead-link. Corrected
– Kenneth Flank
Mar 27 at 4:54