BeautifulSoup find all table emptyCannot find table using Python BeautifulSoupProblems Parsing NBA Boxscore Data with BeautifulSouplxml is not found within Beautiful SoupBeautifulSoup4 fails to parse multiple tablesWeb parsing with python beautifulsoup producing inconsistent resultBeautiful Soup Can't Find TagsData scraping with Python lxml returns adblocker valueScraping Table using Python and SeleniumPython Beautiful Soup Table Data Scraping all except a specific <td> dataIs there a good way to combine these find_all lists in the right order
What does this line from The Hobbit mean?
What is the name of this four-engine plane?
How do certain apps show new notifications when internet access is restricted to them?
What was the ultimate objective of The Party in 1984?
What organs or modifications would be needed for a life biological creature not to require sleep?
Transit visa to Hong Kong
'Overwrote' files, space still occupied, are they lost?
Exam design: give maximum score per question or not?
Other than good shoes and a stick, what are some ways to preserve your knees on long hikes?
Why is the return value of the fun function 8 instead of 7?
How to give my students a straightedge instead of a ruler
Asked to Not Use Transactions and to Use A Workaround to Simulate One
Python web-scraper to download table of transistor counts from Wikipedia
Can Brexit be undone in an emergency?
Seven Places at Once - Another Google Earth Challenge?
Tips for remembering the order of parameters for ln?
How clean are pets?
What are the specifics for a Block of Incense?
Writing a system of Linear Equations
How do we know that black holes are spinning?
How to ensure that neurotic or annoying characters don't get tiring on the long run
Are there objective criteria for classifying consonance v. dissonance?
How to generate short fixed length cryptographic hashs?
What would happen if Protagoras v Euathlus were heard in court today?
BeautifulSoup find all table empty
Cannot find table using Python BeautifulSoupProblems Parsing NBA Boxscore Data with BeautifulSouplxml is not found within Beautiful SoupBeautifulSoup4 fails to parse multiple tablesWeb parsing with python beautifulsoup producing inconsistent resultBeautiful Soup Can't Find TagsData scraping with Python lxml returns adblocker valueScraping Table using Python and SeleniumPython Beautiful Soup Table Data Scraping all except a specific <td> dataIs there a good way to combine these find_all lists in the right order
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I am trying to scrape a very simple table from a NOAA website: https://www1.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_HLY_sample_ascii.dat
The table is a ".dat" file and the site appears to be in html. When I use BeautifulSoup to read the content, I can see the content just fine. However, when I then search for the table with "find_all" or "find," I get nothing, i.e., [].
Here is my initial code:
page = requests.get('https://www1.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_HLY_sample_ascii.dat')
soup = BeautifulSoup(page.content,'html.parser') #'html5lib' #'html.parser' 'lxml'
table = soup.find_all('table')
When I type soup, I get the following:
However, when I try to get the info into a table, it comes up blank
table
>> []
I have tried the following variations:
page = urllib.request.urlopen('https://www1.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_HLY_sample_ascii.dat').read()
soup = BeautifulSoup(page,'lxml')
soup = BeautifulSoup(page,'html5lib') #'' #''
table = soup.findAll('table')
table = soup.findAll("div","class":"line-gutter-backdrop")
table = soup.find_all(True)
However, table still comes up blank.
I found this question that appears to be similar:Cannot find table using Python BeautifulSoup
But my table is not in javascript (as far as I know). It is just text.
I am very new to data scraping and really have no idea why this simple example is not working. Any help is very appreciated. Thank you.
python-3.x web-scraping beautifulsoup
add a comment
|
I am trying to scrape a very simple table from a NOAA website: https://www1.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_HLY_sample_ascii.dat
The table is a ".dat" file and the site appears to be in html. When I use BeautifulSoup to read the content, I can see the content just fine. However, when I then search for the table with "find_all" or "find," I get nothing, i.e., [].
Here is my initial code:
page = requests.get('https://www1.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_HLY_sample_ascii.dat')
soup = BeautifulSoup(page.content,'html.parser') #'html5lib' #'html.parser' 'lxml'
table = soup.find_all('table')
When I type soup, I get the following:
However, when I try to get the info into a table, it comes up blank
table
>> []
I have tried the following variations:
page = urllib.request.urlopen('https://www1.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_HLY_sample_ascii.dat').read()
soup = BeautifulSoup(page,'lxml')
soup = BeautifulSoup(page,'html5lib') #'' #''
table = soup.findAll('table')
table = soup.findAll("div","class":"line-gutter-backdrop")
table = soup.find_all(True)
However, table still comes up blank.
I found this question that appears to be similar:Cannot find table using Python BeautifulSoup
But my table is not in javascript (as far as I know). It is just text.
I am very new to data scraping and really have no idea why this simple example is not working. Any help is very appreciated. Thank you.
python-3.x web-scraping beautifulsoup
There was another answer to this question yesterday, but now it is gone. Could you please repost it? Thank you.
– LCook
Mar 29 at 9:38
add a comment
|
I am trying to scrape a very simple table from a NOAA website: https://www1.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_HLY_sample_ascii.dat
The table is a ".dat" file and the site appears to be in html. When I use BeautifulSoup to read the content, I can see the content just fine. However, when I then search for the table with "find_all" or "find," I get nothing, i.e., [].
Here is my initial code:
page = requests.get('https://www1.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_HLY_sample_ascii.dat')
soup = BeautifulSoup(page.content,'html.parser') #'html5lib' #'html.parser' 'lxml'
table = soup.find_all('table')
When I type soup, I get the following:
However, when I try to get the info into a table, it comes up blank
table
>> []
I have tried the following variations:
page = urllib.request.urlopen('https://www1.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_HLY_sample_ascii.dat').read()
soup = BeautifulSoup(page,'lxml')
soup = BeautifulSoup(page,'html5lib') #'' #''
table = soup.findAll('table')
table = soup.findAll("div","class":"line-gutter-backdrop")
table = soup.find_all(True)
However, table still comes up blank.
I found this question that appears to be similar:Cannot find table using Python BeautifulSoup
But my table is not in javascript (as far as I know). It is just text.
I am very new to data scraping and really have no idea why this simple example is not working. Any help is very appreciated. Thank you.
python-3.x web-scraping beautifulsoup
I am trying to scrape a very simple table from a NOAA website: https://www1.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_HLY_sample_ascii.dat
The table is a ".dat" file and the site appears to be in html. When I use BeautifulSoup to read the content, I can see the content just fine. However, when I then search for the table with "find_all" or "find," I get nothing, i.e., [].
Here is my initial code:
page = requests.get('https://www1.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_HLY_sample_ascii.dat')
soup = BeautifulSoup(page.content,'html.parser') #'html5lib' #'html.parser' 'lxml'
table = soup.find_all('table')
When I type soup, I get the following:
However, when I try to get the info into a table, it comes up blank
table
>> []
I have tried the following variations:
page = urllib.request.urlopen('https://www1.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_HLY_sample_ascii.dat').read()
soup = BeautifulSoup(page,'lxml')
soup = BeautifulSoup(page,'html5lib') #'' #''
table = soup.findAll('table')
table = soup.findAll("div","class":"line-gutter-backdrop")
table = soup.find_all(True)
However, table still comes up blank.
I found this question that appears to be similar:Cannot find table using Python BeautifulSoup
But my table is not in javascript (as far as I know). It is just text.
I am very new to data scraping and really have no idea why this simple example is not working. Any help is very appreciated. Thank you.
python-3.x web-scraping beautifulsoup
python-3.x web-scraping beautifulsoup
asked Mar 28 at 12:55
LCookLCook
371 silver badge4 bronze badges
371 silver badge4 bronze badges
There was another answer to this question yesterday, but now it is gone. Could you please repost it? Thank you.
– LCook
Mar 29 at 9:38
add a comment
|
There was another answer to this question yesterday, but now it is gone. Could you please repost it? Thank you.
– LCook
Mar 29 at 9:38
There was another answer to this question yesterday, but now it is gone. Could you please repost it? Thank you.
– LCook
Mar 29 at 9:38
There was another answer to this question yesterday, but now it is gone. Could you please repost it? Thank you.
– LCook
Mar 29 at 9:38
add a comment
|
1 Answer
1
active
oldest
votes
You cannot find the table tag because there is not, you have to find the pre tag.
You can try this snippet, it will get the text inside the table:
from bs4 import BeautifulSoup as soup
import urllib
url = 'https://www1.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_HLY_sample_ascii.dat'
response = urllib.request.urlopen(url)
html = response.read()
page_soup = soup(html,'lxml')
table = page_soup.find('p')
print(table.text)
OUTPUT:
STATION STATION_NAME ELEVATION LATITUDE LONGITUDE DATE HPCP Measurement Flag Quality Flag
----------------- -------------------------------------------------- ---------- ---------- ---------- -------------- -------- ---------------- ------------
COOP:310301 ASHEVILLE NC US 682.1 35.5954 -82.5568 20100101 00:00 99999 ]
COOP:310301 ASHEVILLE NC US 682.1 35.5954 -82.5568 20100101 01:00 0 g
COOP:310301 ASHEVILLE NC US 682.1 35.5954 -82.5568 20100102 06:00 1
Thank you. Is there a way to get this into a pandas dataframe using this method?
– LCook
Mar 28 at 16:06
I think panda is used to get the data from a table tag, so I don't think it is possible here. But I'm not an expert of panda :-)
– Maaz
Mar 29 at 12:52
add a comment
|
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55398210%2fbeautifulsoup-find-all-table-empty%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You cannot find the table tag because there is not, you have to find the pre tag.
You can try this snippet, it will get the text inside the table:
from bs4 import BeautifulSoup as soup
import urllib
url = 'https://www1.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_HLY_sample_ascii.dat'
response = urllib.request.urlopen(url)
html = response.read()
page_soup = soup(html,'lxml')
table = page_soup.find('p')
print(table.text)
OUTPUT:
STATION STATION_NAME ELEVATION LATITUDE LONGITUDE DATE HPCP Measurement Flag Quality Flag
----------------- -------------------------------------------------- ---------- ---------- ---------- -------------- -------- ---------------- ------------
COOP:310301 ASHEVILLE NC US 682.1 35.5954 -82.5568 20100101 00:00 99999 ]
COOP:310301 ASHEVILLE NC US 682.1 35.5954 -82.5568 20100101 01:00 0 g
COOP:310301 ASHEVILLE NC US 682.1 35.5954 -82.5568 20100102 06:00 1
Thank you. Is there a way to get this into a pandas dataframe using this method?
– LCook
Mar 28 at 16:06
I think panda is used to get the data from a table tag, so I don't think it is possible here. But I'm not an expert of panda :-)
– Maaz
Mar 29 at 12:52
add a comment
|
You cannot find the table tag because there is not, you have to find the pre tag.
You can try this snippet, it will get the text inside the table:
from bs4 import BeautifulSoup as soup
import urllib
url = 'https://www1.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_HLY_sample_ascii.dat'
response = urllib.request.urlopen(url)
html = response.read()
page_soup = soup(html,'lxml')
table = page_soup.find('p')
print(table.text)
OUTPUT:
STATION STATION_NAME ELEVATION LATITUDE LONGITUDE DATE HPCP Measurement Flag Quality Flag
----------------- -------------------------------------------------- ---------- ---------- ---------- -------------- -------- ---------------- ------------
COOP:310301 ASHEVILLE NC US 682.1 35.5954 -82.5568 20100101 00:00 99999 ]
COOP:310301 ASHEVILLE NC US 682.1 35.5954 -82.5568 20100101 01:00 0 g
COOP:310301 ASHEVILLE NC US 682.1 35.5954 -82.5568 20100102 06:00 1
Thank you. Is there a way to get this into a pandas dataframe using this method?
– LCook
Mar 28 at 16:06
I think panda is used to get the data from a table tag, so I don't think it is possible here. But I'm not an expert of panda :-)
– Maaz
Mar 29 at 12:52
add a comment
|
You cannot find the table tag because there is not, you have to find the pre tag.
You can try this snippet, it will get the text inside the table:
from bs4 import BeautifulSoup as soup
import urllib
url = 'https://www1.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_HLY_sample_ascii.dat'
response = urllib.request.urlopen(url)
html = response.read()
page_soup = soup(html,'lxml')
table = page_soup.find('p')
print(table.text)
OUTPUT:
STATION STATION_NAME ELEVATION LATITUDE LONGITUDE DATE HPCP Measurement Flag Quality Flag
----------------- -------------------------------------------------- ---------- ---------- ---------- -------------- -------- ---------------- ------------
COOP:310301 ASHEVILLE NC US 682.1 35.5954 -82.5568 20100101 00:00 99999 ]
COOP:310301 ASHEVILLE NC US 682.1 35.5954 -82.5568 20100101 01:00 0 g
COOP:310301 ASHEVILLE NC US 682.1 35.5954 -82.5568 20100102 06:00 1
You cannot find the table tag because there is not, you have to find the pre tag.
You can try this snippet, it will get the text inside the table:
from bs4 import BeautifulSoup as soup
import urllib
url = 'https://www1.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_HLY_sample_ascii.dat'
response = urllib.request.urlopen(url)
html = response.read()
page_soup = soup(html,'lxml')
table = page_soup.find('p')
print(table.text)
OUTPUT:
STATION STATION_NAME ELEVATION LATITUDE LONGITUDE DATE HPCP Measurement Flag Quality Flag
----------------- -------------------------------------------------- ---------- ---------- ---------- -------------- -------- ---------------- ------------
COOP:310301 ASHEVILLE NC US 682.1 35.5954 -82.5568 20100101 00:00 99999 ]
COOP:310301 ASHEVILLE NC US 682.1 35.5954 -82.5568 20100101 01:00 0 g
COOP:310301 ASHEVILLE NC US 682.1 35.5954 -82.5568 20100102 06:00 1
answered Mar 28 at 13:13
MaazMaaz
1,4631 gold badge8 silver badges15 bronze badges
1,4631 gold badge8 silver badges15 bronze badges
Thank you. Is there a way to get this into a pandas dataframe using this method?
– LCook
Mar 28 at 16:06
I think panda is used to get the data from a table tag, so I don't think it is possible here. But I'm not an expert of panda :-)
– Maaz
Mar 29 at 12:52
add a comment
|
Thank you. Is there a way to get this into a pandas dataframe using this method?
– LCook
Mar 28 at 16:06
I think panda is used to get the data from a table tag, so I don't think it is possible here. But I'm not an expert of panda :-)
– Maaz
Mar 29 at 12:52
Thank you. Is there a way to get this into a pandas dataframe using this method?
– LCook
Mar 28 at 16:06
Thank you. Is there a way to get this into a pandas dataframe using this method?
– LCook
Mar 28 at 16:06
I think panda is used to get the data from a table tag, so I don't think it is possible here. But I'm not an expert of panda :-)
– Maaz
Mar 29 at 12:52
I think panda is used to get the data from a table tag, so I don't think it is possible here. But I'm not an expert of panda :-)
– Maaz
Mar 29 at 12:52
add a comment
|
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55398210%2fbeautifulsoup-find-all-table-empty%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
There was another answer to this question yesterday, but now it is gone. Could you please repost it? Thank you.
– LCook
Mar 29 at 9:38