BeautifulSoup: Getting empty variablesHow do I check if a list is empty?Are static class variables possible in Python?How to get the current time in PythonUsing global variables in a functionGetting the last element of a listHow do I pass a variable by reference?“Least Astonishment” and the Mutable Default ArgumentHow do I get the number of elements in a list?How to access environment variable values?Why is “1000000000000000 in range(1000000000000001)” so fast in Python 3?
Being told my "network" isn't PCI compliant. I don't even have a server! Do I have to comply?
On the expression "sun-down"
What is Albrecht Dürer's Perspective Machine drawing style?
Why are sugars in whole fruits not digested the same way sugars in juice are?
Hook/Clasp/Latch? (For a necklace)
Subverting the essence of fictional and/or religious entities; is it acceptable?
How to handle many times series?
How long should I wait to plug in my refrigerator after unplugging it?
Can an unintentional murderer leave Ir Miklat for Shalosh Regalim?
What does "autolyco-sentimental" mean?
Does a bard know when a character uses their Bardic Inspiration?
What does Argus Filch specifically do?
Can't understand an ACT practice problem: Triangle appears to be isosceles, why isn't the answer 7.3~ here?
How were x-ray diffraction patterns deciphered before computers?
How to avoid a lengthy conversation with someone from the neighborhood I don't share interests with
How does Rust's 128-bit integer `i128` work on a 64-bit system?
Reasons for using monsters as bioweapons
Accurately recalling the key - can everyone do it?
how to change ^L code in many files in ubuntu?
Current in only inductive AC circuit
Is there a general term for the items in a directory?
Can birds evolve without trees?
A wiild aanimal, a cardinal direction, or a place by the water
Different answers of calculations in LuaLaTeX on local computer, lua compiler and on overleaf
BeautifulSoup: Getting empty variables
How do I check if a list is empty?Are static class variables possible in Python?How to get the current time in PythonUsing global variables in a functionGetting the last element of a listHow do I pass a variable by reference?“Least Astonishment” and the Mutable Default ArgumentHow do I get the number of elements in a list?How to access environment variable values?Why is “1000000000000000 in range(1000000000000001)” so fast in Python 3?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I have been trying to get the value of some variables of a web page:
itemPage='https://dadosabertos.camara.leg.br/api/v2/legislaturas/1'
url = urlopen(itemPage)
soupItem=BeautifulSoup(url,'lxml')
dataInicio=soupItem.find('dataInicio')
dataFim=soupItem.find('dataFim')
However, dataInicio
and dataFim
are empty. What am I doing wrong?
python beautifulsoup
add a comment |
I have been trying to get the value of some variables of a web page:
itemPage='https://dadosabertos.camara.leg.br/api/v2/legislaturas/1'
url = urlopen(itemPage)
soupItem=BeautifulSoup(url,'lxml')
dataInicio=soupItem.find('dataInicio')
dataFim=soupItem.find('dataFim')
However, dataInicio
and dataFim
are empty. What am I doing wrong?
python beautifulsoup
add a comment |
I have been trying to get the value of some variables of a web page:
itemPage='https://dadosabertos.camara.leg.br/api/v2/legislaturas/1'
url = urlopen(itemPage)
soupItem=BeautifulSoup(url,'lxml')
dataInicio=soupItem.find('dataInicio')
dataFim=soupItem.find('dataFim')
However, dataInicio
and dataFim
are empty. What am I doing wrong?
python beautifulsoup
I have been trying to get the value of some variables of a web page:
itemPage='https://dadosabertos.camara.leg.br/api/v2/legislaturas/1'
url = urlopen(itemPage)
soupItem=BeautifulSoup(url,'lxml')
dataInicio=soupItem.find('dataInicio')
dataFim=soupItem.find('dataFim')
However, dataInicio
and dataFim
are empty. What am I doing wrong?
python beautifulsoup
python beautifulsoup
edited Mar 27 at 4:28
ggorlen
11k4 gold badges12 silver badges30 bronze badges
11k4 gold badges12 silver badges30 bronze badges
asked Mar 27 at 1:13
DanielTheRocketManDanielTheRocketMan
2,1544 gold badges23 silver badges42 bronze badges
2,1544 gold badges23 silver badges42 bronze badges
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
There are a couple of issues here. First, soup expects a string as input; check your url
and see that it's actually <http.client.HTTPResponse object at 0x036D7770>
. You can read()
it, which produces a JSON byte string which is usable. But if you'd prefer to stick with XML parsing, I'd recommend using Python's request
library to obtain a raw XML string (pass in correct headers to specify XML).
Secondly, when you create your soup object, you need to pass in features="xml"
instead of "lxml"
.
Putting it all together:
import requests
from bs4 import BeautifulSoup
item_page = "https://dadosabertos.camara.leg.br/api/v2/legislaturas/1"
response = requests.get(item_page, headers="accept": "application/xml")
soup = BeautifulSoup(response.text, "xml")
data_inicio = soup.find("dataInicio")
data_fim = soup.find("dataFim")
print(data_inicio)
print(data_fim)
Output:
<dataInicio>1826-04-29</dataInicio>
<dataFim>1830-04-24</dataFim>
Nice answer, just a minor detail - BeautifulSoup accepts strings and file-like objects. HTTPResponse is such an object and we can pass it to BeautifulSoup without reading it.
– t.m.adam
Mar 27 at 12:11
Sure, that seems accurate, but then OP's code should work by simply changing"lxml"
to"xml"
in the bs object, but it doesn't--curious why.
– ggorlen
Mar 27 at 15:09
It does if you set the Accept header (we can set headers withurllib.request.Request()
). It works with"lxml"
too if you use lowercase, becauselxml
converts all tag names to lowercase. But it may be best not to set the Accept header; the API will return json and we can get the data withresponse.json()['dados']['dataFim']
, without using Beautiful Soup.
– t.m.adam
Mar 27 at 22:20
Nice, thanks for the tip and clarification.
– ggorlen
Mar 27 at 22:22
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55368387%2fbeautifulsoup-getting-empty-variables%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
There are a couple of issues here. First, soup expects a string as input; check your url
and see that it's actually <http.client.HTTPResponse object at 0x036D7770>
. You can read()
it, which produces a JSON byte string which is usable. But if you'd prefer to stick with XML parsing, I'd recommend using Python's request
library to obtain a raw XML string (pass in correct headers to specify XML).
Secondly, when you create your soup object, you need to pass in features="xml"
instead of "lxml"
.
Putting it all together:
import requests
from bs4 import BeautifulSoup
item_page = "https://dadosabertos.camara.leg.br/api/v2/legislaturas/1"
response = requests.get(item_page, headers="accept": "application/xml")
soup = BeautifulSoup(response.text, "xml")
data_inicio = soup.find("dataInicio")
data_fim = soup.find("dataFim")
print(data_inicio)
print(data_fim)
Output:
<dataInicio>1826-04-29</dataInicio>
<dataFim>1830-04-24</dataFim>
Nice answer, just a minor detail - BeautifulSoup accepts strings and file-like objects. HTTPResponse is such an object and we can pass it to BeautifulSoup without reading it.
– t.m.adam
Mar 27 at 12:11
Sure, that seems accurate, but then OP's code should work by simply changing"lxml"
to"xml"
in the bs object, but it doesn't--curious why.
– ggorlen
Mar 27 at 15:09
It does if you set the Accept header (we can set headers withurllib.request.Request()
). It works with"lxml"
too if you use lowercase, becauselxml
converts all tag names to lowercase. But it may be best not to set the Accept header; the API will return json and we can get the data withresponse.json()['dados']['dataFim']
, without using Beautiful Soup.
– t.m.adam
Mar 27 at 22:20
Nice, thanks for the tip and clarification.
– ggorlen
Mar 27 at 22:22
add a comment |
There are a couple of issues here. First, soup expects a string as input; check your url
and see that it's actually <http.client.HTTPResponse object at 0x036D7770>
. You can read()
it, which produces a JSON byte string which is usable. But if you'd prefer to stick with XML parsing, I'd recommend using Python's request
library to obtain a raw XML string (pass in correct headers to specify XML).
Secondly, when you create your soup object, you need to pass in features="xml"
instead of "lxml"
.
Putting it all together:
import requests
from bs4 import BeautifulSoup
item_page = "https://dadosabertos.camara.leg.br/api/v2/legislaturas/1"
response = requests.get(item_page, headers="accept": "application/xml")
soup = BeautifulSoup(response.text, "xml")
data_inicio = soup.find("dataInicio")
data_fim = soup.find("dataFim")
print(data_inicio)
print(data_fim)
Output:
<dataInicio>1826-04-29</dataInicio>
<dataFim>1830-04-24</dataFim>
Nice answer, just a minor detail - BeautifulSoup accepts strings and file-like objects. HTTPResponse is such an object and we can pass it to BeautifulSoup without reading it.
– t.m.adam
Mar 27 at 12:11
Sure, that seems accurate, but then OP's code should work by simply changing"lxml"
to"xml"
in the bs object, but it doesn't--curious why.
– ggorlen
Mar 27 at 15:09
It does if you set the Accept header (we can set headers withurllib.request.Request()
). It works with"lxml"
too if you use lowercase, becauselxml
converts all tag names to lowercase. But it may be best not to set the Accept header; the API will return json and we can get the data withresponse.json()['dados']['dataFim']
, without using Beautiful Soup.
– t.m.adam
Mar 27 at 22:20
Nice, thanks for the tip and clarification.
– ggorlen
Mar 27 at 22:22
add a comment |
There are a couple of issues here. First, soup expects a string as input; check your url
and see that it's actually <http.client.HTTPResponse object at 0x036D7770>
. You can read()
it, which produces a JSON byte string which is usable. But if you'd prefer to stick with XML parsing, I'd recommend using Python's request
library to obtain a raw XML string (pass in correct headers to specify XML).
Secondly, when you create your soup object, you need to pass in features="xml"
instead of "lxml"
.
Putting it all together:
import requests
from bs4 import BeautifulSoup
item_page = "https://dadosabertos.camara.leg.br/api/v2/legislaturas/1"
response = requests.get(item_page, headers="accept": "application/xml")
soup = BeautifulSoup(response.text, "xml")
data_inicio = soup.find("dataInicio")
data_fim = soup.find("dataFim")
print(data_inicio)
print(data_fim)
Output:
<dataInicio>1826-04-29</dataInicio>
<dataFim>1830-04-24</dataFim>
There are a couple of issues here. First, soup expects a string as input; check your url
and see that it's actually <http.client.HTTPResponse object at 0x036D7770>
. You can read()
it, which produces a JSON byte string which is usable. But if you'd prefer to stick with XML parsing, I'd recommend using Python's request
library to obtain a raw XML string (pass in correct headers to specify XML).
Secondly, when you create your soup object, you need to pass in features="xml"
instead of "lxml"
.
Putting it all together:
import requests
from bs4 import BeautifulSoup
item_page = "https://dadosabertos.camara.leg.br/api/v2/legislaturas/1"
response = requests.get(item_page, headers="accept": "application/xml")
soup = BeautifulSoup(response.text, "xml")
data_inicio = soup.find("dataInicio")
data_fim = soup.find("dataFim")
print(data_inicio)
print(data_fim)
Output:
<dataInicio>1826-04-29</dataInicio>
<dataFim>1830-04-24</dataFim>
answered Mar 27 at 1:45
ggorlenggorlen
11k4 gold badges12 silver badges30 bronze badges
11k4 gold badges12 silver badges30 bronze badges
Nice answer, just a minor detail - BeautifulSoup accepts strings and file-like objects. HTTPResponse is such an object and we can pass it to BeautifulSoup without reading it.
– t.m.adam
Mar 27 at 12:11
Sure, that seems accurate, but then OP's code should work by simply changing"lxml"
to"xml"
in the bs object, but it doesn't--curious why.
– ggorlen
Mar 27 at 15:09
It does if you set the Accept header (we can set headers withurllib.request.Request()
). It works with"lxml"
too if you use lowercase, becauselxml
converts all tag names to lowercase. But it may be best not to set the Accept header; the API will return json and we can get the data withresponse.json()['dados']['dataFim']
, without using Beautiful Soup.
– t.m.adam
Mar 27 at 22:20
Nice, thanks for the tip and clarification.
– ggorlen
Mar 27 at 22:22
add a comment |
Nice answer, just a minor detail - BeautifulSoup accepts strings and file-like objects. HTTPResponse is such an object and we can pass it to BeautifulSoup without reading it.
– t.m.adam
Mar 27 at 12:11
Sure, that seems accurate, but then OP's code should work by simply changing"lxml"
to"xml"
in the bs object, but it doesn't--curious why.
– ggorlen
Mar 27 at 15:09
It does if you set the Accept header (we can set headers withurllib.request.Request()
). It works with"lxml"
too if you use lowercase, becauselxml
converts all tag names to lowercase. But it may be best not to set the Accept header; the API will return json and we can get the data withresponse.json()['dados']['dataFim']
, without using Beautiful Soup.
– t.m.adam
Mar 27 at 22:20
Nice, thanks for the tip and clarification.
– ggorlen
Mar 27 at 22:22
Nice answer, just a minor detail - BeautifulSoup accepts strings and file-like objects. HTTPResponse is such an object and we can pass it to BeautifulSoup without reading it.
– t.m.adam
Mar 27 at 12:11
Nice answer, just a minor detail - BeautifulSoup accepts strings and file-like objects. HTTPResponse is such an object and we can pass it to BeautifulSoup without reading it.
– t.m.adam
Mar 27 at 12:11
Sure, that seems accurate, but then OP's code should work by simply changing
"lxml"
to "xml"
in the bs object, but it doesn't--curious why.– ggorlen
Mar 27 at 15:09
Sure, that seems accurate, but then OP's code should work by simply changing
"lxml"
to "xml"
in the bs object, but it doesn't--curious why.– ggorlen
Mar 27 at 15:09
It does if you set the Accept header (we can set headers with
urllib.request.Request()
). It works with "lxml"
too if you use lowercase, because lxml
converts all tag names to lowercase. But it may be best not to set the Accept header; the API will return json and we can get the data with response.json()['dados']['dataFim']
, without using Beautiful Soup.– t.m.adam
Mar 27 at 22:20
It does if you set the Accept header (we can set headers with
urllib.request.Request()
). It works with "lxml"
too if you use lowercase, because lxml
converts all tag names to lowercase. But it may be best not to set the Accept header; the API will return json and we can get the data with response.json()['dados']['dataFim']
, without using Beautiful Soup.– t.m.adam
Mar 27 at 22:20
Nice, thanks for the tip and clarification.
– ggorlen
Mar 27 at 22:22
Nice, thanks for the tip and clarification.
– ggorlen
Mar 27 at 22:22
add a comment |
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55368387%2fbeautifulsoup-getting-empty-variables%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown