BeautifulSoup: Getting empty variablesHow do I check if a list is empty?Are static class variables possible in Python?How to get the current time in PythonUsing global variables in a functionGetting the last element of a listHow do I pass a variable by reference?“Least Astonishment” and the Mutable Default ArgumentHow do I get the number of elements in a list?How to access environment variable values?Why is “1000000000000000 in range(1000000000000001)” so fast in Python 3?

Being told my "network" isn't PCI compliant. I don't even have a server! Do I have to comply?

On the expression "sun-down"

What is Albrecht Dürer's Perspective Machine drawing style?

Why are sugars in whole fruits not digested the same way sugars in juice are?

Hook/Clasp/Latch? (For a necklace)

Subverting the essence of fictional and/or religious entities; is it acceptable?

How to handle many times series?

How long should I wait to plug in my refrigerator after unplugging it?

Can an unintentional murderer leave Ir Miklat for Shalosh Regalim?

What does "autolyco-sentimental" mean?

Does a bard know when a character uses their Bardic Inspiration?

What does Argus Filch specifically do?

Can't understand an ACT practice problem: Triangle appears to be isosceles, why isn't the answer 7.3~ here?

How were x-ray diffraction patterns deciphered before computers?

How to avoid a lengthy conversation with someone from the neighborhood I don't share interests with

How does Rust's 128-bit integer `i128` work on a 64-bit system?

Reasons for using monsters as bioweapons

Accurately recalling the key - can everyone do it?

how to change ^L code in many files in ubuntu?

Current in only inductive AC circuit

Is there a general term for the items in a directory?

Can birds evolve without trees?

A wiild aanimal, a cardinal direction, or a place by the water

Different answers of calculations in LuaLaTeX on local computer, lua compiler and on overleaf



BeautifulSoup: Getting empty variables


How do I check if a list is empty?Are static class variables possible in Python?How to get the current time in PythonUsing global variables in a functionGetting the last element of a listHow do I pass a variable by reference?“Least Astonishment” and the Mutable Default ArgumentHow do I get the number of elements in a list?How to access environment variable values?Why is “1000000000000000 in range(1000000000000001)” so fast in Python 3?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








1















I have been trying to get the value of some variables of a web page:



itemPage='https://dadosabertos.camara.leg.br/api/v2/legislaturas/1' 
url = urlopen(itemPage)
soupItem=BeautifulSoup(url,'lxml')
dataInicio=soupItem.find('dataInicio')
dataFim=soupItem.find('dataFim')


However, dataInicio and dataFim are empty. What am I doing wrong?










share|improve this question
































    1















    I have been trying to get the value of some variables of a web page:



    itemPage='https://dadosabertos.camara.leg.br/api/v2/legislaturas/1' 
    url = urlopen(itemPage)
    soupItem=BeautifulSoup(url,'lxml')
    dataInicio=soupItem.find('dataInicio')
    dataFim=soupItem.find('dataFim')


    However, dataInicio and dataFim are empty. What am I doing wrong?










    share|improve this question




























      1












      1








      1








      I have been trying to get the value of some variables of a web page:



      itemPage='https://dadosabertos.camara.leg.br/api/v2/legislaturas/1' 
      url = urlopen(itemPage)
      soupItem=BeautifulSoup(url,'lxml')
      dataInicio=soupItem.find('dataInicio')
      dataFim=soupItem.find('dataFim')


      However, dataInicio and dataFim are empty. What am I doing wrong?










      share|improve this question
















      I have been trying to get the value of some variables of a web page:



      itemPage='https://dadosabertos.camara.leg.br/api/v2/legislaturas/1' 
      url = urlopen(itemPage)
      soupItem=BeautifulSoup(url,'lxml')
      dataInicio=soupItem.find('dataInicio')
      dataFim=soupItem.find('dataFim')


      However, dataInicio and dataFim are empty. What am I doing wrong?







      python beautifulsoup






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 27 at 4:28









      ggorlen

      11k4 gold badges12 silver badges30 bronze badges




      11k4 gold badges12 silver badges30 bronze badges










      asked Mar 27 at 1:13









      DanielTheRocketManDanielTheRocketMan

      2,1544 gold badges23 silver badges42 bronze badges




      2,1544 gold badges23 silver badges42 bronze badges

























          1 Answer
          1






          active

          oldest

          votes


















          2














          There are a couple of issues here. First, soup expects a string as input; check your url and see that it's actually <http.client.HTTPResponse object at 0x036D7770>. You can read() it, which produces a JSON byte string which is usable. But if you'd prefer to stick with XML parsing, I'd recommend using Python's request library to obtain a raw XML string (pass in correct headers to specify XML).



          Secondly, when you create your soup object, you need to pass in features="xml" instead of "lxml".



          Putting it all together:



          import requests
          from bs4 import BeautifulSoup

          item_page = "https://dadosabertos.camara.leg.br/api/v2/legislaturas/1"
          response = requests.get(item_page, headers="accept": "application/xml")
          soup = BeautifulSoup(response.text, "xml")

          data_inicio = soup.find("dataInicio")
          data_fim = soup.find("dataFim")
          print(data_inicio)
          print(data_fim)


          Output:



          <dataInicio>1826-04-29</dataInicio>
          <dataFim>1830-04-24</dataFim>





          share|improve this answer

























          • Nice answer, just a minor detail - BeautifulSoup accepts strings and file-like objects. HTTPResponse is such an object and we can pass it to BeautifulSoup without reading it.

            – t.m.adam
            Mar 27 at 12:11












          • Sure, that seems accurate, but then OP's code should work by simply changing "lxml" to "xml" in the bs object, but it doesn't--curious why.

            – ggorlen
            Mar 27 at 15:09











          • It does if you set the Accept header (we can set headers with urllib.request.Request()). It works with "lxml" too if you use lowercase, because lxml converts all tag names to lowercase. But it may be best not to set the Accept header; the API will return json and we can get the data with response.json()['dados']['dataFim'], without using Beautiful Soup.

            – t.m.adam
            Mar 27 at 22:20











          • Nice, thanks for the tip and clarification.

            – ggorlen
            Mar 27 at 22:22










          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55368387%2fbeautifulsoup-getting-empty-variables%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          2














          There are a couple of issues here. First, soup expects a string as input; check your url and see that it's actually <http.client.HTTPResponse object at 0x036D7770>. You can read() it, which produces a JSON byte string which is usable. But if you'd prefer to stick with XML parsing, I'd recommend using Python's request library to obtain a raw XML string (pass in correct headers to specify XML).



          Secondly, when you create your soup object, you need to pass in features="xml" instead of "lxml".



          Putting it all together:



          import requests
          from bs4 import BeautifulSoup

          item_page = "https://dadosabertos.camara.leg.br/api/v2/legislaturas/1"
          response = requests.get(item_page, headers="accept": "application/xml")
          soup = BeautifulSoup(response.text, "xml")

          data_inicio = soup.find("dataInicio")
          data_fim = soup.find("dataFim")
          print(data_inicio)
          print(data_fim)


          Output:



          <dataInicio>1826-04-29</dataInicio>
          <dataFim>1830-04-24</dataFim>





          share|improve this answer

























          • Nice answer, just a minor detail - BeautifulSoup accepts strings and file-like objects. HTTPResponse is such an object and we can pass it to BeautifulSoup without reading it.

            – t.m.adam
            Mar 27 at 12:11












          • Sure, that seems accurate, but then OP's code should work by simply changing "lxml" to "xml" in the bs object, but it doesn't--curious why.

            – ggorlen
            Mar 27 at 15:09











          • It does if you set the Accept header (we can set headers with urllib.request.Request()). It works with "lxml" too if you use lowercase, because lxml converts all tag names to lowercase. But it may be best not to set the Accept header; the API will return json and we can get the data with response.json()['dados']['dataFim'], without using Beautiful Soup.

            – t.m.adam
            Mar 27 at 22:20











          • Nice, thanks for the tip and clarification.

            – ggorlen
            Mar 27 at 22:22















          2














          There are a couple of issues here. First, soup expects a string as input; check your url and see that it's actually <http.client.HTTPResponse object at 0x036D7770>. You can read() it, which produces a JSON byte string which is usable. But if you'd prefer to stick with XML parsing, I'd recommend using Python's request library to obtain a raw XML string (pass in correct headers to specify XML).



          Secondly, when you create your soup object, you need to pass in features="xml" instead of "lxml".



          Putting it all together:



          import requests
          from bs4 import BeautifulSoup

          item_page = "https://dadosabertos.camara.leg.br/api/v2/legislaturas/1"
          response = requests.get(item_page, headers="accept": "application/xml")
          soup = BeautifulSoup(response.text, "xml")

          data_inicio = soup.find("dataInicio")
          data_fim = soup.find("dataFim")
          print(data_inicio)
          print(data_fim)


          Output:



          <dataInicio>1826-04-29</dataInicio>
          <dataFim>1830-04-24</dataFim>





          share|improve this answer

























          • Nice answer, just a minor detail - BeautifulSoup accepts strings and file-like objects. HTTPResponse is such an object and we can pass it to BeautifulSoup without reading it.

            – t.m.adam
            Mar 27 at 12:11












          • Sure, that seems accurate, but then OP's code should work by simply changing "lxml" to "xml" in the bs object, but it doesn't--curious why.

            – ggorlen
            Mar 27 at 15:09











          • It does if you set the Accept header (we can set headers with urllib.request.Request()). It works with "lxml" too if you use lowercase, because lxml converts all tag names to lowercase. But it may be best not to set the Accept header; the API will return json and we can get the data with response.json()['dados']['dataFim'], without using Beautiful Soup.

            – t.m.adam
            Mar 27 at 22:20











          • Nice, thanks for the tip and clarification.

            – ggorlen
            Mar 27 at 22:22













          2












          2








          2







          There are a couple of issues here. First, soup expects a string as input; check your url and see that it's actually <http.client.HTTPResponse object at 0x036D7770>. You can read() it, which produces a JSON byte string which is usable. But if you'd prefer to stick with XML parsing, I'd recommend using Python's request library to obtain a raw XML string (pass in correct headers to specify XML).



          Secondly, when you create your soup object, you need to pass in features="xml" instead of "lxml".



          Putting it all together:



          import requests
          from bs4 import BeautifulSoup

          item_page = "https://dadosabertos.camara.leg.br/api/v2/legislaturas/1"
          response = requests.get(item_page, headers="accept": "application/xml")
          soup = BeautifulSoup(response.text, "xml")

          data_inicio = soup.find("dataInicio")
          data_fim = soup.find("dataFim")
          print(data_inicio)
          print(data_fim)


          Output:



          <dataInicio>1826-04-29</dataInicio>
          <dataFim>1830-04-24</dataFim>





          share|improve this answer













          There are a couple of issues here. First, soup expects a string as input; check your url and see that it's actually <http.client.HTTPResponse object at 0x036D7770>. You can read() it, which produces a JSON byte string which is usable. But if you'd prefer to stick with XML parsing, I'd recommend using Python's request library to obtain a raw XML string (pass in correct headers to specify XML).



          Secondly, when you create your soup object, you need to pass in features="xml" instead of "lxml".



          Putting it all together:



          import requests
          from bs4 import BeautifulSoup

          item_page = "https://dadosabertos.camara.leg.br/api/v2/legislaturas/1"
          response = requests.get(item_page, headers="accept": "application/xml")
          soup = BeautifulSoup(response.text, "xml")

          data_inicio = soup.find("dataInicio")
          data_fim = soup.find("dataFim")
          print(data_inicio)
          print(data_fim)


          Output:



          <dataInicio>1826-04-29</dataInicio>
          <dataFim>1830-04-24</dataFim>






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Mar 27 at 1:45









          ggorlenggorlen

          11k4 gold badges12 silver badges30 bronze badges




          11k4 gold badges12 silver badges30 bronze badges















          • Nice answer, just a minor detail - BeautifulSoup accepts strings and file-like objects. HTTPResponse is such an object and we can pass it to BeautifulSoup without reading it.

            – t.m.adam
            Mar 27 at 12:11












          • Sure, that seems accurate, but then OP's code should work by simply changing "lxml" to "xml" in the bs object, but it doesn't--curious why.

            – ggorlen
            Mar 27 at 15:09











          • It does if you set the Accept header (we can set headers with urllib.request.Request()). It works with "lxml" too if you use lowercase, because lxml converts all tag names to lowercase. But it may be best not to set the Accept header; the API will return json and we can get the data with response.json()['dados']['dataFim'], without using Beautiful Soup.

            – t.m.adam
            Mar 27 at 22:20











          • Nice, thanks for the tip and clarification.

            – ggorlen
            Mar 27 at 22:22

















          • Nice answer, just a minor detail - BeautifulSoup accepts strings and file-like objects. HTTPResponse is such an object and we can pass it to BeautifulSoup without reading it.

            – t.m.adam
            Mar 27 at 12:11












          • Sure, that seems accurate, but then OP's code should work by simply changing "lxml" to "xml" in the bs object, but it doesn't--curious why.

            – ggorlen
            Mar 27 at 15:09











          • It does if you set the Accept header (we can set headers with urllib.request.Request()). It works with "lxml" too if you use lowercase, because lxml converts all tag names to lowercase. But it may be best not to set the Accept header; the API will return json and we can get the data with response.json()['dados']['dataFim'], without using Beautiful Soup.

            – t.m.adam
            Mar 27 at 22:20











          • Nice, thanks for the tip and clarification.

            – ggorlen
            Mar 27 at 22:22
















          Nice answer, just a minor detail - BeautifulSoup accepts strings and file-like objects. HTTPResponse is such an object and we can pass it to BeautifulSoup without reading it.

          – t.m.adam
          Mar 27 at 12:11






          Nice answer, just a minor detail - BeautifulSoup accepts strings and file-like objects. HTTPResponse is such an object and we can pass it to BeautifulSoup without reading it.

          – t.m.adam
          Mar 27 at 12:11














          Sure, that seems accurate, but then OP's code should work by simply changing "lxml" to "xml" in the bs object, but it doesn't--curious why.

          – ggorlen
          Mar 27 at 15:09





          Sure, that seems accurate, but then OP's code should work by simply changing "lxml" to "xml" in the bs object, but it doesn't--curious why.

          – ggorlen
          Mar 27 at 15:09













          It does if you set the Accept header (we can set headers with urllib.request.Request()). It works with "lxml" too if you use lowercase, because lxml converts all tag names to lowercase. But it may be best not to set the Accept header; the API will return json and we can get the data with response.json()['dados']['dataFim'], without using Beautiful Soup.

          – t.m.adam
          Mar 27 at 22:20





          It does if you set the Accept header (we can set headers with urllib.request.Request()). It works with "lxml" too if you use lowercase, because lxml converts all tag names to lowercase. But it may be best not to set the Accept header; the API will return json and we can get the data with response.json()['dados']['dataFim'], without using Beautiful Soup.

          – t.m.adam
          Mar 27 at 22:20













          Nice, thanks for the tip and clarification.

          – ggorlen
          Mar 27 at 22:22





          Nice, thanks for the tip and clarification.

          – ggorlen
          Mar 27 at 22:22








          Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.







          Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.



















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55368387%2fbeautifulsoup-getting-empty-variables%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

          Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

          Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript