Screen scraping using Beautiful soupOptions for HTML scraping?Make a div fill the height of the remaining screen spaceUsing python re.compile with beautiful soup to match a stringBeautifulsoup can't find textScraping Indeed with Beautiful SoupScraping a table with beautiful soupUsing Beautiful Soup how can I return this value and use it as an integer?Scraping Table using Python and SeleniumI am getting text error while the code is appicable for on company using python beautifulsoupComments are visible on the webpage, but the html object returned by BeautifulSoup did not contain the comment parts

Span command across LaTeX environments

Is it possible to eat quietly in Minecraft?

Was US film used in Luna 3?

Impact of throwing away fruit waste on a peak > 3200 m above a glacier

Is an easily guessed plot twist a good plot twist?

What is the purpose of this "red room" in "Stranger Things"?

Can GPL and BSD licensed applications be used for government work?

Extrapolation v. Interpolation

Why did NASA use Imperial units?

My current job follows "worst practices". How can I talk about my experience in an interview without giving off red flags?

Correct use of smash with math and root signs

In Local Search, which reheating techniques have a good track record?

German phrase for 'suited and booted'

What is wrong with this query (unexpected token: AND)

Short story where a flexible reality hardens to an unchanging one

"It is what it is" in French

If a check is written for bill, but account number is not mentioned on memo line, is it still processed?

Is it OK to accept a job opportunity while planning on not taking it?

dos2unix is unable to convert typescript file to unix format

Are gangsters hired to attack people at a train station classified as a terrorist attack?

Why is a dedicated QA team member necessary?

Historicity doubted by Romans

Can't understand how static works exactly

Why can't a country print its own money to spend it only abroad?



Screen scraping using Beautiful soup


Options for HTML scraping?Make a div fill the height of the remaining screen spaceUsing python re.compile with beautiful soup to match a stringBeautifulsoup can't find textScraping Indeed with Beautiful SoupScraping a table with beautiful soupUsing Beautiful Soup how can I return this value and use it as an integer?Scraping Table using Python and SeleniumI am getting text error while the code is appicable for on company using python beautifulsoupComments are visible on the webpage, but the html object returned by BeautifulSoup did not contain the comment parts






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








0















I am trying to extract some information from a website. I need to click on a link which is inside the 'a' tag. I am able to get to the tag. But when I try to click on it. I am getting a error called 'NoneType' object is not callable.



from selenium import webdriver
import time
from bs4 import BeautifulSoup
import pandas as pd

browser = webdriver.Chrome()
browser.get("url")
browser.find_element_by_class_name('formButton').click()
soup = BeautifulSoup(browser.page_source, 'html.parser')

embargo = soup.find_all(class_="dataOff")

for row in embargo:
cells = row.find_all("td")
rail = cells[0].get_text().strip()
embargo = cells[1].find_element_by_class_name('dataOff').click()


Here is the HTML tag I want the beautiful soup to click on.



<table class="dataLiquidTable">
<tr id = "headerRow> .... </tr>
<tr class = "dataOff">
<td> AO </td>
<td> <a href="url"> </a> </td>


The code should click the link which is inside the 'a' tag.










share|improve this question
























  • It won't help that the HTML is broken with "headerRow. Also, it looks like you are trying to find "dataOff" within "dataOff". This will fail according to your fragment, "dataOff" is only on the row.

    – Andy G
    Mar 26 at 15:22












  • The header row contains the heading of the table.

    – Unicorn-17
    Mar 26 at 15:32

















0















I am trying to extract some information from a website. I need to click on a link which is inside the 'a' tag. I am able to get to the tag. But when I try to click on it. I am getting a error called 'NoneType' object is not callable.



from selenium import webdriver
import time
from bs4 import BeautifulSoup
import pandas as pd

browser = webdriver.Chrome()
browser.get("url")
browser.find_element_by_class_name('formButton').click()
soup = BeautifulSoup(browser.page_source, 'html.parser')

embargo = soup.find_all(class_="dataOff")

for row in embargo:
cells = row.find_all("td")
rail = cells[0].get_text().strip()
embargo = cells[1].find_element_by_class_name('dataOff').click()


Here is the HTML tag I want the beautiful soup to click on.



<table class="dataLiquidTable">
<tr id = "headerRow> .... </tr>
<tr class = "dataOff">
<td> AO </td>
<td> <a href="url"> </a> </td>


The code should click the link which is inside the 'a' tag.










share|improve this question
























  • It won't help that the HTML is broken with "headerRow. Also, it looks like you are trying to find "dataOff" within "dataOff". This will fail according to your fragment, "dataOff" is only on the row.

    – Andy G
    Mar 26 at 15:22












  • The header row contains the heading of the table.

    – Unicorn-17
    Mar 26 at 15:32













0












0








0








I am trying to extract some information from a website. I need to click on a link which is inside the 'a' tag. I am able to get to the tag. But when I try to click on it. I am getting a error called 'NoneType' object is not callable.



from selenium import webdriver
import time
from bs4 import BeautifulSoup
import pandas as pd

browser = webdriver.Chrome()
browser.get("url")
browser.find_element_by_class_name('formButton').click()
soup = BeautifulSoup(browser.page_source, 'html.parser')

embargo = soup.find_all(class_="dataOff")

for row in embargo:
cells = row.find_all("td")
rail = cells[0].get_text().strip()
embargo = cells[1].find_element_by_class_name('dataOff').click()


Here is the HTML tag I want the beautiful soup to click on.



<table class="dataLiquidTable">
<tr id = "headerRow> .... </tr>
<tr class = "dataOff">
<td> AO </td>
<td> <a href="url"> </a> </td>


The code should click the link which is inside the 'a' tag.










share|improve this question
















I am trying to extract some information from a website. I need to click on a link which is inside the 'a' tag. I am able to get to the tag. But when I try to click on it. I am getting a error called 'NoneType' object is not callable.



from selenium import webdriver
import time
from bs4 import BeautifulSoup
import pandas as pd

browser = webdriver.Chrome()
browser.get("url")
browser.find_element_by_class_name('formButton').click()
soup = BeautifulSoup(browser.page_source, 'html.parser')

embargo = soup.find_all(class_="dataOff")

for row in embargo:
cells = row.find_all("td")
rail = cells[0].get_text().strip()
embargo = cells[1].find_element_by_class_name('dataOff').click()


Here is the HTML tag I want the beautiful soup to click on.



<table class="dataLiquidTable">
<tr id = "headerRow> .... </tr>
<tr class = "dataOff">
<td> AO </td>
<td> <a href="url"> </a> </td>


The code should click the link which is inside the 'a' tag.







python html web-scraping beautifulsoup






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 26 at 15:26









QHarr

49.2k9 gold badges28 silver badges51 bronze badges




49.2k9 gold badges28 silver badges51 bronze badges










asked Mar 26 at 15:20









Unicorn-17Unicorn-17

11 bronze badge




11 bronze badge












  • It won't help that the HTML is broken with "headerRow. Also, it looks like you are trying to find "dataOff" within "dataOff". This will fail according to your fragment, "dataOff" is only on the row.

    – Andy G
    Mar 26 at 15:22












  • The header row contains the heading of the table.

    – Unicorn-17
    Mar 26 at 15:32

















  • It won't help that the HTML is broken with "headerRow. Also, it looks like you are trying to find "dataOff" within "dataOff". This will fail according to your fragment, "dataOff" is only on the row.

    – Andy G
    Mar 26 at 15:22












  • The header row contains the heading of the table.

    – Unicorn-17
    Mar 26 at 15:32
















It won't help that the HTML is broken with "headerRow. Also, it looks like you are trying to find "dataOff" within "dataOff". This will fail according to your fragment, "dataOff" is only on the row.

– Andy G
Mar 26 at 15:22






It won't help that the HTML is broken with "headerRow. Also, it looks like you are trying to find "dataOff" within "dataOff". This will fail according to your fragment, "dataOff" is only on the row.

– Andy G
Mar 26 at 15:22














The header row contains the heading of the table.

– Unicorn-17
Mar 26 at 15:32





The header row contains the heading of the table.

– Unicorn-17
Mar 26 at 15:32












1 Answer
1






active

oldest

votes


















0














Try the following which targets the first child a tag with element with class dataOff in the table



browser.find_element_by_css_selector(".dataLiquidTable .dataOff a").click()


Looks like perhaps you want multiple links in which case try and extract links first (hopefully they are valid Urls)



links = [item.get_attribute('href') for item in browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")]
for link in links:
browser.get(link)


You would then join the info you get from those pages with the info from the start of your code. Assuming lengths of returned lists are the same.



I am not sure



embargo = cells[1].find_element_by_class_name('dataOff').click()


is valid as it is performing an action yet you attempt an assignment. I assume you want to go to a new page. If you can clarify that. That step is what I am replacing by gathering the links from the a tag elements to use as required.



Otherwise, you can always gather the webElements with



elems = browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")





share|improve this answer

























  • Where would this line go in the code?

    – Unicorn-17
    Mar 26 at 15:30











  • erm... what is your question please?

    – QHarr
    Mar 26 at 16:50











  • <td> <a href="url"> sample text </a> </td> . I want the code to click on this. e.g Click on the sample text which will take me to the new page. Also embargo = cells[1].find_element_by_class_name('dataOff').click() is not working with an error called 'NoneType' object is not callable.

    – Unicorn-17
    Mar 26 at 16:54











  • Then see my section where i extract all the links into a list you can try to .get to

    – QHarr
    Mar 26 at 16:57











  • How can I click on the first link stored in the list?

    – Unicorn-17
    Mar 26 at 17:06










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55360679%2fscreen-scraping-using-beautiful-soup%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














Try the following which targets the first child a tag with element with class dataOff in the table



browser.find_element_by_css_selector(".dataLiquidTable .dataOff a").click()


Looks like perhaps you want multiple links in which case try and extract links first (hopefully they are valid Urls)



links = [item.get_attribute('href') for item in browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")]
for link in links:
browser.get(link)


You would then join the info you get from those pages with the info from the start of your code. Assuming lengths of returned lists are the same.



I am not sure



embargo = cells[1].find_element_by_class_name('dataOff').click()


is valid as it is performing an action yet you attempt an assignment. I assume you want to go to a new page. If you can clarify that. That step is what I am replacing by gathering the links from the a tag elements to use as required.



Otherwise, you can always gather the webElements with



elems = browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")





share|improve this answer

























  • Where would this line go in the code?

    – Unicorn-17
    Mar 26 at 15:30











  • erm... what is your question please?

    – QHarr
    Mar 26 at 16:50











  • <td> <a href="url"> sample text </a> </td> . I want the code to click on this. e.g Click on the sample text which will take me to the new page. Also embargo = cells[1].find_element_by_class_name('dataOff').click() is not working with an error called 'NoneType' object is not callable.

    – Unicorn-17
    Mar 26 at 16:54











  • Then see my section where i extract all the links into a list you can try to .get to

    – QHarr
    Mar 26 at 16:57











  • How can I click on the first link stored in the list?

    – Unicorn-17
    Mar 26 at 17:06















0














Try the following which targets the first child a tag with element with class dataOff in the table



browser.find_element_by_css_selector(".dataLiquidTable .dataOff a").click()


Looks like perhaps you want multiple links in which case try and extract links first (hopefully they are valid Urls)



links = [item.get_attribute('href') for item in browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")]
for link in links:
browser.get(link)


You would then join the info you get from those pages with the info from the start of your code. Assuming lengths of returned lists are the same.



I am not sure



embargo = cells[1].find_element_by_class_name('dataOff').click()


is valid as it is performing an action yet you attempt an assignment. I assume you want to go to a new page. If you can clarify that. That step is what I am replacing by gathering the links from the a tag elements to use as required.



Otherwise, you can always gather the webElements with



elems = browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")





share|improve this answer

























  • Where would this line go in the code?

    – Unicorn-17
    Mar 26 at 15:30











  • erm... what is your question please?

    – QHarr
    Mar 26 at 16:50











  • <td> <a href="url"> sample text </a> </td> . I want the code to click on this. e.g Click on the sample text which will take me to the new page. Also embargo = cells[1].find_element_by_class_name('dataOff').click() is not working with an error called 'NoneType' object is not callable.

    – Unicorn-17
    Mar 26 at 16:54











  • Then see my section where i extract all the links into a list you can try to .get to

    – QHarr
    Mar 26 at 16:57











  • How can I click on the first link stored in the list?

    – Unicorn-17
    Mar 26 at 17:06













0












0








0







Try the following which targets the first child a tag with element with class dataOff in the table



browser.find_element_by_css_selector(".dataLiquidTable .dataOff a").click()


Looks like perhaps you want multiple links in which case try and extract links first (hopefully they are valid Urls)



links = [item.get_attribute('href') for item in browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")]
for link in links:
browser.get(link)


You would then join the info you get from those pages with the info from the start of your code. Assuming lengths of returned lists are the same.



I am not sure



embargo = cells[1].find_element_by_class_name('dataOff').click()


is valid as it is performing an action yet you attempt an assignment. I assume you want to go to a new page. If you can clarify that. That step is what I am replacing by gathering the links from the a tag elements to use as required.



Otherwise, you can always gather the webElements with



elems = browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")





share|improve this answer















Try the following which targets the first child a tag with element with class dataOff in the table



browser.find_element_by_css_selector(".dataLiquidTable .dataOff a").click()


Looks like perhaps you want multiple links in which case try and extract links first (hopefully they are valid Urls)



links = [item.get_attribute('href') for item in browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")]
for link in links:
browser.get(link)


You would then join the info you get from those pages with the info from the start of your code. Assuming lengths of returned lists are the same.



I am not sure



embargo = cells[1].find_element_by_class_name('dataOff').click()


is valid as it is performing an action yet you attempt an assignment. I assume you want to go to a new page. If you can clarify that. That step is what I am replacing by gathering the links from the a tag elements to use as required.



Otherwise, you can always gather the webElements with



elems = browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")






share|improve this answer














share|improve this answer



share|improve this answer








edited Mar 26 at 16:44

























answered Mar 26 at 15:24









QHarrQHarr

49.2k9 gold badges28 silver badges51 bronze badges




49.2k9 gold badges28 silver badges51 bronze badges












  • Where would this line go in the code?

    – Unicorn-17
    Mar 26 at 15:30











  • erm... what is your question please?

    – QHarr
    Mar 26 at 16:50











  • <td> <a href="url"> sample text </a> </td> . I want the code to click on this. e.g Click on the sample text which will take me to the new page. Also embargo = cells[1].find_element_by_class_name('dataOff').click() is not working with an error called 'NoneType' object is not callable.

    – Unicorn-17
    Mar 26 at 16:54











  • Then see my section where i extract all the links into a list you can try to .get to

    – QHarr
    Mar 26 at 16:57











  • How can I click on the first link stored in the list?

    – Unicorn-17
    Mar 26 at 17:06

















  • Where would this line go in the code?

    – Unicorn-17
    Mar 26 at 15:30











  • erm... what is your question please?

    – QHarr
    Mar 26 at 16:50











  • <td> <a href="url"> sample text </a> </td> . I want the code to click on this. e.g Click on the sample text which will take me to the new page. Also embargo = cells[1].find_element_by_class_name('dataOff').click() is not working with an error called 'NoneType' object is not callable.

    – Unicorn-17
    Mar 26 at 16:54











  • Then see my section where i extract all the links into a list you can try to .get to

    – QHarr
    Mar 26 at 16:57











  • How can I click on the first link stored in the list?

    – Unicorn-17
    Mar 26 at 17:06
















Where would this line go in the code?

– Unicorn-17
Mar 26 at 15:30





Where would this line go in the code?

– Unicorn-17
Mar 26 at 15:30













erm... what is your question please?

– QHarr
Mar 26 at 16:50





erm... what is your question please?

– QHarr
Mar 26 at 16:50













<td> <a href="url"> sample text </a> </td> . I want the code to click on this. e.g Click on the sample text which will take me to the new page. Also embargo = cells[1].find_element_by_class_name('dataOff').click() is not working with an error called 'NoneType' object is not callable.

– Unicorn-17
Mar 26 at 16:54





<td> <a href="url"> sample text </a> </td> . I want the code to click on this. e.g Click on the sample text which will take me to the new page. Also embargo = cells[1].find_element_by_class_name('dataOff').click() is not working with an error called 'NoneType' object is not callable.

– Unicorn-17
Mar 26 at 16:54













Then see my section where i extract all the links into a list you can try to .get to

– QHarr
Mar 26 at 16:57





Then see my section where i extract all the links into a list you can try to .get to

– QHarr
Mar 26 at 16:57













How can I click on the first link stored in the list?

– Unicorn-17
Mar 26 at 17:06





How can I click on the first link stored in the list?

– Unicorn-17
Mar 26 at 17:06








Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.







Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.



















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55360679%2fscreen-scraping-using-beautiful-soup%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

SQL error code 1064 with creating Laravel foreign keysForeign key constraints: When to use ON UPDATE and ON DELETEDropping column with foreign key Laravel error: General error: 1025 Error on renameLaravel SQL Can't create tableLaravel Migration foreign key errorLaravel php artisan migrate:refresh giving a syntax errorSQLSTATE[42S01]: Base table or view already exists or Base table or view already exists: 1050 Tableerror in migrating laravel file to xampp serverSyntax error or access violation: 1064:syntax to use near 'unsigned not null, modelName varchar(191) not null, title varchar(191) not nLaravel cannot create new table field in mysqlLaravel 5.7:Last migration creates table but is not registered in the migration table

용인 삼성생명 블루밍스 목차 통계 역대 감독 선수단 응원단 경기장 같이 보기 외부 링크 둘러보기 메뉴samsungblueminx.comeh선수 명단용인 삼성생명 블루밍스용인 삼성생명 블루밍스ehsamsungblueminx.comeheheheh

155 수학 과학 기타 둘러보기 메뉴eh추가해eh문서를 완성해