Screen scraping using Beautiful soupOptions for HTML scraping?Make a div fill the height of the remaining screen spaceUsing python re.compile with beautiful soup to match a stringBeautifulsoup can't find textScraping Indeed with Beautiful SoupScraping a table with beautiful soupUsing Beautiful Soup how can I return this value and use it as an integer?Scraping Table using Python and SeleniumI am getting text error while the code is appicable for on company using python beautifulsoupComments are visible on the webpage, but the html object returned by BeautifulSoup did not contain the comment parts

Span command across LaTeX environments

Is it possible to eat quietly in Minecraft?

Was US film used in Luna 3?

Impact of throwing away fruit waste on a peak > 3200 m above a glacier

Is an easily guessed plot twist a good plot twist?

What is the purpose of this "red room" in "Stranger Things"?

Can GPL and BSD licensed applications be used for government work?

Extrapolation v. Interpolation

Why did NASA use Imperial units?

My current job follows "worst practices". How can I talk about my experience in an interview without giving off red flags?

Correct use of smash with math and root signs

In Local Search, which reheating techniques have a good track record?

German phrase for 'suited and booted'

What is wrong with this query (unexpected token: AND)

Short story where a flexible reality hardens to an unchanging one

"It is what it is" in French

If a check is written for bill, but account number is not mentioned on memo line, is it still processed?

Is it OK to accept a job opportunity while planning on not taking it?

dos2unix is unable to convert typescript file to unix format

Are gangsters hired to attack people at a train station classified as a terrorist attack?

Why is a dedicated QA team member necessary?

Historicity doubted by Romans

Can't understand how static works exactly

Why can't a country print its own money to spend it only abroad?

Screen scraping using Beautiful soup

Options for HTML scraping?Make a div fill the height of the remaining screen spaceUsing python re.compile with beautiful soup to match a stringBeautifulsoup can't find textScraping Indeed with Beautiful SoupScraping a table with beautiful soupUsing Beautiful Soup how can I return this value and use it as an integer?Scraping Table using Python and SeleniumI am getting text error while the code is appicable for on company using python beautifulsoupComments are visible on the webpage, but the html object returned by BeautifulSoup did not contain the comment parts

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I am trying to extract some information from a website. I need to click on a link which is inside the 'a' tag. I am able to get to the tag. But when I try to click on it. I am getting a error called 'NoneType' object is not callable.

from selenium import webdriver
import time
from bs4 import BeautifulSoup
import pandas as pd

browser = webdriver.Chrome()
browser.get("url")
browser.find_element_by_class_name('formButton').click()
soup = BeautifulSoup(browser.page_source, 'html.parser')

embargo = soup.find_all(class_="dataOff")

for row in embargo:
 cells = row.find_all("td")
 rail = cells[0].get_text().strip()
 embargo = cells[1].find_element_by_class_name('dataOff').click()

Here is the HTML tag I want the beautiful soup to click on.

<table class="dataLiquidTable">
<tr id = "headerRow> .... </tr>
<tr class = "dataOff">
<td> AO </td>
<td> <a href="url"> </a> </td>

The code should click the link which is inside the 'a' tag.

edited Mar 26 at 15:26

QHarr

49.2k9 gold badges28 silver badges51 bronze badges

asked Mar 26 at 15:20

Unicorn-17

11 bronze badge

It won't help that the HTML is broken with "headerRow. Also, it looks like you are trying to find "dataOff" within "dataOff". This will fail according to your fragment, "dataOff" is only on the row.

– Andy G
Mar 26 at 15:22

The header row contains the heading of the table.

– Unicorn-17
Mar 26 at 15:32

add a comment |

from selenium import webdriver
import time
from bs4 import BeautifulSoup
import pandas as pd

browser = webdriver.Chrome()
browser.get("url")
browser.find_element_by_class_name('formButton').click()
soup = BeautifulSoup(browser.page_source, 'html.parser')

embargo = soup.find_all(class_="dataOff")

for row in embargo:
 cells = row.find_all("td")
 rail = cells[0].get_text().strip()
 embargo = cells[1].find_element_by_class_name('dataOff').click()

Here is the HTML tag I want the beautiful soup to click on.

<table class="dataLiquidTable">
<tr id = "headerRow> .... </tr>
<tr class = "dataOff">
<td> AO </td>
<td> <a href="url"> </a> </td>

The code should click the link which is inside the 'a' tag.

edited Mar 26 at 15:26

QHarr

49.2k9 gold badges28 silver badges51 bronze badges

asked Mar 26 at 15:20

Unicorn-17

11 bronze badge

It won't help that the HTML is broken with "headerRow. Also, it looks like you are trying to find "dataOff" within "dataOff". This will fail according to your fragment, "dataOff" is only on the row.

– Andy G
Mar 26 at 15:22

The header row contains the heading of the table.

– Unicorn-17
Mar 26 at 15:32

add a comment |

from selenium import webdriver
import time
from bs4 import BeautifulSoup
import pandas as pd

browser = webdriver.Chrome()
browser.get("url")
browser.find_element_by_class_name('formButton').click()
soup = BeautifulSoup(browser.page_source, 'html.parser')

embargo = soup.find_all(class_="dataOff")

for row in embargo:
 cells = row.find_all("td")
 rail = cells[0].get_text().strip()
 embargo = cells[1].find_element_by_class_name('dataOff').click()

Here is the HTML tag I want the beautiful soup to click on.

<table class="dataLiquidTable">
<tr id = "headerRow> .... </tr>
<tr class = "dataOff">
<td> AO </td>
<td> <a href="url"> </a> </td>

The code should click the link which is inside the 'a' tag.

edited Mar 26 at 15:26

QHarr

49.2k9 gold badges28 silver badges51 bronze badges

asked Mar 26 at 15:20

Unicorn-17

11 bronze badge

from selenium import webdriver
import time
from bs4 import BeautifulSoup
import pandas as pd

browser = webdriver.Chrome()
browser.get("url")
browser.find_element_by_class_name('formButton').click()
soup = BeautifulSoup(browser.page_source, 'html.parser')

embargo = soup.find_all(class_="dataOff")

for row in embargo:
 cells = row.find_all("td")
 rail = cells[0].get_text().strip()
 embargo = cells[1].find_element_by_class_name('dataOff').click()

Here is the HTML tag I want the beautiful soup to click on.

<table class="dataLiquidTable">
<tr id = "headerRow> .... </tr>
<tr class = "dataOff">
<td> AO </td>
<td> <a href="url"> </a> </td>

The code should click the link which is inside the 'a' tag.

python html web-scraping beautifulsoup

edited Mar 26 at 15:26

QHarr

49.2k9 gold badges28 silver badges51 bronze badges

asked Mar 26 at 15:20

Unicorn-17

11 bronze badge

edited Mar 26 at 15:26

QHarr

49.2k9 gold badges28 silver badges51 bronze badges

asked Mar 26 at 15:20

Unicorn-17

11 bronze badge

edited Mar 26 at 15:26

QHarr

49.2k9 gold badges28 silver badges51 bronze badges

edited Mar 26 at 15:26

QHarr

49.2k9 gold badges28 silver badges51 bronze badges

edited Mar 26 at 15:26

QHarr

49.2k9 gold badges28 silver badges51 bronze badges

asked Mar 26 at 15:20

Unicorn-17

11 bronze badge

asked Mar 26 at 15:20

Unicorn-17

11 bronze badge

asked Mar 26 at 15:20

Unicorn-17

11 bronze badge

It won't help that the HTML is broken with "headerRow. Also, it looks like you are trying to find "dataOff" within "dataOff". This will fail according to your fragment, "dataOff" is only on the row.

– Andy G
Mar 26 at 15:22

The header row contains the heading of the table.

– Unicorn-17
Mar 26 at 15:32

add a comment |

It won't help that the HTML is broken with "headerRow. Also, it looks like you are trying to find "dataOff" within "dataOff". This will fail according to your fragment, "dataOff" is only on the row.

– Andy G
Mar 26 at 15:22

The header row contains the heading of the table.

– Unicorn-17
Mar 26 at 15:32

It won't help that the HTML is broken with "headerRow. Also, it looks like you are trying to find "dataOff" within "dataOff". This will fail according to your fragment, "dataOff" is only on the row.

– Andy G
Mar 26 at 15:22

The header row contains the heading of the table.

– Unicorn-17
Mar 26 at 15:32

add a comment |

1 Answer
1

active

oldest

votes

Try the following which targets the first child a tag with element with class dataOff in the table

browser.find_element_by_css_selector(".dataLiquidTable .dataOff a").click()

Looks like perhaps you want multiple links in which case try and extract links first (hopefully they are valid Urls)

links = [item.get_attribute('href') for item in browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")]
for link in links:
 browser.get(link)

You would then join the info you get from those pages with the info from the start of your code. Assuming lengths of returned lists are the same.

I am not sure

embargo = cells[1].find_element_by_class_name('dataOff').click()

is valid as it is performing an action yet you attempt an assignment. I assume you want to go to a new page. If you can clarify that. That step is what I am replacing by gathering the links from the a tag elements to use as required.

Otherwise, you can always gather the webElements with

elems = browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")

edited Mar 26 at 16:44

answered Mar 26 at 15:24

QHarr

49.2k9 gold badges28 silver badges51 bronze badges

Where would this line go in the code?

– Unicorn-17
Mar 26 at 15:30

erm... what is your question please?

– QHarr
Mar 26 at 16:50

<td> <a href="url"> sample text </a> </td> . I want the code to click on this. e.g Click on the sample text which will take me to the new page. Also embargo = cells[1].find_element_by_class_name('dataOff').click() is not working with an error called 'NoneType' object is not callable.

– Unicorn-17
Mar 26 at 16:54

Then see my section where i extract all the links into a list you can try to .get to

– QHarr
Mar 26 at 16:57

How can I click on the first link stored in the list?

– Unicorn-17
Mar 26 at 17:06

|
show 4 more comments

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55360679%2fscreen-scraping-using-beautiful-soup%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Try the following which targets the first child a tag with element with class dataOff in the table

browser.find_element_by_css_selector(".dataLiquidTable .dataOff a").click()

Looks like perhaps you want multiple links in which case try and extract links first (hopefully they are valid Urls)

links = [item.get_attribute('href') for item in browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")]
for link in links:
 browser.get(link)

You would then join the info you get from those pages with the info from the start of your code. Assuming lengths of returned lists are the same.

I am not sure

embargo = cells[1].find_element_by_class_name('dataOff').click()

Otherwise, you can always gather the webElements with

elems = browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")

edited Mar 26 at 16:44

answered Mar 26 at 15:24

QHarr

49.2k9 gold badges28 silver badges51 bronze badges

Where would this line go in the code?

– Unicorn-17
Mar 26 at 15:30

erm... what is your question please?

– QHarr
Mar 26 at 16:50

<td> <a href="url"> sample text </a> </td> . I want the code to click on this. e.g Click on the sample text which will take me to the new page. Also embargo = cells[1].find_element_by_class_name('dataOff').click() is not working with an error called 'NoneType' object is not callable.

– Unicorn-17
Mar 26 at 16:54

Then see my section where i extract all the links into a list you can try to .get to

– QHarr
Mar 26 at 16:57

How can I click on the first link stored in the list?

– Unicorn-17
Mar 26 at 17:06

|
show 4 more comments

Try the following which targets the first child a tag with element with class dataOff in the table

browser.find_element_by_css_selector(".dataLiquidTable .dataOff a").click()

Looks like perhaps you want multiple links in which case try and extract links first (hopefully they are valid Urls)

links = [item.get_attribute('href') for item in browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")]
for link in links:
 browser.get(link)

You would then join the info you get from those pages with the info from the start of your code. Assuming lengths of returned lists are the same.

I am not sure

embargo = cells[1].find_element_by_class_name('dataOff').click()

Otherwise, you can always gather the webElements with

elems = browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")

edited Mar 26 at 16:44

answered Mar 26 at 15:24

QHarr

49.2k9 gold badges28 silver badges51 bronze badges

Where would this line go in the code?

– Unicorn-17
Mar 26 at 15:30

erm... what is your question please?

– QHarr
Mar 26 at 16:50

<td> <a href="url"> sample text </a> </td> . I want the code to click on this. e.g Click on the sample text which will take me to the new page. Also embargo = cells[1].find_element_by_class_name('dataOff').click() is not working with an error called 'NoneType' object is not callable.

– Unicorn-17
Mar 26 at 16:54

Then see my section where i extract all the links into a list you can try to .get to

– QHarr
Mar 26 at 16:57

How can I click on the first link stored in the list?

– Unicorn-17
Mar 26 at 17:06

|
show 4 more comments

Try the following which targets the first child a tag with element with class dataOff in the table

browser.find_element_by_css_selector(".dataLiquidTable .dataOff a").click()

Looks like perhaps you want multiple links in which case try and extract links first (hopefully they are valid Urls)

links = [item.get_attribute('href') for item in browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")]
for link in links:
 browser.get(link)

You would then join the info you get from those pages with the info from the start of your code. Assuming lengths of returned lists are the same.

I am not sure

embargo = cells[1].find_element_by_class_name('dataOff').click()

Otherwise, you can always gather the webElements with

elems = browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")

edited Mar 26 at 16:44

answered Mar 26 at 15:24

QHarr

49.2k9 gold badges28 silver badges51 bronze badges

Try the following which targets the first child a tag with element with class dataOff in the table

browser.find_element_by_css_selector(".dataLiquidTable .dataOff a").click()

Looks like perhaps you want multiple links in which case try and extract links first (hopefully they are valid Urls)

links = [item.get_attribute('href') for item in browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")]
for link in links:
 browser.get(link)

You would then join the info you get from those pages with the info from the start of your code. Assuming lengths of returned lists are the same.

I am not sure

embargo = cells[1].find_element_by_class_name('dataOff').click()

Otherwise, you can always gather the webElements with

elems = browser.find_elements_by_css_selector(".dataLiquidTable .dataOff a")

edited Mar 26 at 16:44

answered Mar 26 at 15:24

QHarr

49.2k9 gold badges28 silver badges51 bronze badges

edited Mar 26 at 16:44

answered Mar 26 at 15:24

QHarr

49.2k9 gold badges28 silver badges51 bronze badges

answered Mar 26 at 15:24

QHarr

49.2k9 gold badges28 silver badges51 bronze badges

answered Mar 26 at 15:24

QHarr

49.2k9 gold badges28 silver badges51 bronze badges

Where would this line go in the code?

– Unicorn-17
Mar 26 at 15:30

erm... what is your question please?

– QHarr
Mar 26 at 16:50

<td> <a href="url"> sample text </a> </td> . I want the code to click on this. e.g Click on the sample text which will take me to the new page. Also embargo = cells[1].find_element_by_class_name('dataOff').click() is not working with an error called 'NoneType' object is not callable.

– Unicorn-17
Mar 26 at 16:54

Then see my section where i extract all the links into a list you can try to .get to

– QHarr
Mar 26 at 16:57

How can I click on the first link stored in the list?

– Unicorn-17
Mar 26 at 17:06

|
show 4 more comments

Where would this line go in the code?

– Unicorn-17
Mar 26 at 15:30

erm... what is your question please?

– QHarr
Mar 26 at 16:50

<td> <a href="url"> sample text </a> </td> . I want the code to click on this. e.g Click on the sample text which will take me to the new page. Also embargo = cells[1].find_element_by_class_name('dataOff').click() is not working with an error called 'NoneType' object is not callable.

– Unicorn-17
Mar 26 at 16:54

Then see my section where i extract all the links into a list you can try to .get to

– QHarr
Mar 26 at 16:57

How can I click on the first link stored in the list?

– Unicorn-17
Mar 26 at 17:06

Where would this line go in the code?

– Unicorn-17
Mar 26 at 15:30

erm... what is your question please?

– QHarr
Mar 26 at 16:50

<td> <a href="url"> sample text </a> </td> . I want the code to click on this. e.g Click on the sample text which will take me to the new page. Also embargo = cells[1].find_element_by_class_name('dataOff').click() is not working with an error called 'NoneType' object is not callable.

– Unicorn-17
Mar 26 at 16:54

Then see my section where i extract all the links into a list you can try to .get to

– QHarr
Mar 26 at 16:57

How can I click on the first link stored in the list?

– Unicorn-17
Mar 26 at 17:06

|
show 4 more comments

Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Styjun

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

1 Answer
1

1 Answer
1

1 Answer
1