soup.findAll returning empty listSelenium using Python - Geckodriver executable needs to be in PATHfindAll returning empty for htmlBeautifulSoup find_all() returns no dataHow do I check if a list is empty?Finding the index of an item given a list containing it in PythonWhat is the difference between Python's list methods append and extend?Getting the last element of a listHow to make a flat list out of list of listsHow do I get the number of elements in a list?How do I concatenate two lists in Python?How to clone or copy a list?How do I list all files of a directory?How to read a file line-by-line into a list?

Isometric Heyacrazy - Now In 3D!

Unexpected behavior after assignment of function object to function wrapper

Where could I find a math pen pal?

Do universities maintain secret textbooks?

Why are JWST optics not enclosed like HST?

How were US credit cards verified in-store in the 1980's?

What's the difference between a variable and a memory location?

Match blank lines before a word awk

Find the logic in first 2 statements to give the answer for the third statement

Does the telecom provider need physical access to the SIM card to clone it?

What checks exist against overuse of presidential pardons in the USA?

Is "prohibition against," a double negative?

Are sweatpants frowned upon on flights?

Why is there no Disney logo in MCU movies?

Was a six-engine 747 ever seriously considered by Boeing?

Can UV radiation be safe for the skin?

In what language did Túrin converse with Mím?

How can I improve my formal definitions

Strange behavior of std::initializer_list of std::strings

Coupling two 15 Amp circuit breaker for 20 Amp

Printing a list as "a, b, c." using Python

Rapid change in character

How do I get my neighbour to stop disturbing with loud music?

How to differentiate between two people with the same name in a story?

soup.findAll returning empty list

Selenium using Python - Geckodriver executable needs to be in PATHfindAll returning empty for htmlBeautifulSoup find_all() returns no dataHow do I check if a list is empty?Finding the index of an item given a list containing it in PythonWhat is the difference between Python's list methods append and extend?Getting the last element of a listHow to make a flat list out of list of listsHow do I get the number of elements in a list?How do I concatenate two lists in Python?How to clone or copy a list?How do I list all files of a directory?How to read a file line-by-line into a list?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I am trying to scrape with soup and am obtaining an empty set when I call findAll

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url='https://www.sainsburys.co.uk/webapp/wcs/stores/servlet/SearchDisplayView?catalogId=10123&langId=44&storeId=10151&krypto=70KutR16JmLgr7Ka%2F385RFXrzDpOkSqx%2FRC3DnlU09%2BYcw0pR5cfIfC0kOlQywiD%2BTEe7ppq8ENXglbpqA8sDUtif1h3ZjrEoQkV29%2B90iqljHi2gm2T%2BDZHH2%2FCNeKB%2BkVglbz%2BNx1bKsSfE5L6SVtckHxg%2FM%2F%2FVieWp8vgaJTan0k1WrPjCrVuDs5WnbRN#langId=44&storeId=10151&catalogId=10123&categoryId=&parent_category_rn=&top_category=&pageSize=60&orderBy=RELEVANCE&searchTerm=milk&beginIndex=0&hideFilters=true&categoryFacetId1='

uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

page_soup = soup(page_html,'html.parser')

containers = page_soup.findAll("div","class":"product") 
containers

I also got empty datasets from these articles:
findAll returning empty for html

and BeautifulSoup find_all() returns no data

Can anyone offer any help?

edited Mar 28 at 9:32

asked Mar 27 at 22:55

alex

6451 gold badge8 silver badges22 bronze badges

1

I think you just got unlucky. Look at the page source. You'll notice for "product" there is a rogue space after the name: class="product ", which means you are referencing a class that doesn't exist. If you do Ctr+F for class="product", you'll find 0 results, but for class="product ", you'll find 54.

– Recessive
Mar 27 at 23:17

1

Please don't post pictures of code. Use the snippet tool via edit to include html and for python code, insert, select code and press Ctrl + K.

– QHarr
Mar 28 at 3:02

noted. Removed pictures of code

– alex
Mar 28 at 9:33

add a comment |

I am trying to scrape with soup and am obtaining an empty set when I call findAll

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url='https://www.sainsburys.co.uk/webapp/wcs/stores/servlet/SearchDisplayView?catalogId=10123&langId=44&storeId=10151&krypto=70KutR16JmLgr7Ka%2F385RFXrzDpOkSqx%2FRC3DnlU09%2BYcw0pR5cfIfC0kOlQywiD%2BTEe7ppq8ENXglbpqA8sDUtif1h3ZjrEoQkV29%2B90iqljHi2gm2T%2BDZHH2%2FCNeKB%2BkVglbz%2BNx1bKsSfE5L6SVtckHxg%2FM%2F%2FVieWp8vgaJTan0k1WrPjCrVuDs5WnbRN#langId=44&storeId=10151&catalogId=10123&categoryId=&parent_category_rn=&top_category=&pageSize=60&orderBy=RELEVANCE&searchTerm=milk&beginIndex=0&hideFilters=true&categoryFacetId1='

uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

page_soup = soup(page_html,'html.parser')

containers = page_soup.findAll("div","class":"product") 
containers

I also got empty datasets from these articles:
findAll returning empty for html

and BeautifulSoup find_all() returns no data

Can anyone offer any help?

edited Mar 28 at 9:32

asked Mar 27 at 22:55

alex

6451 gold badge8 silver badges22 bronze badges

1

I think you just got unlucky. Look at the page source. You'll notice for "product" there is a rogue space after the name: class="product ", which means you are referencing a class that doesn't exist. If you do Ctr+F for class="product", you'll find 0 results, but for class="product ", you'll find 54.

– Recessive
Mar 27 at 23:17

1

Please don't post pictures of code. Use the snippet tool via edit to include html and for python code, insert, select code and press Ctrl + K.

– QHarr
Mar 28 at 3:02

noted. Removed pictures of code

– alex
Mar 28 at 9:33

add a comment |

I am trying to scrape with soup and am obtaining an empty set when I call findAll

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url='https://www.sainsburys.co.uk/webapp/wcs/stores/servlet/SearchDisplayView?catalogId=10123&langId=44&storeId=10151&krypto=70KutR16JmLgr7Ka%2F385RFXrzDpOkSqx%2FRC3DnlU09%2BYcw0pR5cfIfC0kOlQywiD%2BTEe7ppq8ENXglbpqA8sDUtif1h3ZjrEoQkV29%2B90iqljHi2gm2T%2BDZHH2%2FCNeKB%2BkVglbz%2BNx1bKsSfE5L6SVtckHxg%2FM%2F%2FVieWp8vgaJTan0k1WrPjCrVuDs5WnbRN#langId=44&storeId=10151&catalogId=10123&categoryId=&parent_category_rn=&top_category=&pageSize=60&orderBy=RELEVANCE&searchTerm=milk&beginIndex=0&hideFilters=true&categoryFacetId1='

uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

page_soup = soup(page_html,'html.parser')

containers = page_soup.findAll("div","class":"product") 
containers

I also got empty datasets from these articles:
findAll returning empty for html

and BeautifulSoup find_all() returns no data

Can anyone offer any help?

edited Mar 28 at 9:32

asked Mar 27 at 22:55

alex

6451 gold badge8 silver badges22 bronze badges

I am trying to scrape with soup and am obtaining an empty set when I call findAll

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url='https://www.sainsburys.co.uk/webapp/wcs/stores/servlet/SearchDisplayView?catalogId=10123&langId=44&storeId=10151&krypto=70KutR16JmLgr7Ka%2F385RFXrzDpOkSqx%2FRC3DnlU09%2BYcw0pR5cfIfC0kOlQywiD%2BTEe7ppq8ENXglbpqA8sDUtif1h3ZjrEoQkV29%2B90iqljHi2gm2T%2BDZHH2%2FCNeKB%2BkVglbz%2BNx1bKsSfE5L6SVtckHxg%2FM%2F%2FVieWp8vgaJTan0k1WrPjCrVuDs5WnbRN#langId=44&storeId=10151&catalogId=10123&categoryId=&parent_category_rn=&top_category=&pageSize=60&orderBy=RELEVANCE&searchTerm=milk&beginIndex=0&hideFilters=true&categoryFacetId1='

uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

page_soup = soup(page_html,'html.parser')

containers = page_soup.findAll("div","class":"product") 
containers

I also got empty datasets from these articles:
findAll returning empty for html

and BeautifulSoup find_all() returns no data

Can anyone offer any help?

python beautifulsoup urllib findall

edited Mar 28 at 9:32

asked Mar 27 at 22:55

alex

6451 gold badge8 silver badges22 bronze badges

edited Mar 28 at 9:32

asked Mar 27 at 22:55

alex

6451 gold badge8 silver badges22 bronze badges

edited Mar 28 at 9:32

asked Mar 27 at 22:55

alex

6451 gold badge8 silver badges22 bronze badges

asked Mar 27 at 22:55

alex

6451 gold badge8 silver badges22 bronze badges

asked Mar 27 at 22:55

alex

6451 gold badge8 silver badges22 bronze badges

1

I think you just got unlucky. Look at the page source. You'll notice for "product" there is a rogue space after the name: class="product ", which means you are referencing a class that doesn't exist. If you do Ctr+F for class="product", you'll find 0 results, but for class="product ", you'll find 54.

– Recessive
Mar 27 at 23:17

1

Please don't post pictures of code. Use the snippet tool via edit to include html and for python code, insert, select code and press Ctrl + K.

– QHarr
Mar 28 at 3:02

noted. Removed pictures of code

– alex
Mar 28 at 9:33

add a comment |

1

I think you just got unlucky. Look at the page source. You'll notice for "product" there is a rogue space after the name: class="product ", which means you are referencing a class that doesn't exist. If you do Ctr+F for class="product", you'll find 0 results, but for class="product ", you'll find 54.

– Recessive
Mar 27 at 23:17

1

Please don't post pictures of code. Use the snippet tool via edit to include html and for python code, insert, select code and press Ctrl + K.

– QHarr
Mar 28 at 3:02

noted. Removed pictures of code

– alex
Mar 28 at 9:33

I think you just got unlucky. Look at the page source. You'll notice for "product" there is a rogue space after the name: class="product ", which means you are referencing a class that doesn't exist. If you do Ctr+F for class="product", you'll find 0 results, but for class="product ", you'll find 54.

– Recessive
Mar 27 at 23:17

Please don't post pictures of code. Use the snippet tool via edit to include html and for python code, insert, select code and press Ctrl + K.

– QHarr
Mar 28 at 3:02

noted. Removed pictures of code

– alex
Mar 28 at 9:33

add a comment |

1 Answer
1

active

oldest

votes

The page content is loaded with javascript, so you can't just use BeautifulSoup to parse it. You have to use another module like selenium to simulate javacript execution.

Here is an exemple:

from bs4 import BeautifulSoup as soup
from selenium import webdriver

url='https://www.sainsburys.co.uk/webapp/wcs/stores/servlet/SearchDisplayView?catalogId=10123&langId=44&storeId=10151&krypto=70KutR16JmLgr7Ka%2F385RFXrzDpOkSqx%2FRC3DnlU09%2BYcw0pR5cfIfC0kOlQywiD%2BTEe7ppq8ENXglbpqA8sDUtif1h3ZjrEoQkV29%2B90iqljHi2gm2T%2BDZHH2%2FCNeKB%2BkVglbz%2BNx1bKsSfE5L6SVtckHxg%2FM%2F%2FVieWp8vgaJTan0k1WrPjCrVuDs5WnbRN#langId=44&storeId=10151&catalogId=10123&categoryId=&parent_category_rn=&top_category=&pageSize=60&orderBy=RELEVANCE&searchTerm=milk&beginIndex=0&hideFilters=true&categoryFacetId1='

driver = webdriver.Firefox()
driver.get(url)

page = driver.page_source
page_soup = soup(page,'html.parser')

containers = page_soup.findAll("div","class":"product")
print(containers)
print(len(containers))

OUTPUT:

[
<div class="product "> ...
...,
<div class="product hl-product hookLogic highlighted straplineRow" ... 
]

64

answered Mar 28 at 8:40

Maaz

1,4111 gold badge8 silver badges15 bronze badges

Unfortunately, I am having issues installing selenium: WebDriverException: Message: 'geckodriver' executable needs to be in PATH. Hoping once I solve that, I can accept your answer

– alex
Mar 28 at 23:29

You can check here for this: stackoverflow.com/questions/40208051/…

– Maaz
Mar 29 at 7:45

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55387710%2fsoup-findall-returning-empty-list%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

The page content is loaded with javascript, so you can't just use BeautifulSoup to parse it. You have to use another module like selenium to simulate javacript execution.

Here is an exemple:

from bs4 import BeautifulSoup as soup
from selenium import webdriver

url='https://www.sainsburys.co.uk/webapp/wcs/stores/servlet/SearchDisplayView?catalogId=10123&langId=44&storeId=10151&krypto=70KutR16JmLgr7Ka%2F385RFXrzDpOkSqx%2FRC3DnlU09%2BYcw0pR5cfIfC0kOlQywiD%2BTEe7ppq8ENXglbpqA8sDUtif1h3ZjrEoQkV29%2B90iqljHi2gm2T%2BDZHH2%2FCNeKB%2BkVglbz%2BNx1bKsSfE5L6SVtckHxg%2FM%2F%2FVieWp8vgaJTan0k1WrPjCrVuDs5WnbRN#langId=44&storeId=10151&catalogId=10123&categoryId=&parent_category_rn=&top_category=&pageSize=60&orderBy=RELEVANCE&searchTerm=milk&beginIndex=0&hideFilters=true&categoryFacetId1='

driver = webdriver.Firefox()
driver.get(url)

page = driver.page_source
page_soup = soup(page,'html.parser')

containers = page_soup.findAll("div","class":"product")
print(containers)
print(len(containers))

OUTPUT:

[
<div class="product "> ...
...,
<div class="product hl-product hookLogic highlighted straplineRow" ... 
]

64

answered Mar 28 at 8:40

Maaz

1,4111 gold badge8 silver badges15 bronze badges

Unfortunately, I am having issues installing selenium: WebDriverException: Message: 'geckodriver' executable needs to be in PATH. Hoping once I solve that, I can accept your answer

– alex
Mar 28 at 23:29

You can check here for this: stackoverflow.com/questions/40208051/…

– Maaz
Mar 29 at 7:45

add a comment |

The page content is loaded with javascript, so you can't just use BeautifulSoup to parse it. You have to use another module like selenium to simulate javacript execution.

Here is an exemple:

from bs4 import BeautifulSoup as soup
from selenium import webdriver

url='https://www.sainsburys.co.uk/webapp/wcs/stores/servlet/SearchDisplayView?catalogId=10123&langId=44&storeId=10151&krypto=70KutR16JmLgr7Ka%2F385RFXrzDpOkSqx%2FRC3DnlU09%2BYcw0pR5cfIfC0kOlQywiD%2BTEe7ppq8ENXglbpqA8sDUtif1h3ZjrEoQkV29%2B90iqljHi2gm2T%2BDZHH2%2FCNeKB%2BkVglbz%2BNx1bKsSfE5L6SVtckHxg%2FM%2F%2FVieWp8vgaJTan0k1WrPjCrVuDs5WnbRN#langId=44&storeId=10151&catalogId=10123&categoryId=&parent_category_rn=&top_category=&pageSize=60&orderBy=RELEVANCE&searchTerm=milk&beginIndex=0&hideFilters=true&categoryFacetId1='

driver = webdriver.Firefox()
driver.get(url)

page = driver.page_source
page_soup = soup(page,'html.parser')

containers = page_soup.findAll("div","class":"product")
print(containers)
print(len(containers))

OUTPUT:

[
<div class="product "> ...
...,
<div class="product hl-product hookLogic highlighted straplineRow" ... 
]

64

answered Mar 28 at 8:40

Maaz

1,4111 gold badge8 silver badges15 bronze badges

Unfortunately, I am having issues installing selenium: WebDriverException: Message: 'geckodriver' executable needs to be in PATH. Hoping once I solve that, I can accept your answer

– alex
Mar 28 at 23:29

You can check here for this: stackoverflow.com/questions/40208051/…

– Maaz
Mar 29 at 7:45

add a comment |

The page content is loaded with javascript, so you can't just use BeautifulSoup to parse it. You have to use another module like selenium to simulate javacript execution.

Here is an exemple:

from bs4 import BeautifulSoup as soup
from selenium import webdriver

url='https://www.sainsburys.co.uk/webapp/wcs/stores/servlet/SearchDisplayView?catalogId=10123&langId=44&storeId=10151&krypto=70KutR16JmLgr7Ka%2F385RFXrzDpOkSqx%2FRC3DnlU09%2BYcw0pR5cfIfC0kOlQywiD%2BTEe7ppq8ENXglbpqA8sDUtif1h3ZjrEoQkV29%2B90iqljHi2gm2T%2BDZHH2%2FCNeKB%2BkVglbz%2BNx1bKsSfE5L6SVtckHxg%2FM%2F%2FVieWp8vgaJTan0k1WrPjCrVuDs5WnbRN#langId=44&storeId=10151&catalogId=10123&categoryId=&parent_category_rn=&top_category=&pageSize=60&orderBy=RELEVANCE&searchTerm=milk&beginIndex=0&hideFilters=true&categoryFacetId1='

driver = webdriver.Firefox()
driver.get(url)

page = driver.page_source
page_soup = soup(page,'html.parser')

containers = page_soup.findAll("div","class":"product")
print(containers)
print(len(containers))

OUTPUT:

[
<div class="product "> ...
...,
<div class="product hl-product hookLogic highlighted straplineRow" ... 
]

64

answered Mar 28 at 8:40

Maaz

1,4111 gold badge8 silver badges15 bronze badges

The page content is loaded with javascript, so you can't just use BeautifulSoup to parse it. You have to use another module like selenium to simulate javacript execution.

Here is an exemple:

from bs4 import BeautifulSoup as soup
from selenium import webdriver

url='https://www.sainsburys.co.uk/webapp/wcs/stores/servlet/SearchDisplayView?catalogId=10123&langId=44&storeId=10151&krypto=70KutR16JmLgr7Ka%2F385RFXrzDpOkSqx%2FRC3DnlU09%2BYcw0pR5cfIfC0kOlQywiD%2BTEe7ppq8ENXglbpqA8sDUtif1h3ZjrEoQkV29%2B90iqljHi2gm2T%2BDZHH2%2FCNeKB%2BkVglbz%2BNx1bKsSfE5L6SVtckHxg%2FM%2F%2FVieWp8vgaJTan0k1WrPjCrVuDs5WnbRN#langId=44&storeId=10151&catalogId=10123&categoryId=&parent_category_rn=&top_category=&pageSize=60&orderBy=RELEVANCE&searchTerm=milk&beginIndex=0&hideFilters=true&categoryFacetId1='

driver = webdriver.Firefox()
driver.get(url)

page = driver.page_source
page_soup = soup(page,'html.parser')

containers = page_soup.findAll("div","class":"product")
print(containers)
print(len(containers))

OUTPUT:

[
<div class="product "> ...
...,
<div class="product hl-product hookLogic highlighted straplineRow" ... 
]

64

answered Mar 28 at 8:40

Maaz

1,4111 gold badge8 silver badges15 bronze badges

answered Mar 28 at 8:40

Maaz

1,4111 gold badge8 silver badges15 bronze badges

answered Mar 28 at 8:40

Maaz

1,4111 gold badge8 silver badges15 bronze badges

answered Mar 28 at 8:40

Maaz

1,4111 gold badge8 silver badges15 bronze badges

Unfortunately, I am having issues installing selenium: WebDriverException: Message: 'geckodriver' executable needs to be in PATH. Hoping once I solve that, I can accept your answer

– alex
Mar 28 at 23:29

You can check here for this: stackoverflow.com/questions/40208051/…

– Maaz
Mar 29 at 7:45

add a comment |

Unfortunately, I am having issues installing selenium: WebDriverException: Message: 'geckodriver' executable needs to be in PATH. Hoping once I solve that, I can accept your answer

– alex
Mar 28 at 23:29

You can check here for this: stackoverflow.com/questions/40208051/…

– Maaz
Mar 29 at 7:45

Unfortunately, I am having issues installing selenium: WebDriverException: Message: 'geckodriver' executable needs to be in PATH. Hoping once I solve that, I can accept your answer

– alex
Mar 28 at 23:29

You can check here for this: stackoverflow.com/questions/40208051/…

– Maaz
Mar 29 at 7:45

add a comment |

Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Styjun

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

1 Answer
1

1 Answer
1

1 Answer
1