Web scraping google flight pricesextract label value for checkbox input object with beautiful soup instead of mechanize in pythonUse the contents of a div as more Beautful Soup inputAirline Price Scraping with PythonHow can I web-scrape with Python when the HTML doesn't change?Walmart Price Scraping with Python 3Scraping Flipkart webpage using beautifulsoupHow Can I Scrape Data From Websites Don't Return Simple HTMLScraping Table using Python and SeleniumHow to scrape a google search results page?Querying <div class=“name”> in Python
Why is Robin Hood French in Shrek?
Employer says they want Quality & Quantity, but only pays bonuses based on the latter
Identifying root parents and all their children in trees
Is there a name for the phenomenon of false positives counterintuitively outstripping true positives
How can a signal be both periodic and random?
What is the purpose of polls published by the organization that they are asking about which have leading/confusing questions?
Program to print the multiple occurrence of numbers in a list
I don't know the meaning of this piece of code in C++
Could the principle of owls' silent flight be used for stealth aircraft?
Do you celebrate paying your mortgage off with colleagues?
Covering an 8x8 grid with X pentominoes
What is the difference between democracy and ochlocracy?
How to delete music as it's being played
A pencil in a beaker of water
Can the Infinity Stones be destroyed?
What would you do? Different results than what is reported
Is This Constraint Convex?
Moisture leaking out of chip in floor tile
Did the Allies reverse the threads on secret microfilm-hiding buttons to thwart the Germans?
Extract lines from files with names begining with given letter
Do players roll their own die?
Is there any physical evidence for motion?
Would there be a difference between boiling whole black peppercorns or fine ground black pepp in a stew?
How to make black peppercorns extremely fine?
Web scraping google flight prices
extract label value for checkbox input object with beautiful soup instead of mechanize in pythonUse the contents of a div as more Beautful Soup inputAirline Price Scraping with PythonHow can I web-scrape with Python when the HTML doesn't change?Walmart Price Scraping with Python 3Scraping Flipkart webpage using beautifulsoupHow Can I Scrape Data From Websites Don't Return Simple HTMLScraping Table using Python and SeleniumHow to scrape a google search results page?Querying <div class=“name”> in Python
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;
I am trying to learn to use the python library BeautifulSoup, I would like to, for example, scrape a price of a flight on Google Flights.
So I connected to Google Flights, for example at this link, and I want to get the cheapest flight price.
So I would get the value inside the div with this class "gws-flights-results__itinerary-price" (as in the figure).
Here is the simple code I wrote:
from bs4 import BeautifulSoup
import urllib.request
url = 'https://www.google.com/flights?hl=it#flt=/m/07_pf./m/05qtj.2019-04-27;c:EUR;e:1;sd:1;t:f;tt:o'
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page, 'html.parser')
div = soup.find('div', attrs='class': 'gws-flights-results__itinerary-price')
But the resulting div has class NoneType
.
I also try with
find_all('div')
but within all the div I found in this way, there was not the div I was interested in.
Can someone help me?
python web-scraping beautifulsoup
add a comment
|
I am trying to learn to use the python library BeautifulSoup, I would like to, for example, scrape a price of a flight on Google Flights.
So I connected to Google Flights, for example at this link, and I want to get the cheapest flight price.
So I would get the value inside the div with this class "gws-flights-results__itinerary-price" (as in the figure).
Here is the simple code I wrote:
from bs4 import BeautifulSoup
import urllib.request
url = 'https://www.google.com/flights?hl=it#flt=/m/07_pf./m/05qtj.2019-04-27;c:EUR;e:1;sd:1;t:f;tt:o'
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page, 'html.parser')
div = soup.find('div', attrs='class': 'gws-flights-results__itinerary-price')
But the resulting div has class NoneType
.
I also try with
find_all('div')
but within all the div I found in this way, there was not the div I was interested in.
Can someone help me?
python web-scraping beautifulsoup
add a comment
|
I am trying to learn to use the python library BeautifulSoup, I would like to, for example, scrape a price of a flight on Google Flights.
So I connected to Google Flights, for example at this link, and I want to get the cheapest flight price.
So I would get the value inside the div with this class "gws-flights-results__itinerary-price" (as in the figure).
Here is the simple code I wrote:
from bs4 import BeautifulSoup
import urllib.request
url = 'https://www.google.com/flights?hl=it#flt=/m/07_pf./m/05qtj.2019-04-27;c:EUR;e:1;sd:1;t:f;tt:o'
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page, 'html.parser')
div = soup.find('div', attrs='class': 'gws-flights-results__itinerary-price')
But the resulting div has class NoneType
.
I also try with
find_all('div')
but within all the div I found in this way, there was not the div I was interested in.
Can someone help me?
python web-scraping beautifulsoup
I am trying to learn to use the python library BeautifulSoup, I would like to, for example, scrape a price of a flight on Google Flights.
So I connected to Google Flights, for example at this link, and I want to get the cheapest flight price.
So I would get the value inside the div with this class "gws-flights-results__itinerary-price" (as in the figure).
Here is the simple code I wrote:
from bs4 import BeautifulSoup
import urllib.request
url = 'https://www.google.com/flights?hl=it#flt=/m/07_pf./m/05qtj.2019-04-27;c:EUR;e:1;sd:1;t:f;tt:o'
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page, 'html.parser')
div = soup.find('div', attrs='class': 'gws-flights-results__itinerary-price')
But the resulting div has class NoneType
.
I also try with
find_all('div')
but within all the div I found in this way, there was not the div I was interested in.
Can someone help me?
python web-scraping beautifulsoup
python web-scraping beautifulsoup
edited Mar 28 at 23:00
Hoppeduppeanut
6042 gold badges12 silver badges18 bronze badges
6042 gold badges12 silver badges18 bronze badges
asked Mar 28 at 21:39
Andrea BarnabòAndrea Barnabò
11010 bronze badges
11010 bronze badges
add a comment
|
add a comment
|
3 Answers
3
active
oldest
votes
Looks like javascript needs to run so use a method like selenium
from selenium import webdriver
url = 'https://www.google.com/flights?hl=it#flt=/m/07_pf./m/05qtj.2019-04-27;c:EUR;e:1;sd:1;t:f;tt:o'
driver = webdriver.Chrome()
driver.get(url)
print(driver.find_element_by_css_selector('.gws-flights-results__cheapest-price').text)
driver.quit()
Thanks for your answer! I did not consider Selenium. It works, thanks.
– Andrea Barnabò
Mar 28 at 23:13
add a comment
|
Its great that you are learning web scrapping! The reason you are getting NoneType as a result is because the website that you are scraping loads content dynamically. When requests library fetches the url it only contains javascript. and the div with this class "gws-flights-results__itinerary-price" isn't rendered yet! So it won't be possible by the scraping approach you are using to scrape this website.
However you can use other methods such as fetching the page using tools such as selenium or splash to render the javascript and then parse the content.
Hi Khan, thanks for your comment. I didn't know about splash. I will look forward to it.
– Andrea Barnabò
Mar 30 at 18:30
add a comment
|
BeautifulSoup is a great tool for extracting part of HTML or XML, but here it looks like you only need to get the url to another GET-request for a JSON object.
(I am not by a computer now, can update with an example tomorrow.)
Thanks for your answer, I resolved my question with the code posted by QHarr.Thanks again.
– Andrea Barnabò
Mar 28 at 23:14
add a comment
|
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55407247%2fweb-scraping-google-flight-prices%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Looks like javascript needs to run so use a method like selenium
from selenium import webdriver
url = 'https://www.google.com/flights?hl=it#flt=/m/07_pf./m/05qtj.2019-04-27;c:EUR;e:1;sd:1;t:f;tt:o'
driver = webdriver.Chrome()
driver.get(url)
print(driver.find_element_by_css_selector('.gws-flights-results__cheapest-price').text)
driver.quit()
Thanks for your answer! I did not consider Selenium. It works, thanks.
– Andrea Barnabò
Mar 28 at 23:13
add a comment
|
Looks like javascript needs to run so use a method like selenium
from selenium import webdriver
url = 'https://www.google.com/flights?hl=it#flt=/m/07_pf./m/05qtj.2019-04-27;c:EUR;e:1;sd:1;t:f;tt:o'
driver = webdriver.Chrome()
driver.get(url)
print(driver.find_element_by_css_selector('.gws-flights-results__cheapest-price').text)
driver.quit()
Thanks for your answer! I did not consider Selenium. It works, thanks.
– Andrea Barnabò
Mar 28 at 23:13
add a comment
|
Looks like javascript needs to run so use a method like selenium
from selenium import webdriver
url = 'https://www.google.com/flights?hl=it#flt=/m/07_pf./m/05qtj.2019-04-27;c:EUR;e:1;sd:1;t:f;tt:o'
driver = webdriver.Chrome()
driver.get(url)
print(driver.find_element_by_css_selector('.gws-flights-results__cheapest-price').text)
driver.quit()
Looks like javascript needs to run so use a method like selenium
from selenium import webdriver
url = 'https://www.google.com/flights?hl=it#flt=/m/07_pf./m/05qtj.2019-04-27;c:EUR;e:1;sd:1;t:f;tt:o'
driver = webdriver.Chrome()
driver.get(url)
print(driver.find_element_by_css_selector('.gws-flights-results__cheapest-price').text)
driver.quit()
answered Mar 28 at 21:49
QHarrQHarr
55.8k9 gold badges30 silver badges55 bronze badges
55.8k9 gold badges30 silver badges55 bronze badges
Thanks for your answer! I did not consider Selenium. It works, thanks.
– Andrea Barnabò
Mar 28 at 23:13
add a comment
|
Thanks for your answer! I did not consider Selenium. It works, thanks.
– Andrea Barnabò
Mar 28 at 23:13
Thanks for your answer! I did not consider Selenium. It works, thanks.
– Andrea Barnabò
Mar 28 at 23:13
Thanks for your answer! I did not consider Selenium. It works, thanks.
– Andrea Barnabò
Mar 28 at 23:13
add a comment
|
Its great that you are learning web scrapping! The reason you are getting NoneType as a result is because the website that you are scraping loads content dynamically. When requests library fetches the url it only contains javascript. and the div with this class "gws-flights-results__itinerary-price" isn't rendered yet! So it won't be possible by the scraping approach you are using to scrape this website.
However you can use other methods such as fetching the page using tools such as selenium or splash to render the javascript and then parse the content.
Hi Khan, thanks for your comment. I didn't know about splash. I will look forward to it.
– Andrea Barnabò
Mar 30 at 18:30
add a comment
|
Its great that you are learning web scrapping! The reason you are getting NoneType as a result is because the website that you are scraping loads content dynamically. When requests library fetches the url it only contains javascript. and the div with this class "gws-flights-results__itinerary-price" isn't rendered yet! So it won't be possible by the scraping approach you are using to scrape this website.
However you can use other methods such as fetching the page using tools such as selenium or splash to render the javascript and then parse the content.
Hi Khan, thanks for your comment. I didn't know about splash. I will look forward to it.
– Andrea Barnabò
Mar 30 at 18:30
add a comment
|
Its great that you are learning web scrapping! The reason you are getting NoneType as a result is because the website that you are scraping loads content dynamically. When requests library fetches the url it only contains javascript. and the div with this class "gws-flights-results__itinerary-price" isn't rendered yet! So it won't be possible by the scraping approach you are using to scrape this website.
However you can use other methods such as fetching the page using tools such as selenium or splash to render the javascript and then parse the content.
Its great that you are learning web scrapping! The reason you are getting NoneType as a result is because the website that you are scraping loads content dynamically. When requests library fetches the url it only contains javascript. and the div with this class "gws-flights-results__itinerary-price" isn't rendered yet! So it won't be possible by the scraping approach you are using to scrape this website.
However you can use other methods such as fetching the page using tools such as selenium or splash to render the javascript and then parse the content.
edited Mar 28 at 22:15
answered Mar 28 at 22:06
Hoossain KhanHoossain Khan
212 bronze badges
212 bronze badges
Hi Khan, thanks for your comment. I didn't know about splash. I will look forward to it.
– Andrea Barnabò
Mar 30 at 18:30
add a comment
|
Hi Khan, thanks for your comment. I didn't know about splash. I will look forward to it.
– Andrea Barnabò
Mar 30 at 18:30
Hi Khan, thanks for your comment. I didn't know about splash. I will look forward to it.
– Andrea Barnabò
Mar 30 at 18:30
Hi Khan, thanks for your comment. I didn't know about splash. I will look forward to it.
– Andrea Barnabò
Mar 30 at 18:30
add a comment
|
BeautifulSoup is a great tool for extracting part of HTML or XML, but here it looks like you only need to get the url to another GET-request for a JSON object.
(I am not by a computer now, can update with an example tomorrow.)
Thanks for your answer, I resolved my question with the code posted by QHarr.Thanks again.
– Andrea Barnabò
Mar 28 at 23:14
add a comment
|
BeautifulSoup is a great tool for extracting part of HTML or XML, but here it looks like you only need to get the url to another GET-request for a JSON object.
(I am not by a computer now, can update with an example tomorrow.)
Thanks for your answer, I resolved my question with the code posted by QHarr.Thanks again.
– Andrea Barnabò
Mar 28 at 23:14
add a comment
|
BeautifulSoup is a great tool for extracting part of HTML or XML, but here it looks like you only need to get the url to another GET-request for a JSON object.
(I am not by a computer now, can update with an example tomorrow.)
BeautifulSoup is a great tool for extracting part of HTML or XML, but here it looks like you only need to get the url to another GET-request for a JSON object.
(I am not by a computer now, can update with an example tomorrow.)
answered Mar 28 at 22:00
PunnerudPunnerud
1,34414 silver badges20 bronze badges
1,34414 silver badges20 bronze badges
Thanks for your answer, I resolved my question with the code posted by QHarr.Thanks again.
– Andrea Barnabò
Mar 28 at 23:14
add a comment
|
Thanks for your answer, I resolved my question with the code posted by QHarr.Thanks again.
– Andrea Barnabò
Mar 28 at 23:14
Thanks for your answer, I resolved my question with the code posted by QHarr.Thanks again.
– Andrea Barnabò
Mar 28 at 23:14
Thanks for your answer, I resolved my question with the code posted by QHarr.Thanks again.
– Andrea Barnabò
Mar 28 at 23:14
add a comment
|
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55407247%2fweb-scraping-google-flight-prices%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown