Web scraping from remax.comPython Requests throwing SSLErrorWeb scrapping remax.com for pythonRest service working in Postman but not in Python IDE (Anaconda)Why can't I establish a connection to the Uber API?SSL Certificate error while doing a request via pythonpython, newspaper,unhashable type: 'tzutc' and writing to dataframeUnable to connect to tableau server 10.5 using TableauServerClient python librarySSL Error executing IBM Watson Python SDKPython SSL Bad HandshakeNumba jit giving a AssertionError, TypingError, and LoweringErrorwhy the translation api of python version demo api timeout when invoked'set' object has no attribute 'setdefault'. Error in scraping data using using Requests

How soon after takeoff can you recline your airplane seat?

Lenovo Legion PXI-E61 Media Test Failure, Check Cable. Exiting PXE ROM. Then restarts and works fine

How to extract coefficients of a generating function like this one, using a computer?

Which high-degree derivatives play an essential role?

Wings for orbital transfer bioships?

Installed software from source, how to say yum not to install it from package?

Why is my 401k manager recommending me to save more?

How to track mail undetectably?

Finding an optimal set without forbidden subsets

Why am I getting an electric shock from the water in my hot tub?

Could citing a database like libgen get one into trouble?

Advantages of using bra-ket notation

To “Er” Is Human

What is the meaning of ゴト in the context of 鮎

What prevents a US state from colonizing a smaller state?

2019 2-letters 33-length list

What was the point of separating stdout and stderr?

Classify 2-dim p-adic galois representations

Are all notation equal by derivatives?

Tricky riddle from sister

Why should I allow multiple IP addresses on a website for a single session?

When does it become illegal to exchange bitcoin for cash?

Why is the saxophone not common in classical repertoire?

What is the point of using the kunai?



Web scraping from remax.com


Python Requests throwing SSLErrorWeb scrapping remax.com for pythonRest service working in Postman but not in Python IDE (Anaconda)Why can't I establish a connection to the Uber API?SSL Certificate error while doing a request via pythonpython, newspaper,unhashable type: 'tzutc' and writing to dataframeUnable to connect to tableau server 10.5 using TableauServerClient python librarySSL Error executing IBM Watson Python SDKPython SSL Bad HandshakeNumba jit giving a AssertionError, TypingError, and LoweringErrorwhy the translation api of python version demo api timeout when invoked'set' object has no attribute 'setdefault'. Error in scraping data using using Requests













2















I am trying to scrape some data from Remax.com for information like lotsize or square feet of property. Although I am get the following errors:



---------------------------------------------------------------------------
Error Traceback (most recent call last)
~AppDataLocalContinuumanaconda3libsite-packagesurllib3contribpyopenssl.py in wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname)
440 try:
--> 441 cnx.do_handshake()
442 except OpenSSL.SSL.WantReadError:

~AppDataLocalContinuumanaconda3libsite-packagesOpenSSLSSL.py in do_handshake(self)
1715 result = _lib.SSL_do_handshake(self._ssl)
-> 1716 self._raise_ssl_error(self._ssl, result)
1717

~AppDataLocalContinuumanaconda3libsite-packagesOpenSSLSSL.py in _raise_ssl_error(self, ssl, result)
1455 else:
-> 1456 _raise_current_error()
1457

~AppDataLocalContinuumanaconda3libsite-packagesOpenSSL_util.py in exception_from_error_queue(exception_type)
53
---> 54 raise exception_type(errors)
55

Error: [('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')]

During handling of the above exception, another exception occurred:

SSLError Traceback (most recent call last)
~AppDataLocalContinuumanaconda3libsite-packagesurllib3connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
600 body=body, headers=headers,
--> 601 chunked=chunked)
602

~AppDataLocalContinuumanaconda3libsite-packagesurllib3connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
345 try:
--> 346 self._validate_conn(conn)
347 except (SocketTimeout, BaseSSLError) as e:

~AppDataLocalContinuumanaconda3libsite-packagesurllib3connectionpool.py in _validate_conn(self, conn)
849 if not getattr(conn, 'sock', None): # AppEngine might not have `.sock`
--> 850 conn.connect()
851

~AppDataLocalContinuumanaconda3libsite-packagesurllib3connection.py in connect(self)
325 server_hostname=hostname,
--> 326 ssl_context=context)
327

~AppDataLocalContinuumanaconda3libsite-packagesurllib3utilssl_.py in ssl_wrap_socket(sock, keyfile, certfile, cert_reqs, ca_certs, server_hostname, ssl_version, ciphers, ssl_context, ca_cert_dir)
328 if HAS_SNI: # Platform-specific: OpenSSL with enabled SNI
--> 329 return context.wrap_socket(sock, server_hostname=server_hostname)
330

~AppDataLocalContinuumanaconda3libsite-packagesurllib3contribpyopenssl.py in wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname)
447 except OpenSSL.SSL.Error as e:
--> 448 raise ssl.SSLError('bad handshake: %r' % e)
449 break

SSLError: ("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')],)",)

During handling of the above exception, another exception occurred:

MaxRetryError Traceback (most recent call last)
~AppDataLocalContinuumanaconda3libsite-packagesrequestsadapters.py in send(self, request, stream, timeout, verify, cert, proxies)
439 retries=self.max_retries,
--> 440 timeout=timeout
441 )

~AppDataLocalContinuumanaconda3libsite-packagesurllib3connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
638 retries = retries.increment(method, url, error=e, _pool=self,
--> 639 _stacktrace=sys.exc_info()[2])
640 retries.sleep()

~AppDataLocalContinuumanaconda3libsite-packagesurllib3utilretry.py in increment(self, method, url, response, error, _pool, _stacktrace)
387 if new_retry.is_exhausted():
--> 388 raise MaxRetryError(_pool, url, error or ResponseError(cause))
389

MaxRetryError: HTTPSConnectionPool(host='www.remax.com', port=443): Max retries exceeded with url: /api/listings?nwlat=33.8426971435546875&nwlong=-118.3811187744140625&selat=33.8426971435546875&selong=-118.3783721923828125&Count=100&pagenumber=1&SiteID=68000000&pageCount=10&tab=map&sh=true&forcelatlong=true&maplistings=1&maplistcards=0&sv=true&sortorder=newest&view=forsale (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')],)",),))

During handling of the above exception, another exception occurred:

SSLError Traceback (most recent call last)
<ipython-input-22-bcfdfdfb0a4e> in <module>()
----> 1 get_info('119 S IRENA AVE B, Redondo Beach, CA 90277')

<ipython-input-21-f3c942a87400> in get_info(address)
32 }
33 # proxies = 'http': 'http://user:pass@10.10.1.10:3128/'
---> 34 req_properties = requests.get("https://www.remax.com/api/listings", params=params)
35 matching_properties_json = req_properties.json()
36 for p in matching_properties_json[0]:

~AppDataLocalContinuumanaconda3libsite-packagesrequestsapi.py in get(url, params, **kwargs)
70
71 kwargs.setdefault('allow_redirects', True)
---> 72 return request('get', url, params=params, **kwargs)
73
74

~AppDataLocalContinuumanaconda3libsite-packagesrequestsapi.py in request(method, url, **kwargs)
56 # cases, and look like a memory leak in others.
57 with sessions.Session() as session:
---> 58 return session.request(method=method, url=url, **kwargs)
59
60

~AppDataLocalContinuumanaconda3libsite-packagesrequestssessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
506 }
507 send_kwargs.update(settings)
--> 508 resp = self.send(prep, **send_kwargs)
509
510 return resp

~AppDataLocalContinuumanaconda3libsite-packagesrequestssessions.py in send(self, request, **kwargs)
616
617 # Send the request
--> 618 r = adapter.send(request, **kwargs)
619
620 # Total elapsed time of the request (approximately)

~AppDataLocalContinuumanaconda3libsite-packagesrequestsadapters.py in send(self, request, stream, timeout, verify, cert, proxies)
504 if isinstance(e.reason, _SSLError):
505 # This branch is for urllib3 v1.22 and later.
--> 506 raise SSLError(e, request=request)
507
508 raise ConnectionError(e, request=request)

SSLError: HTTPSConnectionPool(host='www.remax.com', port=443): Max retries exceeded with url: /api/listings?nwlat=33.8426971435546875&nwlong=-118.3811187744140625&selat=33.8426971435546875&selong=-118.3783721923828125&Count=100&pagenumber=1&SiteID=68000000&pageCount=10&tab=map&sh=true&forcelatlong=true&maplistings=1&maplistcards=0&sv=true&sortorder=newest&view=forsale (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')],)",),))


Here is my code:



import urllib
from bs4 import BeautifulSoup
import pandas as pd
import geopy
from geopy.geocoders import Nominatim
import geolib
from geolib import geohash
from geopy.extra.rate_limiter import RateLimiter
import requests

geolocator = Nominatim(timeout=None)
def get_dir(address):
location = geolocator.geocode(address)
lat = location.latitude
lng = location.longitude
h = geolib.geohash.encode(lat, lng, 7)
hashes = geolib.geohash.neighbours(h)
NW = geohash.decode(hashes.nw)
SE = geohash.decode(hashes.ne)
nwlat = NW.lat
nwlon = NW.lon
selat = SE.lat
selon = SE.lon
return nwlat, nwlon, selat, selon

def get_info(address):
try:
nwlat, nwlon, selat, selon = get_dir(address)
params =
"nwlat" : nwlat,
"nwlong" : nwlon,
"selat" : selat,
"selong" : selon,
"Count" : 100,
"pagenumber" : 1,
"SiteID" : "68000000",
"pageCount" : "10",
"tab" : "map",
"sh" : "true",
"forcelatlong" : "true",
"maplistings" : "1",
"maplistcards" : "0",
"sv" : "true",
"sortorder" : "newest",
"view" : "homeestimates",

proxies = 'http': 'http://user:pass@10.10.1.10:3128/'
req_properties = requests.get("https://www.remax.com/api/listings", params=params, proxies=proxies, verify=False)
matching_properties_json = req_properties.json()
for p in matching_properties_json[0]:
print(f"p['Address']:<40 p.get('BedRooms', 0) beds | int(p.get('BathRooms',0)) baths | p['SqFt'] sqft")
except (AttributeError):
return 'NaN'

x = get_info('693 Bluebird Canyon Drive, Laguna Beach CA, 92651')
print(x)


I am not sure how to fix this problem as I am new to web scraping, I tried adding a proxy in the code but I still get the same errors in the latter above.



Update:



adding



proxies = 'http': 'http://user:pass@10.10.1.10:3128/'
req_properties = requests.get("https://www.remax.com/api/listings", params=params, proxies=proxies, verify=False)


yields no errors but also no output at all.










share|improve this question
























  • are you using a proxy ?

    – Nipun Wijerathne
    Mar 25 at 16:41











  • @NipunWijerathne I do not believe so, how would I know?

    – Wolfy
    Mar 25 at 16:41











  • Take a look at this answer which pertains to the error you are getting - stackoverflow.com/questions/10667960/…

    – Bert
    Mar 25 at 17:18











  • @Bert I tried following the suggestions but still doesn't solve the problem but thank you for the comment.

    – Wolfy
    Mar 25 at 17:24











  • Does the code in the original question return results for you? If so the issue should not be proxy related.

    – Martin Evans
    Mar 26 at 10:13















2















I am trying to scrape some data from Remax.com for information like lotsize or square feet of property. Although I am get the following errors:



---------------------------------------------------------------------------
Error Traceback (most recent call last)
~AppDataLocalContinuumanaconda3libsite-packagesurllib3contribpyopenssl.py in wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname)
440 try:
--> 441 cnx.do_handshake()
442 except OpenSSL.SSL.WantReadError:

~AppDataLocalContinuumanaconda3libsite-packagesOpenSSLSSL.py in do_handshake(self)
1715 result = _lib.SSL_do_handshake(self._ssl)
-> 1716 self._raise_ssl_error(self._ssl, result)
1717

~AppDataLocalContinuumanaconda3libsite-packagesOpenSSLSSL.py in _raise_ssl_error(self, ssl, result)
1455 else:
-> 1456 _raise_current_error()
1457

~AppDataLocalContinuumanaconda3libsite-packagesOpenSSL_util.py in exception_from_error_queue(exception_type)
53
---> 54 raise exception_type(errors)
55

Error: [('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')]

During handling of the above exception, another exception occurred:

SSLError Traceback (most recent call last)
~AppDataLocalContinuumanaconda3libsite-packagesurllib3connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
600 body=body, headers=headers,
--> 601 chunked=chunked)
602

~AppDataLocalContinuumanaconda3libsite-packagesurllib3connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
345 try:
--> 346 self._validate_conn(conn)
347 except (SocketTimeout, BaseSSLError) as e:

~AppDataLocalContinuumanaconda3libsite-packagesurllib3connectionpool.py in _validate_conn(self, conn)
849 if not getattr(conn, 'sock', None): # AppEngine might not have `.sock`
--> 850 conn.connect()
851

~AppDataLocalContinuumanaconda3libsite-packagesurllib3connection.py in connect(self)
325 server_hostname=hostname,
--> 326 ssl_context=context)
327

~AppDataLocalContinuumanaconda3libsite-packagesurllib3utilssl_.py in ssl_wrap_socket(sock, keyfile, certfile, cert_reqs, ca_certs, server_hostname, ssl_version, ciphers, ssl_context, ca_cert_dir)
328 if HAS_SNI: # Platform-specific: OpenSSL with enabled SNI
--> 329 return context.wrap_socket(sock, server_hostname=server_hostname)
330

~AppDataLocalContinuumanaconda3libsite-packagesurllib3contribpyopenssl.py in wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname)
447 except OpenSSL.SSL.Error as e:
--> 448 raise ssl.SSLError('bad handshake: %r' % e)
449 break

SSLError: ("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')],)",)

During handling of the above exception, another exception occurred:

MaxRetryError Traceback (most recent call last)
~AppDataLocalContinuumanaconda3libsite-packagesrequestsadapters.py in send(self, request, stream, timeout, verify, cert, proxies)
439 retries=self.max_retries,
--> 440 timeout=timeout
441 )

~AppDataLocalContinuumanaconda3libsite-packagesurllib3connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
638 retries = retries.increment(method, url, error=e, _pool=self,
--> 639 _stacktrace=sys.exc_info()[2])
640 retries.sleep()

~AppDataLocalContinuumanaconda3libsite-packagesurllib3utilretry.py in increment(self, method, url, response, error, _pool, _stacktrace)
387 if new_retry.is_exhausted():
--> 388 raise MaxRetryError(_pool, url, error or ResponseError(cause))
389

MaxRetryError: HTTPSConnectionPool(host='www.remax.com', port=443): Max retries exceeded with url: /api/listings?nwlat=33.8426971435546875&nwlong=-118.3811187744140625&selat=33.8426971435546875&selong=-118.3783721923828125&Count=100&pagenumber=1&SiteID=68000000&pageCount=10&tab=map&sh=true&forcelatlong=true&maplistings=1&maplistcards=0&sv=true&sortorder=newest&view=forsale (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')],)",),))

During handling of the above exception, another exception occurred:

SSLError Traceback (most recent call last)
<ipython-input-22-bcfdfdfb0a4e> in <module>()
----> 1 get_info('119 S IRENA AVE B, Redondo Beach, CA 90277')

<ipython-input-21-f3c942a87400> in get_info(address)
32 }
33 # proxies = 'http': 'http://user:pass@10.10.1.10:3128/'
---> 34 req_properties = requests.get("https://www.remax.com/api/listings", params=params)
35 matching_properties_json = req_properties.json()
36 for p in matching_properties_json[0]:

~AppDataLocalContinuumanaconda3libsite-packagesrequestsapi.py in get(url, params, **kwargs)
70
71 kwargs.setdefault('allow_redirects', True)
---> 72 return request('get', url, params=params, **kwargs)
73
74

~AppDataLocalContinuumanaconda3libsite-packagesrequestsapi.py in request(method, url, **kwargs)
56 # cases, and look like a memory leak in others.
57 with sessions.Session() as session:
---> 58 return session.request(method=method, url=url, **kwargs)
59
60

~AppDataLocalContinuumanaconda3libsite-packagesrequestssessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
506 }
507 send_kwargs.update(settings)
--> 508 resp = self.send(prep, **send_kwargs)
509
510 return resp

~AppDataLocalContinuumanaconda3libsite-packagesrequestssessions.py in send(self, request, **kwargs)
616
617 # Send the request
--> 618 r = adapter.send(request, **kwargs)
619
620 # Total elapsed time of the request (approximately)

~AppDataLocalContinuumanaconda3libsite-packagesrequestsadapters.py in send(self, request, stream, timeout, verify, cert, proxies)
504 if isinstance(e.reason, _SSLError):
505 # This branch is for urllib3 v1.22 and later.
--> 506 raise SSLError(e, request=request)
507
508 raise ConnectionError(e, request=request)

SSLError: HTTPSConnectionPool(host='www.remax.com', port=443): Max retries exceeded with url: /api/listings?nwlat=33.8426971435546875&nwlong=-118.3811187744140625&selat=33.8426971435546875&selong=-118.3783721923828125&Count=100&pagenumber=1&SiteID=68000000&pageCount=10&tab=map&sh=true&forcelatlong=true&maplistings=1&maplistcards=0&sv=true&sortorder=newest&view=forsale (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')],)",),))


Here is my code:



import urllib
from bs4 import BeautifulSoup
import pandas as pd
import geopy
from geopy.geocoders import Nominatim
import geolib
from geolib import geohash
from geopy.extra.rate_limiter import RateLimiter
import requests

geolocator = Nominatim(timeout=None)
def get_dir(address):
location = geolocator.geocode(address)
lat = location.latitude
lng = location.longitude
h = geolib.geohash.encode(lat, lng, 7)
hashes = geolib.geohash.neighbours(h)
NW = geohash.decode(hashes.nw)
SE = geohash.decode(hashes.ne)
nwlat = NW.lat
nwlon = NW.lon
selat = SE.lat
selon = SE.lon
return nwlat, nwlon, selat, selon

def get_info(address):
try:
nwlat, nwlon, selat, selon = get_dir(address)
params =
"nwlat" : nwlat,
"nwlong" : nwlon,
"selat" : selat,
"selong" : selon,
"Count" : 100,
"pagenumber" : 1,
"SiteID" : "68000000",
"pageCount" : "10",
"tab" : "map",
"sh" : "true",
"forcelatlong" : "true",
"maplistings" : "1",
"maplistcards" : "0",
"sv" : "true",
"sortorder" : "newest",
"view" : "homeestimates",

proxies = 'http': 'http://user:pass@10.10.1.10:3128/'
req_properties = requests.get("https://www.remax.com/api/listings", params=params, proxies=proxies, verify=False)
matching_properties_json = req_properties.json()
for p in matching_properties_json[0]:
print(f"p['Address']:<40 p.get('BedRooms', 0) beds | int(p.get('BathRooms',0)) baths | p['SqFt'] sqft")
except (AttributeError):
return 'NaN'

x = get_info('693 Bluebird Canyon Drive, Laguna Beach CA, 92651')
print(x)


I am not sure how to fix this problem as I am new to web scraping, I tried adding a proxy in the code but I still get the same errors in the latter above.



Update:



adding



proxies = 'http': 'http://user:pass@10.10.1.10:3128/'
req_properties = requests.get("https://www.remax.com/api/listings", params=params, proxies=proxies, verify=False)


yields no errors but also no output at all.










share|improve this question
























  • are you using a proxy ?

    – Nipun Wijerathne
    Mar 25 at 16:41











  • @NipunWijerathne I do not believe so, how would I know?

    – Wolfy
    Mar 25 at 16:41











  • Take a look at this answer which pertains to the error you are getting - stackoverflow.com/questions/10667960/…

    – Bert
    Mar 25 at 17:18











  • @Bert I tried following the suggestions but still doesn't solve the problem but thank you for the comment.

    – Wolfy
    Mar 25 at 17:24











  • Does the code in the original question return results for you? If so the issue should not be proxy related.

    – Martin Evans
    Mar 26 at 10:13













2












2








2








I am trying to scrape some data from Remax.com for information like lotsize or square feet of property. Although I am get the following errors:



---------------------------------------------------------------------------
Error Traceback (most recent call last)
~AppDataLocalContinuumanaconda3libsite-packagesurllib3contribpyopenssl.py in wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname)
440 try:
--> 441 cnx.do_handshake()
442 except OpenSSL.SSL.WantReadError:

~AppDataLocalContinuumanaconda3libsite-packagesOpenSSLSSL.py in do_handshake(self)
1715 result = _lib.SSL_do_handshake(self._ssl)
-> 1716 self._raise_ssl_error(self._ssl, result)
1717

~AppDataLocalContinuumanaconda3libsite-packagesOpenSSLSSL.py in _raise_ssl_error(self, ssl, result)
1455 else:
-> 1456 _raise_current_error()
1457

~AppDataLocalContinuumanaconda3libsite-packagesOpenSSL_util.py in exception_from_error_queue(exception_type)
53
---> 54 raise exception_type(errors)
55

Error: [('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')]

During handling of the above exception, another exception occurred:

SSLError Traceback (most recent call last)
~AppDataLocalContinuumanaconda3libsite-packagesurllib3connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
600 body=body, headers=headers,
--> 601 chunked=chunked)
602

~AppDataLocalContinuumanaconda3libsite-packagesurllib3connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
345 try:
--> 346 self._validate_conn(conn)
347 except (SocketTimeout, BaseSSLError) as e:

~AppDataLocalContinuumanaconda3libsite-packagesurllib3connectionpool.py in _validate_conn(self, conn)
849 if not getattr(conn, 'sock', None): # AppEngine might not have `.sock`
--> 850 conn.connect()
851

~AppDataLocalContinuumanaconda3libsite-packagesurllib3connection.py in connect(self)
325 server_hostname=hostname,
--> 326 ssl_context=context)
327

~AppDataLocalContinuumanaconda3libsite-packagesurllib3utilssl_.py in ssl_wrap_socket(sock, keyfile, certfile, cert_reqs, ca_certs, server_hostname, ssl_version, ciphers, ssl_context, ca_cert_dir)
328 if HAS_SNI: # Platform-specific: OpenSSL with enabled SNI
--> 329 return context.wrap_socket(sock, server_hostname=server_hostname)
330

~AppDataLocalContinuumanaconda3libsite-packagesurllib3contribpyopenssl.py in wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname)
447 except OpenSSL.SSL.Error as e:
--> 448 raise ssl.SSLError('bad handshake: %r' % e)
449 break

SSLError: ("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')],)",)

During handling of the above exception, another exception occurred:

MaxRetryError Traceback (most recent call last)
~AppDataLocalContinuumanaconda3libsite-packagesrequestsadapters.py in send(self, request, stream, timeout, verify, cert, proxies)
439 retries=self.max_retries,
--> 440 timeout=timeout
441 )

~AppDataLocalContinuumanaconda3libsite-packagesurllib3connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
638 retries = retries.increment(method, url, error=e, _pool=self,
--> 639 _stacktrace=sys.exc_info()[2])
640 retries.sleep()

~AppDataLocalContinuumanaconda3libsite-packagesurllib3utilretry.py in increment(self, method, url, response, error, _pool, _stacktrace)
387 if new_retry.is_exhausted():
--> 388 raise MaxRetryError(_pool, url, error or ResponseError(cause))
389

MaxRetryError: HTTPSConnectionPool(host='www.remax.com', port=443): Max retries exceeded with url: /api/listings?nwlat=33.8426971435546875&nwlong=-118.3811187744140625&selat=33.8426971435546875&selong=-118.3783721923828125&Count=100&pagenumber=1&SiteID=68000000&pageCount=10&tab=map&sh=true&forcelatlong=true&maplistings=1&maplistcards=0&sv=true&sortorder=newest&view=forsale (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')],)",),))

During handling of the above exception, another exception occurred:

SSLError Traceback (most recent call last)
<ipython-input-22-bcfdfdfb0a4e> in <module>()
----> 1 get_info('119 S IRENA AVE B, Redondo Beach, CA 90277')

<ipython-input-21-f3c942a87400> in get_info(address)
32 }
33 # proxies = 'http': 'http://user:pass@10.10.1.10:3128/'
---> 34 req_properties = requests.get("https://www.remax.com/api/listings", params=params)
35 matching_properties_json = req_properties.json()
36 for p in matching_properties_json[0]:

~AppDataLocalContinuumanaconda3libsite-packagesrequestsapi.py in get(url, params, **kwargs)
70
71 kwargs.setdefault('allow_redirects', True)
---> 72 return request('get', url, params=params, **kwargs)
73
74

~AppDataLocalContinuumanaconda3libsite-packagesrequestsapi.py in request(method, url, **kwargs)
56 # cases, and look like a memory leak in others.
57 with sessions.Session() as session:
---> 58 return session.request(method=method, url=url, **kwargs)
59
60

~AppDataLocalContinuumanaconda3libsite-packagesrequestssessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
506 }
507 send_kwargs.update(settings)
--> 508 resp = self.send(prep, **send_kwargs)
509
510 return resp

~AppDataLocalContinuumanaconda3libsite-packagesrequestssessions.py in send(self, request, **kwargs)
616
617 # Send the request
--> 618 r = adapter.send(request, **kwargs)
619
620 # Total elapsed time of the request (approximately)

~AppDataLocalContinuumanaconda3libsite-packagesrequestsadapters.py in send(self, request, stream, timeout, verify, cert, proxies)
504 if isinstance(e.reason, _SSLError):
505 # This branch is for urllib3 v1.22 and later.
--> 506 raise SSLError(e, request=request)
507
508 raise ConnectionError(e, request=request)

SSLError: HTTPSConnectionPool(host='www.remax.com', port=443): Max retries exceeded with url: /api/listings?nwlat=33.8426971435546875&nwlong=-118.3811187744140625&selat=33.8426971435546875&selong=-118.3783721923828125&Count=100&pagenumber=1&SiteID=68000000&pageCount=10&tab=map&sh=true&forcelatlong=true&maplistings=1&maplistcards=0&sv=true&sortorder=newest&view=forsale (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')],)",),))


Here is my code:



import urllib
from bs4 import BeautifulSoup
import pandas as pd
import geopy
from geopy.geocoders import Nominatim
import geolib
from geolib import geohash
from geopy.extra.rate_limiter import RateLimiter
import requests

geolocator = Nominatim(timeout=None)
def get_dir(address):
location = geolocator.geocode(address)
lat = location.latitude
lng = location.longitude
h = geolib.geohash.encode(lat, lng, 7)
hashes = geolib.geohash.neighbours(h)
NW = geohash.decode(hashes.nw)
SE = geohash.decode(hashes.ne)
nwlat = NW.lat
nwlon = NW.lon
selat = SE.lat
selon = SE.lon
return nwlat, nwlon, selat, selon

def get_info(address):
try:
nwlat, nwlon, selat, selon = get_dir(address)
params =
"nwlat" : nwlat,
"nwlong" : nwlon,
"selat" : selat,
"selong" : selon,
"Count" : 100,
"pagenumber" : 1,
"SiteID" : "68000000",
"pageCount" : "10",
"tab" : "map",
"sh" : "true",
"forcelatlong" : "true",
"maplistings" : "1",
"maplistcards" : "0",
"sv" : "true",
"sortorder" : "newest",
"view" : "homeestimates",

proxies = 'http': 'http://user:pass@10.10.1.10:3128/'
req_properties = requests.get("https://www.remax.com/api/listings", params=params, proxies=proxies, verify=False)
matching_properties_json = req_properties.json()
for p in matching_properties_json[0]:
print(f"p['Address']:<40 p.get('BedRooms', 0) beds | int(p.get('BathRooms',0)) baths | p['SqFt'] sqft")
except (AttributeError):
return 'NaN'

x = get_info('693 Bluebird Canyon Drive, Laguna Beach CA, 92651')
print(x)


I am not sure how to fix this problem as I am new to web scraping, I tried adding a proxy in the code but I still get the same errors in the latter above.



Update:



adding



proxies = 'http': 'http://user:pass@10.10.1.10:3128/'
req_properties = requests.get("https://www.remax.com/api/listings", params=params, proxies=proxies, verify=False)


yields no errors but also no output at all.










share|improve this question
















I am trying to scrape some data from Remax.com for information like lotsize or square feet of property. Although I am get the following errors:



---------------------------------------------------------------------------
Error Traceback (most recent call last)
~AppDataLocalContinuumanaconda3libsite-packagesurllib3contribpyopenssl.py in wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname)
440 try:
--> 441 cnx.do_handshake()
442 except OpenSSL.SSL.WantReadError:

~AppDataLocalContinuumanaconda3libsite-packagesOpenSSLSSL.py in do_handshake(self)
1715 result = _lib.SSL_do_handshake(self._ssl)
-> 1716 self._raise_ssl_error(self._ssl, result)
1717

~AppDataLocalContinuumanaconda3libsite-packagesOpenSSLSSL.py in _raise_ssl_error(self, ssl, result)
1455 else:
-> 1456 _raise_current_error()
1457

~AppDataLocalContinuumanaconda3libsite-packagesOpenSSL_util.py in exception_from_error_queue(exception_type)
53
---> 54 raise exception_type(errors)
55

Error: [('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')]

During handling of the above exception, another exception occurred:

SSLError Traceback (most recent call last)
~AppDataLocalContinuumanaconda3libsite-packagesurllib3connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
600 body=body, headers=headers,
--> 601 chunked=chunked)
602

~AppDataLocalContinuumanaconda3libsite-packagesurllib3connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
345 try:
--> 346 self._validate_conn(conn)
347 except (SocketTimeout, BaseSSLError) as e:

~AppDataLocalContinuumanaconda3libsite-packagesurllib3connectionpool.py in _validate_conn(self, conn)
849 if not getattr(conn, 'sock', None): # AppEngine might not have `.sock`
--> 850 conn.connect()
851

~AppDataLocalContinuumanaconda3libsite-packagesurllib3connection.py in connect(self)
325 server_hostname=hostname,
--> 326 ssl_context=context)
327

~AppDataLocalContinuumanaconda3libsite-packagesurllib3utilssl_.py in ssl_wrap_socket(sock, keyfile, certfile, cert_reqs, ca_certs, server_hostname, ssl_version, ciphers, ssl_context, ca_cert_dir)
328 if HAS_SNI: # Platform-specific: OpenSSL with enabled SNI
--> 329 return context.wrap_socket(sock, server_hostname=server_hostname)
330

~AppDataLocalContinuumanaconda3libsite-packagesurllib3contribpyopenssl.py in wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname)
447 except OpenSSL.SSL.Error as e:
--> 448 raise ssl.SSLError('bad handshake: %r' % e)
449 break

SSLError: ("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')],)",)

During handling of the above exception, another exception occurred:

MaxRetryError Traceback (most recent call last)
~AppDataLocalContinuumanaconda3libsite-packagesrequestsadapters.py in send(self, request, stream, timeout, verify, cert, proxies)
439 retries=self.max_retries,
--> 440 timeout=timeout
441 )

~AppDataLocalContinuumanaconda3libsite-packagesurllib3connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
638 retries = retries.increment(method, url, error=e, _pool=self,
--> 639 _stacktrace=sys.exc_info()[2])
640 retries.sleep()

~AppDataLocalContinuumanaconda3libsite-packagesurllib3utilretry.py in increment(self, method, url, response, error, _pool, _stacktrace)
387 if new_retry.is_exhausted():
--> 388 raise MaxRetryError(_pool, url, error or ResponseError(cause))
389

MaxRetryError: HTTPSConnectionPool(host='www.remax.com', port=443): Max retries exceeded with url: /api/listings?nwlat=33.8426971435546875&nwlong=-118.3811187744140625&selat=33.8426971435546875&selong=-118.3783721923828125&Count=100&pagenumber=1&SiteID=68000000&pageCount=10&tab=map&sh=true&forcelatlong=true&maplistings=1&maplistcards=0&sv=true&sortorder=newest&view=forsale (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')],)",),))

During handling of the above exception, another exception occurred:

SSLError Traceback (most recent call last)
<ipython-input-22-bcfdfdfb0a4e> in <module>()
----> 1 get_info('119 S IRENA AVE B, Redondo Beach, CA 90277')

<ipython-input-21-f3c942a87400> in get_info(address)
32 }
33 # proxies = 'http': 'http://user:pass@10.10.1.10:3128/'
---> 34 req_properties = requests.get("https://www.remax.com/api/listings", params=params)
35 matching_properties_json = req_properties.json()
36 for p in matching_properties_json[0]:

~AppDataLocalContinuumanaconda3libsite-packagesrequestsapi.py in get(url, params, **kwargs)
70
71 kwargs.setdefault('allow_redirects', True)
---> 72 return request('get', url, params=params, **kwargs)
73
74

~AppDataLocalContinuumanaconda3libsite-packagesrequestsapi.py in request(method, url, **kwargs)
56 # cases, and look like a memory leak in others.
57 with sessions.Session() as session:
---> 58 return session.request(method=method, url=url, **kwargs)
59
60

~AppDataLocalContinuumanaconda3libsite-packagesrequestssessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
506 }
507 send_kwargs.update(settings)
--> 508 resp = self.send(prep, **send_kwargs)
509
510 return resp

~AppDataLocalContinuumanaconda3libsite-packagesrequestssessions.py in send(self, request, **kwargs)
616
617 # Send the request
--> 618 r = adapter.send(request, **kwargs)
619
620 # Total elapsed time of the request (approximately)

~AppDataLocalContinuumanaconda3libsite-packagesrequestsadapters.py in send(self, request, stream, timeout, verify, cert, proxies)
504 if isinstance(e.reason, _SSLError):
505 # This branch is for urllib3 v1.22 and later.
--> 506 raise SSLError(e, request=request)
507
508 raise ConnectionError(e, request=request)

SSLError: HTTPSConnectionPool(host='www.remax.com', port=443): Max retries exceeded with url: /api/listings?nwlat=33.8426971435546875&nwlong=-118.3811187744140625&selat=33.8426971435546875&selong=-118.3783721923828125&Count=100&pagenumber=1&SiteID=68000000&pageCount=10&tab=map&sh=true&forcelatlong=true&maplistings=1&maplistcards=0&sv=true&sortorder=newest&view=forsale (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')],)",),))


Here is my code:



import urllib
from bs4 import BeautifulSoup
import pandas as pd
import geopy
from geopy.geocoders import Nominatim
import geolib
from geolib import geohash
from geopy.extra.rate_limiter import RateLimiter
import requests

geolocator = Nominatim(timeout=None)
def get_dir(address):
location = geolocator.geocode(address)
lat = location.latitude
lng = location.longitude
h = geolib.geohash.encode(lat, lng, 7)
hashes = geolib.geohash.neighbours(h)
NW = geohash.decode(hashes.nw)
SE = geohash.decode(hashes.ne)
nwlat = NW.lat
nwlon = NW.lon
selat = SE.lat
selon = SE.lon
return nwlat, nwlon, selat, selon

def get_info(address):
try:
nwlat, nwlon, selat, selon = get_dir(address)
params =
"nwlat" : nwlat,
"nwlong" : nwlon,
"selat" : selat,
"selong" : selon,
"Count" : 100,
"pagenumber" : 1,
"SiteID" : "68000000",
"pageCount" : "10",
"tab" : "map",
"sh" : "true",
"forcelatlong" : "true",
"maplistings" : "1",
"maplistcards" : "0",
"sv" : "true",
"sortorder" : "newest",
"view" : "homeestimates",

proxies = 'http': 'http://user:pass@10.10.1.10:3128/'
req_properties = requests.get("https://www.remax.com/api/listings", params=params, proxies=proxies, verify=False)
matching_properties_json = req_properties.json()
for p in matching_properties_json[0]:
print(f"p['Address']:<40 p.get('BedRooms', 0) beds | int(p.get('BathRooms',0)) baths | p['SqFt'] sqft")
except (AttributeError):
return 'NaN'

x = get_info('693 Bluebird Canyon Drive, Laguna Beach CA, 92651')
print(x)


I am not sure how to fix this problem as I am new to web scraping, I tried adding a proxy in the code but I still get the same errors in the latter above.



Update:



adding



proxies = 'http': 'http://user:pass@10.10.1.10:3128/'
req_properties = requests.get("https://www.remax.com/api/listings", params=params, proxies=proxies, verify=False)


yields no errors but also no output at all.







python web-scraping






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 25 at 21:55







Wolfy

















asked Mar 25 at 16:39









WolfyWolfy

1371 silver badge11 bronze badges




1371 silver badge11 bronze badges












  • are you using a proxy ?

    – Nipun Wijerathne
    Mar 25 at 16:41











  • @NipunWijerathne I do not believe so, how would I know?

    – Wolfy
    Mar 25 at 16:41











  • Take a look at this answer which pertains to the error you are getting - stackoverflow.com/questions/10667960/…

    – Bert
    Mar 25 at 17:18











  • @Bert I tried following the suggestions but still doesn't solve the problem but thank you for the comment.

    – Wolfy
    Mar 25 at 17:24











  • Does the code in the original question return results for you? If so the issue should not be proxy related.

    – Martin Evans
    Mar 26 at 10:13

















  • are you using a proxy ?

    – Nipun Wijerathne
    Mar 25 at 16:41











  • @NipunWijerathne I do not believe so, how would I know?

    – Wolfy
    Mar 25 at 16:41











  • Take a look at this answer which pertains to the error you are getting - stackoverflow.com/questions/10667960/…

    – Bert
    Mar 25 at 17:18











  • @Bert I tried following the suggestions but still doesn't solve the problem but thank you for the comment.

    – Wolfy
    Mar 25 at 17:24











  • Does the code in the original question return results for you? If so the issue should not be proxy related.

    – Martin Evans
    Mar 26 at 10:13
















are you using a proxy ?

– Nipun Wijerathne
Mar 25 at 16:41





are you using a proxy ?

– Nipun Wijerathne
Mar 25 at 16:41













@NipunWijerathne I do not believe so, how would I know?

– Wolfy
Mar 25 at 16:41





@NipunWijerathne I do not believe so, how would I know?

– Wolfy
Mar 25 at 16:41













Take a look at this answer which pertains to the error you are getting - stackoverflow.com/questions/10667960/…

– Bert
Mar 25 at 17:18





Take a look at this answer which pertains to the error you are getting - stackoverflow.com/questions/10667960/…

– Bert
Mar 25 at 17:18













@Bert I tried following the suggestions but still doesn't solve the problem but thank you for the comment.

– Wolfy
Mar 25 at 17:24





@Bert I tried following the suggestions but still doesn't solve the problem but thank you for the comment.

– Wolfy
Mar 25 at 17:24













Does the code in the original question return results for you? If so the issue should not be proxy related.

– Martin Evans
Mar 26 at 10:13





Does the code in the original question return results for you? If so the issue should not be proxy related.

– Martin Evans
Mar 26 at 10:13










1 Answer
1






active

oldest

votes


















3














There appear to be a number of issues:



  1. Proxy is not an issue as you have said the previous question is working without needing one to be configured.


  2. Your geohash.decode(hashes.ne) call is using ne instead of se.


  3. The returned coordinates are not returning any valid properties, the API appears to return a different kind of response in this case which does not include the values you want. It does include the price though.


  4. Make sure that verify=False is configured for the get. The warning message can be suppressed.


If the search square is increased slightly in size, it does return results:



import urllib
import urllib3
from bs4 import BeautifulSoup
import pandas as pd
import geopy
from geopy.geocoders import Nominatim
import geolib
from geolib import geohash
from geopy.extra.rate_limiter import RateLimiter
import requests


# Disable the certificate warning
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
geolocator = Nominatim(timeout=None)


def get_dir(address):
location = geolocator.geocode(address)
lat = location.latitude
lng = location.longitude
h = geolib.geohash.encode(lat, lng, 7)
hashes = geolib.geohash.neighbours(h)
NW = geohash.decode(hashes.nw)
SE = geohash.decode(hashes.se)

return NW, SE


def get_info(address):
try:
NW, SE = get_dir(address)
square_size = 0.001

params =
"nwlat" : float(NW.lat) + square_size,
"nwlong" : float(NW.lon) - square_size,
"selat" : float(SE.lat) - square_size,
"selong" : float(SE.lon) + square_size,
"Count" : 100,
"pagenumber" : 1,
"SiteID" : "68000000",
"pageCount" : "10",
"tab" : "map",
"sh" : "true",
"forcelatlong" : "true",
"maplistings" : "1",
"maplistcards" : "0",
"sv" : "true",
"sortorder" : "newest",
"view" : "homeestimates",


req_properties = requests.get("https://www.remax.com/api/listings", params=params, verify=False)
matching_properties_json = req_properties.json()

for p in matching_properties_json[0]:
address = f"p['Address'], p['City'], p['State'], p['Zip']"

try:
print(f" address:<50 | p.get('BedRooms', 0) beds | int(p.get('BathRooms',0)) baths | p['SqFt'] sqft")
except KeyError:
print(f"None found - address - $p['PriceFormatted']")

except (AttributeError):
return 'NaN'

get_info('693 Bluebird Canyon Drive, Laguna Beach CA, 92651')


This displays:



 1566 Glenneyre Street, Laguna Beach, CA, 92651 | 0 beds | 0 baths | sqft
1585 S Coast 4, Laguna Beach, CA, 92651 | 3 beds | 2 baths | 1448 sqft
429 Shadow Lane, Laguna Beach, CA, 92651 | 2 beds | 2 baths | 1102 sqft
243 Calliope Street 1, Laguna Beach, CA, 92651 | 2 beds | 2 baths | 1350 sqft





share|improve this answer

























  • I get this None found - 1566 Glenneyre Street, Laguna Beach, CA, 92651 - $2,595,000

    – Wolfy
    Mar 26 at 15:58











  • set square_size = 0.001 and try again.

    – Martin Evans
    Mar 26 at 15:58











  • Perfect, any idea how to get lotsize? I noticed that in the json it doesn't specify lotsize as a name so its difficult to retrieve.

    – Wolfy
    Mar 26 at 15:59






  • 1





    I suggest you add print(p). You can then see all of the available data for each property. I could only size SqFt.

    – Martin Evans
    Mar 26 at 16:02











  • Thanks for you help, really appreciate it.

    – Wolfy
    Mar 26 at 16:03










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55342568%2fweb-scraping-from-remax-com%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









3














There appear to be a number of issues:



  1. Proxy is not an issue as you have said the previous question is working without needing one to be configured.


  2. Your geohash.decode(hashes.ne) call is using ne instead of se.


  3. The returned coordinates are not returning any valid properties, the API appears to return a different kind of response in this case which does not include the values you want. It does include the price though.


  4. Make sure that verify=False is configured for the get. The warning message can be suppressed.


If the search square is increased slightly in size, it does return results:



import urllib
import urllib3
from bs4 import BeautifulSoup
import pandas as pd
import geopy
from geopy.geocoders import Nominatim
import geolib
from geolib import geohash
from geopy.extra.rate_limiter import RateLimiter
import requests


# Disable the certificate warning
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
geolocator = Nominatim(timeout=None)


def get_dir(address):
location = geolocator.geocode(address)
lat = location.latitude
lng = location.longitude
h = geolib.geohash.encode(lat, lng, 7)
hashes = geolib.geohash.neighbours(h)
NW = geohash.decode(hashes.nw)
SE = geohash.decode(hashes.se)

return NW, SE


def get_info(address):
try:
NW, SE = get_dir(address)
square_size = 0.001

params =
"nwlat" : float(NW.lat) + square_size,
"nwlong" : float(NW.lon) - square_size,
"selat" : float(SE.lat) - square_size,
"selong" : float(SE.lon) + square_size,
"Count" : 100,
"pagenumber" : 1,
"SiteID" : "68000000",
"pageCount" : "10",
"tab" : "map",
"sh" : "true",
"forcelatlong" : "true",
"maplistings" : "1",
"maplistcards" : "0",
"sv" : "true",
"sortorder" : "newest",
"view" : "homeestimates",


req_properties = requests.get("https://www.remax.com/api/listings", params=params, verify=False)
matching_properties_json = req_properties.json()

for p in matching_properties_json[0]:
address = f"p['Address'], p['City'], p['State'], p['Zip']"

try:
print(f" address:<50 | p.get('BedRooms', 0) beds | int(p.get('BathRooms',0)) baths | p['SqFt'] sqft")
except KeyError:
print(f"None found - address - $p['PriceFormatted']")

except (AttributeError):
return 'NaN'

get_info('693 Bluebird Canyon Drive, Laguna Beach CA, 92651')


This displays:



 1566 Glenneyre Street, Laguna Beach, CA, 92651 | 0 beds | 0 baths | sqft
1585 S Coast 4, Laguna Beach, CA, 92651 | 3 beds | 2 baths | 1448 sqft
429 Shadow Lane, Laguna Beach, CA, 92651 | 2 beds | 2 baths | 1102 sqft
243 Calliope Street 1, Laguna Beach, CA, 92651 | 2 beds | 2 baths | 1350 sqft





share|improve this answer

























  • I get this None found - 1566 Glenneyre Street, Laguna Beach, CA, 92651 - $2,595,000

    – Wolfy
    Mar 26 at 15:58











  • set square_size = 0.001 and try again.

    – Martin Evans
    Mar 26 at 15:58











  • Perfect, any idea how to get lotsize? I noticed that in the json it doesn't specify lotsize as a name so its difficult to retrieve.

    – Wolfy
    Mar 26 at 15:59






  • 1





    I suggest you add print(p). You can then see all of the available data for each property. I could only size SqFt.

    – Martin Evans
    Mar 26 at 16:02











  • Thanks for you help, really appreciate it.

    – Wolfy
    Mar 26 at 16:03















3














There appear to be a number of issues:



  1. Proxy is not an issue as you have said the previous question is working without needing one to be configured.


  2. Your geohash.decode(hashes.ne) call is using ne instead of se.


  3. The returned coordinates are not returning any valid properties, the API appears to return a different kind of response in this case which does not include the values you want. It does include the price though.


  4. Make sure that verify=False is configured for the get. The warning message can be suppressed.


If the search square is increased slightly in size, it does return results:



import urllib
import urllib3
from bs4 import BeautifulSoup
import pandas as pd
import geopy
from geopy.geocoders import Nominatim
import geolib
from geolib import geohash
from geopy.extra.rate_limiter import RateLimiter
import requests


# Disable the certificate warning
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
geolocator = Nominatim(timeout=None)


def get_dir(address):
location = geolocator.geocode(address)
lat = location.latitude
lng = location.longitude
h = geolib.geohash.encode(lat, lng, 7)
hashes = geolib.geohash.neighbours(h)
NW = geohash.decode(hashes.nw)
SE = geohash.decode(hashes.se)

return NW, SE


def get_info(address):
try:
NW, SE = get_dir(address)
square_size = 0.001

params =
"nwlat" : float(NW.lat) + square_size,
"nwlong" : float(NW.lon) - square_size,
"selat" : float(SE.lat) - square_size,
"selong" : float(SE.lon) + square_size,
"Count" : 100,
"pagenumber" : 1,
"SiteID" : "68000000",
"pageCount" : "10",
"tab" : "map",
"sh" : "true",
"forcelatlong" : "true",
"maplistings" : "1",
"maplistcards" : "0",
"sv" : "true",
"sortorder" : "newest",
"view" : "homeestimates",


req_properties = requests.get("https://www.remax.com/api/listings", params=params, verify=False)
matching_properties_json = req_properties.json()

for p in matching_properties_json[0]:
address = f"p['Address'], p['City'], p['State'], p['Zip']"

try:
print(f" address:<50 | p.get('BedRooms', 0) beds | int(p.get('BathRooms',0)) baths | p['SqFt'] sqft")
except KeyError:
print(f"None found - address - $p['PriceFormatted']")

except (AttributeError):
return 'NaN'

get_info('693 Bluebird Canyon Drive, Laguna Beach CA, 92651')


This displays:



 1566 Glenneyre Street, Laguna Beach, CA, 92651 | 0 beds | 0 baths | sqft
1585 S Coast 4, Laguna Beach, CA, 92651 | 3 beds | 2 baths | 1448 sqft
429 Shadow Lane, Laguna Beach, CA, 92651 | 2 beds | 2 baths | 1102 sqft
243 Calliope Street 1, Laguna Beach, CA, 92651 | 2 beds | 2 baths | 1350 sqft





share|improve this answer

























  • I get this None found - 1566 Glenneyre Street, Laguna Beach, CA, 92651 - $2,595,000

    – Wolfy
    Mar 26 at 15:58











  • set square_size = 0.001 and try again.

    – Martin Evans
    Mar 26 at 15:58











  • Perfect, any idea how to get lotsize? I noticed that in the json it doesn't specify lotsize as a name so its difficult to retrieve.

    – Wolfy
    Mar 26 at 15:59






  • 1





    I suggest you add print(p). You can then see all of the available data for each property. I could only size SqFt.

    – Martin Evans
    Mar 26 at 16:02











  • Thanks for you help, really appreciate it.

    – Wolfy
    Mar 26 at 16:03













3












3








3







There appear to be a number of issues:



  1. Proxy is not an issue as you have said the previous question is working without needing one to be configured.


  2. Your geohash.decode(hashes.ne) call is using ne instead of se.


  3. The returned coordinates are not returning any valid properties, the API appears to return a different kind of response in this case which does not include the values you want. It does include the price though.


  4. Make sure that verify=False is configured for the get. The warning message can be suppressed.


If the search square is increased slightly in size, it does return results:



import urllib
import urllib3
from bs4 import BeautifulSoup
import pandas as pd
import geopy
from geopy.geocoders import Nominatim
import geolib
from geolib import geohash
from geopy.extra.rate_limiter import RateLimiter
import requests


# Disable the certificate warning
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
geolocator = Nominatim(timeout=None)


def get_dir(address):
location = geolocator.geocode(address)
lat = location.latitude
lng = location.longitude
h = geolib.geohash.encode(lat, lng, 7)
hashes = geolib.geohash.neighbours(h)
NW = geohash.decode(hashes.nw)
SE = geohash.decode(hashes.se)

return NW, SE


def get_info(address):
try:
NW, SE = get_dir(address)
square_size = 0.001

params =
"nwlat" : float(NW.lat) + square_size,
"nwlong" : float(NW.lon) - square_size,
"selat" : float(SE.lat) - square_size,
"selong" : float(SE.lon) + square_size,
"Count" : 100,
"pagenumber" : 1,
"SiteID" : "68000000",
"pageCount" : "10",
"tab" : "map",
"sh" : "true",
"forcelatlong" : "true",
"maplistings" : "1",
"maplistcards" : "0",
"sv" : "true",
"sortorder" : "newest",
"view" : "homeestimates",


req_properties = requests.get("https://www.remax.com/api/listings", params=params, verify=False)
matching_properties_json = req_properties.json()

for p in matching_properties_json[0]:
address = f"p['Address'], p['City'], p['State'], p['Zip']"

try:
print(f" address:<50 | p.get('BedRooms', 0) beds | int(p.get('BathRooms',0)) baths | p['SqFt'] sqft")
except KeyError:
print(f"None found - address - $p['PriceFormatted']")

except (AttributeError):
return 'NaN'

get_info('693 Bluebird Canyon Drive, Laguna Beach CA, 92651')


This displays:



 1566 Glenneyre Street, Laguna Beach, CA, 92651 | 0 beds | 0 baths | sqft
1585 S Coast 4, Laguna Beach, CA, 92651 | 3 beds | 2 baths | 1448 sqft
429 Shadow Lane, Laguna Beach, CA, 92651 | 2 beds | 2 baths | 1102 sqft
243 Calliope Street 1, Laguna Beach, CA, 92651 | 2 beds | 2 baths | 1350 sqft





share|improve this answer















There appear to be a number of issues:



  1. Proxy is not an issue as you have said the previous question is working without needing one to be configured.


  2. Your geohash.decode(hashes.ne) call is using ne instead of se.


  3. The returned coordinates are not returning any valid properties, the API appears to return a different kind of response in this case which does not include the values you want. It does include the price though.


  4. Make sure that verify=False is configured for the get. The warning message can be suppressed.


If the search square is increased slightly in size, it does return results:



import urllib
import urllib3
from bs4 import BeautifulSoup
import pandas as pd
import geopy
from geopy.geocoders import Nominatim
import geolib
from geolib import geohash
from geopy.extra.rate_limiter import RateLimiter
import requests


# Disable the certificate warning
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
geolocator = Nominatim(timeout=None)


def get_dir(address):
location = geolocator.geocode(address)
lat = location.latitude
lng = location.longitude
h = geolib.geohash.encode(lat, lng, 7)
hashes = geolib.geohash.neighbours(h)
NW = geohash.decode(hashes.nw)
SE = geohash.decode(hashes.se)

return NW, SE


def get_info(address):
try:
NW, SE = get_dir(address)
square_size = 0.001

params =
"nwlat" : float(NW.lat) + square_size,
"nwlong" : float(NW.lon) - square_size,
"selat" : float(SE.lat) - square_size,
"selong" : float(SE.lon) + square_size,
"Count" : 100,
"pagenumber" : 1,
"SiteID" : "68000000",
"pageCount" : "10",
"tab" : "map",
"sh" : "true",
"forcelatlong" : "true",
"maplistings" : "1",
"maplistcards" : "0",
"sv" : "true",
"sortorder" : "newest",
"view" : "homeestimates",


req_properties = requests.get("https://www.remax.com/api/listings", params=params, verify=False)
matching_properties_json = req_properties.json()

for p in matching_properties_json[0]:
address = f"p['Address'], p['City'], p['State'], p['Zip']"

try:
print(f" address:<50 | p.get('BedRooms', 0) beds | int(p.get('BathRooms',0)) baths | p['SqFt'] sqft")
except KeyError:
print(f"None found - address - $p['PriceFormatted']")

except (AttributeError):
return 'NaN'

get_info('693 Bluebird Canyon Drive, Laguna Beach CA, 92651')


This displays:



 1566 Glenneyre Street, Laguna Beach, CA, 92651 | 0 beds | 0 baths | sqft
1585 S Coast 4, Laguna Beach, CA, 92651 | 3 beds | 2 baths | 1448 sqft
429 Shadow Lane, Laguna Beach, CA, 92651 | 2 beds | 2 baths | 1102 sqft
243 Calliope Street 1, Laguna Beach, CA, 92651 | 2 beds | 2 baths | 1350 sqft






share|improve this answer














share|improve this answer



share|improve this answer








edited Mar 26 at 15:58

























answered Mar 26 at 15:48









Martin EvansMartin Evans

29.7k13 gold badges37 silver badges61 bronze badges




29.7k13 gold badges37 silver badges61 bronze badges












  • I get this None found - 1566 Glenneyre Street, Laguna Beach, CA, 92651 - $2,595,000

    – Wolfy
    Mar 26 at 15:58











  • set square_size = 0.001 and try again.

    – Martin Evans
    Mar 26 at 15:58











  • Perfect, any idea how to get lotsize? I noticed that in the json it doesn't specify lotsize as a name so its difficult to retrieve.

    – Wolfy
    Mar 26 at 15:59






  • 1





    I suggest you add print(p). You can then see all of the available data for each property. I could only size SqFt.

    – Martin Evans
    Mar 26 at 16:02











  • Thanks for you help, really appreciate it.

    – Wolfy
    Mar 26 at 16:03

















  • I get this None found - 1566 Glenneyre Street, Laguna Beach, CA, 92651 - $2,595,000

    – Wolfy
    Mar 26 at 15:58











  • set square_size = 0.001 and try again.

    – Martin Evans
    Mar 26 at 15:58











  • Perfect, any idea how to get lotsize? I noticed that in the json it doesn't specify lotsize as a name so its difficult to retrieve.

    – Wolfy
    Mar 26 at 15:59






  • 1





    I suggest you add print(p). You can then see all of the available data for each property. I could only size SqFt.

    – Martin Evans
    Mar 26 at 16:02











  • Thanks for you help, really appreciate it.

    – Wolfy
    Mar 26 at 16:03
















I get this None found - 1566 Glenneyre Street, Laguna Beach, CA, 92651 - $2,595,000

– Wolfy
Mar 26 at 15:58





I get this None found - 1566 Glenneyre Street, Laguna Beach, CA, 92651 - $2,595,000

– Wolfy
Mar 26 at 15:58













set square_size = 0.001 and try again.

– Martin Evans
Mar 26 at 15:58





set square_size = 0.001 and try again.

– Martin Evans
Mar 26 at 15:58













Perfect, any idea how to get lotsize? I noticed that in the json it doesn't specify lotsize as a name so its difficult to retrieve.

– Wolfy
Mar 26 at 15:59





Perfect, any idea how to get lotsize? I noticed that in the json it doesn't specify lotsize as a name so its difficult to retrieve.

– Wolfy
Mar 26 at 15:59




1




1





I suggest you add print(p). You can then see all of the available data for each property. I could only size SqFt.

– Martin Evans
Mar 26 at 16:02





I suggest you add print(p). You can then see all of the available data for each property. I could only size SqFt.

– Martin Evans
Mar 26 at 16:02













Thanks for you help, really appreciate it.

– Wolfy
Mar 26 at 16:03





Thanks for you help, really appreciate it.

– Wolfy
Mar 26 at 16:03






Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.







Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.



















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55342568%2fweb-scraping-from-remax-com%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript