scrapy1.5.1 CrawlSpider handle 302 redirect The 2019 Stack Overflow Developer Survey Results Are InHow to manage a redirect request after a jQuery Ajax callHow do I redirect to another webpage?Redirect stderr and stdout in BashHow do I make a redirect in PHP?How can I redirect and append both stdout and stderr to a file with Bash?HTTP redirect: 301 (permanent) vs. 302 (temporary)Will a 302 redirect maintain the referer string?How do I redirect with JavaScript?SgmlLinkExtractor and regular expression for match word in a stringValueError: Missing scheme in request url: h

Write faster on AT24C32

How technical should a Scrum Master be to effectively remove impediments?

Why can Shazam fly?

Is this app Icon Browser Safe/Legit?

How to type this arrow in math mode?

What is the motivation for a law requiring 2 parties to consent for recording a conversation

What do the Banks children have against barley water?

How to answer pointed "are you quitting" questioning when I don't want them to suspect

Why isn't airport relocation done gradually?

Does the shape of a die affect the probability of a number being rolled?

Is "plugging out" electronic devices an American expression?

Deal with toxic manager when you can't quit

Can one be advised by a professor who is very far away?

Are children permitted to help build the Beis Hamikdash?

Time travel alters history but people keep saying nothing's changed

A poker game description that does not feel gimmicky

Did Scotland spend $250,000 for the slogan "Welcome to Scotland"?

What is the meaning of Triage in Cybersec world?

Output the Arecibo Message

What does Linus Torvalds mean when he says that Git "never ever" tracks a file?

What did it mean to "align" a radio?

What is the meaning of the verb "bear" in this context?

Geography at the pixel level

Pokemon Turn Based battle (Python)



scrapy1.5.1 CrawlSpider handle 302 redirect



The 2019 Stack Overflow Developer Survey Results Are InHow to manage a redirect request after a jQuery Ajax callHow do I redirect to another webpage?Redirect stderr and stdout in BashHow do I make a redirect in PHP?How can I redirect and append both stdout and stderr to a file with Bash?HTTP redirect: 301 (permanent) vs. 302 (temporary)Will a 302 redirect maintain the referer string?How do I redirect with JavaScript?SgmlLinkExtractor and regular expression for match word in a stringValueError: Missing scheme in request url: h



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








0















I have tried to handle 302 http code in scrapy crawlspider.I searched in google and this site ,also in scrapy document https://docs.scrapy.org/en/latest/topics/downloader-middleware.html?highlight=302 ,and tried it with following codes



handle_httpstatus_list = [302]
meta = 'dont_redirect': True, "handle_httpstatus_list": [302]
# and
custom_settings = 'REDIRECT_ENABLED': False


All of it do not work for me.



Here is my code



class LagouSpider(CrawlSpider):
handle_httpstatus_list = [302]
meta = 'dont_redirect': True, "handle_httpstatus_list": [302]
name = 'lagou'
allowed_domains = ['www.lagou.com']
start_urls = ['https://www.lagou.com']
login_url = "https://passport.lagou.com/login/login.html"
custom_settings = 'REDIRECT_ENABLED': False
rules = (
Rule(LinkExtractor(allow=("zhaopin/.*",)), follow=True),
Rule(LinkExtractor(allow=("gongsi/jd+.html",)), follow=True),
Rule(LinkExtractor(allow=r'jobs/d+.html'), callback='parse_job', follow=True),
)
headers =
'Accept': 'application/json, text/javascript, */*; q=0.01',
'Accept-Language': 'zh-CN,zh;q=0.9',
'Connection': 'keep-alive',
'Host': 'www.lagou.com',
'Referer': 'https://www.lagou.com/',
'X-Anit-Forge-Code': '0',
'X-Anit-Forge-Token': 'None',
'Accept-Encoding': 'gzip, deflate, br',
'X-Requested-With': 'XMLHttpRequest'



def start_requests(self):
global rc, im
browser = webdriver.Chrome(executable_path="/home/wqh/下载/chromedriver")
browser.get(self.login_url)
# ··········(some code)
return [scrapy.Request(self.start_urls[0], cookies=cookie_dict,
dont_filter=True)]
# I have tried to use meta in scrapy.request and it failed.
# return [scrapy.Request(self.start_urls[0], cookies=cookie_dict,
meta=self.meta)]
def parse_job(self, response):
if response.status == 302:
print("302")
time.sleep(100)


And it never print 302 when page 302 status occurred.










share|improve this question
























  • I works. Thank you!!!!! I wrote the wrong code . My apologize.

    – qihuan wu
    Mar 23 at 3:16

















0















I have tried to handle 302 http code in scrapy crawlspider.I searched in google and this site ,also in scrapy document https://docs.scrapy.org/en/latest/topics/downloader-middleware.html?highlight=302 ,and tried it with following codes



handle_httpstatus_list = [302]
meta = 'dont_redirect': True, "handle_httpstatus_list": [302]
# and
custom_settings = 'REDIRECT_ENABLED': False


All of it do not work for me.



Here is my code



class LagouSpider(CrawlSpider):
handle_httpstatus_list = [302]
meta = 'dont_redirect': True, "handle_httpstatus_list": [302]
name = 'lagou'
allowed_domains = ['www.lagou.com']
start_urls = ['https://www.lagou.com']
login_url = "https://passport.lagou.com/login/login.html"
custom_settings = 'REDIRECT_ENABLED': False
rules = (
Rule(LinkExtractor(allow=("zhaopin/.*",)), follow=True),
Rule(LinkExtractor(allow=("gongsi/jd+.html",)), follow=True),
Rule(LinkExtractor(allow=r'jobs/d+.html'), callback='parse_job', follow=True),
)
headers =
'Accept': 'application/json, text/javascript, */*; q=0.01',
'Accept-Language': 'zh-CN,zh;q=0.9',
'Connection': 'keep-alive',
'Host': 'www.lagou.com',
'Referer': 'https://www.lagou.com/',
'X-Anit-Forge-Code': '0',
'X-Anit-Forge-Token': 'None',
'Accept-Encoding': 'gzip, deflate, br',
'X-Requested-With': 'XMLHttpRequest'



def start_requests(self):
global rc, im
browser = webdriver.Chrome(executable_path="/home/wqh/下载/chromedriver")
browser.get(self.login_url)
# ··········(some code)
return [scrapy.Request(self.start_urls[0], cookies=cookie_dict,
dont_filter=True)]
# I have tried to use meta in scrapy.request and it failed.
# return [scrapy.Request(self.start_urls[0], cookies=cookie_dict,
meta=self.meta)]
def parse_job(self, response):
if response.status == 302:
print("302")
time.sleep(100)


And it never print 302 when page 302 status occurred.










share|improve this question
























  • I works. Thank you!!!!! I wrote the wrong code . My apologize.

    – qihuan wu
    Mar 23 at 3:16













0












0








0








I have tried to handle 302 http code in scrapy crawlspider.I searched in google and this site ,also in scrapy document https://docs.scrapy.org/en/latest/topics/downloader-middleware.html?highlight=302 ,and tried it with following codes



handle_httpstatus_list = [302]
meta = 'dont_redirect': True, "handle_httpstatus_list": [302]
# and
custom_settings = 'REDIRECT_ENABLED': False


All of it do not work for me.



Here is my code



class LagouSpider(CrawlSpider):
handle_httpstatus_list = [302]
meta = 'dont_redirect': True, "handle_httpstatus_list": [302]
name = 'lagou'
allowed_domains = ['www.lagou.com']
start_urls = ['https://www.lagou.com']
login_url = "https://passport.lagou.com/login/login.html"
custom_settings = 'REDIRECT_ENABLED': False
rules = (
Rule(LinkExtractor(allow=("zhaopin/.*",)), follow=True),
Rule(LinkExtractor(allow=("gongsi/jd+.html",)), follow=True),
Rule(LinkExtractor(allow=r'jobs/d+.html'), callback='parse_job', follow=True),
)
headers =
'Accept': 'application/json, text/javascript, */*; q=0.01',
'Accept-Language': 'zh-CN,zh;q=0.9',
'Connection': 'keep-alive',
'Host': 'www.lagou.com',
'Referer': 'https://www.lagou.com/',
'X-Anit-Forge-Code': '0',
'X-Anit-Forge-Token': 'None',
'Accept-Encoding': 'gzip, deflate, br',
'X-Requested-With': 'XMLHttpRequest'



def start_requests(self):
global rc, im
browser = webdriver.Chrome(executable_path="/home/wqh/下载/chromedriver")
browser.get(self.login_url)
# ··········(some code)
return [scrapy.Request(self.start_urls[0], cookies=cookie_dict,
dont_filter=True)]
# I have tried to use meta in scrapy.request and it failed.
# return [scrapy.Request(self.start_urls[0], cookies=cookie_dict,
meta=self.meta)]
def parse_job(self, response):
if response.status == 302:
print("302")
time.sleep(100)


And it never print 302 when page 302 status occurred.










share|improve this question
















I have tried to handle 302 http code in scrapy crawlspider.I searched in google and this site ,also in scrapy document https://docs.scrapy.org/en/latest/topics/downloader-middleware.html?highlight=302 ,and tried it with following codes



handle_httpstatus_list = [302]
meta = 'dont_redirect': True, "handle_httpstatus_list": [302]
# and
custom_settings = 'REDIRECT_ENABLED': False


All of it do not work for me.



Here is my code



class LagouSpider(CrawlSpider):
handle_httpstatus_list = [302]
meta = 'dont_redirect': True, "handle_httpstatus_list": [302]
name = 'lagou'
allowed_domains = ['www.lagou.com']
start_urls = ['https://www.lagou.com']
login_url = "https://passport.lagou.com/login/login.html"
custom_settings = 'REDIRECT_ENABLED': False
rules = (
Rule(LinkExtractor(allow=("zhaopin/.*",)), follow=True),
Rule(LinkExtractor(allow=("gongsi/jd+.html",)), follow=True),
Rule(LinkExtractor(allow=r'jobs/d+.html'), callback='parse_job', follow=True),
)
headers =
'Accept': 'application/json, text/javascript, */*; q=0.01',
'Accept-Language': 'zh-CN,zh;q=0.9',
'Connection': 'keep-alive',
'Host': 'www.lagou.com',
'Referer': 'https://www.lagou.com/',
'X-Anit-Forge-Code': '0',
'X-Anit-Forge-Token': 'None',
'Accept-Encoding': 'gzip, deflate, br',
'X-Requested-With': 'XMLHttpRequest'



def start_requests(self):
global rc, im
browser = webdriver.Chrome(executable_path="/home/wqh/下载/chromedriver")
browser.get(self.login_url)
# ··········(some code)
return [scrapy.Request(self.start_urls[0], cookies=cookie_dict,
dont_filter=True)]
# I have tried to use meta in scrapy.request and it failed.
# return [scrapy.Request(self.start_urls[0], cookies=cookie_dict,
meta=self.meta)]
def parse_job(self, response):
if response.status == 302:
print("302")
time.sleep(100)


And it never print 302 when page 302 status occurred.







selenium redirect scrapy http-status-code-302






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 23 at 3:11







qihuan wu

















asked Mar 22 at 3:32









qihuan wuqihuan wu

104




104












  • I works. Thank you!!!!! I wrote the wrong code . My apologize.

    – qihuan wu
    Mar 23 at 3:16

















  • I works. Thank you!!!!! I wrote the wrong code . My apologize.

    – qihuan wu
    Mar 23 at 3:16
















I works. Thank you!!!!! I wrote the wrong code . My apologize.

– qihuan wu
Mar 23 at 3:16





I works. Thank you!!!!! I wrote the wrong code . My apologize.

– qihuan wu
Mar 23 at 3:16












0






active

oldest

votes












Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55292506%2fscrapy1-5-1-crawlspider-handle-302-redirect%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55292506%2fscrapy1-5-1-crawlspider-handle-302-redirect%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

SQL error code 1064 with creating Laravel foreign keysForeign key constraints: When to use ON UPDATE and ON DELETEDropping column with foreign key Laravel error: General error: 1025 Error on renameLaravel SQL Can't create tableLaravel Migration foreign key errorLaravel php artisan migrate:refresh giving a syntax errorSQLSTATE[42S01]: Base table or view already exists or Base table or view already exists: 1050 Tableerror in migrating laravel file to xampp serverSyntax error or access violation: 1064:syntax to use near 'unsigned not null, modelName varchar(191) not null, title varchar(191) not nLaravel cannot create new table field in mysqlLaravel 5.7:Last migration creates table but is not registered in the migration table

용인 삼성생명 블루밍스 목차 통계 역대 감독 선수단 응원단 경기장 같이 보기 외부 링크 둘러보기 메뉴samsungblueminx.comeh선수 명단용인 삼성생명 블루밍스용인 삼성생명 블루밍스ehsamsungblueminx.comeheheheh

155 수학 과학 기타 둘러보기 메뉴eh추가해eh문서를 완성해