Use JavaScript to crawl a website -> Possible and which IP is shown on the crawled site Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!Which “href” value should I use for JavaScript links, “#” or “javascript:void(0)”?Is It Possible to Sandbox JavaScript Running In the Browser?JavaScript implementation of GzipWhich equals operator (== vs ===) should be used in JavaScript comparisons?Is it possible to hide or scramble/obfuscate the javascript code of a webpage?Is it possible to add dynamically named properties to JavaScript object?How to decide when to use Node.js?How to request Google to re-crawl my website?Why is google not using a headless browser to crawl clientside content?Which version of Javascript

Who's this lady in the war room?

Why aren't these two solutions equivalent? Combinatorics problem

Putting Ant-Man on house arrest

What documents does someone with a long-term visa need to travel to another Schengen country?

What is the ongoing value of the Kanban board to the developers as opposed to management

What helicopter has the most rotor blades?

Does Prince Arnaud cause someone holding the Princess to lose?

How to ask rejected full-time candidates to apply to teach individual courses?

Lights are flickering on and off after accidentally bumping into light switch

When speaking, how do you change your mind mid-sentence?

Reflections in a Square

Protagonist's race is hidden - should I reveal it?

Coin Game with infinite paradox

How can I introduce the names of fantasy creatures to the reader?

Weaponising the Grasp-at-a-Distance spell

What is the evidence that custom checks in Northern Ireland are going to result in violence?

Trying to enter the Fox's den

Why isn't everyone flabbergasted about Bran's "gift"?

Book about a teenager and alien

How to break 信じようとしていただけかも知れない into separate parts?

When does Bran Stark remember Jamie pushing him?

Network questions

Why do people think Winterfell crypts is the safest place for women, children & old people?

Who can become a wight?



Use JavaScript to crawl a website -> Possible and which IP is shown on the crawled site



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)
Data science time! April 2019 and salary with experience
The Ask Question Wizard is Live!Which “href” value should I use for JavaScript links, “#” or “javascript:void(0)”?Is It Possible to Sandbox JavaScript Running In the Browser?JavaScript implementation of GzipWhich equals operator (== vs ===) should be used in JavaScript comparisons?Is it possible to hide or scramble/obfuscate the javascript code of a webpage?Is it possible to add dynamically named properties to JavaScript object?How to decide when to use Node.js?How to request Google to re-crawl my website?Why is google not using a headless browser to crawl clientside content?Which version of Javascript



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








-1















it is possible to crawl a website within an Angular-App? I am speaking about to call a website from Angular, not crawling an Angular-App. If that so, then I am wondering which IP will be shown on the crawled website. Since JavaScript is client-side, I would suggest, its the IP of the client, not of the server (like probably at nodejs). But all I know, its mostly browser-implemented stuff what we can use in JS, so it is even possible to crawl websites with methods from JavaScript (or Angular)?



Best Regards
Buzz










share|improve this question






















  • you probably cant because of cors

    – Chris Li
    Mar 22 at 14:01

















-1















it is possible to crawl a website within an Angular-App? I am speaking about to call a website from Angular, not crawling an Angular-App. If that so, then I am wondering which IP will be shown on the crawled website. Since JavaScript is client-side, I would suggest, its the IP of the client, not of the server (like probably at nodejs). But all I know, its mostly browser-implemented stuff what we can use in JS, so it is even possible to crawl websites with methods from JavaScript (or Angular)?



Best Regards
Buzz










share|improve this question






















  • you probably cant because of cors

    – Chris Li
    Mar 22 at 14:01













-1












-1








-1








it is possible to crawl a website within an Angular-App? I am speaking about to call a website from Angular, not crawling an Angular-App. If that so, then I am wondering which IP will be shown on the crawled website. Since JavaScript is client-side, I would suggest, its the IP of the client, not of the server (like probably at nodejs). But all I know, its mostly browser-implemented stuff what we can use in JS, so it is even possible to crawl websites with methods from JavaScript (or Angular)?



Best Regards
Buzz










share|improve this question














it is possible to crawl a website within an Angular-App? I am speaking about to call a website from Angular, not crawling an Angular-App. If that so, then I am wondering which IP will be shown on the crawled website. Since JavaScript is client-side, I would suggest, its the IP of the client, not of the server (like probably at nodejs). But all I know, its mostly browser-implemented stuff what we can use in JS, so it is even possible to crawl websites with methods from JavaScript (or Angular)?



Best Regards
Buzz







javascript angular web-crawler






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Mar 22 at 14:00









BuzzBennyBuzzBenny

625




625












  • you probably cant because of cors

    – Chris Li
    Mar 22 at 14:01

















  • you probably cant because of cors

    – Chris Li
    Mar 22 at 14:01
















you probably cant because of cors

– Chris Li
Mar 22 at 14:01





you probably cant because of cors

– Chris Li
Mar 22 at 14:01












1 Answer
1






active

oldest

votes


















0














In theory, you can create an AJAX request to fetch the data with reponse type text/html. That would give you the remote document as a string. The browser wouldn't try to load the JavaScript and CSS in that document, though. That might not be a problem but CORS is. For security reasons, most browsers prevent you from loading data from somewhere else (otherwise, it would be too easy for criminals to put JavaScript into any web page). See here for details: https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS



If you have control over the second domain, you can configure the server there to send Access-Control-Allow-Origin headers to the browser to allow access from the Angular App.



Note: You could use an iframe to load the other website but when the domains of the current document and the one in the iframe don't match, then you can't access the contents of the iframe from JavaScript.



One way to work around this is to install a proxy on your server. The browser can then ask your server for the pages in question. In this case, the remote web site will get the IP of your server.






share|improve this answer























  • Thank you for the verbose response. So I let the crawling job doing by a java-backend and use a proxy to hide the server ip. then I pass the data via JSON to my Angular-App.

    – BuzzBenny
    Mar 22 at 15:47











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55301269%2fuse-javascript-to-crawl-a-website-possible-and-which-ip-is-shown-on-the-crawl%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














In theory, you can create an AJAX request to fetch the data with reponse type text/html. That would give you the remote document as a string. The browser wouldn't try to load the JavaScript and CSS in that document, though. That might not be a problem but CORS is. For security reasons, most browsers prevent you from loading data from somewhere else (otherwise, it would be too easy for criminals to put JavaScript into any web page). See here for details: https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS



If you have control over the second domain, you can configure the server there to send Access-Control-Allow-Origin headers to the browser to allow access from the Angular App.



Note: You could use an iframe to load the other website but when the domains of the current document and the one in the iframe don't match, then you can't access the contents of the iframe from JavaScript.



One way to work around this is to install a proxy on your server. The browser can then ask your server for the pages in question. In this case, the remote web site will get the IP of your server.






share|improve this answer























  • Thank you for the verbose response. So I let the crawling job doing by a java-backend and use a proxy to hide the server ip. then I pass the data via JSON to my Angular-App.

    – BuzzBenny
    Mar 22 at 15:47















0














In theory, you can create an AJAX request to fetch the data with reponse type text/html. That would give you the remote document as a string. The browser wouldn't try to load the JavaScript and CSS in that document, though. That might not be a problem but CORS is. For security reasons, most browsers prevent you from loading data from somewhere else (otherwise, it would be too easy for criminals to put JavaScript into any web page). See here for details: https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS



If you have control over the second domain, you can configure the server there to send Access-Control-Allow-Origin headers to the browser to allow access from the Angular App.



Note: You could use an iframe to load the other website but when the domains of the current document and the one in the iframe don't match, then you can't access the contents of the iframe from JavaScript.



One way to work around this is to install a proxy on your server. The browser can then ask your server for the pages in question. In this case, the remote web site will get the IP of your server.






share|improve this answer























  • Thank you for the verbose response. So I let the crawling job doing by a java-backend and use a proxy to hide the server ip. then I pass the data via JSON to my Angular-App.

    – BuzzBenny
    Mar 22 at 15:47













0












0








0







In theory, you can create an AJAX request to fetch the data with reponse type text/html. That would give you the remote document as a string. The browser wouldn't try to load the JavaScript and CSS in that document, though. That might not be a problem but CORS is. For security reasons, most browsers prevent you from loading data from somewhere else (otherwise, it would be too easy for criminals to put JavaScript into any web page). See here for details: https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS



If you have control over the second domain, you can configure the server there to send Access-Control-Allow-Origin headers to the browser to allow access from the Angular App.



Note: You could use an iframe to load the other website but when the domains of the current document and the one in the iframe don't match, then you can't access the contents of the iframe from JavaScript.



One way to work around this is to install a proxy on your server. The browser can then ask your server for the pages in question. In this case, the remote web site will get the IP of your server.






share|improve this answer













In theory, you can create an AJAX request to fetch the data with reponse type text/html. That would give you the remote document as a string. The browser wouldn't try to load the JavaScript and CSS in that document, though. That might not be a problem but CORS is. For security reasons, most browsers prevent you from loading data from somewhere else (otherwise, it would be too easy for criminals to put JavaScript into any web page). See here for details: https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS



If you have control over the second domain, you can configure the server there to send Access-Control-Allow-Origin headers to the browser to allow access from the Angular App.



Note: You could use an iframe to load the other website but when the domains of the current document and the one in the iframe don't match, then you can't access the contents of the iframe from JavaScript.



One way to work around this is to install a proxy on your server. The browser can then ask your server for the pages in question. In this case, the remote web site will get the IP of your server.







share|improve this answer












share|improve this answer



share|improve this answer










answered Mar 22 at 14:48









Aaron DigullaAaron Digulla

251k87482701




251k87482701












  • Thank you for the verbose response. So I let the crawling job doing by a java-backend and use a proxy to hide the server ip. then I pass the data via JSON to my Angular-App.

    – BuzzBenny
    Mar 22 at 15:47

















  • Thank you for the verbose response. So I let the crawling job doing by a java-backend and use a proxy to hide the server ip. then I pass the data via JSON to my Angular-App.

    – BuzzBenny
    Mar 22 at 15:47
















Thank you for the verbose response. So I let the crawling job doing by a java-backend and use a proxy to hide the server ip. then I pass the data via JSON to my Angular-App.

– BuzzBenny
Mar 22 at 15:47





Thank you for the verbose response. So I let the crawling job doing by a java-backend and use a proxy to hide the server ip. then I pass the data via JSON to my Angular-App.

– BuzzBenny
Mar 22 at 15:47



















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55301269%2fuse-javascript-to-crawl-a-website-possible-and-which-ip-is-shown-on-the-crawl%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript