Use JavaScript to crawl a website -> Possible and which IP is shown on the crawled site Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!Which “href” value should I use for JavaScript links, “#” or “javascript:void(0)”?Is It Possible to Sandbox JavaScript Running In the Browser?JavaScript implementation of GzipWhich equals operator (== vs ===) should be used in JavaScript comparisons?Is it possible to hide or scramble/obfuscate the javascript code of a webpage?Is it possible to add dynamically named properties to JavaScript object?How to decide when to use Node.js?How to request Google to re-crawl my website?Why is google not using a headless browser to crawl clientside content?Which version of Javascript
Who's this lady in the war room?
Why aren't these two solutions equivalent? Combinatorics problem
Putting Ant-Man on house arrest
What documents does someone with a long-term visa need to travel to another Schengen country?
What is the ongoing value of the Kanban board to the developers as opposed to management
What helicopter has the most rotor blades?
Does Prince Arnaud cause someone holding the Princess to lose?
How to ask rejected full-time candidates to apply to teach individual courses?
Lights are flickering on and off after accidentally bumping into light switch
When speaking, how do you change your mind mid-sentence?
Reflections in a Square
Protagonist's race is hidden - should I reveal it?
Coin Game with infinite paradox
How can I introduce the names of fantasy creatures to the reader?
Weaponising the Grasp-at-a-Distance spell
What is the evidence that custom checks in Northern Ireland are going to result in violence?
Trying to enter the Fox's den
Why isn't everyone flabbergasted about Bran's "gift"?
Book about a teenager and alien
How to break 信じようとしていただけかも知れない into separate parts?
When does Bran Stark remember Jamie pushing him?
Network questions
Why do people think Winterfell crypts is the safest place for women, children & old people?
Who can become a wight?
Use JavaScript to crawl a website -> Possible and which IP is shown on the crawled site
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)
Data science time! April 2019 and salary with experience
The Ask Question Wizard is Live!Which “href” value should I use for JavaScript links, “#” or “javascript:void(0)”?Is It Possible to Sandbox JavaScript Running In the Browser?JavaScript implementation of GzipWhich equals operator (== vs ===) should be used in JavaScript comparisons?Is it possible to hide or scramble/obfuscate the javascript code of a webpage?Is it possible to add dynamically named properties to JavaScript object?How to decide when to use Node.js?How to request Google to re-crawl my website?Why is google not using a headless browser to crawl clientside content?Which version of Javascript
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
it is possible to crawl a website within an Angular-App? I am speaking about to call a website from Angular, not crawling an Angular-App. If that so, then I am wondering which IP will be shown on the crawled website. Since JavaScript is client-side, I would suggest, its the IP of the client, not of the server (like probably at nodejs). But all I know, its mostly browser-implemented stuff what we can use in JS, so it is even possible to crawl websites with methods from JavaScript (or Angular)?
Best Regards
Buzz
javascript

add a comment |
it is possible to crawl a website within an Angular-App? I am speaking about to call a website from Angular, not crawling an Angular-App. If that so, then I am wondering which IP will be shown on the crawled website. Since JavaScript is client-side, I would suggest, its the IP of the client, not of the server (like probably at nodejs). But all I know, its mostly browser-implemented stuff what we can use in JS, so it is even possible to crawl websites with methods from JavaScript (or Angular)?
Best Regards
Buzz
javascript

you probably cant because of cors
– Chris Li
Mar 22 at 14:01
add a comment |
it is possible to crawl a website within an Angular-App? I am speaking about to call a website from Angular, not crawling an Angular-App. If that so, then I am wondering which IP will be shown on the crawled website. Since JavaScript is client-side, I would suggest, its the IP of the client, not of the server (like probably at nodejs). But all I know, its mostly browser-implemented stuff what we can use in JS, so it is even possible to crawl websites with methods from JavaScript (or Angular)?
Best Regards
Buzz
javascript

it is possible to crawl a website within an Angular-App? I am speaking about to call a website from Angular, not crawling an Angular-App. If that so, then I am wondering which IP will be shown on the crawled website. Since JavaScript is client-side, I would suggest, its the IP of the client, not of the server (like probably at nodejs). But all I know, its mostly browser-implemented stuff what we can use in JS, so it is even possible to crawl websites with methods from JavaScript (or Angular)?
Best Regards
Buzz
javascript

javascript

asked Mar 22 at 14:00
BuzzBennyBuzzBenny
625
625
you probably cant because of cors
– Chris Li
Mar 22 at 14:01
add a comment |
you probably cant because of cors
– Chris Li
Mar 22 at 14:01
you probably cant because of cors
– Chris Li
Mar 22 at 14:01
you probably cant because of cors
– Chris Li
Mar 22 at 14:01
add a comment |
1 Answer
1
active
oldest
votes
In theory, you can create an AJAX request to fetch the data with reponse type text/html
. That would give you the remote document as a string. The browser wouldn't try to load the JavaScript and CSS in that document, though. That might not be a problem but CORS is. For security reasons, most browsers prevent you from loading data from somewhere else (otherwise, it would be too easy for criminals to put JavaScript into any web page). See here for details: https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS
If you have control over the second domain, you can configure the server there to send Access-Control-Allow-Origin
headers to the browser to allow access from the Angular App.
Note: You could use an iframe
to load the other website but when the domains of the current document and the one in the iframe
don't match, then you can't access the contents of the iframe
from JavaScript.
One way to work around this is to install a proxy on your server. The browser can then ask your server for the pages in question. In this case, the remote web site will get the IP of your server.
Thank you for the verbose response. So I let the crawling job doing by a java-backend and use a proxy to hide the server ip. then I pass the data via JSON to my Angular-App.
– BuzzBenny
Mar 22 at 15:47
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55301269%2fuse-javascript-to-crawl-a-website-possible-and-which-ip-is-shown-on-the-crawl%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
In theory, you can create an AJAX request to fetch the data with reponse type text/html
. That would give you the remote document as a string. The browser wouldn't try to load the JavaScript and CSS in that document, though. That might not be a problem but CORS is. For security reasons, most browsers prevent you from loading data from somewhere else (otherwise, it would be too easy for criminals to put JavaScript into any web page). See here for details: https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS
If you have control over the second domain, you can configure the server there to send Access-Control-Allow-Origin
headers to the browser to allow access from the Angular App.
Note: You could use an iframe
to load the other website but when the domains of the current document and the one in the iframe
don't match, then you can't access the contents of the iframe
from JavaScript.
One way to work around this is to install a proxy on your server. The browser can then ask your server for the pages in question. In this case, the remote web site will get the IP of your server.
Thank you for the verbose response. So I let the crawling job doing by a java-backend and use a proxy to hide the server ip. then I pass the data via JSON to my Angular-App.
– BuzzBenny
Mar 22 at 15:47
add a comment |
In theory, you can create an AJAX request to fetch the data with reponse type text/html
. That would give you the remote document as a string. The browser wouldn't try to load the JavaScript and CSS in that document, though. That might not be a problem but CORS is. For security reasons, most browsers prevent you from loading data from somewhere else (otherwise, it would be too easy for criminals to put JavaScript into any web page). See here for details: https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS
If you have control over the second domain, you can configure the server there to send Access-Control-Allow-Origin
headers to the browser to allow access from the Angular App.
Note: You could use an iframe
to load the other website but when the domains of the current document and the one in the iframe
don't match, then you can't access the contents of the iframe
from JavaScript.
One way to work around this is to install a proxy on your server. The browser can then ask your server for the pages in question. In this case, the remote web site will get the IP of your server.
Thank you for the verbose response. So I let the crawling job doing by a java-backend and use a proxy to hide the server ip. then I pass the data via JSON to my Angular-App.
– BuzzBenny
Mar 22 at 15:47
add a comment |
In theory, you can create an AJAX request to fetch the data with reponse type text/html
. That would give you the remote document as a string. The browser wouldn't try to load the JavaScript and CSS in that document, though. That might not be a problem but CORS is. For security reasons, most browsers prevent you from loading data from somewhere else (otherwise, it would be too easy for criminals to put JavaScript into any web page). See here for details: https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS
If you have control over the second domain, you can configure the server there to send Access-Control-Allow-Origin
headers to the browser to allow access from the Angular App.
Note: You could use an iframe
to load the other website but when the domains of the current document and the one in the iframe
don't match, then you can't access the contents of the iframe
from JavaScript.
One way to work around this is to install a proxy on your server. The browser can then ask your server for the pages in question. In this case, the remote web site will get the IP of your server.
In theory, you can create an AJAX request to fetch the data with reponse type text/html
. That would give you the remote document as a string. The browser wouldn't try to load the JavaScript and CSS in that document, though. That might not be a problem but CORS is. For security reasons, most browsers prevent you from loading data from somewhere else (otherwise, it would be too easy for criminals to put JavaScript into any web page). See here for details: https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS
If you have control over the second domain, you can configure the server there to send Access-Control-Allow-Origin
headers to the browser to allow access from the Angular App.
Note: You could use an iframe
to load the other website but when the domains of the current document and the one in the iframe
don't match, then you can't access the contents of the iframe
from JavaScript.
One way to work around this is to install a proxy on your server. The browser can then ask your server for the pages in question. In this case, the remote web site will get the IP of your server.
answered Mar 22 at 14:48
Aaron DigullaAaron Digulla
251k87482701
251k87482701
Thank you for the verbose response. So I let the crawling job doing by a java-backend and use a proxy to hide the server ip. then I pass the data via JSON to my Angular-App.
– BuzzBenny
Mar 22 at 15:47
add a comment |
Thank you for the verbose response. So I let the crawling job doing by a java-backend and use a proxy to hide the server ip. then I pass the data via JSON to my Angular-App.
– BuzzBenny
Mar 22 at 15:47
Thank you for the verbose response. So I let the crawling job doing by a java-backend and use a proxy to hide the server ip. then I pass the data via JSON to my Angular-App.
– BuzzBenny
Mar 22 at 15:47
Thank you for the verbose response. So I let the crawling job doing by a java-backend and use a proxy to hide the server ip. then I pass the data via JSON to my Angular-App.
– BuzzBenny
Mar 22 at 15:47
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55301269%2fuse-javascript-to-crawl-a-website-possible-and-which-ip-is-shown-on-the-crawl%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
you probably cant because of cors
– Chris Li
Mar 22 at 14:01