Haskell scalpel scraping from web linkGetting started with HaskellWhat is the ecosystem for Haskell web development?What is Haskell actually useful for?Large-scale design in Haskell?Web Scraping With HaskellGood Haskell source to read and learn fromHaskell: What is Weak Head Normal Form?Speed comparison with Project Euler: C vs Python vs Erlang vs HaskellHaskell: Lists, Arrays, Vectors, SequencesWhat's so bad about Template Haskell?
Why "Having chlorophyll without photosynthesis is actually very dangerous" and "like living with a bomb"?
How to find program name(s) of an installed package?
Why don't electron-positron collisions release infinite energy?
How old can references or sources in a thesis be?
Why did the Germans forbid the possession of pet pigeons in Rostov-on-Don in 1941?
can i play a electric guitar through a bass amp?
Test whether all array elements are factors of a number
What defenses are there against being summoned by the Gate spell?
Is it possible to do 50 km distance without any previous training?
Has the BBC provided arguments for saying Brexit being cancelled is unlikely?
What do the dots in this tr command do: tr .............A-Z A-ZA-Z <<< "JVPQBOV" (with 13 dots)
Have astronauts in space suits ever taken selfies? If so, how?
Why does Kotter return in Welcome Back Kotter?
How does one intimidate enemies without having the capacity for violence?
Problem of parity - Can we draw a closed path made up of 20 line segments...
"You are your self first supporter", a more proper way to say it
Dragon forelimb placement
The use of multiple foreign keys on same column in SQL Server
What's the output of a record cartridge playing an out-of-speed record
Today is the Center
Example of a continuous function that don't have a continuous extension
How can I make my BBEG immortal short of making them a Lich or Vampire?
TGV timetables / schedules?
How to write a macro that is braces sensitive?
Haskell scalpel scraping from web link
Getting started with HaskellWhat is the ecosystem for Haskell web development?What is Haskell actually useful for?Large-scale design in Haskell?Web Scraping With HaskellGood Haskell source to read and learn fromHaskell: What is Weak Head Normal Form?Speed comparison with Project Euler: C vs Python vs Erlang vs HaskellHaskell: Lists, Arrays, Vectors, SequencesWhat's so bad about Template Haskell?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
I got pretty far with a web scraper that I tested by pasting page source into a function as a giant string.
I am now trying to make it actually read the source from the web link itself, but I am stuck on an error that I don't understand.
targetUrl :: IO (Maybe [String])
targetUrl = scrapeURL "https://lehd.ces.census.gov/data/lodes/LODES7/" linkToData
linkToData :: Scraper String [String]
linkToData =
chroots "a" $ do
dataURL <- attr "href" $ anySelector
pure dataURL
imgWithAltDir :: Selector
imgWithAltDir = (TagString "img") @: [(AttributeString "alt") @= "[DIR]"]
allLinks :: Scraper String [[String]]
allLinks = chroots "tr" $ do
attributes <- html imgWithAltDir
guard (not $ null attributes)
linkToData
runWeb = scrapeStringLike targetUrl
This is the error I'm getting (in reference to the last line above):
• No instance for (Ord (IO (Maybe [String])))
arising from a use of ‘scrapeStringLike’
• In the expression: scrapeStringLike targetUrl
In an equation for ‘runWeb’: runWeb = scrapeStringLike targetUrl
haskell
add a comment |
I got pretty far with a web scraper that I tested by pasting page source into a function as a giant string.
I am now trying to make it actually read the source from the web link itself, but I am stuck on an error that I don't understand.
targetUrl :: IO (Maybe [String])
targetUrl = scrapeURL "https://lehd.ces.census.gov/data/lodes/LODES7/" linkToData
linkToData :: Scraper String [String]
linkToData =
chroots "a" $ do
dataURL <- attr "href" $ anySelector
pure dataURL
imgWithAltDir :: Selector
imgWithAltDir = (TagString "img") @: [(AttributeString "alt") @= "[DIR]"]
allLinks :: Scraper String [[String]]
allLinks = chroots "tr" $ do
attributes <- html imgWithAltDir
guard (not $ null attributes)
linkToData
runWeb = scrapeStringLike targetUrl
This is the error I'm getting (in reference to the last line above):
• No instance for (Ord (IO (Maybe [String])))
arising from a use of ‘scrapeStringLike’
• In the expression: scrapeStringLike targetUrl
In an equation for ‘runWeb’: runWeb = scrapeStringLike targetUrl
haskell
You're using scrapeStringLike in the wrong way. If you look at the type signature,scrapeStringLike :: StringLike str => str -> Scraper str a -> Maybe a
, you need to give it a String, a scraper and it'll return aMaybe a
. You're only giving it aIO (Maybe [String])
. Unfortunately, I have no idea how to use it either so this is all the help I can give you
– Lorenzo
Mar 21 at 23:49
Are you surerunWeb = targetURL
is not enough?
– Lorenzo
Mar 21 at 23:50
So that does produce a list that includes what I wanted, but it also seems to skip the filtering I did inimgWithAltDir
andallLinks
that filtered out some elements I did not want. And when I change thetargetUrl
function toscrapeURL "www.urlhere.com" allLinks
, I get an error.
– reallymemorable
Mar 21 at 23:57
what error exactly?
– Lorenzo
Mar 22 at 0:48
2
The IO Monad for People who Simply Don't Care
– Daniel Wagner
Mar 22 at 1:15
add a comment |
I got pretty far with a web scraper that I tested by pasting page source into a function as a giant string.
I am now trying to make it actually read the source from the web link itself, but I am stuck on an error that I don't understand.
targetUrl :: IO (Maybe [String])
targetUrl = scrapeURL "https://lehd.ces.census.gov/data/lodes/LODES7/" linkToData
linkToData :: Scraper String [String]
linkToData =
chroots "a" $ do
dataURL <- attr "href" $ anySelector
pure dataURL
imgWithAltDir :: Selector
imgWithAltDir = (TagString "img") @: [(AttributeString "alt") @= "[DIR]"]
allLinks :: Scraper String [[String]]
allLinks = chroots "tr" $ do
attributes <- html imgWithAltDir
guard (not $ null attributes)
linkToData
runWeb = scrapeStringLike targetUrl
This is the error I'm getting (in reference to the last line above):
• No instance for (Ord (IO (Maybe [String])))
arising from a use of ‘scrapeStringLike’
• In the expression: scrapeStringLike targetUrl
In an equation for ‘runWeb’: runWeb = scrapeStringLike targetUrl
haskell
I got pretty far with a web scraper that I tested by pasting page source into a function as a giant string.
I am now trying to make it actually read the source from the web link itself, but I am stuck on an error that I don't understand.
targetUrl :: IO (Maybe [String])
targetUrl = scrapeURL "https://lehd.ces.census.gov/data/lodes/LODES7/" linkToData
linkToData :: Scraper String [String]
linkToData =
chroots "a" $ do
dataURL <- attr "href" $ anySelector
pure dataURL
imgWithAltDir :: Selector
imgWithAltDir = (TagString "img") @: [(AttributeString "alt") @= "[DIR]"]
allLinks :: Scraper String [[String]]
allLinks = chroots "tr" $ do
attributes <- html imgWithAltDir
guard (not $ null attributes)
linkToData
runWeb = scrapeStringLike targetUrl
This is the error I'm getting (in reference to the last line above):
• No instance for (Ord (IO (Maybe [String])))
arising from a use of ‘scrapeStringLike’
• In the expression: scrapeStringLike targetUrl
In an equation for ‘runWeb’: runWeb = scrapeStringLike targetUrl
haskell
haskell
asked Mar 21 at 23:40
reallymemorablereallymemorable
171119
171119
You're using scrapeStringLike in the wrong way. If you look at the type signature,scrapeStringLike :: StringLike str => str -> Scraper str a -> Maybe a
, you need to give it a String, a scraper and it'll return aMaybe a
. You're only giving it aIO (Maybe [String])
. Unfortunately, I have no idea how to use it either so this is all the help I can give you
– Lorenzo
Mar 21 at 23:49
Are you surerunWeb = targetURL
is not enough?
– Lorenzo
Mar 21 at 23:50
So that does produce a list that includes what I wanted, but it also seems to skip the filtering I did inimgWithAltDir
andallLinks
that filtered out some elements I did not want. And when I change thetargetUrl
function toscrapeURL "www.urlhere.com" allLinks
, I get an error.
– reallymemorable
Mar 21 at 23:57
what error exactly?
– Lorenzo
Mar 22 at 0:48
2
The IO Monad for People who Simply Don't Care
– Daniel Wagner
Mar 22 at 1:15
add a comment |
You're using scrapeStringLike in the wrong way. If you look at the type signature,scrapeStringLike :: StringLike str => str -> Scraper str a -> Maybe a
, you need to give it a String, a scraper and it'll return aMaybe a
. You're only giving it aIO (Maybe [String])
. Unfortunately, I have no idea how to use it either so this is all the help I can give you
– Lorenzo
Mar 21 at 23:49
Are you surerunWeb = targetURL
is not enough?
– Lorenzo
Mar 21 at 23:50
So that does produce a list that includes what I wanted, but it also seems to skip the filtering I did inimgWithAltDir
andallLinks
that filtered out some elements I did not want. And when I change thetargetUrl
function toscrapeURL "www.urlhere.com" allLinks
, I get an error.
– reallymemorable
Mar 21 at 23:57
what error exactly?
– Lorenzo
Mar 22 at 0:48
2
The IO Monad for People who Simply Don't Care
– Daniel Wagner
Mar 22 at 1:15
You're using scrapeStringLike in the wrong way. If you look at the type signature,
scrapeStringLike :: StringLike str => str -> Scraper str a -> Maybe a
, you need to give it a String, a scraper and it'll return a Maybe a
. You're only giving it a IO (Maybe [String])
. Unfortunately, I have no idea how to use it either so this is all the help I can give you– Lorenzo
Mar 21 at 23:49
You're using scrapeStringLike in the wrong way. If you look at the type signature,
scrapeStringLike :: StringLike str => str -> Scraper str a -> Maybe a
, you need to give it a String, a scraper and it'll return a Maybe a
. You're only giving it a IO (Maybe [String])
. Unfortunately, I have no idea how to use it either so this is all the help I can give you– Lorenzo
Mar 21 at 23:49
Are you sure
runWeb = targetURL
is not enough?– Lorenzo
Mar 21 at 23:50
Are you sure
runWeb = targetURL
is not enough?– Lorenzo
Mar 21 at 23:50
So that does produce a list that includes what I wanted, but it also seems to skip the filtering I did in
imgWithAltDir
and allLinks
that filtered out some elements I did not want. And when I change the targetUrl
function to scrapeURL "www.urlhere.com" allLinks
, I get an error.– reallymemorable
Mar 21 at 23:57
So that does produce a list that includes what I wanted, but it also seems to skip the filtering I did in
imgWithAltDir
and allLinks
that filtered out some elements I did not want. And when I change the targetUrl
function to scrapeURL "www.urlhere.com" allLinks
, I get an error.– reallymemorable
Mar 21 at 23:57
what error exactly?
– Lorenzo
Mar 22 at 0:48
what error exactly?
– Lorenzo
Mar 22 at 0:48
2
2
The IO Monad for People who Simply Don't Care
– Daniel Wagner
Mar 22 at 1:15
The IO Monad for People who Simply Don't Care
– Daniel Wagner
Mar 22 at 1:15
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55290806%2fhaskell-scalpel-scraping-from-web-link%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55290806%2fhaskell-scalpel-scraping-from-web-link%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
You're using scrapeStringLike in the wrong way. If you look at the type signature,
scrapeStringLike :: StringLike str => str -> Scraper str a -> Maybe a
, you need to give it a String, a scraper and it'll return aMaybe a
. You're only giving it aIO (Maybe [String])
. Unfortunately, I have no idea how to use it either so this is all the help I can give you– Lorenzo
Mar 21 at 23:49
Are you sure
runWeb = targetURL
is not enough?– Lorenzo
Mar 21 at 23:50
So that does produce a list that includes what I wanted, but it also seems to skip the filtering I did in
imgWithAltDir
andallLinks
that filtered out some elements I did not want. And when I change thetargetUrl
function toscrapeURL "www.urlhere.com" allLinks
, I get an error.– reallymemorable
Mar 21 at 23:57
what error exactly?
– Lorenzo
Mar 22 at 0:48
2
The IO Monad for People who Simply Don't Care
– Daniel Wagner
Mar 22 at 1:15