How to extract underlined text from pdf Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30 pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!How to merge two dictionaries in a single expression?How do I check if a list is empty?How do I check whether a file exists without exceptions?How can I safely create a nested directory in Python?How do I parse a string to a float or int in Python?Extracting extension from filename in PythonHow do I sort a dictionary by value?How do I list all files of a directory?How do you parse and process HTML/XML in PHP?Python - Extract formatted text (i.e. bold, italics, color) from pdf
Are there existing rules/lore for MTG planeswalkers?
What is the ongoing value of the Kanban board to the developers as opposed to management
All ASCII characters with a given bit count
How to translate "red flag" into Spanish?
Will I lose my paid in full property
Are these square matrices always diagonalisable?
Is it OK if I do not take the receipt in Germany?
Is it accepted to use working hours to read general interest books?
Is there a verb for listening stealthily?
What is a 'Key' in computer science?
Processing ADC conversion result: DMA vs Processor Registers
Why did Europeans not widely domesticate foxes?
How did Elite on the NES work?
What does the black goddess statue do and what is it?
How was Lagrange appointed professor of mathematics so early?
TV series episode where humans nuke aliens before decrypting their message that states they come in peace
How can I wire a 9-position switch so that each position turns on one more LED than the one before?
How long can a nation maintain a technological edge over the rest of the world?
Variable does not exist: sObjectType (Task.sObjectType)
Could a cockatrice have parasitic embryos?
Was there ever a LEGO store in Miami International Airport?
What is the purpose of the side handle on a hand ("eggbeater") drill?
Simulate round-robin tournament draw
Writing a T-SQL stored procedure to receive 4 numbers and insert them into a table
How to extract underlined text from pdf
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30 pm US/Eastern)
Data science time! April 2019 and salary with experience
The Ask Question Wizard is Live!How to merge two dictionaries in a single expression?How do I check if a list is empty?How do I check whether a file exists without exceptions?How can I safely create a nested directory in Python?How do I parse a string to a float or int in Python?Extracting extension from filename in PythonHow do I sort a dictionary by value?How do I list all files of a directory?How do you parse and process HTML/XML in PHP?Python - Extract formatted text (i.e. bold, italics, color) from pdf
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
I tried pdfminer, pdfquery and other libraries and I get bold, italic, fonts, etc, but cannot get underlined text. For example when converting with pdfminer to html it creates DIVs with borders but not associated with the word.
Any idea of how can identify underlined text in a PDF? if possible using python.
Thank you!
python parsing pdfminer
add a comment |
I tried pdfminer, pdfquery and other libraries and I get bold, italic, fonts, etc, but cannot get underlined text. For example when converting with pdfminer to html it creates DIVs with borders but not associated with the word.
Any idea of how can identify underlined text in a PDF? if possible using python.
Thank you!
python parsing pdfminer
just asking, why the -1? what other information can I add/is required? Thank you!
– Alejandro
Mar 28 at 12:51
add a comment |
I tried pdfminer, pdfquery and other libraries and I get bold, italic, fonts, etc, but cannot get underlined text. For example when converting with pdfminer to html it creates DIVs with borders but not associated with the word.
Any idea of how can identify underlined text in a PDF? if possible using python.
Thank you!
python parsing pdfminer
I tried pdfminer, pdfquery and other libraries and I get bold, italic, fonts, etc, but cannot get underlined text. For example when converting with pdfminer to html it creates DIVs with borders but not associated with the word.
Any idea of how can identify underlined text in a PDF? if possible using python.
Thank you!
python parsing pdfminer
python parsing pdfminer
asked Mar 22 at 14:57
AlejandroAlejandro
77116
77116
just asking, why the -1? what other information can I add/is required? Thank you!
– Alejandro
Mar 28 at 12:51
add a comment |
just asking, why the -1? what other information can I add/is required? Thank you!
– Alejandro
Mar 28 at 12:51
just asking, why the -1? what other information can I add/is required? Thank you!
– Alejandro
Mar 28 at 12:51
just asking, why the -1? what other information can I add/is required? Thank you!
– Alejandro
Mar 28 at 12:51
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55302394%2fhow-to-extract-underlined-text-from-pdf%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55302394%2fhow-to-extract-underlined-text-from-pdf%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
just asking, why the -1? what other information can I add/is required? Thank you!
– Alejandro
Mar 28 at 12:51