Why does Tesseract OCR get affected by a top row of two pixels, and how can I get consistent resultsIs there any way to improve tesseract OCR with small fonts?Can `tesseract-ocr` put the result to STDOUT?Adding New Fonts to Tesseract 3Tesseract failing on trivial input image. Segfault errorTesseract OCR German Special CharactersTesseract Open Source OCR Engine v3.05.00dev with Leptonica Warning in pixReadMemPngTesseract is Not Working At Another PCFile not found in JupyterAdd languages to Tesserocr AnacondaPytesseract dont reconize a very clear imageI want to Train Number plate using LIOS(Linux Intelligent OCR Solution)
Links to webpages in books
Are there any efficient algorithms to solve longest path problem in networks with cycles?
Is it possible writing coservation of relativistic energy in this naive way?
Unusual mail headers, evidence of an attempted attack. Have I been pwned?
The Target Principal Name Is Incorrect. Cannot Generate SSPI Context (SQL or AD Issue)?
Does this Wild Magic result affect the sorcerer or just other creatures?
How to split an equation in two lines?
Why doesn't a marching band have strings?
Why do textbooks often include the solutions to odd or even numbered problems but not both?
Fill NAs in R with zero if the next valid data point is more than 2 intervals away
How to make clear to people I don't want to answer their "Where are you from?" question?
Where can I find a database of galactic spectra?
Underbar nabla symbol doesn't work
Can any NP-Complete Problem be solved using at most polynomial space (but while using exponential time?)
Sci fi short story, robot city that nags people about health
What reason would an alien civilization have for building a Dyson Sphere (or Swarm) if cheap Nuclear fusion is available?
Swapping rooks in a 4x4 board
Can the negators "jamais, rien, personne, plus, ni, aucun" be used in a single sentence?
C-152 carb heat on before landing in hot weather?
Cascading Repair Costs following Blown Head Gasket on a 2004 Subaru Outback
What is the mechanical difference between the Spectator's Create Food and Water action and the Banshee's Undead Nature Trait?
How do I respond to requests for a "guarantee" not to leave after a few months?
Why aren't cotton tents more popular?
Is adding a new player (or players) a DM decision, or a group decision?
Why does Tesseract OCR get affected by a top row of two pixels, and how can I get consistent results
Is there any way to improve tesseract OCR with small fonts?Can `tesseract-ocr` put the result to STDOUT?Adding New Fonts to Tesseract 3Tesseract failing on trivial input image. Segfault errorTesseract OCR German Special CharactersTesseract Open Source OCR Engine v3.05.00dev with Leptonica Warning in pixReadMemPngTesseract is Not Working At Another PCFile not found in JupyterAdd languages to Tesserocr AnacondaPytesseract dont reconize a very clear imageI want to Train Number plate using LIOS(Linux Intelligent OCR Solution)
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I am trying to read a value from a popup dialog.
I use openCV to locate the info icon and Close button. Then I crop area in between.
These are my popup dialogs:
Popup90409.png and Popup90411.png
I cropped below images from the popup dialogs:
From another SO question, I use ImageMagic to enhance image to get good results (Is there any way to improve tesseract OCR with small fonts?).
So now I try to OCR on images:
Tesseract reads 90409 correctly but doesn't read 90411 (from respective images).
Looking at the images closely, there is a 2 pixel line on top that extends to the right.
To see if the 2 pixel line on top may be causing issues, I crop images leaving out 2 pixels from the top.
So, new images cropped from popup dialogs:
Again, enhancing for better OCR results with above ImageMagic method:
Now Tesseract OCR reads 90411 correctly.
However, now it can't read 90409.
How should I process the images to read both 90409 and 90411?
p.s:
Tesseract version:
tesseract 4.0.0-beta.3
leptonica-1.77.0 (Sep 10 2018, 11:35:46) [MSC v.1915 LIB Release x64]
libgif 5.1.4 : libjpeg 9b : libpng 1.6.35 : libtiff 4.0.9 : zlib 1.2.11 : libw
ebp 1.0.0 : libopenjp2 2.3.0
Found AVX
Found SSE
I am using Tesseract with python on windows as following:
config = ('--tessdata-dir "tessdata" -l eng --oem 1 --psm 3')
text = pytesseract.image_to_string(Image.open(filename), config=config)
tesseract python-tesseract
add a comment |
I am trying to read a value from a popup dialog.
I use openCV to locate the info icon and Close button. Then I crop area in between.
These are my popup dialogs:
Popup90409.png and Popup90411.png
I cropped below images from the popup dialogs:
From another SO question, I use ImageMagic to enhance image to get good results (Is there any way to improve tesseract OCR with small fonts?).
So now I try to OCR on images:
Tesseract reads 90409 correctly but doesn't read 90411 (from respective images).
Looking at the images closely, there is a 2 pixel line on top that extends to the right.
To see if the 2 pixel line on top may be causing issues, I crop images leaving out 2 pixels from the top.
So, new images cropped from popup dialogs:
Again, enhancing for better OCR results with above ImageMagic method:
Now Tesseract OCR reads 90411 correctly.
However, now it can't read 90409.
How should I process the images to read both 90409 and 90411?
p.s:
Tesseract version:
tesseract 4.0.0-beta.3
leptonica-1.77.0 (Sep 10 2018, 11:35:46) [MSC v.1915 LIB Release x64]
libgif 5.1.4 : libjpeg 9b : libpng 1.6.35 : libtiff 4.0.9 : zlib 1.2.11 : libw
ebp 1.0.0 : libopenjp2 2.3.0
Found AVX
Found SSE
I am using Tesseract with python on windows as following:
config = ('--tessdata-dir "tessdata" -l eng --oem 1 --psm 3')
text = pytesseract.image_to_string(Image.open(filename), config=config)
tesseract python-tesseract
add a comment |
I am trying to read a value from a popup dialog.
I use openCV to locate the info icon and Close button. Then I crop area in between.
These are my popup dialogs:
Popup90409.png and Popup90411.png
I cropped below images from the popup dialogs:
From another SO question, I use ImageMagic to enhance image to get good results (Is there any way to improve tesseract OCR with small fonts?).
So now I try to OCR on images:
Tesseract reads 90409 correctly but doesn't read 90411 (from respective images).
Looking at the images closely, there is a 2 pixel line on top that extends to the right.
To see if the 2 pixel line on top may be causing issues, I crop images leaving out 2 pixels from the top.
So, new images cropped from popup dialogs:
Again, enhancing for better OCR results with above ImageMagic method:
Now Tesseract OCR reads 90411 correctly.
However, now it can't read 90409.
How should I process the images to read both 90409 and 90411?
p.s:
Tesseract version:
tesseract 4.0.0-beta.3
leptonica-1.77.0 (Sep 10 2018, 11:35:46) [MSC v.1915 LIB Release x64]
libgif 5.1.4 : libjpeg 9b : libpng 1.6.35 : libtiff 4.0.9 : zlib 1.2.11 : libw
ebp 1.0.0 : libopenjp2 2.3.0
Found AVX
Found SSE
I am using Tesseract with python on windows as following:
config = ('--tessdata-dir "tessdata" -l eng --oem 1 --psm 3')
text = pytesseract.image_to_string(Image.open(filename), config=config)
tesseract python-tesseract
I am trying to read a value from a popup dialog.
I use openCV to locate the info icon and Close button. Then I crop area in between.
These are my popup dialogs:
Popup90409.png and Popup90411.png
I cropped below images from the popup dialogs:
From another SO question, I use ImageMagic to enhance image to get good results (Is there any way to improve tesseract OCR with small fonts?).
So now I try to OCR on images:
Tesseract reads 90409 correctly but doesn't read 90411 (from respective images).
Looking at the images closely, there is a 2 pixel line on top that extends to the right.
To see if the 2 pixel line on top may be causing issues, I crop images leaving out 2 pixels from the top.
So, new images cropped from popup dialogs:
Again, enhancing for better OCR results with above ImageMagic method:
Now Tesseract OCR reads 90411 correctly.
However, now it can't read 90409.
How should I process the images to read both 90409 and 90411?
p.s:
Tesseract version:
tesseract 4.0.0-beta.3
leptonica-1.77.0 (Sep 10 2018, 11:35:46) [MSC v.1915 LIB Release x64]
libgif 5.1.4 : libjpeg 9b : libpng 1.6.35 : libtiff 4.0.9 : zlib 1.2.11 : libw
ebp 1.0.0 : libopenjp2 2.3.0
Found AVX
Found SSE
I am using Tesseract with python on windows as following:
config = ('--tessdata-dir "tessdata" -l eng --oem 1 --psm 3')
text = pytesseract.image_to_string(Image.open(filename), config=config)
tesseract python-tesseract
tesseract python-tesseract
asked Mar 25 at 9:38
Deepak GarudDeepak Garud
5904 silver badges10 bronze badges
5904 silver badges10 bronze badges
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55334889%2fwhy-does-tesseract-ocr-get-affected-by-a-top-row-of-two-pixels-and-how-can-i-ge%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55334889%2fwhy-does-tesseract-ocr-get-affected-by-a-top-row-of-two-pixels-and-how-can-i-ge%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown