OCR With Tesseract

Amuda Adelou shows how to use Tesseract’s Java API to perform character recognition in images:

Extracting text from an image means that you are considering the flowchart imagery that’s processed to extract the text components and then extracting the geometrical shapes components. The text components are extracted with geometrical components, as well. The internal relationship between the components is set up by tracing the flow lines that connect different components. The extracted components are output to metadata (in XML format), which is machine-readable. This metadata can be archived, stored in a knowledge base, or shared with others.

Click through for a demo app and code.

Related Posts

Clippy Lives: In Scala

Akhil Vijayan explains Scala Clippy: Now you may be wondering how these errors are identified and we get advice related to it. Simple, these are provided by the Scala community. If you visit their official website Scala Clippy where you can find a tab “Contribute”. Under that, we can post our own errors. These errors are parsed […]

Read More

Bashing Windows

Steve Jones shows how to install Bash on Windows 10: With SQL Server coming on Linux, some people will want to learn a bit of Linux. Or perhaps they need to get re-acquainted with the OS, which is my situation. I grew up on DOS, but moved to Unix in university. I’ve dabbled in Linux […]

Read More

Categories

April 2017
MTWTFSS
« Mar May »
 12
3456789
10111213141516
17181920212223
24252627282930