OCR With Tesseract

Amuda Adelou shows how to use Tesseract’s Java API to perform character recognition in images:

Extracting text from an image means that you are considering the flowchart imagery that’s processed to extract the text components and then extracting the geometrical shapes components. The text components are extracted with geometrical components, as well. The internal relationship between the components is set up by tracing the flow lines that connect different components. The extracted components are output to metadata (in XML format), which is machine-readable. This metadata can be archived, stored in a knowledge base, or shared with others.

Click through for a demo app and code.

Related Posts

Static Site Generation With Hugo

Steph Locke explains how to build a simple site using Hugo: This site uses Hugo. Hugo is a “static site generator” which means you write a bunch of markdown and it generates html. This is great for building simple sites like company leafletware or blogs. You can get Hugo across platforms and on Windows it’s […]

Read More

Using Azure Cloud Shell

Jeffrey Verheul shows off a bit of Azure Cloud Shell: Connecting to a database Now that your Cloud Shell is ready to go, you can start using Bash. This means you can also use sqlcmd from within Bash. You can connect to a database with sqlcmd, by using the following command: sqlcmd -S servername.database.windows.net -U […]

Read More

Categories

April 2017
MTWTFSS
« Mar May »
 12
3456789
10111213141516
17181920212223
24252627282930