As per the recent Gartner report, automation with AI is one of the most trending innovations of this era to automate mundane and complex business process tasks.

google cloud vision

Every day billions of documents like orders, invoices, payrollls, business cards, handwritten letters, tax documents, legal filing, and even boarding passes, are exchanged between the desperate system in enterprise applications. The documents come in a variety of formats like doc, pdf, text and even scanned documents. Document formats like PDF, doc, text are easy to process than the scanned document images. Images can have textual details which are required to read for making documents searchable and other key workflows in any business. OCR is one of the most widely used technique to extract textual information from images.

OCR plays a major role in this automation. It allows individuals to convert hard-copy content into digital files. It helps in data entry industry for easy text search and processing.

OCR supported by many mobility devised has a limitless possibility. For example, Smart Glasses has the ability to read the serial number and which can be very useful to courier or logistic services sector to equip their employees for faster freight handling.

Read more: Page Load Optimization by Progressive Image Loading (like Medium)

Let’s create a small app which uses the power of OCR technology to read text from images. We are going to use Google Cloud Vision to achieve this.

In my previous article, we set up of Google Cloud Vision account, setup of credentials required to access the API. We created a script which identifies objects from the image. In this article, we will learn how to use the OCR capability provided by Google Cloud Vision.

Here I assume that you have followed all the steps required to setup Google Cloud Vision from my previous post. So let’s move forward.  We are going to use the same gem google-cloud-vision in this script as well.

Here I have attached a sample image from which we will extract text from:

We are going to use Google::Cloud::Vision::ImageAnnotator#document_text_detection of  Google Cloud Vision’s ruby API wrapper.

Please make sure you set GOOGLE_APPLICATION_CREDENTIALS environment variable is set before running the script.

$ export GOOGLE_APPLICATION_CREDENTIALS= <path to your key.json file>

You can use #document_text_detection method to detect text from the image. Here is a super simple script to extract text using Ruby gem for AWS SDK.

You should be able to see the following output by running the above script.

Extracting text from image

Voila!!! How easy it is to extract text from images. We can index extracted text to make image searchable. Hope you enjoy this post.

I will keep posting some similar amazing articles so keep following me on Twitter.

Have a happy coding!

Click here for more details…


At BoTree Technologies, we build enterprise applications with our RoR team of 25+ engineers.

We also specialize in RPA, AI, Python, Django, JavaScript and ReactJS.

Consulting is free – let us help you grow!