Incase you missed it, here are the prequels to this article about the Mobile Vision API. The first post was on the Face Detection API while the second was on the Barcode Detection API.

Text Detection API

According to the overview, the Text Detection API allows for detecting text in images and videos and it breaks down those texts into blocks (paragraphs/columns), lines (sets of words on the same vertical axis) and words (set of alphanumeric characters on the same vertical axis). The API recognizes text in various Latin based languages.

Potential applications

I’ll write about what’s possible with this API before I go ahead to explain how to use it .

  1. Imagine you were invited to attend the Firebase Dev Summit in Berlin and you got all excited, but you didn’t know a word of German. The only foreign language you understand (apart from English) is some Spanish you picked up on Duolingo two years ago. Tough huh? How would you communicate if all the signs were in German? Typing the text into Google Translate ALL THE TIME was going to be out of the question because you know German words have a reputation for being unusually long. An option would be to have some sort of way to detect any text you want translated and be able to receive translations in your preferred language immediately. The Text Detection API will help with detecting the text but the Google Translate API will be used for the translations.
  2. Another possibility is converting a very large amount of text (e.g. from a book) into digital format. The traditional method would be to scan each page, which might damage the book. With the Text Detection API, all that’ll be needed is a device with a camera for focusing on the text and maybe some API that uploads the recognized text to a server.

Getting started

Here, we are going to detect text from a default image preloaded in an app using the Text Detection API. I initially wanted to take this a step further by translating that text into a specified language as described in the scenario above but I left that part out when I discovered the Google Translate API is billed per usage.

Here we go (again)…

compile 'com.google.android.gms:play-services-vision:9.8.0'

<meta-dataandroid:name="com.google.android.gms.vision.DEPENDENCIES"android:value="ocr"/>

Bitmap textBitmap = BitmapFactory.decodeResource(getResources(), R.drawable.cute_cat_image);

TextRecognizer textRecognizer = new TextRecognizer.Builder(this).build();

if (!textRecognizer.isOperational()) {new AlertDialog.Builder(this).setMessage("Text recognizer could not be set up on your device :(").show();return;}

Frame frame = new Frame.Builder().setBitmap(textBitmap).build();SparseArray<TextBlock> text = textRecognizer.detect(frame);

for (int i = 0; i < text.size(); i++) {TextBlock textBlock = text.valueAt(i);if (textBlock != null && textBlock.getValue() != null) {detectedText += textBlock.getValue();}}detectedTextView.setText(detectedText);

textRecognizer.release();

Please note that all these operations should be carried out on a background thread. The code from this article is on Github here.

References:

Text Recognition API Overview | Mobile Vision | Google Developers_Text recognition is the process of detecting text in images and video streams and recognizing the text contained…_developers.google.com

See and Understand Text using OCR with Mobile Vision Text API for Android_Optical Character Recognition (OCR) gives a computer the ability to read text that appears in an image, letting…_codelabs.developers.google.com

moyheen/text-detector_text-detector - This application contains all the code from my article on the Text Detector API._github.com

Thought this was great? Please don’t forget to “Recommend” and “Share”.