So, the first week of my internship at freshlybuilt has just completed. I am very excited about this project. The most exciting part about this project was that we’ll be building a python library of our own.
The week started with lots of introduction and discussions about the project. We were asked t present our ideas related to the project. Our primary task started with testing out various OCR tools with good accuracy in text extraction over all kind of text images.
I had some familiarity with the well known OCR tool Tesseract. But its performance usually goes down on some camera clicked images because of the noise and bad quality of images.
So, we found out about various other OCR tools such as – Calamari, OCRopus, Kraken etc. But sadly all these could not compete well with Tesseract.
I decided to try out text extraction on a few images with different types of text. I categorised text in 4 categories –
- Printed Text : This is the most easiest case to deal with as we have various OCR tools which perform well on such texts. This is because of their structured and well formatted nature. All kinds of printed documented text from PDFs or photographs from book pages or newspapers fall in this category.
- Handwritten Text : It consists of handwritten text and documents. They are one of the hardest ones because of varying styles and different handwritings of different person.
- Cards : It consists of all kinds of cards such as ID cards, Credit/Debit cards etc. They need some image processing and are relatively easy for text extraction.
- Text in Wild : This is the most challenging OCR as all kinds of noise is present in such images. There is no structure of the text, varying lightning conditions and obstacles are present.
As of now, I begin with applying some image processing to the images and using tesseract for text extraction.
It performed well for a few documented texts :
As we know, text extraction is a two step process-
- Text detection
- Text recognition
I found out in text in wild images, tesseract was not able to detect text in the images, So I tried using EAST detector model from OpenCV for text detection and tesseract for text recognition.
OpenCV’s EAST(Efficient and Accurate Scene Text Detection ) text detector is a deep learning model, based on a novel architecture and training pattern. It is capable of
- running at near real-time at 13 FPS on 720p images and
- obtains state-of-the-art text detection accuracy.
OpenCV’s text detector implementation of EAST is quite robust, capable of localizing text even when it’s blurred, reflective, or partially obscured.
It did quite a good job in detecting text from natural scene text (text in wild).
There were some bad results too :
At last, I would like to mention about the first release (beta release) of our library. It comes with some cool function and module names which I loved the most.
You can get an insight about it from the description of our library – “Image bhi bol uthegi”.