Tesseract OCR with Python using PyTesseract

Introduction

What's up guys! In the last post or series we talked about NPM packages and I hope you liked that. Today, we will be experimenting and having some fun with tesseract OCR software. In this post we will do tesseract for Python and in the next post we will do tesseract for JavaScript and maybe, for TypeScript.

Installation

To get started with PyTesseract we need to install Tesseract executable. To install Tesseract for macOS, run brew install tesseract or for Linux, run sudo apt-get install tesseract-ocr. After this we are ready to install the PyTesseract library. Do that with the command, pip install pytesseract.

Usage

Okay, so PyTesseract is very simple and easy to use and does not have much configurations or customizations but is very powerful. Start by writing the following code:

from PIL import Image
import pytesseract

The above import statements import PIL which stands for Pillow which is a Python image manipulation library. You need to install it through the command, pip install pillow. Now to actually convert image to string, write the following code:

print(pytesseract.image_to_string(Image.open('<image_path>')))

The above code essentially opens the image with the Image module loaded above from PIL and our PyTesseract library converts that opened image to string and then we are printing onto the terminal. Now by default PyTesseract uses English as the language but you can specify the language as well by lang parameter. So our above code would like this:

print(pytesseract.image_to_string(Image.open('<image_path>'), lang='<lang_code>'))

The above code will convert the image to string but only if it sees that the string would be in that language.

Conclusion

Okay, so that wasn't too hard. And I know you don't want to believe it but that was it, I mean there were a few more methods and properties but I want you to play with them and check out the docs for PyTesseract here and comment down below listing the different methods it supports so that I (am too lazy to learn) can learn from your comments. Well that was it for today, hoped you liked it and I don't need to remind you to like this post and react with some other emojis as well.

P.S. - Share this post

Web Tech