Introduction
What up guys! In this post, we will have fun with Tesseract.js and even though there are like a thousand other OCR packages, this is the original Tesseract OCR wrapper. You could experiment and play with those other packages but I won't be covering them. And if you do play with them tell me which one did you like more.
Installation
Before we get into code, we need to install the package. So open up your console and type, npm init -y
and then npm i tesseract.js
and I don't need to tell you that we need Node.js and NPM. Make sure the package is installed properly and then create a file named, index.js
.
Get into code
The first thing we need is the package itself, so add this import statement at the top, const { createWorker } = require('tesseract.js')
. In this line, we are just importing a function, createWorker
from the package we installed. After this add the following code:
const tessWorker = createWorker()
async function ocrwithtesseractjs() {
await tessWorker.load()
await tessWorker.loadLanguage('eng')
await tessWorker.initialize('eng')
}
The code above just creates an instance of tesseract worker and then loads it in an async function we named, ocrwithtesseractjs
. This will allow us in doing OCR. In the function we loaded our worker and then loaded language English. After that we simply initialised our language. Our worker has a recognize
method which we can use to recognize the text from images. Amend the following code to the function:
const { data: { recognizedText } } = await tessWorker.recognize('<your_image_path>');
This line of code will recognize and return a data object with the recognized text. Now we can terminate our worker with the following line of code:
await tessWorker.terminate()
Let's return the text we got from the recognize
function with this line of code:
return text
Test it
Now let's test our code by adding the following line of code in your file:
ocrwithtesseractjs().then(console.log)
Okay then, call your script by typing node index
or node index.js
. This will run the script and if we are lucky, it will work. So if it did work for you then comment down below, if it didn't for some reason and you faced an error, then comment down the error and I will help you. But if you notice closely it didn't find all the text. That is because Tesseract works in what is known as SINGLE_BLOCK
mode. Fortunately though, a worker instance has a setParameters
function which changes the default behaviour of Tesseract.
Changing tesseract parameters
Here I will be using the tessedit_pageseg_mode
which is a page segmentation mode. To work with this mode we need to import PSM. We can do that with this import line:
const PSM = require('tesseract.js/src/constants/PSM.js')
Finally, we can call the setParameters
function. For the example, we will use AUTO mode and let the engine find all lines. Your complete function code should look something like this:
async function ocrwithtesseractjs() {
await tessWorker.load()
await tessWorker.loadLanguage('eng')
await tessWorker.initialize('eng')
await tessWorker.setParameters({
tessedit_pageseg_mode: PSM.AUTO,
})
const { data: { recognizedText } } = await worker.recognize('<your_image_path>');
await worker.terminate()
return recognizedText
}
Conclusion
Hopefully, it worked for you and you learnt a useful skill from this post. If you did then react to this post with a couple of emojis and if you didn't learn anything then still react to this post with a couple of emojis. Thank you for reading this post and I will see you in the next one.
Here is the complete code:
async function ocrwithtesseractjs() {
await tessWorker.load()
await tessWorker.loadLanguage('eng')
await tessWorker.initialize('eng')
await tessWorker.setParameters({
tessedit_pageseg_mode: PSM.AUTO,
})
const { data: { recognizedText } } = await worker.recognize('<your_image_path>');
await worker.terminate()
return recognizedText
}
ocrwithtesseractjs().then(console.log)