Search

Home > Hacker Public Radio > HPR3315: tesseract optical character recognition
Podcast: Hacker Public Radio
Episode:

HPR3315: tesseract optical character recognition

Category: Technology
Duration: 00:00:00
Publish Date: 2021-04-16 00:00:00
Description:

Tesseract (software)

From Wikipedia, the free encyclopedia

Tesseract is an optical character recognition engine for various operating systems. It is free software, released under the Apache License. Originally developed by Hewlett-Packard as proprietary software in the 1980s, it was released as open source in 2005 and development has been sponsored by Google since 2006.
In 2006, Tesseract was considered one of the most accurate open-source OCR engines then available.


$ tesseract -l eng english-page.jpg english
$ tesseract -l nld dutch-page.jpg dutch
$ ls
dutch.txt english.txt 

Total Play: 0

Users also like

90+ Episodes
Seginfocast .. 100+     20+