Tesseract is a free optical character recognition engine developed originally by HP and currently being maintained by Google. It has been voted as one of the best OCR engine in the world. It has no layout engine, no output formatting and no GUI. It has been trained to perform recognition on many languages like English, French, German etc. It can also be taught to recognize other languages. Currently it can only read tiff and bmp images.
module load tesseract tesseract imagename textfileoutputname