Con el motor OCR de Tesseract, los formatos pdf, pdfa y txt son compatibles.

Idiomas admitidos: Afrikaans (afr), Albanian (sqi), Azerbaijani (aze), Belarusian (bel), Bosnian (bos), Breton (bre), Bulgarian (bul), Catalan (cat), Cebuano (ceb), Corsican (cos), Croatian (hrv), Czech (ces), Danish (dan), Dutch/Flemish (ndl), English (eng), English Middle 1100-1500 (enm), Esperanto (epo), Estonian (est), Faroese (fao), Filipino (fil), Finnish (fin), French (fra), Gaelic (gla), Galician (glg), German (deu), Haitian (hat), Hebrew (heb), Hungarian (hun), Icelandic (ici), Indonesian (ind), Irish (gle), Italian (ita), Japanese (jpn), Javanese (jav), Kyrgyz (kir), Latin (lat), Latvian (lav), Lithuanian (lit), Macedonian (mkd), Malay (msa), Maltese (mlt), Maori (mri), Norwegian (nor), Occitan (oci), Polish (pol), Portuguese (por), Quechua (que), Romanian/Moldovan (ron), Russian (rus), Serbian (srp), Serbian Latin (srp_latn), Slovak (slk), Slovenian (slv), Spanish (spa), Sundanese (sun), Swahili (swa), Swedish (swe), Tajik (tgk), Tonga (ton), Turkish (tur), Ukrainian (ukr), Uzbek (uzb), Uzbek Cyrlic (uzb_cyrl), Vietnamese (vie), Welsh (cym), Western Frisian (fry), Yoruba (yor), Азəрбајҹан, ქართული ენა - Georgian.

Si se seleccionan varios idiomas, se tardará mucho más en procesar los archivos.

Para más información sobre el motor, consulte la documentación dedicada de su desarrollador.