An OCR (Optical Character Recognition) application that extracts text from images using Tesseract OCR engine with a user-friendly GUI.
Paradorn Katananon
- 📷 Single Image Processing: Extract text from individual images
- 📁 Batch Processing: Process multiple images in a folder at once
- ✏️ Editable Preview: Review and edit extracted text before saving
- 🌍 Multi-language Support: English, Chinese (Simplified/Traditional), Spanish, French, German, Japanese, Korean
- ⚙️ Configurable OCR Settings: Adjust page segmentation modes for better accuracy
- 💾 Flexible Saving: Save individual files or batch process with automatic file naming
- 🔄 Responsive UI: Multi-threaded processing keeps the interface responsive
- Python 3.x
- Tesseract OCR engine
- Required Python packages:
pytesseractPillow (PIL)tkinter(usually included with Python)
-
Install Tesseract OCR:
- Windows: Download from GitHub Tesseract Releases
- macOS:
brew install tesseract - Linux:
sudo apt-get install tesseract-ocr
-
Install Python dependencies:
pip install pytesseract Pillow
-
Configure Tesseract path (if needed): If Tesseract is not in your system PATH, add this line to the code:
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
python img2txt.py- Click "📷 Select Single Image"
- Choose an image file (PNG, JPG, JPEG, TIFF, BMP, GIF)
- Review the extracted text in the preview area
- Edit the text if needed
- Click "💾 Save Text to File" to save
- Click "📁 Select Folder (Batch)"
- Select a folder containing multiple images
- All images will be processed automatically
- Text files are saved in an
extracted_textssubfolder - Review the processing summary
Choose the appropriate language for better OCR accuracy:
eng- Englishchi_sim- Chinese Simplifiedchi_tra- Chinese Traditionalspa- Spanishfra- Frenchdeu- Germanjpn- Japanesekor- Korean
Select the page segmentation mode based on your image type:
- PSM 3 (Auto): Fully automatic page segmentation. Best for general documents when you're unsure.
- PSM 6 (Block): Single uniform block of text. Ideal for clean documents, books, single-column text.
- PSM 11 (Sparse): Sparse text with no particular order. Good for screenshots, forms, receipts, or scattered text.
- PSM 12 (Sparse + OSD): Sparse text with orientation and script detection. Same as PSM 11 but handles rotated text.
- PNG
- JPG/JPEG
- TIFF
- BMP
- GIF
img2txt/
├── img2txt.py # Main application file
├── README.md # This file
└── extracted_texts/ # Created automatically for batch processing
- Use high-resolution images: Better image quality = better text recognition
- Ensure good contrast: Black text on white background works best
- Avoid skewed images: Straighten images for better accuracy
- Choose correct language: Select the language that matches your image text
- Adjust PSM mode: Experiment with different segmentation modes for your specific use case
- Ensure Tesseract OCR is installed
- Add Tesseract to your system PATH
- Or specify the path in the code
- Check image quality and resolution
- Try different page segmentation modes
- Ensure correct language is selected
- Pre-process images (enhance contrast, remove noise)
- Install additional Tesseract language packs
- Verify language data files are in Tesseract's
tessdatafolder
This project is licensed under the MIT License. See the LICENSE file for details.
1.0
Created by Paradorn Katananon