Skip to content

A simple Python tool to extract text from screenshots using Tesseract OCR. Includes a small web GUI for uploading images.

Notifications You must be signed in to change notification settings

Mahmoud-Emad/snaptext

Repository files navigation

SnapText

Tests Lint CI

A powerful OCR tool to extract text from images with enhanced accuracy and modern UI.

Features

  • Enhanced OCR Accuracy: Multiple preprocessing techniques for better text recognition
  • Modern Web Interface: Google Material Design-inspired UI with image preview
  • Command Line Interface: Full-featured CLI with confidence scoring
  • Image Preview: See your uploaded images before processing
  • OCR Quality Metrics: Real-time confidence scores and quality assessment
  • Copy to Clipboard: One-click text copying
  • Responsive Design: Works on desktop and mobile devices
  • Cross-platform: Supports Linux, macOS, and Windows

OCR Improvements

SnapText uses advanced image preprocessing techniques to improve text extraction accuracy:

  • Multiple OCR Methods: Tries different approaches and selects the best result
  • Image Enhancement: Contrast, sharpness, and noise reduction
  • Adaptive Thresholding: Handles varying lighting conditions
  • Scale Optimization: Automatically scales images for better recognition
  • Confidence Scoring: Provides quality metrics for extracted text

Requirements

  • Python 3.11+
  • Tesseract OCR
  • OpenCV (automatically installed)
  • NumPy (automatically installed)

Quick Start

git clone https://github.com/yourname/snaptext.git
cd snaptext
make install    # Installs Python, Poetry, Tesseract, and dependencies
make runserver  # Start the web interface

Installation

Automatic Installation (Recommended)

make install

This will automatically install:

  • Python 3.11+ (if not present)
  • Poetry (if not present)
  • Tesseract OCR
  • All project dependencies

Manual Installation

# Install Tesseract OCR first
# On macOS: brew install tesseract
# On Ubuntu: sudo apt-get install tesseract-ocr

# Install project dependencies
poetry install

Usage

Web Interface

make runserver
# or
poetry run python server/server.py

# For development with debug mode (not recommended for production)
FLASK_DEBUG=true poetry run python server/server.py

Open http://127.0.0.1:5000 in your browser.

Security Note: Debug mode is disabled by default for security. Only enable it during development by setting the FLASK_DEBUG=true environment variable.

Features:

  • Drag and drop image upload
  • Image preview with metadata
  • Real-time OCR quality assessment
  • One-click text copying
  • Responsive design

Command Line Interface

# Basic usage
make runcli -- image.png

# With confidence information
make runcli -- image.png --confidence

# Save to file with verbose output
make runcli -- image.png --output extracted.txt --verbose

# Direct poetry commands
poetry run python cli/cli.py image.png --help

Available Make Commands

make help        # Show all available commands
make install     # Install all dependencies
make runserver   # Start web server
make runcli      # Run CLI tool
make check-deps  # Check dependency status
make clean       # Clean temporary files

# Testing commands
make test              # Run all tests
make test-unit         # Run fast unit tests
make test-integration  # Run integration tests
make test-performance  # Run performance tests (slow)
make test-coverage     # Run tests with coverage report
make test-core         # Test core OCR functionality
make test-server       # Test Flask server
make test-cli          # Test CLI interface

# Code quality commands
make lint              # Run code linting
make format            # Format code with black and isort
make install-dev       # Install development dependencies

About

A simple Python tool to extract text from screenshots using Tesseract OCR. Includes a small web GUI for uploading images.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published