Skip to content

ecohydro/llava_shot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

LLaVA-Shot: Zero-Shot Sentinel-2 Classification

Zero-shot land use classification for Sentinel-2 satellite imagery using LLaVA vision-language models. Initial benchmark: 38% accuracy on 10-class EuroSAT test (50 samples, 5 per class) using task-focused prompting.

πŸ“Š See detailed results and methodology β†’

Overview

This project evaluates LLaVA's capability for zero-shot classification of Sentinel-2 satellite imagery without any satellite-specific training. Using task-focused prompts with the full 10-class EuroSAT taxonomy and True Color RGB composites, we demonstrate targeted capabilities on specific land use types.

Key Results (Initial Test: 50 samples across 10 classes)

  • 38% overall accuracy on 10-class EuroSAT benchmark
  • 100% precision on River and Industrial classes
  • Zero training required - no fine-tuning or training data
  • ~3 seconds per image on M4 Max with llava:13b

Note: These are preliminary results from a small test set (5 samples per class). Larger-scale validation needed to confirm performance.

Best Use Cases

βœ… Strong Performance:

  • River detection (80% recall, 100% precision)
  • Industrial building identification (60% recall, 100% precision)
  • Residential area detection (100% recall, 45% precision)

❌ Challenging:

  • Fine-grained vegetation discrimination (AnnualCrop vs. PermanentCrop vs. Pasture vs. Forest)
  • Large water body detection (SeaLake: 0% recall)
  • Comprehensive 10-class land use classification

Quick Start

Prerequisites

  • Apple Silicon Mac (M3/M4 recommended) or Linux/Windows with GPU
  • Python 3.12+
  • Ollama for local LLaVA inference
  • EuroSAT dataset (optional, for benchmarking)

Installation

# Clone repository
git clone https://github.com/ecohydro/llava_shot.git
cd llava_shot

# Install with uv (recommended)
uv pip install -e .

# Or with pip
pip install -e .

Install Ollama and LLaVA

# Install Ollama
brew install ollama  # macOS
# or download from https://ollama.ai/

# Start Ollama service
ollama serve

# Pull LLaVA model (in a new terminal)
ollama pull llava:13b  # Recommended for benchmarking
# or
ollama pull llava:7b   # Faster, lower accuracy

Run Benchmark

# Quick test (5 samples per class, 10-class taxonomy)
python scripts/benchmark_eurosat.py --n-per-class 5 --prompt-style eurosat10

# Larger test (20 samples per class)
python scripts/benchmark_eurosat.py --n-per-class 20 --prompt-style eurosat10

# Test specific classes only (using EuroSAT class names)
python scripts/benchmark_eurosat.py --n-per-class 10 --classes Industrial River Residential

Note: EuroSAT dataset must be downloaded separately. See EuroSAT for download instructions. Place in eurostat/ directory.

Benchmark Dataset

We use EuroSAT for rigorous validation:

  • 27,000 labeled Sentinel-2 patches (64Γ—64 pixels, 13 bands)
  • 5,400 test samples with expert ground truth
  • 10 land use classes from across Europe:
    • AnnualCrop, Forest, HerbaceousVegetation, Highway, Industrial
    • Pasture, PermanentCrop, Residential, River, SeaLake

This provides real, expert-labeled ground truth rather than synthetic labels.

Approach: Task-Focused Prompting

Instead of explaining spectral theory, we define LLaVA's role explicitly with the full 10-class EuroSAT taxonomy:

"""You are a land cover classifier analyzing a Sentinel-2 satellite image.

**CLASSIFICATION TASK:**
Classify this image into ONE of the 10 EuroSAT land use classes below.

**EUROSAT CLASSES:**
1. **AnnualCrop** - Annual cropland
2. **Forest** - Areas with dense tree cover
3. **HerbaceousVegetation** - Natural grasslands, meadows
4. **Highway** - Major roads, highways
5. **Industrial** - Industrial buildings, factories
6. **Pasture** - Managed grassland for grazing
7. **PermanentCrop** - Orchards, vineyards
8. **Residential** - Houses, residential buildings
9. **River** - Rivers and streams
10. **SeaLake** - Seas, lakes, large water bodies

**OUTPUT FORMAT:**
CLASS: [exact class name from above]
CONFIDENCE: [high/medium/low]
"""

Key Design Principles:

  • βœ… Clear task definition
  • βœ… Explicit class taxonomy with visual cues
  • βœ… Structured output format
  • βœ… No spectral theory or band math
  • βœ… Simple True Color RGB input

Project Structure

llava_shot/
β”œβ”€β”€ src/llava_shot/
β”‚   β”œβ”€β”€ data/
β”‚   β”‚   β”œβ”€β”€ download.py           # Sentinel-2 scene download (AWS STAC)
β”‚   β”‚   └── eurosat_loader.py     # EuroSAT benchmark loader
β”‚   β”œβ”€β”€ processing/
β”‚   β”‚   β”œβ”€β”€ bands.py              # Band reading and resampling
β”‚   β”‚   β”œβ”€β”€ composites.py         # RGB composite generation
β”‚   β”‚   └── indices.py            # NDVI, NDWI, NDBI calculation
β”‚   β”œβ”€β”€ classification/
β”‚   β”‚   β”œβ”€β”€ llava_interface.py    # Ollama API wrapper
β”‚   β”‚   └── task_prompts.py       # Task-focused prompts
β”‚   └── validation/
β”‚       └── metrics.py            # Accuracy, confusion matrix
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ benchmark_eurosat.py      # Main benchmark script
β”‚   β”œβ”€β”€ quickstart.py             # Download and visualize scenes
β”‚   └── demo_classification.py    # Interactive classification demos
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ raw/                      # Downloaded Sentinel-2 scenes
β”‚   β”œβ”€β”€ processed/                # Generated composites
β”‚   └── validation_eurosat/       # Benchmark results
β”œβ”€β”€ eurosat/                      # EuroSAT dataset (not in git)
β”œβ”€β”€ RESULTS.md                    # Detailed results and analysis
└── README.md                     # This file

Usage Examples

Benchmark on EuroSAT

from llava_shot.data.eurosat_loader import EuroSATLoader
from llava_shot.classification.llava_interface import LLaVAClassifier
from llava_shot.classification.task_prompts import get_task_prompt

# Load EuroSAT dataset
loader = EuroSATLoader()
samples = loader.sample_dataset(n_per_class=5)

# Initialize classifier
classifier = LLaVAClassifier(model="llava:13b")
prompt = get_task_prompt("classifier")

# Classify samples
for eurosat_class, filename, simple_class in samples:
    bands, metadata = loader.load_patch(eurosat_class, filename)
    rgb = loader.create_rgb_composite(bands, "true_color")

    response = classifier.classify(rgb, prompt, temperature=0.1)
    print(f"Ground truth: {simple_class}, Prediction: {response}")

Download and Classify Sentinel-2 Scene

from llava_shot.data.download import download_sentinel2_scene
from llava_shot.processing.composites import CompositeGenerator
from llava_shot.classification.llava_interface import LLaVAClassifier

# Download scene
scene_dir = download_sentinel2_scene(
    bbox=[-120.5, 34.4, -120.3, 34.5],  # Santa Barbara
    date_range=("2024-09-01", "2024-09-30"),
    max_cloud_cover=20
)

# Generate True Color composite
generator = CompositeGenerator(scene_dir)
rgb = generator.generate_composite("true_color", target_resolution=10)

# Classify
classifier = LLaVAClassifier(model="llava:13b")
result = classifier.classify_land_cover(rgb)
print(result)

Performance & Limitations

Strengths

  • No training data or fine-tuning required
  • Fast inference (~3 sec/image on consumer hardware)
  • Excellent precision on River (100%) and Industrial (100%) classes
  • Works with standard True Color RGB composites

Limitations

  • 38% accuracy on 10-class EuroSAT falls well short of supervised methods (90%+)
  • Cannot distinguish fine-grained vegetation types reliably
  • Complete failures on AnnualCrop, Forest, and SeaLake (0% recall each)
  • Limited to RGB visual information; cannot leverage 13-band spectral data
  • Fine distinctions require spectral indices (NDVI, EVI) beyond LLaVA's capabilities

See RESULTS.md for detailed analysis.

Development Status

Current Phase: Benchmarking Complete βœ…

  • Project structure and dependencies
  • Sentinel-2 data download (AWS STAC)
  • Band processing and RGB composites
  • Spectral index calculation (NDVI, NDWI, NDBI, EVI)
  • LLaVA integration via Ollama
  • Task-focused prompt engineering
  • EuroSAT benchmark dataset integration
  • Validation metrics and confusion matrix
  • Results documentation

Next Steps:

  • Few-shot learning experiments
  • Larger patch sizes (128Γ—128, 256Γ—256)
  • Multi-temporal analysis
  • Comparison with specialized satellite vision models

Contributing

This is a research project evaluating zero-shot satellite image classification. Contributions welcome!

Areas for improvement:

  • Prompt engineering for better grassland/forest discrimination
  • Few-shot learning implementations
  • Integration with other vision-language models
  • Comparison benchmarks

Citation

If you use this project in your research, please cite:

@software{llava_shot_2025,
  title = {LLaVA-Shot: Zero-Shot Sentinel-2 Classification},
  author = {Caylor, Kelly},
  year = {2025},
  url = {https://github.com/ecohydro/llava_shot}
}

License

TBD

References

  • EuroSAT: Helber et al., "EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification", IEEE JSTARS 2019
  • Sentinel-2: ESA Sentinel-2 Mission
  • LLaVA: Liu et al., "Visual Instruction Tuning", NeurIPS 2023
  • Ollama: https://ollama.ai/

Acknowledgments

Developed on M4 Max MacBook Pro (128GB RAM) with local LLaVA inference using Metal Performance Shaders. EuroSAT dataset provided by the German Research Center for Artificial Intelligence (DFKI).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages