GitHub - The-Obstacle-Is-The-Way/gigapixel-goblin: GIANT-style WSI navigation env plus MultiPathQA eval runner. Plug in any VLM, zoom with bboxes, log trajectories, and score accuracy.

🔬 GIANT
Gigapixel Image Agent for Navigating Tissue

The Problem: Whole-slide pathology images contain billions of pixels—10,000× more than an LLM can see at once. Previous approaches used blurry thumbnails or random patches, severely underestimating what frontier models can do.

The Solution: GIANT lets LLMs navigate gigapixel images like pathologists do—iteratively pan, zoom, and reason across the slide until they can answer a diagnostic question.

"GPT-5 with GIANT achieves 62.5% accuracy on pathologist-authored questions, outperforming specialist pathology models such as TITAN (43.8%) and SlideChat (37.5%)." — Buckley et al., 2025

How It Works

1. LOAD        →  Open gigapixel WSI, generate thumbnail with coordinate guides
2. OBSERVE     →  LLM sees current view + conversation history
3. REASON      →  "I see suspicious tissue at (45000, 32000). Let me zoom in..."
4. ACT         →  Crop high-resolution region OR provide final answer
5. REPEAT      →  Continue until confident diagnosis (max 20 steps)

The agent accumulates evidence across multiple zoom levels—just like a pathologist scanning a slide.

Quick Start

# Install
uv sync && source .venv/bin/activate

# Configure API
export OPENAI_API_KEY=sk-...

# Run on a slide
giant run /path/to/slide.svs -q "What type of tissue is this?"

# Run benchmark (requires MultiPathQA CSV + WSI files; see docs/data/data-acquisition.md)
giant benchmark gtex --provider openai -v

Benchmark Results

Evaluated on MultiPathQA—934 questions across 862 unique whole-slide images.

Benchmark	Task	Our Result	Paper (GIANT)	Paper (GIANT x5)	Thumbnail Baseline
GTEx	Organ Classification (20-way)	70.3%†	53.7% ± 3.4%	60.7% ± 3.2%	36.5% ± 3.4%
ExpertVQA	Pathologist-Authored (128 Q)	60.1%	57.0% ± 4.5%	62.5% ± 4.4%	50.0% ± 4.4%
SlideBench	Visual QA (197 Q)	51.8%	58.9% ± 3.5%	59.4% ± 3.4%	54.8% ± 3.5%
TCGA	Cancer Diagnosis (30-way)	26.2%	32.3% ± 3.5%	29.3% ± 3.3%	9.2% ± 1.9%
PANDA	Prostate Grading (6-way)	20.3%	23.2% ± 2.3%	25.4% ± 2.0%	12.2% ± 2.2%

_{†GTEx: 70.3% on 185/191 scored items; 67.6% ± 3.1% paper-faithful (6 parse errors counted incorrect). Both exceed paper.}

All 5 MultiPathQA benchmarks complete. See docs/results/benchmark-results.md for detailed analysis.

Key findings:

GTEx (70.3%) and ExpertVQA (60.1%) exceed the paper's single-run GIANT results
Agent navigation provides up to ~3× improvement over thumbnail baselines
Total benchmark cost: $124.64 across 934 questions

Supported Models

Provider	Model	Status
OpenAI	`gpt-5.2`	✅ Default
Anthropic	`claude-sonnet-4-5-20250929`	✅ Supported
Google	`gemini-3-pro-preview`	🔜 Planned

Documentation

Section	Description
Installation	Environment setup
Quickstart	First inference in 5 minutes
Architecture	System design and components
Algorithm	Navigation loop explained
Running Benchmarks	Reproduce paper results
Configuring Providers	API key setup
Data Acquisition	Download WSI files (~500 GiB)
CLI Reference	Command-line options

Why This Matters

For Clinicians: Frontier LLMs can now reason over full pathology slides—not just patches. This opens doors for AI-assisted diagnosis, second opinions, and education.

For Researchers: A reproducible benchmark (MultiPathQA) and framework for evaluating LLM capabilities on gigapixel medical images. Proves that how you test matters as much as what you test.

For Developers: Production-ready implementation with 90% test coverage, strict typing, resumable benchmarks, cost tracking, and trajectory visualization.

Citation

@article{buckley2025navigating,
  title={Navigating Gigapixel Pathology Images with Large Multimodal Models},
  author={Buckley, Thomas A. and Weihrauch, Kian R. and Latham, Katherine and
          Zhou, Andrew Z. and Manrai, Padmini A. and Manrai, Arjun K.},
  journal={arXiv preprint arXiv:2511.19652},
  year={2025}
}

Links

Paper: arXiv:2511.19652
Dataset: MultiPathQA on HuggingFace
Documentation: Full Docs

Built for reproducible research in computational pathology.

Name		Name	Last commit message	Last commit date
Latest commit History 136 Commits
.github		.github
data		data
docs		docs
scripts		scripts
src/giant		src/giant
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CONFIG.md		CONFIG.md
Dockerfile		Dockerfile
GEMINI.md		GEMINI.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How It Works

Quick Start

Benchmark Results

Supported Models

Documentation

Why This Matters

Citation

Links

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

How It Works

Quick Start

Benchmark Results

Supported Models

Documentation

Why This Matters

Citation

Links

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages