mcpmydocs

A local-first semantic search engine for Markdown documentation. Indexes your docs, generates embeddings using ONNX models, stores them in DuckDB, and exposes search via CLI or MCP server for AI agents like Claude Code.

Quick Start

# Install mcpmydocs and models
curl -sSL https://raw.githubusercontent.com/mattdennewitz/mcpmydocs/main/install.sh | bash

# Install ONNX Runtime
brew install onnxruntime  # macOS
# Linux: see Installation section for instructions

# Index your documentation
mcpmydocs index ~/Documents/wiki

# Search
mcpmydocs search "how to configure authentication"

Note: A Brewfile is included for macOS users building from source—run brew bundle to install all dependencies.

Features

Semantic search - Find documents by meaning, not just keywords
Cross-encoder reranking - Two-stage retrieval for improved relevance
Local-first - All processing happens on your machine, no API calls
Fast - DuckDB with HNSW vector indexing for millisecond queries
MCP server - Integrate with Claude Code or other MCP-compatible AI tools
Incremental indexing - Only re-indexes changed files

Requirements

macOS Apple Silicon, Linux x64, or Linux ARM64 (pre-built binaries)
macOS Intel (build from source)
ONNX Runtime library
Embedding model files

Installation

Quick install (macOS Apple Silicon / Linux)

curl -sSL https://raw.githubusercontent.com/mattdennewitz/mcpmydocs/main/install.sh | bash

This will:

Download the latest release for your platform
Install the binary to ~/.local/bin
Download the embedding and reranker models to ~/.local/share/mcpmydocs/models/

Note: You still need ONNX Runtime installed (see below).

For macOS Intel, see Building from source.

Manual installation

If you have a pre-built binary, you need to install the dependencies manually:

1. Install ONNX Runtime

macOS (Homebrew):

brew install onnxruntime

Linux (Ubuntu/Debian):

# Download from https://github.com/microsoft/onnxruntime/releases
# Extract and copy libonnxruntime.so to /usr/local/lib/
sudo ldconfig

Alternatively, place the library in a lib/ directory next to the binary, or set the ONNX_LIBRARY_PATH environment variable.

2. Download the models

Download the model files and place them in an assets/models/ directory next to the binary:

mkdir -p assets/models

# Download embedding model (~90MB)
curl -L -o assets/models/embed.onnx \
  "https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/onnx/model.onnx"

# Download tokenizer
curl -L -o assets/models/tokenizer.json \
  "https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/tokenizer.json"

# Download reranker model (~90MB, optional but recommended)
curl -L -o assets/models/rerank.onnx \
  "https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-6-v2/resolve/main/onnx/model.onnx"

The application automatically searches for models in:

~/.local/share/mcpmydocs/models/ (install script location)
assets/models/ relative to the binary
assets/models/ in the current working directory

3. Verify installation

./mcpmydocs --help

Building from source

Requires Go 1.24+ and Homebrew (macOS).

1. Clone and build

git clone https://github.com/mattdennewitz/mcpmydocs.git
cd mcpmydocs
make all

This will:

Install system dependencies (DuckDB, ONNX Runtime) via Homebrew
Download Go modules
Download the embedding model, reranker model, and tokenizer
Build the binary to dist/mcpmydocs

2. Verify installation

mcpmydocs --help

Configuration

Database location

The database file mcpmydocs.db is created in the current working directory by default. Use the --db flag to specify a custom path:

mcpmydocs index ~/Documents/wiki --db ~/data/mcpmydocs.db
mcpmydocs search "query" --db ~/data/mcpmydocs.db

Environment variables

Variable	Description
`ONNX_LIBRARY_PATH`	Path to the ONNX Runtime library (if not in standard locations)
`MCPMYDOCS_MODEL_PATH`	Path to `embed.onnx` embedding model
`MCPMYDOCS_RERANKER_PATH`	Path to `rerank.onnx` reranker model

Usage

Note: Examples below assume mcpmydocs is in your PATH. If you built from source, use mcpmydocs instead.

Index a directory

Index all Markdown files in a directory:

mcpmydocs index ~/Documents/wiki

Example output:

Indexing directory: /home/user/docs
Database: /home/user/docs/mcpmydocs.db
[247/247] guides/advanced-configuration.md
Indexing complete!
  Indexed: 247 files
  Skipped: 0 unchanged files

Re-running the command only processes changed files:

Indexing complete!
  Indexed: 2 files
  Skipped: 245 unchanged files

Search from CLI

mcpmydocs search "how to configure authentication"

Example output (with reranking enabled by default):

Found 5 results for: "how to configure authentication" (reranked)

─────────────────────────────────────────────────────────────
[1] # Authentication > ## OAuth Setup (relevance: 2.15)
    File: /home/user/docs/auth/oauth.md:15

    OAuth Setup

    To configure OAuth authentication, you'll need to:
    1. Register your application with the OAuth provider
    2. Set the callback URL to https://yourapp.com/auth/callback
    ...

─────────────────────────────────────────────────────────────
[2] # Security > ## API Keys (relevance: 1.82)
    File: /home/user/docs/security/api-keys.md:8

    API Keys

    API keys provide a simple authentication mechanism...

Options:

# Return more results
mcpmydocs search "database migrations" -n 10

# Disable reranking (faster, uses vector similarity only)
mcpmydocs search "quick lookup" --no-rerank

# Adjust candidate pool for reranking (default: 50)
mcpmydocs search "detailed query" --candidates 100

Run as MCP server

Start the MCP server for integration with AI tools:

mcpmydocs run

The server communicates via stdio using JSON-RPC 2.0 (the MCP protocol).

Integrating with Claude Code

Create a wrapper script that sets the working directory, then register it with Claude Code:

# Create the script (adjust MCPMYDOCS_DIR to your install location)
MCPMYDOCS_DIR="$HOME/.local/share/mcpmydocs"

mkdir -p ~/bin
cat > ~/bin/mcpmydocs-server << EOF
#!/bin/bash
cd $MCPMYDOCS_DIR && mcpmydocs run
EOF

chmod +x ~/bin/mcpmydocs-server

# Add to Claude Code
claude mcp add --transport stdio mcpmydocs -- ~/bin/mcpmydocs-server

The wrapper script is necessary because the MCP server needs to run from a directory containing the database and models.

Available MCP tools

Once connected, Claude Code has access to:

`search`

Semantic search with cross-encoder reranking.

Parameter	Type	Default	Description
`query`	string	(required)	The search query
`limit`	integer	5	Number of results to return (max 20)
`rerank`	boolean	true	Enable cross-encoder reranking
`candidates`	integer	50	Candidate pool size for reranking (max 100)

`list_documents`

List all indexed documents with titles and paths. No parameters.

Example usage in Claude Code

Ask Claude Code to search your indexed documentation:

Search my docs for information about database migrations

Claude will automatically use the search tool and return relevant documentation snippets.

How it works

Chunking - Markdown files are split into chunks by heading structure
Embedding - Each chunk is converted to a 384-dimensional vector using all-MiniLM-L6-v2
Storage - Vectors are stored in DuckDB with HNSW indexing via the vss extension
Search - Two-stage retrieval:
- Stage 1 (Retrieval): Query is embedded and top-N candidates are fetched using cosine similarity
- Stage 2 (Reranking): Candidates are rescored using ms-marco-MiniLM-L-6-v2 cross-encoder for improved relevance

Project structure

mcpmydocs/
├── cmd/
│   ├── config.go     # CLI configuration
│   ├── index.go      # Index command
│   ├── run.go        # MCP server command
│   └── search.go     # Search command
├── internal/
│   ├── app/          # Application initialization
│   ├── chunker/      # Markdown chunking logic
│   ├── embedder/     # ONNX embedding generation
│   ├── logger/       # Logging utilities
│   ├── paths/        # Path resolution for models
│   ├── reranker/     # Cross-encoder reranking
│   ├── search/       # Unified search service
│   └── store/        # DuckDB storage layer
├── assets/
│   └── models/       # Downloaded ONNX models
├── Makefile
├── Brewfile
└── main.go

Development

# Install dependencies only
make deps

# Build only
make build

# Clean build artifacts and models
make clean

# Run tests
go test ./...

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch (git checkout -b feature/my-feature)
Run tests before submitting (go test ./...)
Submit a pull request

For bugs or feature requests, please open an issue.

Troubleshooting

"database not found" error

Run the index command first:

mcpmydocs index /path/to/your/docs

Poor search results

The search uses semantic similarity with cross-encoder reranking. Try:

More descriptive queries ("how to set up OAuth" vs "oauth")
Check that documents were actually indexed (use list_documents via MCP)
Ensure the reranker model is available (check for "reranker enabled" in logs)

Reranker not loading

If search results show similarity percentages instead of relevance scores, the reranker isn't loaded:

Ensure rerank.onnx exists in assets/models/ or set MCPMYDOCS_RERANKER_PATH
Check logs for "reranker initialization failed" messages

MCP server not connecting

Ensure the wrapper script runs from a directory containing mcpmydocs.db and the models
Check Claude Code logs: ~/.claude/logs/
Test the server directly: echo '{"jsonrpc":"2.0","method":"initialize","id":1,"params":{}}' | mcpmydocs run

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github/workflows		.github/workflows
cmd		cmd
internal		internal
scripts		scripts
.gitignore		.gitignore
Brewfile		Brewfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
install.sh		install.sh
integration_test.go		integration_test.go
main.go		main.go

Folders and files

Latest commit

History

Repository files navigation

mcpmydocs

Table of Contents

Quick Start

Features

Requirements

Installation

Quick install (macOS Apple Silicon / Linux)

Manual installation

1. Install ONNX Runtime

2. Download the models

3. Verify installation

Building from source

1. Clone and build

2. Verify installation

Configuration

Database location

Environment variables

Usage

Index a directory

Search from CLI

Run as MCP server

Integrating with Claude Code

Available MCP tools

search

list_documents

Example usage in Claude Code

How it works

Project structure

Development

Contributing

Troubleshooting

"database not found" error

Poor search results

Reranker not loading

MCP server not connecting

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`search`

`list_documents`

Packages