zonnx

Standalone CLI for converting ONNX and SafeTensors models to GGUF format. Ships as a single static binary — zero CGo.

Part of the Zerfoo ML ecosystem.

Features

ONNX / SafeTensors to GGUF — produce portable GGUF files compatible with zerfoo and llama.cpp
Post-conversion quantization — quantize weights to Q4_0 or Q8_0 during conversion
HuggingFace integration — download ONNX models and tokenizer files in one step
Model inspection — introspect metadata, IOs, nodes, and tensor stats for ONNX and GGUF files
Architecture-aware mappings — tensor name and metadata mappings tuned per model family
CGo-free — single static binary, easy to distribute and run in minimal containers

Installation

go install github.com/zerfoo/zonnx/cmd/zonnx@latest

Or build from source:

go build -o zonnx ./cmd/zonnx

Requires Go 1.26+. CGO_ENABLED=0 works.

Quick Start

# Download an ONNX model from HuggingFace
zonnx download --model google/gemma-2-2b-it --output ./models

# Convert ONNX to GGUF
zonnx convert --arch gemma --output ./models/model.gguf ./models/model.onnx

# Convert SafeTensors to GGUF
zonnx convert --format safetensors --arch bert --output ./models/model.gguf ./models/bert-dir/

# Convert with quantization
zonnx convert --quantize q4_0 --output ./models/model-q4.gguf ./models/model.onnx

# Inspect a model file
zonnx inspect --pretty ./models/model.gguf

Supported Architectures

Architecture	`--arch`	Input Formats	Notes
Llama	`llama` (default)	ONNX	Llama 3, Code Llama
Gemma	`gemma`	ONNX	Gemma, Gemma 2, Gemma 3
BERT	`bert`	ONNX, SafeTensors	Classification, embeddings
RoBERTa	`roberta`	ONNX, SafeTensors	Same layer structure as BERT

Any architecture string can be passed via --arch. Metadata mapping is generic; tensor name mapping currently covers decoder (Llama-style) and encoder (BERT/RoBERTa) models.

Commands

`convert`

zonnx convert [flags] <input>

Flag	Default	Description
`--output`	`<input>.gguf`	Output GGUF file path
`--arch`	`llama`	Model architecture for metadata/tensor mapping
`--format`	`onnx`	Input format: `onnx` or `safetensors`
`--quantize`	(none)	Quantize weights: `q4_0` or `q8_0`

`download`

zonnx download --model <huggingface-model-id> [--output <dir>] [--api-key <key>]

The --api-key flag takes precedence over the HF_API_KEY environment variable.

`inspect`

zonnx inspect [--type onnx|gguf] [--pretty] <input-file>

Type is inferred from file extension when not specified.

Metadata Mapped

These HuggingFace config.json fields are mapped to GGUF metadata for all architectures:

config.json field	GGUF key
`hidden_size`	`{arch}.embedding_length`
`num_hidden_layers`	`{arch}.block_count`
`num_attention_heads`	`{arch}.attention.head_count`
`num_key_value_heads`	`{arch}.attention.head_count_kv`
`intermediate_size`	`{arch}.feed_forward_length`
`vocab_size`	`{arch}.vocab_size`
`max_position_embeddings`	`{arch}.context_length`
`rms_norm_eps`	`{arch}.attention.layer_norm_rms_epsilon`
`rope_theta`	`{arch}.rope.freq_base`

BERT/RoBERTa additionally map layer_norm_eps, num_labels, and pooler_type.

Design Principles

GGUF-only output — emits only GGUF files, no runtime code
No zerfoo imports — strictly decoupled from the inference runtime
Explicit schema — GGUF output captures all model attributes directly

Development

make test       # go test ./...
make lint       # golangci-lint run
make format     # gofmt + goimports

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
.github		.github
cmd		cmd
docs		docs
internal/onnx		internal/onnx
pkg		pkg
safetensors		safetensors
testdata		testdata
.gitignore		.gitignore
.goreleaser.yaml		.goreleaser.yaml
.goreleaser.yml		.goreleaser.yml
.release-please-manifest.json		.release-please-manifest.json
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
nocgo_test.go		nocgo_test.go
release-please-config.json		release-please-config.json
test_convert.go		test_convert.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

zonnx

Features

Installation

Quick Start

Supported Architectures

Commands

`convert`

`download`

`inspect`

Metadata Mapped

Design Principles

Development

License

About

Uh oh!

Releases 6

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

zonnx

Features

Installation

Quick Start

Supported Architectures

Commands

convert

download

inspect

Metadata Mapped

Design Principles

Development

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`convert`

`download`

`inspect`

Packages