Changelog

All notable changes and version history

Latest: v0.1.3

v0.1.3

Latest

February 27, 2026

72B Model Benchmark Support

Verified 72B model benchmarks showing 79× speedup over bitsandbytes.

added

Verified Qwen 72B benchmark: 6.5s cold start (79× faster than bitsandbytes)

added

llama.cpp GGUF comparison: ZSE 1.6× faster on 72B models

changed

Updated website with H200 GPU benchmark results

added

New benchmark scripts for 70B+ models

v0.1.2

February 25, 2026

Documentation and packaging fixes

Improved documentation and fixed PyPI classifier issues.

fixed

Fixed invalid PyPI classifier (removed CUDA language)

changed

Updated README with verified 32B benchmarks

changed

Added honest VRAM threshold notes

added

PyPI badge in README

v0.1.1

February 24, 2026

Bug fixes and improvements

Minor bug fixes and documentation updates.

fixed

Fixed package installation with optional GGUF support

changed

Improved error messages for missing models

added

Support for custom model paths

v0.1.0

February 23, 2026

Initial Release

First public release of ZSE with core functionality.

added

zQuantize: INT4/NF4 pre-quantization with .zse format

added

zServe: OpenAI-compatible API server

added

zInfer: CLI inference tool

added

GGUF import support

added

Streaming token generation

added

Multi-model management

Want to see what's coming next?

Check out our roadmap and upcoming features on GitHub

View Roadmap GitHub Releases