Prompt Registry: Versioned Prompt Library with Evals

A lightweight Python-based prompt registry for managing versioned prompts with evaluation support, structured rendering, and CLI tooling for production AI systems.
Prompt Registry: Versioned Prompt Library with Evals
Photo by Compagnons / Unsplash

Prompt Registry is a lightweight system for managing prompts as structured, versioned assets rather than scattered text. It provides a simple way to store, organize, render, and evaluate prompts using a consistent interface built in Python.

The project is designed for developers and teams working with AI systems who want reproducibility, traceability, and clarity in how prompts evolve over time.

GitHub - brandonhimpfen/prompt-registry: Helps you store prompts as versioned artifacts, load them from disk, compare revisions, render them with variables, and run lightweight evaluations against expected behaviors.
Helps you store prompts as versioned artifacts, load them from disk, compare revisions, render them with variables, and run lightweight evaluations against expected behaviors. - brandonhimpfen/prom…

Why This Exists

As AI systems move from experimentation to production, prompts become part of the application layer. Without structure, they are difficult to track, compare, or improve.

Prompt Registry treats prompts as first-class artifacts. Each prompt is versioned, validated, and paired with evaluation logic so changes can be measured rather than guessed.

This shifts prompt work from trial-and-error toward a more disciplined workflow.

Core Concepts

  • Versioned Prompts: Prompts are stored as YAML files with explicit versions. This allows changes to be tracked over time and makes it easy to compare iterations.
  • Structured Rendering: Prompts support variables with validation, ensuring consistent input formatting and reducing runtime errors.
  • Registry Layer: A central loader resolves prompts by name and version, making it easy to integrate into applications without hardcoding prompt text.
  • Evaluation Support: Prompts can be paired with test cases and evaluation logic. This enables repeatable checks when prompts are updated.
  • CLI Interface: A simple command-line interface allows you to list prompts, inspect versions, compare changes, and run evaluations.

How It Fits Into an AI Stack

Prompt Registry acts as a contract layer between your application and the model.

Instead of embedding prompts directly in code, your system references them through the registry. This creates a clear separation between application logic and prompt logic, making both easier to maintain.

It also complements tools like:

  • AI contract layers.
  • evaluation pipelines.
  • observability systems.

Together, these form a more reliable and explainable AI workflow.

Use Cases

Prompt Registry is useful for:

  • AI-powered applications that rely on stable prompt behavior.
  • Teams iterating on prompts and needing version control.
  • Developers building internal AI tooling or assistants.
  • Projects that require evaluation and regression testing of prompts.

Design Approach

The project is intentionally minimal. It avoids heavy dependencies and focuses on clarity over abstraction.

The goal is not to replace larger frameworks, but to provide a clean foundation that can integrate with them.