ADR-0011: Auto-Generated Pydantic Models from OpenAPI¶
Status¶
Accepted (Revised)
Date: 2025-10-30 (Original) Updated: 2025-12-08
Context¶
The generated attrs models from OpenAPI represent API request/response structures with
Unset sentinel values, nested complexity, and mixed concerns. While suitable for API
transport, they are suboptimal for:
- ETL and Data Processing: Unset sentinels complicate data export and transformation
- Business Logic: Methods for display formatting, search, validation belong on domain models
- Type Safety: Unset sentinels require constant checking
(
if not isinstance(x, Unset)) - Immutability: No built-in immutability guarantees for safer data handling
- JSON Schema Generation: attrs doesn't provide JSON schema for documentation/validation
Users need clean, business-focused models that represent "the thing itself" rather than "how to transport the thing".
Decision¶
We will auto-generate Pydantic v2 models from the same OpenAPI spec using
datamodel-code-generator, providing a parallel model layer:
- Auto-Generation: Use datamodel-code-generator to generate Pydantic models from
docs/katana-openapi.yaml - Domain Grouping: Split generated models into domain-grouped files using AST parsing (base, common, inventory, stock, sales_orders, purchase_orders, manufacturing, contacts, webhooks, errors)
- Registry: Maintain bidirectional mapping between attrs and Pydantic classes for easy conversion
- Pydantic v2 Config: Models use
frozen=True,extra="forbid",validate_assignment=Truefor safety
Architecture¶
katana_public_api_client/
├── models/ # attrs models (generated by openapi-python-client)
├── models_pydantic/
│ ├── __init__.py # Public exports
│ ├── _base.py # KatanaPydanticBase with conversion methods
│ ├── _registry.py # attrs↔pydantic class mappings
│ ├── _auto_registry.py # Auto-generated registry entries
│ └── _generated/ # Auto-generated domain files
│ ├── __init__.py # Re-exports all models
│ ├── base.py # BaseEntity hierarchy
│ ├── common.py # Shared types, enums, utilities
│ ├── inventory.py # Products, Materials, Variants
│ ├── stock.py # Batches, Stock levels
│ ├── sales_orders.py
│ ├── purchase_orders.py
│ ├── manufacturing.py
│ ├── contacts.py # Customers, Suppliers
│ ├── webhooks.py
│ └── errors.py
Generation Process¶
The generation script (scripts/generate_pydantic_models.py):
- Runs datamodel-codegen with config from
pyproject.toml - Parses generated code using Python AST
- Groups classes by domain using pattern matching
- Fixes datamodel-codegen issues:
- MRO (Method Resolution Order) conflicts
- String enum defaults (→ enum member references)
- Invalid
union_modewithout discriminators - Writes domain-grouped module files
- Generates registry mappings for attrs↔pydantic conversion
- Runs ruff format/fix
Conversion Methods¶
from katana_public_api_client.models_pydantic import Product as PydanticProduct
from katana_public_api_client.models import Product as AttrsProduct
# Convert attrs → pydantic
pydantic_product = PydanticProduct.from_attrs(attrs_product)
# Convert pydantic → attrs
attrs_product = pydantic_product.to_attrs()
Consequences¶
Positive Consequences¶
- Full Coverage: All 287+ models generated automatically
- Type Safety: Pydantic validation, clean
Optional[T]types - Immutability: Frozen by default, prevents accidental mutations
- JSON Schema: Automatic generation for documentation
- MCP Integration: Clean, validated data for LLM contexts
- Bidirectional: Easy conversion between attrs and Pydantic layers
Negative Consequences¶
- Two Model Layers: Developers must understand attrs (transport) vs Pydantic
- Conversion Overhead: Small performance cost for conversion
- Generation Complexity: Custom script to fix datamodel-codegen issues
- Naming Conflicts: Some generated names are awkward (Status7, CustomField3)
Neutral Consequences¶
- Generated Code Unchanged: attrs models remain unmodified
- Incremental Adoption: Use Pydantic layer where beneficial
- Regeneration Required: When OpenAPI spec changes, regenerate both layers
Implementation Notes¶
Required Dependencies¶
datamodel-codegen Configuration (pyproject.toml)¶
[tool.datamodel-codegen]
input = "docs/katana-openapi.yaml"
input-file-type = "openapi"
output-model-type = "pydantic_v2.BaseModel"
use-annotated = true
use-standard-collections = true
field-constraints = true
use-union-operator = true
target-python-version = "3.11"
base-class = "katana_public_api_client.models_pydantic._base.KatanaPydanticBase"
Alternatives Considered¶
Alternative 1: Hand-Crafted Domain Models¶
- Description: Manually write Pydantic classes for key entities only
- Pros: Clean names, selective coverage, business methods
- Cons: High maintenance, incomplete coverage, manual sync required
- Why Rejected: Auto-generation provides complete, consistent coverage
Alternative 2: Regenerate with Pydantic Generator Only¶
- Description: Replace openapi-python-client with pydantic-based generator
- Pros: Single model layer
- Cons: Would require rewriting all existing code, loss of httpx patterns
- Why Rejected: Too disruptive, attrs layer works well for API transport
Alternative 3: Runtime Conversion Only¶
- Description: Convert attrs→dict→Pydantic at runtime
- Pros: No generation needed
- Cons: No static typing, runtime overhead, no IDE support
- Why Rejected: Static types and IDE support are essential