Schema Validator Skill
Validate consistency between Pydantic models, database schema, and LLM output contracts.
Purpose
Ensure data consistency across all layers: Pydantic models (Python), database schema (SQL), and LLM output format (prompts).
Workflow
Phase 1: Inventory Schemas
- •
Find Pydantic Models
- •Search
src/models/or similar directories - •Identify data models with fields and types
- •Note enum definitions
- •Search
- •
Find Database Schema
- •Check
src/db/schema.sql - •Look for migrations in
src/db/migrations/ - •Review
docs/architecture.mdfor schema docs
- •Check
- •
Find LLM Output Schema
- •Read
docs/prompts.mdfor classification schema - •Check prompt definitions for output format
- •Identify expected field names and types
- •Read
Phase 2: Compare Field Definitions
- •
Field Names
- •Do field names match across all layers?
- •Check for case differences (snake_case vs camelCase)
- •Identify missing fields
- •
Field Types
- •Are types compatible?
- •Pydantic
str↔ DatabaseVARCHAR - •Pydantic
int↔ DatabaseINTEGER - •Pydantic
float↔ DatabaseFLOAT - •Check enum values match exactly
- •
Required vs Optional
- •Pydantic
Optional[str]vs DatabaseNULL - •Are nullable fields aligned?
- •Are required fields enforced everywhere?
- •Pydantic
Phase 3: Check Constraints
- •
String Length Limits
- •Pydantic max_length vs Database VARCHAR(N)
- •Will database reject valid Pydantic values?
- •
Numeric Ranges
- •Min/max constraints
- •Decimal precision
- •Integer bounds
- •
Enum Value Sets
- •Do enum values match exactly?
- •Same spelling and case?
- •All values present in all layers?
- •
Nullable Fields
- •Pydantic Optional vs Database NULL
- •Consistent nullability across layers
Phase 4: Identify Mismatches
- •
Missing Fields
- •Field exists in one layer but not others
- •Which layer is missing it?
- •
Type Incompatibilities
- •Type mismatch between layers
- •Potential data loss or validation failure
- •
Naming Inconsistencies
- •snake_case in database, camelCase in models
- •Abbreviations differ
- •Spelling variations
Output Format
markdown
## Schema Validation Report ### Schemas Found - Pydantic: [file paths] - Database: [file paths] - LLM Output: docs/prompts.md ### Field Comparison | Field | Pydantic | Database | LLM Output | Status | | --------------- | ---------- | ----------- | ----------- | ------ | | issue_type | str (enum) | VARCHAR(50) | enum list | ✅ | | priority | str | VARCHAR(20) | enum list | ✅ | | sentiment_score | float | FLOAT | -1.0 to 1.0 | ✅ | ### Issues Found 1. **[MISMATCH]** Field `foo`: - Pydantic: `Optional[str]` - Database: `NOT NULL` - Fix: [recommendation] 2. **[MISSING]** Field `bar`: - Present in: Pydantic - Missing from: Database - Fix: [recommendation] ### Recommendations - [Specific fix]
Success Criteria
- • All schema locations identified
- • Field comparison table complete
- • Mismatches documented with severity
- • Recommendations provided
- • Intentional differences noted (e.g., audit fields in DB)
Constraints
- •Don't modify files - only report issues
- •Flag severity - breaking vs cosmetic
- •Consider migration implications - DB changes need migrations
- •Note intentional differences - DB may have extra audit fields
Key Files
| File | Schema Type |
|---|---|
src/db/models.py | Pydantic models |
src/db/schema.sql | Database schema |
docs/prompts.md | LLM output schema |
src/api/routers/ | API request/response models |
Common Pitfalls
- •Assuming names match: Check for snake_case vs camelCase
- •Missing enums: Enum values must match exactly across layers
- •Ignoring nullability: Optional in code, NOT NULL in database = bug
- •Not checking length: VARCHAR(50) in DB, no max_length in Pydantic
Integration with Marcus
This skill is typically invoked when:
- •Database schema changes
- •Pydantic models updated
- •LLM output format changes
- •Before migrations
If Blocked
If you cannot proceed:
- •State which schema you can't find
- •Explain where you've looked
- •Provide partial comparison if available
- •Request clarification on expected schema location