AgentSkillsCN

ring:pre-dev-data-model

第5道关口:数据结构文档——在选定数据库技术之前,先明确实体、关系与所有权归属。仅限大型项目适用。

SKILL.md
--- frontmatter
name: ring:pre-dev-data-model
description: |
  Gate 5: Data structures document - defines entities, relationships, and ownership
  before database technology selection. Large Track only.

trigger: |
  - API Design passed Gate 4 validation
  - System stores persistent data
  - Multiple entities with relationships
  - Large Track workflow (2+ day features)

skip_when: |
  - Small Track workflow → skip to Task Breakdown
  - No persistent data → skip to Dependency Map
  - API Design not validated → complete Gate 4 first

sequence:
  after: [ring:pre-dev-api-design]
  before: [ring:pre-dev-dependency-map]

Data Modeling - Defining Data Structures

Foundational Principle

Data structures, relationships, and ownership must be defined before database technology selection.

Jumping to database-specific schemas without modeling creates:

  • Inconsistent data structures across services
  • Unclear data ownership and authority
  • Schema conflicts discovered during development
  • Migration nightmares when requirements change

The Data Model answers: WHAT data exists, HOW entities relate, WHO owns what data? The Data Model never answers: WHICH database technology or HOW to implement storage.

Phase 0: Database Field Naming Strategy (MANDATORY)

Before defining schemas, determine how to name database fields.

Step 1: Check if Gate 4 API standards exist

Check if docs/pre-dev/{feature-name}/api-standards-ref.md exists:

bash
ls docs/pre-dev/{feature-name}/api-standards-ref.md

Step 2: Ask user about database field naming

Use AskUserQuestion tool:

If api-standards-ref.md EXISTS:

Question: "Gate 4 defined API field names (e.g., userId, createdAt). How should database fields be named?"

  • Header: "DB Field Naming"
  • multiSelect: false
  • Options:
    1. "Convert to snake_case (Recommended)" (description: "API: userId → DB: user_id (PostgreSQL/MySQL standard)")
    2. "Keep same as API (camelCase)" (description: "API: userId → DB: userId (MongoDB/document DB)")
    3. "Different standards - provide DB dictionary" (description: "I have a separate database naming standards document")
    4. "Define manually for this feature" (description: "No standards, I'll specify field names")

If api-standards-ref.md DOES NOT EXIST:

Question: "No API standards were defined in Gate 4. How should database fields be named?"

  • Header: "DB Field Naming"
  • multiSelect: false
  • Options:
    1. "Use snake_case (Recommended)" (description: "Standard for PostgreSQL/MySQL: user_id, created_at")
    2. "Use camelCase" (description: "Standard for MongoDB/document DBs: userId, createdAt")
    3. "Load from standards document" (description: "I have a database naming standards document (URL or file)")
    4. "Define manually" (description: "No standards, I'll specify per feature")

Step 3: Process user selection

Option 1 Selected: "Convert to snake_case"

  1. Load api-standards-ref.md from Gate 4
  2. Apply automatic conversion rules:
API Field (camelCase)DB Column (snake_case)Rule
userIduser_idSplit on capital letters, join with underscore
createdAtcreated_atSplit on capital letters, join with underscore
isActiveis_activePreserve boolean prefix
phoneNumberphone_numberSplit on capital letters within words
userIDuser_idCollapse consecutive capitals
  1. Create db-standards-ref.md with converted names + mapping table
  2. Document conversion rule: "Automatic camelCase → snake_case conversion"

Option 2 Selected: "Keep same as API"

  1. Load api-standards-ref.md from Gate 4
  2. Copy field names without modification
  3. Create db-standards-ref.md with same names
  4. Document strategy: "Database uses same field names as API (camelCase)"

Option 3 Selected: "Different standards - provide DB dictionary"

  1. Ask for URL or file path (same as Gate 4 Phase 0)
  2. Load and extract database-specific standards
  3. If conflicts with API standards: create mapping table
  4. Document source and differences

Additionally extract for database context:

Database-Specific ElementWhat to Extract
Table namingSingular vs plural (user vs users)
Column namingsnake_case, camelCase, PascalCase
Primary key namingid, {table}_id, uuid
Foreign key naming{table}_id, {table}_uuid
Timestamp columnscreated_at/updated_at vs createdAt/updatedAt
Boolean prefixesis_, has_, no prefix
Junction table naminguser_role vs users_roles vs user_roles

Option 4 Selected: "Define manually"

  1. Proceed without standards reference
  2. Document in db-standards-ref.md: "No organizational standards, fields defined per-feature"

Step 4: Generate db-standards-ref.md

Output to: docs/pre-dev/{feature-name}/db-standards-ref.md

If conversion from API standards (Option 1 or 2):

markdown
# Database Standards Reference - {Feature Name}

Source: Converted from api-standards-ref.md (Gate 4)
Conversion: {camelCase → snake_case / same as API}
Generated: {ISO 8601 timestamp}

## Field Naming Convention

**Database pattern:** {snake_case / camelCase}
**Source:** API standards (Gate 4) with automatic conversion

## Standard Fields

| DB Column | API Field | Type | Example |
|-----------|-----------|------|---------|
| user_id | userId | uuid | "550e8400-e29b-41d4-a716-446655440000" |
| email | email | varchar(254) | "user@example.com" |
| created_at | createdAt | timestamptz | "2026-01-23T10:30:00Z" |
| is_active | isActive | boolean | true |

## API to Database Mapping

| API Field | DB Column | Type Conversion |
|-----------|-----------|-----------------|
| userId (string) | user_id (uuid) | Parse UUID string → uuid type |
| createdAt (ISO 8601 string) | created_at (timestamptz) | Parse ISO → timestamptz |
| isActive (boolean) | is_active (boolean) | Direct mapping |

## Conversion Rules

- camelCase → snake_case: Insert underscore before capitals, lowercase all
- Consecutive capitals: Treat as acronym (userID → user_id, not user_i_d)
- Boolean prefixes: Preserve (isActive → is_active, hasPermission → has_permission)

If loaded from separate dictionary (Option 3):

Follow shared-patterns/standards-discovery.md workflow, including database-specific extractions.

Mandatory Workflow

PhaseActivities
0. Database Field Naming StrategyCheck if Gate 4 API standards exist; ask user: reuse with conversion (snake_case/camelCase), load separate DB dictionary, or define manually; generate db-standards-ref.md with mapping if applicable
1. Data AnalysisLoad approved API Design (Gate 4), TRD (Gate 3), Feature Map (Gate 2), PRD (Gate 1); extract entities from contracts; identify relationships
2. Data ModelingDefine entities, specify attributes, model relationships, assign ownership, define constraints, plan lifecycle, design access patterns, consider data quality; apply field naming strategy from Phase 0
3. Gate 5 ValidationVerify all checkboxes before proceeding to Dependency Map

Explicit Rules

✅ DO Include

Entity definitions (conceptual data objects), attributes with types, constraints (required, unique, ranges), relationships (1:1, 1:N, M:N), data ownership (authoritative component), primary identifiers, lifecycle rules (soft delete, archival), access patterns, data quality rules, referential integrity

❌ NEVER Include

Database products (PostgreSQL, MongoDB, Redis), table/collection names, index definitions, SQL/query language, ORM frameworks (Prisma, TypeORM), storage engines, partitioning/sharding, replication/backup, database-specific types (JSONB, BIGSERIAL)

Abstraction Rules

ElementAbstract (✅)Database-Specific (❌)
Entity"User""users table"
Attribute"emailAddress: String (email format)""email VARCHAR(255)"
Relationship"User has many Orders""foreign key user_id"
Identifier"Unique identifier""UUID primary key"
Constraint"Must be unique""UNIQUE INDEX"

Rationalization Table

ExcuseReality
"We know it's PostgreSQL, just use PG types"Database choice comes later. Model abstractly now.
"Table design is data modeling"Tables are implementation. Entities are concepts. Stay conceptual.
"We need indexes for performance"Indexes are optimization. Model data first, optimize later.
"ORMs require specific schemas"ORMs adapt to models. Don't let tooling drive design.
"Foreign keys define relationships"Relationships exist conceptually. FKs are implementation.
"SQL examples help clarity"Abstract models are clearer. SQL is implementation detail.
"NoSQL doesn't need relationships"All systems have data relationships. Model them regardless of DB type.
"This is just ERD"ERD is visualization tool. Data model is broader (ownership, lifecycle, etc).
"We can skip this for simple CRUD"Even CRUD needs clear entity design. Don't skip.
"Microservices mean no relationships"Services interact via data. Model entities per service.

Red Flags - STOP

If you catch yourself writing any of these in Data Model, STOP:

  • Database product names (Postgres, MySQL, Mongo, Redis)
  • SQL keywords (CREATE TABLE, ALTER TABLE, SELECT, JOIN)
  • Database-specific types (SERIAL, JSONB, VARCHAR, TEXT)
  • Index commands (CREATE INDEX, UNIQUE INDEX)
  • ORM code (Prisma schema, TypeORM decorators)
  • Storage details (partitioning, sharding, replication)
  • Query optimization (EXPLAIN plans, index hints)
  • Backup/recovery strategies

When you catch yourself: Replace DB detail with abstract concept. "users table" → "User entity"

Gate 5 Validation Checklist

CategoryRequirements
Entity CompletenessAll entities from PRD/Feature Map modeled; clear consistent names; defined purpose; boundaries align with TRD components
Attribute SpecificationAll types specified; required vs optional explicit; constraints documented; defaults where relevant; computed fields identified
Relationship ModelingAll relationships documented; cardinality specified (1:1, 1:N, M:N); optional vs required clear; referential integrity to be documented; circular deps resolved
Data OwnershipEach entity owned by exactly one component; read/write permissions documented; cross-component access via APIs only; no shared database anti-pattern
Data QualityValidation rules specified; normalization level appropriate; denormalization justified; consistency strategy defined
Lifecycle ManagementCreation rules; update patterns; deletion strategy (hard/soft); archival/retention policies; audit trail needs
Access PatternsPrimary patterns documented; query needs identified; write patterns documented; consistency requirements specified
Technology AgnosticNo database products; no SQL/NoSQL specifics; no table/index definitions; implementable in any DB

Gate Result: ✅ PASS (all checked) → Dependency Map | ⚠️ CONDITIONAL (remove DB specifics) | ❌ FAIL (incomplete/poor ownership)

Data Model Template Structure

Output to docs/pre-dev/{feature-name}/data-model.md with these sections:

SectionContent
OverviewAPI Design/TRD/Feature Map references, status, last updated
Data Ownership MapTable: Entity, Owning Component, Read Access, Write Access

Per-Entity Structure

FieldContent
PurposeWhat this entity represents
Owned ByComponent from TRD
Primary IdentifierUnique identifier field and format
AttributesTable: Attribute, Type, Required, Unique, Constraints, Description
Nested TypesEmbedded types (e.g., OrderItem within Order, Address value object)
RelationshipsCardinality notation: Entity (1) ──< has many >── (*) OtherEntity
ConstraintsBusiness rules, status transitions, referential integrity
LifecycleCreation (via which API), updates, deletion strategy, archival
Access PatternsLookup patterns by frequency (primary, secondary, rare)
Data QualityNormalization rules, validation

Additional Sections

SectionContent
Relationship DiagramASCII/text diagram showing entity relationships with cardinality legend
Cross-Component AccessPer scenario: data flow steps, rules (no direct DB access, API only)
Consistency StrategyStrong consistency (immediate): auth, payments, inventory; Eventual (delay OK): analytics, search
Validation RulesPer-entity and cross-entity validation
Lifecycle PoliciesRetention periods table, soft delete strategy, audit trail requirements
Privacy & CompliancePII fields table with handling, GDPR compliance, encryption needs (algorithm TBD)
Access Pattern AnalysisHigh/medium/low frequency patterns with req/sec estimates, optimization notes for later
Data Quality StandardsNormalization rules, validation approach, integrity enforcement
Migration StrategySchema evolution (additive, non-breaking, breaking), versioning approach
Gate 5 ValidationDate, validator, checklist, approval status

Common Violations

ViolationWrongCorrect
Database SchemaCREATE TABLE users (id UUID PRIMARY KEY, email VARCHAR(255) UNIQUE)Entity User with attributes table: userId (Identifier, Unique), email (EmailAddress, Unique)
ORM CodeTypeScript with @Entity(), @PrimaryGeneratedColumn('uuid'), @Column decoratorsEntity User with primary identifier, attributes list, constraints description
Technology in Relationships"Foreign key user_id references users.id; Join table user_roles""User (1:N) Order; User (M:N) Role" with cardinality descriptions

Confidence Scoring

FactorPointsCriteria
Entity Coverage0-30All entities: 30, Most: 20, Gaps: 10
Relationship Clarity0-25All documented: 25, Most clear: 15, Ambiguous: 5
Data Ownership0-25Clear boundaries: 25, Minor overlaps: 15, Unclear: 5
Constraint Completeness0-20All rules: 20, Common cases: 12, Minimal: 5

Action: 80+ autonomous generation | 50-79 present options | <50 ask clarifying questions

After Approval

  1. ✅ Lock data model - entity structure is now reference
  2. 🎯 Use model as input for Dependency Map (ring:pre-dev-dependency-map)
  3. 🚫 Never add database specifics retroactively
  4. 📋 Keep technology-agnostic until Dependency Map

The Bottom Line

If you wrote SQL schemas or ORM code, delete it and model abstractly.

Data modeling is conceptual. Period. No database products. No SQL. No ORMs.

Database technology goes in Dependency Map. That's the next phase. Wait for it.

Model the data. Stay abstract. Choose database later.