navidocs/EVIDENCE_QUALITY_STANDARDS.md
Claude 232f50f0d6
Agent 0A (S5-H0A) DEPLOYED: Evidence Quality Standards
CRITICAL for Sessions 1-4 - Read immediately before creating claims.

IF.TTT compliance framework:
- Citation schema (≥2 sources required)
- Source quality tiers (primary 8-10, secondary 5-7, tertiary 2-4)
- Multi-source verification examples
- Confidence scoring formula
- Session-specific guidance
- Quality assurance checklist

Target metrics for Guardian approval:
- >85% verified claims
- Average credibility ≥7.5/10
- Primary sources >70%
- Unverified claims <10%

Agent: S5-H0A
Status: READY for Sessions 1-4 consumption
Next: Agent 0B (continuous quality monitoring every 5 min)
2025-11-13 02:07:46 +00:00

17 KiB

Evidence Quality Standards (IF.TTT Compliance)

NaviDocs Cloud Sessions - Citation & Verification Requirements

Agent: S5-H0A (Evidence Quality Standards) Session: Session 5 - Quality Assurance Partner For: All Sessions 1-4 (Market Research, Technical, Sales, Implementation) Version: 1.0 Generated: 2025-11-13


CRITICAL: Read This Before Creating Any Claims

ALL claims in your session outputs MUST follow these standards.

Session 5 (Guardian Council) will reject your handoff if evidence quality is below threshold.

Target: >85% verified claims, average credibility ≥7.5/10


IF.TTT Framework: Two-Source Verification

Core Principle: All claims require ≥2 independent sources

Evidence Status Ladder

VERIFIED ✅         →  ≥2 credible sources (credibility ≥5), no contradictions
PROVISIONAL ⚠️      →  1 credible source (credibility ≥8), needs 2nd confirmation
UNVERIFIED ❌       →  0 credible sources or <5 credibility, flagged for review
DISPUTED 🔴        →  Contradictory sources, requires investigation
REVOKED ⛔         →  Proven false, removed from dossier

Your goal: All claims should be VERIFIED before handoff


Citation Schema (Required Format)

Example Citation

{
  "citation_id": "if://citation/navidocs-warranty-savings-2025-11-13",
  "claim": "NaviDocs prevents €8K-€33K warranty losses per yacht",
  "evidence_type": "market_research",
  "sources": [
    {
      "type": "file",
      "path": "/mnt/c/users/setup/downloads/NaviDocs-Medium-Articles.md",
      "line_range": "45-67",
      "git_commit": "abc123def456",
      "quality": "primary",
      "credibility": 9,
      "excerpt": "Yacht owners who track warranties save €8K-€33K per vessel..."
    },
    {
      "type": "file",
      "path": "/home/setup/navidocs/docs/debates/02-yacht-management-features.md",
      "line_range": "120-145",
      "git_commit": "def456ghi789",
      "quality": "primary",
      "credibility": 9,
      "excerpt": "Warranty expiration tracking prevents €15K-€50K forgotten value..."
    }
  ],
  "status": "verified",
  "verification_date": "2025-11-13T12:00:00Z",
  "verified_by": "if://agent/session-1/haiku-3",
  "confidence_score": 0.95,
  "dependencies": [],
  "created_by": "if://agent/session-1/haiku-3",
  "created_at": "2025-11-13T10:00:00Z",
  "updated_at": "2025-11-13T12:00:00Z",
  "tags": ["warranty-tracking", "roi", "yacht-sales"]
}

Required Fields

Every citation MUST include:

  • citation_id (unique identifier)
  • claim (the specific statement being verified)
  • sources (array of ≥2 sources for VERIFIED status)
    • Each source MUST have: type, quality, credibility (0-10)
    • File sources: path, line_range, git_commit
    • Web sources: url, accessed, hash (SHA-256)
  • status (verified/provisional/unverified/disputed/revoked)
  • confidence_score (0.0-1.0)
  • created_by (your agent ID: S1-H03, S2-H05, etc.)

Source Quality Tiers

Primary Sources (Credibility: 8-10)

Use these whenever possible:

  1. Codebase Analysis (Credibility: 9-10)

    • File: server/db/schema.sql (line 45-67)
    • File: server/routes/boats.js (line 120-145)
    • Git commit: abc123def456
    • Why primary: Direct observation of actual code
  2. Local Documentation (Credibility: 8-9)

    • File: /mnt/c/users/setup/downloads/NaviDocs-Medium-Articles.md
    • File: /home/setup/navidocs/docs/debates/02-yacht-management-features.md
    • Why primary: Created by NaviDocs team, first-hand knowledge
  3. Official Industry Reports (Credibility: 8-9)

    • ICOMIA Global Recreational Boating Market Report 2024
    • European Boating Industry Statistics (EBI)
    • Why primary: Commissioned research, rigorous methodology
  4. Direct Interviews/Surveys (Credibility: 8-9)

    • Broker testimonials (first-hand pain points)
    • Owner interviews (actual usage patterns)
    • Why primary: Direct observation, real-world data

Secondary Sources (Credibility: 5-7)

Acceptable, but need 2nd source:

  1. Industry Association Websites (Credibility: 6-7)

    • ICOMIA, European Boating Industry
    • Yacht Brokers Association
    • Why secondary: Aggregated data, not original research
  2. Competitor Websites (Credibility: 5-7)

    • BoatVault pricing page
    • DeckDocs feature comparison
    • Why secondary: Marketing materials, may be biased
  3. Government Regulations (Credibility: 7-8)

    • Flag registration requirements (9 jurisdictions)
    • VAT/tax regulations
    • Why secondary (not primary): Legal requirements, but implementation varies
  4. Academic Papers (Credibility: 6-8)

    • Marine documentation studies
    • Yacht market analysis papers
    • Why secondary: Peer-reviewed, but may be outdated or theoretical

Tertiary Sources (Credibility: 2-4) ⚠️

Use ONLY if no primary/secondary available:

  1. Blog Posts (Credibility: 3-4)

    • Industry commentary
    • Yacht brokerage blogs
    • Why tertiary: Opinion-based, not verified
  2. Forum Discussions (Credibility: 2-4)

    • YachtWorld forums
    • The Trader Online discussions
    • Why tertiary: Anecdotal, single data points
  3. News Articles (Credibility: 3-5)

    • Yacht market trend coverage
    • Brokerage industry news
    • Why tertiary: Journalism, not original research
  4. Social Media (Credibility: 1-3)

    • LinkedIn posts from brokers
    • Twitter industry discussions
    • Why tertiary: Highly anecdotal, low verification

Unverified Claims (Credibility: 0-1)

Flag these - Guardian Council will reject:

  1. Assumptions - "We assume brokers will pay €299/month"
  2. Hypotheses - "MLS integration should reduce listing time"
  3. Projections - "Market will grow 15% annually"
  4. Guesses - "Prestige 50 boats cost around €250K"

Action required: Find 2+ sources or mark as UNVERIFIED


Multi-Source Verification Examples

Example 1: Market Size Claim (VERIFIED )

Claim: "Mediterranean yacht sales market is €2.3B annually"

Source 1 (Primary):

  • Type: Industry report
  • Path: /home/setup/yacht-market-reports/2024-mediterranean-market-analysis.pdf
  • Page: 23
  • Credibility: 8
  • Excerpt: "Mediterranean yacht market valued at €2.3B in 2024"

Source 2 (Secondary):

  • Type: Web
  • URL: https://icomia.org/statistics/european-market-2024
  • Accessed: 2025-11-13T10:00:00Z
  • Hash: sha256:a3b2c1d4e5f6...
  • Credibility: 7
  • Excerpt: "Southern Europe yacht sales: €2.2-€2.4B range"

Result: VERIFIED (2 sources, credibility 8+7=15, confidence 0.90)


Example 2: Warranty Savings Claim (VERIFIED )

Claim: "Inventory tracking prevents €8K-€33K forgotten value at resale"

Source 1 (Primary):

  • Type: File
  • Path: /mnt/c/users/setup/downloads/NaviDocs-Medium-Articles.md
  • Line: 45-67
  • Credibility: 9
  • Excerpt: "Yacht owners who track warranties save €8K-€33K per vessel"

Source 2 (Primary):

  • Type: File
  • Path: /home/setup/navidocs/docs/debates/02-yacht-management-features.md
  • Line: 120-145
  • Credibility: 9
  • Excerpt: "Warranty expiration tracking prevents €15K-€50K forgotten value"

Note: Range discrepancy (€8K-€33K vs €15K-€50K) - use conservative estimate €8K-€33K

Result: VERIFIED (2 primary sources, credibility 9+9=18, confidence 0.95)


Example 3: Technical Claim (VERIFIED )

Claim: "NaviDocs uses SQLite database with BullMQ job queue"

Source 1 (Primary):

  • Type: File
  • Path: server/db/schema.sql
  • Line: 1-10
  • Git commit: abc123def456
  • Credibility: 10
  • Excerpt: "-- SQLite schema for NaviDocs database"

Source 2 (Primary):

  • Type: File
  • Path: server/services/queue.service.js
  • Line: 5-20
  • Git commit: abc123def456
  • Credibility: 10
  • Excerpt: "import { Queue } from 'bullmq'; // Job queue for background tasks"

Result: VERIFIED (2 codebase sources, credibility 10+10=20, confidence 1.0)


Example 4: Pricing Claim (PROVISIONAL ⚠️)

Claim: "Brokers willing to pay €99-€299/month for NaviDocs"

Source 1 (Tertiary):

  • Type: Forum
  • URL: https://yachtworld.com/forums/thread-12345
  • Credibility: 3
  • Excerpt: "I'd pay €150/month for warranty tracking software"

Problem: Only 1 source, credibility too low (3 < 5)

Action required:

  • Find pricing survey data (primary source)
  • OR competitor pricing analysis (secondary source)
  • OR mark as PROVISIONAL ⚠️ and flag for follow-up

Result: PROVISIONAL ⚠️ (needs 2nd source before Session 5 handoff)


Example 5: Timeline Claim (UNVERIFIED )

Claim: "MLS integration can be completed in 2 weeks"

Source 1: None (assumption based on developer estimate)

Problem: No evidence, pure speculation

Action required:

  • Search codebase for existing MLS integrations (time to implement)
  • Find industry benchmarks for API integration timelines
  • OR consult Session 4 sprint planning for realistic estimate
  • OR mark as UNVERIFIED and remove from critical path

Result: UNVERIFIED (remove claim or find 2 sources)


Confidence Scoring Formula

Confidence = (Source1_Credibility + Source2_Credibility) / 20

If ≥3 sources: Confidence = min(0.95, average_credibility / 10)
If 2 sources: Confidence = average_credibility / 10
If 1 source (credibility ≥8): Confidence = credibility / 15 (PROVISIONAL)
If 0 sources: Confidence = 0.0 (UNVERIFIED)

Examples:

  • 2 primary sources (9+9=18): Confidence = 0.90
  • 2 secondary sources (6+6=12): Confidence = 0.60
  • 1 primary source (9): Confidence = 0.60 (PROVISIONAL)
  • 3 primary sources (9+9+8=26): Confidence = 0.95 (capped)

Evidence Quality Scorecard

Target metrics for Session handoff:

Metric Target Guardian Rejection Threshold
Verified claims >85% <70% verified
Average credibility ≥7.5/10 <6.0/10
Primary sources >70% <50%
Unverified claims <10% >20%
Confidence score ≥0.75 <0.60

If you miss targets: Guardian Council will ABSTAIN or REJECT your session handoff


Citation File Format

File: intelligence/session-X/session-X-citations.json

{
  "session_id": "if://conversation/navidocs-session-1-2025-11-13",
  "total_citations": 47,
  "verified_citations": 42,
  "provisional_citations": 3,
  "unverified_citations": 2,
  "average_credibility": 8.2,
  "average_confidence": 0.87,
  "citations": [
    {
      "citation_id": "if://citation/warranty-savings-8k-33k",
      "claim": "NaviDocs prevents €8K-€33K warranty losses per yacht",
      "sources": [ /* full source objects */ ],
      "status": "verified",
      "confidence_score": 0.95
    },
    {
      "citation_id": "if://citation/broker-pricing-willingness",
      "claim": "Brokers willing to pay €99-€299/month",
      "sources": [ /* only 1 source */ ],
      "status": "provisional",
      "confidence_score": 0.60
    }
  ]
}

IF.bus Communication: Citing Sources

When sending findings to Agent 10 (synthesis), include citations:

{
  "performative": "inform",
  "sender": "if://agent/session-1/haiku-3",
  "receiver": ["if://agent/session-1/haiku-10"],
  "content": {
    "claim": "Inventory tracking prevents €15K-€50K forgotten value",
    "evidence": [
      "file:/home/setup/navidocs/docs/debates/02-yacht-management-features.md:120-145",
      "file:/mnt/c/users/setup/downloads/NaviDocs-Medium-Articles.md:45-67"
    ],
    "confidence": 0.95,
    "cost_tokens": 1247
  },
  "citation_ids": ["if://citation/inventory-pain-point-2025-11-13"],
  "timestamp": "2025-11-13T10:00:00Z"
}

Agent 10 validates:

  • Check citation_ids reference valid citations in session-X-citations.json
  • Verify ≥2 sources (IF.TTT compliance)
  • Confirm confidence ≥0.75

Quality Assurance Checklist

Before creating your session handoff, verify:

  • All claims have ≥2 sources (or marked PROVISIONAL/UNVERIFIED)
  • Citations file (session-X-citations.json) exists
  • Average credibility ≥7.5/10
  • Verified claims >85%
  • Primary sources >70%
  • Unverified claims <10%
  • All file references include: path, line_range, git_commit
  • All web references include: url, accessed date, SHA-256 hash
  • Confidence scores calculated correctly
  • Status field populated (verified/provisional/unverified)

Session 5 (Guardian Council) will review your handoff against this checklist.


ESCALATE Protocol: Evidence Conflicts

If you detect conflicting evidence (>20% variance), ESCALATE:

Example:

  • Agent 1 claims: "Prestige 50 price range €250K-€480K"
  • Agent 3 claims: "Owner has €1.5M Prestige 50 boat"
  • Variance: (1.5M - 250K) / 250K = 500% ⚠️

Action:

{
  "performative": "ESCALATE",
  "sender": "if://agent/session-1/haiku-10",
  "receiver": ["if://agent/session-1/coordinator"],
  "content": {
    "conflict_type": "Price range inconsistency",
    "agent_1_claim": "€250K-€480K (S1-H01)",
    "agent_3_claim": "€1.5M boat (S1-H03)",
    "variance": "500%",
    "requires_resolution": true,
    "recommendation": "Re-search YachtWorld for Prestige 50 ACTUAL sale prices"
  }
}

Coordinator investigates, resolves, updates citation status.


Session-Specific Guidance

Session 1 (Market Research)

Focus: Market sizing, competitive landscape, broker pain points

Critical claims to verify:

  • Mediterranean yacht sales market size (€2.3B)
  • Riviera brokerage count (120 active)
  • Warranty savings (€8K-€33K)
  • Documentation prep time (6 hours → 20 minutes)

Best sources:

  • ICOMIA reports (primary)
  • NaviDocs Medium articles (primary)
  • Competitor websites (secondary)

Session 2 (Technical Integration)

Focus: Architecture design, database migrations, API specifications

Critical claims to verify:

  • NaviDocs uses SQLite + BullMQ (codebase analysis)
  • Database schema changes (file references)
  • API endpoint specifications (OpenAPI spec)
  • Integration points (file:line citations)

Best sources:

  • Codebase files (primary, credibility 10)
  • Git commits (primary, credibility 10)
  • Technical documentation (primary, credibility 8-9)

Session 3 (Sales Enablement)

Focus: Pitch deck, ROI calculator, demo scripts

Critical claims to verify:

  • ROI calculations cite Session 1 sources
  • Pricing strategy aligns with competitor analysis
  • Demo script matches NaviDocs actual features
  • Objection handling backed by evidence

Best sources:

  • Session 1 citations (cross-reference)
  • Session 2 codebase validation (features exist)
  • Competitor pricing pages (secondary)

Session 4 (Implementation Planning)

Focus: Sprint planning, roadmap, acceptance criteria

Critical claims to verify:

  • 4-week timeline realistic (codebase complexity)
  • Dependencies correctly identified (file references)
  • Acceptance criteria testable (Given/When/Then format)
  • Migration scripts safe (rollback procedures)

Best sources:

  • Session 2 architecture (cross-reference)
  • Codebase file analysis (primary)
  • Sprint planning best practices (secondary)

Session 5 (Guardian Council) Will Check:

Empirical Soundness (0-10):

  • Evidence quality (primary vs secondary vs tertiary)
  • Source verification (all citations traceable)
  • Multi-source compliance (≥2 sources per claim)

Logical Coherence (0-10):

  • Cross-session consistency (Session 1 ↔ Session 3 alignment)
  • Contradiction detection (conflicting claims flagged)
  • Integration validation (all pieces fit together)

Practical Viability (0-10):

  • Implementation feasibility (4-week timeline backed by codebase)
  • ROI justification (€8K-€33K savings verified)
  • Technical risks (migration scripts tested)

Approval threshold: Average ≥7.0 across all 3 dimensions

If you fail: Guardian Council will ABSTAIN (5.0-6.9) or REJECT (<5.0)


Real-Time Quality Feedback

Agent 0B (S5-H0B) monitors your work every 5 minutes:

Check: intelligence/session-X/QUALITY_FEEDBACK.md (updated continuously)

Example feedback:

# Session 1 Quality Feedback (2025-11-13 10:15 UTC)

## ✅ Good practices:
- Market size claim has 2 primary sources (ICOMIA + EBI)
- Citation format matches IF.TTT schema
- Confidence scores calculated correctly

## ⚠️ Warnings:
- Broker pricing claim (€99-€299/month) has only 1 tertiary source
  - Action: Find pricing survey or competitor analysis
  - Deadline: Before Session 1 handoff

## ❌ Errors:
- MLS integration timeline claim has 0 sources (UNVERIFIED)
  - Action: Remove claim OR find 2 sources
  - Risk: Guardian Council will reject if not fixed

## 📊 Current metrics:
- Verified: 38/42 (90%) ✅
- Average credibility: 8.1/10 ✅
- Primary sources: 30/42 (71%) ✅
- Confidence: 0.85 ✅

**Overall:** On track for Guardian approval

Questions?

If unclear:

  1. Check QUALITY_FEEDBACK.md (Agent 0B updates every 5 min)
  2. ESCALATE to Session 5 coordinator
  3. Create intelligence/session-X/QUESTION-evidence-standards.md

Session 5 Contact:

  • Agent 0A (S5-H0A): Evidence standards
  • Agent 0B (S5-H0B): Real-time QA feedback
  • Coordinator: Final validation before Guardian vote

Document Signature:

if://doc/evidence-quality-standards-2025-11-13
Agent: S5-H0A (Evidence Quality Standards)
Version: 1.0
Status: READY - Sessions 1-4 read immediately
For Guardian Council Approval: >85% verified, credibility ≥7.5