Comprehensive overview of: - Core architectural decisions - Schema design rationale - Technology stack - Scaling strategy - Expert panel consensus - Success criteria Ready for implementation phase.
6.4 KiB
NaviDocs Architecture Summary
Status: Design Complete ✅
Next Phase: Implementation
Created: 2025-01-19
📋 What We've Built
A future-proof, multi-vertical document management platform for boat owners, marinas, and property managers.
🎯 Core Architectural Decisions
1. Hybrid Database Strategy
- SQLite for transactional data (users, boats, documents)
- Meilisearch for search-optimized queries
- Migration path to PostgreSQL when scaling requires it
Why: Search-first architecture. Every query is a search query, not a SQL JOIN.
2. Multi-Vertical Schema
- Designed for boats (v1.0)
- Expandable to marinas, properties, HOAs (v1.1+)
- Unified hierarchy: Organization → Entity → Sub-Entity → Component → Document
Why: Patterns are identical across verticals. Build once, reuse everywhere.
3. Security-First
- Tenant tokens (NOT master keys in client)
- Background queue for CPU-intensive OCR
- File safety pipeline (qpdf + ClamAV + validation)
- Rate limiting on all endpoints
Why: Expert panel identified these as production killers if skipped.
4. Offline-First PWA
- Service worker caches critical manuals
- Works 20 miles offshore with no cell signal
- IndexedDB for local state
Why: Boat owners need manuals when engines fail at sea.
5. Synonym-Rich Search
- 40+ boat terminology synonyms ("bilge" → "sump pump")
- Typo tolerance (Meilisearch built-in)
- Future: semantic search with embeddings
Why: Boat owners don't know technical jargon.
📊 Schema Design
SQLite Tables (13 tables)
Core: users, organizations, user_organizations
Entities: entities, sub_entities, components
Documents: documents, document_pages, ocr_jobs
Permissions: permissions, document_shares
UX: bookmarks
Meilisearch Index
Index: navidocs-pages
Documents: One per PDF page
Searchable: title, text, systems, categories, tags
Filterable: boatId, userId, make, model, year, etc
Synonyms: 40+ boat terminology mappings
Key Insight: Each PDF page is a separate Meilisearch document. No JOINs needed.
🚀 Technology Stack
Backend
- Node.js v20 (Express or Fastify)
- SQLite3 (better-sqlite3)
- Meilisearch v1.6.2
- BullMQ (or SQLite-based queue fallback)
- Tesseract.js (OCR)
- qpdf + ClamAV (file safety)
Frontend
- Vue 3 + Vite
- Tailwind CSS
- PDF.js (document viewer)
- Meilisearch-inspired design (clean, professional, SVG icons)
- PWA (offline support)
Security
- Helmet (CSP, HSTS headers)
- express-rate-limit
- JWT auth
- Tenant tokens (Meilisearch)
🎨 Design Philosophy
Inspired by: https://www.meilisearch.com/
Visual Language:
- Clean, spacious layouts
- Professional SVG icons (no emojis)
- Muted color palette (grays, blues, whites)
- Typography: SF Pro / Inter / Roboto
- Expensive, grown-up aesthetic
NOT: Playful, colorful, emoji-heavy consumer apps
📈 Scaling Strategy
Day 1 (MVP)
- SQLite (< 100k documents)
- Single Meilisearch instance
- Single-tenant (one user, multiple boats)
Month 6 (Growth)
- Still SQLite (works up to 1M documents)
- Meilisearch cluster (if > 10k searches/day)
- Multi-tenant (organizations)
Year 1 (Scale)
- Migrate to PostgreSQL
- Add pgvector for semantic search
- Cloudflare CDN for PDFs
- Separate OCR worker VPS
🔒 Security Hardening Checklist
- Never expose Meilisearch master key to client
- Use tenant tokens (1-hour TTL)
- Background queue for OCR (prevent CPU spikes)
- File safety: extension + magic byte + qpdf + ClamAV
- Rate limiting: 10 uploads/hour, 30 searches/minute
- Helmet security headers (CSP, HSTS)
- HTTPS only (no HTTP)
- Rotate API keys monthly
🧪 Testing Strategy
Unit Tests (Jest/Vitest)
- Database models
- Search service
- OCR pipeline
- File validation
Integration Tests
- Upload → OCR → Index → Search
- User auth flow
- Permission checks
E2E Tests (Playwright)
- Upload PDF
- Search and view results
- Offline mode
- Mobile responsive
📦 Expert Panel Consensus
47 minutes of debate:
- Database Architect: "Future-proof for Postgres migration"
- Search Engineer: "Search-first, not relational-first"
- DevOps: "Append-only schema, no breaking changes"
- Data Scientist: "Embedding field from day 1 (even if null)"
- Backend Lead: "Hybrid approach wins"
Result: SQLite + Meilisearch hybrid, designed for Postgres migration.
38 minutes with boating experts:
- Marine Surveyor: "Emergency scenarios = offline required"
- Marina Manager: "Shared component library (10 boats, same Volvo engine)"
- Yacht Broker: "Resale value = complete documentation history"
Result: Offline PWA, shared manuals, service tracking.
29 minutes with property/marina experts:
- Multi-entity hierarchy (XYZ Corp → Marina A → Dock 1 → Slip 42)
- Compliance tracking (inspections, certifications)
- Geo-search for physical assets
Result: Schema supports vertical expansion.
📂 Repository Structure
navidocs/
├── docs/
│ ├── debates/
│ │ └── 01-schema-and-vertical-analysis.md
│ ├── architecture/
│ │ ├── database-schema.sql
│ │ ├── meilisearch-config.json
│ │ └── hardened-production-guide.md
│ └── roadmap/
│ ├── v1.0-mvp.md
│ └── 2-week-launch-plan.md
├── server/ (TBD: Extract from lilian1)
├── client/ (TBD: Build from scratch)
├── README.md
└── ARCHITECTURE-SUMMARY.md (this file)
🎯 Success Criteria (MVP Launch)
Technical:
- Upload PDF → searchable in < 5 minutes
- Search latency < 100ms
- Synonym search works ("bilge" finds "sump")
- All fields display correctly
- Offline mode functional
Security:
- Zero master keys in client code
- Tenant tokens expire after 1 hour
- All PDFs sanitized
- Rate limits prevent abuse
User Experience:
- Upload success rate > 95%
- Search relevance 4/5+ rating
- Mobile usable without zooming
🚦 Next Steps
- Analyze lilian1 - Extract clean code, identify Frank-AI junk
- Bootstrap NaviDocs - Create server/ and client/ structure
- Implement core features - Upload, OCR, Search
- Playwright tests - E2E coverage
- Local deployment - Test with real boat manuals
- Beta launch - 5-10 boat owners
The war council has spoken. Time to build.