navidocs/ARCHITECTURE_INTEGRATION_ANALYSIS.md
Danny Stocker 58b344aa31 FINAL: P0 blockers fixed + Joe Trader + ignore binaries
Fixed:
- Price: €800K-€1.5M, Sunseeker added
- Agent 1: Joe Trader persona + actual sale ads research
- Ignored meilisearch binary + data/ (too large for GitHub)
- SESSION_DEBUG_BLOCKERS.md created

Ready for Session 1 launch.

🤖 Generated with Claude Code
2025-11-13 01:29:59 +01:00

916 lines
32 KiB
Markdown

# NaviDocs Architecture Analysis & Integration Points
**Analysis Date:** 2025-11-13
**Project:** NaviDocs - Multi-vertical Document Management System
**Scope:** Yacht Sales & Marine Asset Documentation
---
## EXECUTIVE SUMMARY
NaviDocs is a **production-ready document management platform** designed for multi-entity scenarios (boats, marinas, properties). The architecture supports:
- **Multi-tenancy** with organization/entity hierarchies
- **Background processing** for OCR and indexing
- **Search-first design** using Meilisearch
- **Offline-capable** PWA client (Vue 3)
- **Granular access control** with role-based permissions
- **Extensible metadata** for custom integrations
**Gap Analysis:** Currently NO external integrations (Home Assistant, MQTT, webhooks). Foundation exists for adding them.
---
## 1. DATABASE SCHEMA ANALYSIS
### Location
- `/home/setup/navidocs/server/db/schema.sql` (primary)
- Migrations: `/home/setup/navidocs/server/db/migrations/`
### Core Tables (13 total)
#### A. Tenant Structure
| Table | Purpose | Key Fields |
|-------|---------|-----------|
| `users` | User authentication | id (UUID), email, password_hash, created_at |
| `organizations` | Multi-entity container | id, name, type (personal/commercial/hoa), metadata (JSON) |
| `user_organizations` | Membership + roles | user_id, organization_id, role (admin/manager/member/viewer) |
#### B. Asset Hierarchy
| Table | Purpose | Key Fields |
|-------|---------|-----------|
| `entities` | Boats, marinas, condos | id, organization_id, entity_type, name, **boat-specific** (hull_id, vessel_type, length_feet, make, model, year) |
| `sub_entities` | Systems, docks, units | id, entity_id, name, type, metadata |
| `components` | Engines, panels, appliances | id, sub_entity_id/entity_id, name, manufacturer, model_number, serial_number, install_date, warranty_expires |
**YACHT SALES RELEVANCE:**
- Vessel specs: hull_id (HIN), vessel_type, length, make, model, year
- Component tracking: engines, electrical systems, HVAC by serial number
- Perfect for "as-built" documentation transfer at closing
#### C. Document Management
| Table | Purpose | Key Fields |
|-------|---------|-----------|
| `documents` | File metadata | id, organization_id, entity_id (boat link!), sub_entity_id, component_id, title, **document_type** (owner-manual, component-manual, service-record, etc), status (processing/indexed/failed), file_hash (SHA256 for dedup) |
| `document_pages` | OCR results per page | id, document_id, page_number, ocr_text, ocr_confidence, ocr_language, meilisearch_id, metadata |
| `ocr_jobs` | Background processing queue | id, document_id, status (pending/processing/completed/failed), progress (0-100), error, timestamps |
#### D. Access Control
| Table | Purpose | Key Fields |
|-------|---------|-----------|
| `permissions` | Granular resource access | resource_type (document/entity/organization), resource_id, user_id, permission (read/write/share/delete/admin), granted_at, expires_at |
| `document_shares` | Simplified sharing | document_id, shared_by, shared_with, permission (read/write) |
#### E. User Experience
| Table | Purpose | Key Fields |
|-------|---------|-----------|
| `bookmarks` | Quick access pins | user_id, document_id, page_id, label, quick_access (bool) |
### Schema Design Strengths for Yacht Sales
1. **Deduplication by Hash**: SHA256 file hash prevents duplicate owner manuals when same boat model has same manual
2. **Metadata Extensibility**: JSON fields on entities, components, documents for custom data (broker notes, sale status, etc)
3. **Temporal Tracking**: `created_at`, `updated_at`, `warranty_expires`, `install_date` for compliance/history
4. **Soft Deletes**: `status` field + `replaced_by` support version history without losing data
### Migration History
- `002_add_document_toc.sql` - Table of Contents support
- `008_add_organizations_metadata.sql` - Custom metadata column
- `009_permission_templates_and_invitations.sql` - Invite workflow
---
## 2. API ENDPOINTS & CAPABILITIES
### Location: `/home/setup/navidocs/server/routes/`
#### A. Authentication & Multi-Tenancy
**Route:** `/api/auth`
- `POST /register` - User signup with email verification
- `POST /login` - JWT + refresh token generation
- `POST /refresh` - Refresh access token
- `POST /logout` - Session revocation
- `GET /me` - Current user profile
- `POST /password/reset-request` - Email-based reset
- `POST /password/reset` - Reset with token
- `POST /email/verify` - Email verification
**Auth Service:** `/server/services/auth.service.js`
- bcrypt password hashing
- JWT token management (default: 7-day expiry)
- Audit logging on all auth events
- Device tracking (user-agent, IP, login timestamps)
#### B. Organization Management
**Route:** `/api/organizations`
- `POST /` - Create organization (with owner as member)
- `GET /` - List user's organizations
- `GET /:id` - Organization details (members, stats)
- `PUT /:id` - Update organization (name, metadata)
- `DELETE /:id` - Delete org (soft delete with audit trail)
- `GET /:id/members` - List organization members
- `POST /:id/members` - Invite user to org
- `DELETE /:id/members/:userId` - Remove user
- `GET /:id/stats` - Document count, storage usage
**Authorization Checks:**
- Organization admin role required for member management
- User membership verified before access
- Organization metadata supports custom fields
#### C. Document Management
**Route:** `/api/documents`
- `GET /:id` - Document metadata (with ownership check)
- `GET ?organizationId=X&limit=50` - List documents with pagination
- `DELETE /:id` - Soft delete with file cleanup
**Ownership Verification:**
```sql
-- Access granted if:
1. User is in document's organization, OR
2. User uploaded the document, OR
3. Document was shared with user
```
#### D. File Upload & OCR Pipeline
**Route:** `/api/upload`
- `POST /` - Upload PDF (multipart/form-data)
- Parameters: file, title, documentType, organizationId, entityId (optional), componentId (optional)
- Response: { jobId, documentId, message }
- File validation: .pdf only, magic bytes check, max 50MB
- File safety: sanitized filename, SHA256 hash, no null bytes
**Quick OCR Route:** `/api/upload/quick-ocr`
- Fast OCR for preview/validation
**Deduplication:** SHA256 hash checks prevent uploading same file twice
#### E. Background Jobs
**Route:** `/api/jobs`
- `GET /:id` - Job status and progress (0-100%)
- `GET ?status=completed&limit=50` - List jobs with filtering
- Response includes: documentId, status (pending/processing/completed/failed), progress, error message
**Queue System:** BullMQ with Redis backend
- 3 retry attempts with exponential backoff
- Job persistence for 24 hours (completed) / 7 days (failed)
- Progress updates via WebSocket-ready design
#### F. Search & Indexing
**Route:** `/api/search`
- `POST /token` - Generate Meilisearch tenant token (1-hour TTL, user-scoped)
- `POST /` - Server-side search (optional, for SSR)
- Filters: documentType, entityId, language, custom fields
- Response: hits, estimatedTotalHits, processingTimeMs
- `GET /health` - Meilisearch connectivity check
**Security:**
- Tenant tokens scoped to user + their organizations
- Row-level filtering injected at token generation
- Master key never exposed to client
- Fallback to search-only API key
#### G. Permissions Management
**Route:** `/api/permissions`
- Grant/revoke read/write/share/admin access
- Resource-level granularity (document, entity, organization)
- Time-bound permissions with expiration
- Audit trail of who granted what when
#### H. System Settings
**Route:** `/api/admin/settings` (admin-only)
- `GET /public/app` - Public app name (no auth)
- Settings management: OCR language, email config, feature flags
- Categories: app, email, ocr, security, integrations
#### I. Table of Contents
**Route:** `/api/documents/:id/toc`
- Extract and display document structure
- PDF heading hierarchy
- Section-based navigation
#### J. Images & Media
**Route:** `/api/images`
- Extract images from PDF pages
- Image search within documents
- Figure/diagram zoom capability
#### K. Statistics
**Route:** `/api/stats`
- Organization document count
- Storage usage
- OCR processing metrics
- User activity trends
---
## 3. FRONTEND ARCHITECTURE
### Location: `/home/setup/navidocs/client/src/`
#### A. Core Views
| View | Route | Purpose |
|------|-------|---------|
| **HomeView** | `/` | Dashboard, recent docs, quick access |
| **LibraryView** | `/library` | Document browser by entity/type |
| **SearchView** | `/search` | Full-text search with filters |
| **DocumentView** | `/document/:id` | PDF viewer with OCR results |
| **AuthView** | `/auth/login`, `/register` | Login/signup forms |
| **AccountView** | `/account` | User profile, organizations |
| **JobsView** | `/jobs` | Upload progress, OCR status |
#### B. Reusable Components
| Component | Purpose |
|-----------|---------|
| **UploadModal** | Drag-drop file upload interface |
| **TocSidebar** | Document TOC navigation |
| **TocEntry** | Individual TOC item renderer |
| **DocumentView** | PDF.js viewer with search overlay |
| **ImageOverlay** | Full-screen image viewer |
| **FigureZoom** | Figure/diagram magnifier |
| **ToastContainer** | Notification system |
| **ConfirmDialog** | Action confirmation UI |
| **CompactNav** | Mobile-friendly navigation |
| **LanguageSwitcher** | UI language selection |
#### C. Framework & Libraries
- **Vue 3** with Composition API
- **Vue Router** for SPA navigation
- **Tailwind CSS** for styling (Meilisearch-inspired design)
- **PDF.js** for document rendering
- **Meilisearch JS** for client-side search
- **IndexedDB** for offline storage (PWA)
#### D. PWA Capabilities
- Service worker for offline mode
- Offline-first architecture (works 20+ miles offshore per design docs)
- Cached critical manuals
- IndexedDB for local document metadata
---
## 4. BACKGROUND WORKERS & SERVICES
### Location: `/home/setup/navidocs/server/workers/` and `/server/services/`
#### A. OCR Worker
**File:** `ocr-worker.js`
**Function:** Background processing of document uploads
- BullMQ job processor (listens to Redis queue)
- PDF text extraction via Tesseract.js or Google Vision
- Page-by-page processing with progress updates (0-100%)
- Results saved to `document_pages` table
- Automatic indexing in Meilisearch upon completion
- Error handling: 3 retries, then marks job as failed
**Flow:**
```
1. User uploads PDF → Document stored, OCR job created
2. Worker picks up job from queue
3. Extract text per page (calls ocr-hybrid.js)
4. Save OCR results to document_pages
5. Index each page in Meilisearch (searchable_text)
6. Update document status: processing → indexed
```
#### B. Image Extractor
**File:** `image-extractor.js`
**Function:** Extract images from PDF pages
- Called during OCR processing
- Stores images separately for search/zoom
- Supports figure detection and metadata
#### C. OCR Service
**File:** `ocr.js`, `ocr-hybrid.js`
**Options:**
- Local: Tesseract.js (CPU-intensive, slow)
- Cloud: Google Vision API (fast, accurate, $$$)
- Hybrid: Local fallback if API fails
- Configuration via `OCR_LANGUAGE`, `OCR_CONFIDENCE_THRESHOLD`
#### D. File Safety Service
**File:** `file-safety.js`
**Validation:**
1. Extension check (.pdf only)
2. MIME type via magic bytes
3. File size limit (50MB)
4. Filename sanitization (no path traversal, null bytes, special chars)
5. Hash calculation for deduplication
#### E. Search Service
**File:** `search.js`
**Features:**
- Meilisearch index creation and configuration
- Tenant token generation with user scoping
- Row-level security filter injection
- Synonym mapping (boat terminology)
- Page-level indexing (each PDF page = searchable document)
**Meilisearch Index Schema:**
```json
{
"indexName": "navidocs-pages",
"primaryKey": "id",
"searchableAttributes": ["title", "text", "systems", "categories", "tags"],
"filterableAttributes": ["boatId", "userId", "make", "model", "year", "documentType"],
"sortableAttributes": ["createdAt", "pageNumber"],
"synonyms": {
"bilge": ["sump pump", "bilge pump"],
"engine": ["motor", "powerplant"],
...40+ boat terms...
}
}
```
#### F. Section Extractor
**File:** `section-extractor.js`
**Purpose:** Extract document structure (chapters, headings, sections)
#### G. Authorization Service
**File:** `authorization.service.js`
**Provides:**
- User organization list
- Entity-level permission checks
- Role validation (admin/manager/member/viewer)
- Hierarchical permission resolution
#### H. Queue Service
**File:** `queue.js`
**Implementation:** BullMQ with Redis
```javascript
// Job options:
- 3 retry attempts
- Exponential backoff (2s, 4s, 8s)
- Completed jobs kept for 24 hours
- Failed jobs kept for 7 days
- Job priority support
```
#### I. Audit Service
**File:** `audit.service.js`
**Tracks:**
- User login/logout
- Document uploads
- Permission changes
- Organization modifications
- Failed access attempts
- Data exports
#### J. Organization Service
**File:** `organization.service.js`
**Features:**
- Organization CRUD
- Member invitation workflows
- Permission template application
- Org statistics (doc count, storage)
#### K. Settings Service
**File:** `settings.service.js`
**Manages:**
- System configuration (app name, email settings, OCR options)
- Feature flags
- Integration credentials (for webhooks, etc.)
- Environment-specific overrides
---
## 5. INTEGRATION POINTS IDENTIFIED
### Current State: NO External Integrations
The system is **architecturally ready** for integrations but none are implemented.
### A. Existing Hooks & Extension Points
#### 1. Metadata Fields (JSON)
```sql
-- Organizations
metadata TEXT -- Custom org-level data (e.g., {"broker_id": "123", "region": "SE"})
-- Entities (boats)
metadata TEXT -- E.g., {"hull_cert_date": "2020-01-15", "survey_status": "valid"}
-- Components
metadata TEXT -- E.g., {"last_service": "2024-06", "service_interval_months": 12}
-- Documents
metadata TEXT -- E.g., {"sale_list_price": "450000", "condition_notes": "excellent"}
-- Document Pages
metadata TEXT -- Bounding boxes, OCR confidence per region
```
**Use for yacht sales:** Store listing price, condition report status, broker notes, survey dates.
#### 2. Settings Table
```
system_settings (key TEXT, value TEXT, category, encrypted)
```
**Extensible for:** API keys, webhook URLs, integrations config
**Currently used for:** OCR language, email settings, feature flags
#### 3. Status Enum Fields
```sql
documents.status -- 'processing' | 'indexed' | 'failed' | 'archived' | 'deleted'
ocr_jobs.status -- 'pending' | 'processing' | 'completed' | 'failed'
```
**Extensible with:** 'sold', 'transferred', 'archived_due_to_sale', 'under_inspection'
#### 4. Background Job System
**Existing:** BullMQ queue for OCR
**Extensible for:** Webhook delivery, MQTT publishing, external API calls
### B. Potential Integration Points for Yacht Sales
#### 1. **Home Assistant Integration**
**Where:** `/api/webhooks/home-assistant` (needs creation)
**Purpose:**
- Publish boat documentation availability to yacht's systems
- Trigger documentation reminders based on automation rules
- Log documentation access to home automation timeline
**Database Changes Needed:**
```sql
-- New table
CREATE TABLE webhooks (
id TEXT PRIMARY KEY,
organization_id TEXT,
type TEXT ('homeassistant', 'mqtt', 'webhook_generic'),
endpoint_url TEXT,
auth_token TEXT (encrypted),
events TEXT (JSON array: ['document.uploaded', 'document.indexed', 'ocr.completed']),
active BOOLEAN,
created_at INTEGER
);
-- Add to metadata
-- documents.metadata: {"ha_entity_id": "sensor.boat_name_docs_updated"}
-- entities.metadata: {"ha_boat_id": "yacht_123", "automations": [...]}
```
**Webhook Events:**
```json
{
"event": "document.indexed",
"timestamp": 1699868400,
"document": {
"id": "...",
"title": "Engine Manual",
"documentType": "component-manual",
"component": { "name": "Volvo Penta D6-400", "serialNumber": "..." }
},
"entity": {
"id": "...",
"name": "35ft Yacht",
"hull_id": "..."
}
}
```
#### 2. **MQTT Broker Integration**
**Where:** New worker `/server/workers/mqtt-publisher.js`
**Purpose:**
- Publish document metadata to boat's IoT network
- Subscribe to maintenance triggers
- Real-time documentation sync to onboard displays
**Topic Schema:**
```
navidocs/organizations/{org_id}/entities/{boat_id}/documents/{type}/{component}
navidocs/organizations/{org_id}/entities/{boat_id}/maintenance/triggers
```
#### 3. **Broker CRM Integration** (e.g., Zillow, MLS)
**Where:** `/api/integrations/broker-crm`
**Purpose:**
- Auto-tag documents by boat listing
- Sync sale status (pending, sold, withdrawn)
- Pull boat specs from MLS into system
**Database Additions:**
```sql
ALTER TABLE entities ADD COLUMN mls_id TEXT;
ALTER TABLE documents ADD COLUMN crm_external_id TEXT;
-- Use metadata for broker-specific fields
-- entities.metadata: {"mls_id": "...", "listing_agent": "...", "sale_price": "..."}
```
#### 4. **Document Storage Integration** (S3, Backblaze)
**Where:** File path resolution in `/routes/documents.js`
**Current:** Files stored in `./uploads` directory
**Extensible to:** Cloud storage with signed URLs
**Implementation Pattern:**
```javascript
// Current
const filePath = path.join(UPLOAD_DIR, document.file_path);
fs.readFileSync(filePath);
// Future (with integration):
if (document.storage_type === 's3') {
return await s3.getObject(document.file_path).promise();
} else if (document.storage_type === 'local') {
return fs.readFileSync(filePath);
}
```
#### 5. **Survey & Inspection APIs**
**Where:** `/api/integrations/survey-provider`
**Purpose:**
- Link boat surveys from HagsEye, DNV, etc.
- Sync inspection status to document metadata
- Auto-generate compliance documents
**Data Model:**
```sql
-- Survey linking
ALTER TABLE entities ADD COLUMN survey_provider_id TEXT;
-- Status tracking
-- documents.status: could add 'survey-required', 'survey-pending', 'survey-complete'
-- metadata: {"survey_date": "2024-06-15", "surveyor": "...", "next_survey": "2025-06-15"}
```
#### 6. **Email/Notification Integration**
**Existing:** Partial (email reset links)
**Extensible to:**
- Document ready notifications (OCR complete)
- Access granted notifications
- Document expiration warnings (warranty dates, inspection due)
**Table Already Exists:**
```sql
-- system_settings can store email provider config
-- documents.metadata: {"notify_on_expiry": true, "expiry_date": "2025-01-01"}
```
#### 7. **Audit & Compliance Export**
**Where:** `/api/integrations/compliance-export`
**Purpose:** Export document chain-of-custody for yacht sale closing
**Existing Foundation:**
```sql
-- audit_logs table tracks all document access
-- permissions table tracks who granted what access
-- documents track uploaded_by + creation date
```
#### 8. **Sync to Cloud Directory**
**Where:** `/api/integrations/cloud-directory`
**Purpose:** Mirror documents to client's cloud account (Google Drive, OneDrive)
**Services:**
- Google Drive API (already referenced in code: `ocr-google-drive.js`)
- OneDrive API
- Dropbox API
**Implementation Path:**
```javascript
// New worker
/server/workers/cloud-sync-worker.js
// On document indexed:
// 1. Convert to Google Drive folder structure
// 2. Upload PDF + OCR text file
// 3. Store sync metadata in documents table
```
### C. Current Integration Status
**Implemented:**
- ✅ Google Vision API (optional OCR)
- ✅ Google Drive API (referenced in services)
- ✅ Redis (BullMQ job queue)
- ✅ Meilisearch (search engine)
**Partially Implemented:**
- ⚠️ JWT auth (core system, but no 3rd-party OAuth)
- ⚠️ Email (password reset, not general notifications)
**Not Implemented (Gaps for Yacht Sales):**
- ❌ Home Assistant webhook
- ❌ MQTT publisher
- ❌ Broker CRM sync
- ❌ Cloud storage (S3, Backblaze)
- ❌ Survey provider APIs
- ❌ OAuth (Google, Microsoft, Apple sign-in)
- ❌ Notification system (beyond email)
- ❌ Document expiration alerts
---
## 6. OFFLINE-FIRST PWA CAPABILITIES
### Existing Infrastructure
**Design Document:** Architecture summary mentions offline-first PWA
- Service worker caching of critical manuals
- Works 20+ miles offshore (per design spec)
- IndexedDB for local state
### Not Yet Fully Implemented
- Service worker registration code not found in client/src/
- Manifest.json not created yet
- Offline mode for boat emergencies needs completion
### Enhancement Opportunity for Yacht Sales
**Scenario:** Buyer viewing yacht specs on boat during sea trial
```
1. Download critical manuals before leaving shore
2. Access offline on iPad while viewing systems
3. Sync back to server when WiFi available
4. Signature capture for inspection sign-off
```
---
## 7. SECURITY ARCHITECTURE
### Implemented
-**JWT Auth** (7-day expiry, refresh tokens)
-**Tenant Scoping** (Meilisearch tenant tokens, 1-hour TTL)
-**File Validation** (extension, magic bytes, size)
-**Role-Based Access Control** (admin/manager/member/viewer)
-**Permission Granularity** (document/entity/organization level)
-**Helmet Security Headers** (CSP, HSTS, etc.)
-**Rate Limiting** (100 req/15min default)
-**Password Hashing** (bcrypt)
-**Audit Logging** (all auth events, document access)
-**Prepared Statements** (SQL injection prevention)
### Not Yet Fully Implemented
- ⚠️ Email verification workflow (scaffolding exists)
- ⚠️ 2FA/MFA
- ⚠️ IP whitelisting for organizations
- ⚠️ API keys for machine-to-machine auth
---
## 8. CURRENT CAPABILITIES VS. YACHT SALES USE CASE
### Strengths
1. **Multi-entity Management**
- Multiple boats per broker/agency
- Components tracked (engines, systems)
- Hierarchical organization
2. **Document Search**
- Full-text OCR + Meilisearch
- Synonym mapping (boat terms)
- Metadata filtering (vessel type, make, model)
3. **Access Control**
- Share docs with buyers
- Broker/agent team collaboration
- Granular permissions
4. **Offline Access** ✅ (Design complete, implementation partial)
- Critical manuals cached
- Offline PWA mode
5. **Compliance Tracking**
- Document creation date + uploader
- Warranty date tracking (components)
- Survey/inspection metadata via JSON
6. **Multi-tenancy**
- Brokers manage multiple boats
- Team member roles
- Organization-level statistics
### Gaps for Yacht Sales
1. **No Listing Integration**
- Not linked to MLS/YachtWorld data
- Broker CRM sync not implemented
2. **No Sale Workflow**
- No "as-built" package generation
- No closing checklist
- No transfer-of-ownership flow
3. **No Signature Capture**
- Buyers can't sign off on receipt
- No e-signature integration
4. **Limited Notifications**
- No alerts for expiring surveys/certificates
- No "buyer accessed doc" notifications
- No OCR completion alerts to user
5. **No Media Support (Video)**
- System built for documents
- No walkthrough video links
- No marina tour videos
6. **No Real-Time Collaboration**
- No commenting on specific pages
- No annotation tools
- No comment notifications
---
## 9. FILE STRUCTURE REFERENCE
```
/home/setup/navidocs/
├── server/
│ ├── db/
│ │ ├── schema.sql ← Database schema (13 tables)
│ │ ├── migrations/ ← Schema evolution
│ │ └── db.js ← Connection singleton
│ │
│ ├── routes/ ← API endpoints (12 route files)
│ │ ├── auth.routes.js ← Login, register, password reset
│ │ ├── organization.routes.js ← Multi-tenancy management
│ │ ├── permission.routes.js ← Access control
│ │ ├── settings.routes.js ← System configuration
│ │ ├── documents.js ← Document CRUD + ownership check
│ │ ├── upload.js ← File upload + OCR queue
│ │ ├── jobs.js ← OCR job status
│ │ ├── search.js ← Meilisearch tokens + search
│ │ ├── organization.routes.js ← Org member management
│ │ ├── toc.js ← Table of contents
│ │ ├── images.js ← Image extraction from PDFs
│ │ ├── stats.js ← Usage statistics
│ │ └── quick-ocr.js ← Fast OCR endpoint
│ │
│ ├── services/
│ │ ├── auth.service.js ← User registration, login, token refresh
│ │ ├── authorization.service.js ← Permission checking
│ │ ├── organization.service.js ← Org CRUD operations
│ │ ├── audit.service.js ← Event logging
│ │ ├── settings.service.js ← Config management
│ │ ├── queue.js ← BullMQ job queue
│ │ ├── search.js ← Meilisearch indexing
│ │ ├── file-safety.js ← File validation
│ │ ├── ocr.js ← Tesseract OCR client
│ │ ├── ocr-google-vision.js ← Google Vision API
│ │ ├── ocr-hybrid.js ← Fallback OCR strategy
│ │ ├── section-extractor.js ← Document structure extraction
│ │ └── toc-extractor.js ← TOC generation
│ │
│ ├── workers/
│ │ ├── ocr-worker.js ← Background OCR processor (Redis queue)
│ │ └── image-extractor.js ← Image extraction from PDFs
│ │
│ ├── middleware/
│ │ └── auth.middleware.js ← JWT validation + org checks
│ │
│ ├── config/
│ │ ├── db.js ← SQLite config
│ │ ├── meilisearch.js ← Search engine config
│ │ └── redis.js ← Job queue config
│ │
│ ├── index.js ← Express server + route mounting
│ └── package.json
├── client/
│ ├── src/
│ │ ├── views/ ← Page components
│ │ │ ├── HomeView.vue ← Dashboard
│ │ │ ├── LibraryView.vue ← Document browser
│ │ │ ├── SearchView.vue ← Full-text search
│ │ │ ├── DocumentView.vue ← PDF viewer
│ │ │ ├── AuthView.vue ← Login/signup
│ │ │ ├── AccountView.vue ← User profile
│ │ │ └── JobsView.vue ← Upload progress
│ │ │
│ │ ├── components/ ← Reusable UI components
│ │ │ ├── UploadModal.vue ← Drag-drop upload
│ │ │ ├── DocumentView.vue ← PDF.js viewer
│ │ │ ├── TocSidebar.vue ← TOC navigation
│ │ │ ├── ImageOverlay.vue ← Full-screen images
│ │ │ ├── ToastContainer.vue ← Notifications
│ │ │ └── ... (8 more components)
│ │ │
│ │ ├── router.js ← Vue Router configuration
│ │ ├── App.vue ← Root component
│ │ └── main.js ← Entry point
│ │
│ └── package.json
└── docs/
├── ARCHITECTURE-SUMMARY.md ← Design overview
├── DESIGN_AUTH_MULTITENANCY.md ← Auth system spec
├── API_SUMMARY.md ← API documentation
└── [39 other analysis/design docs]
```
---
## 10. RECOMMENDED INTEGRATION ROADMAP
### Phase 1: Quick Wins (Week 1-2)
1. **Settings UI for Webhook Configuration**
- Store Home Assistant webhook URL in system_settings
- Enable/disable per organization
- Test webhook delivery
2. **Metadata Templates for Yacht Sales**
- Add "vessel_specs" template (HIN, survey date, condition notes)
- Add "sale_info" template (list price, broker ID, showings)
- Add "buyer_info" template (name, contact, purchase contingencies)
3. **Status Enhancements**
- Extend document.status to include 'sale-package', 'transferred'
- Add document.expiry_date for survey/inspection tracking
### Phase 2: Core Integrations (Week 3-4)
1. **Home Assistant Webhook Handler**
- `/api/webhooks/home-assistant` endpoint
- Publish document indexed/uploaded events
- Subscribe to boat status changes
2. **Notification System**
- Email notifications when docs ready
- Expiration warnings (surveys, warranties)
- Access notifications to seller/broker
3. **Cloud Storage Option**
- S3/Backblaze as alternative to local disk
- Signed URL generation for downloads
### Phase 3: Automation (Week 5-6)
1. **As-Built Package Generator**
- Collect all boat documents
- Generate PDF with index
- Auto-upload to Google Drive for buyer
2. **Broker CRM Sync** (if connecting to external CRM)
- Sync boat data from MLS
- Tag documents by listing
- Update sale status in NaviDocs
3. **MQTT Integration**
- Publish doc metadata to boat's IoT network
- Real-time sync for onboard displays
---
## 11. DATABASE MIGRATION PLAN
To support integrations without breaking changes:
```sql
-- New table for external integrations
CREATE TABLE integrations (
id TEXT PRIMARY KEY,
organization_id TEXT NOT NULL,
type TEXT NOT NULL, -- 'webhook', 'mqtt', 'crm', 'cloud_storage'
name TEXT,
config TEXT NOT NULL, -- JSON: {url, auth, events_enabled, etc}
active BOOLEAN DEFAULT 1,
created_at INTEGER,
updated_at INTEGER,
FOREIGN KEY (organization_id) REFERENCES organizations(id) ON DELETE CASCADE
);
-- Track integration events
CREATE TABLE integration_events (
id TEXT PRIMARY KEY,
integration_id TEXT NOT NULL,
event_type TEXT, -- 'document.uploaded', 'document.indexed', etc
document_id TEXT,
payload TEXT, -- JSON: full event data
status TEXT, -- 'pending', 'delivered', 'failed'
retry_count INTEGER DEFAULT 0,
created_at INTEGER,
FOREIGN KEY (integration_id) REFERENCES integrations(id) ON DELETE CASCADE,
FOREIGN KEY (document_id) REFERENCES documents(id) ON DELETE SET NULL
);
-- Extend documents for sale workflow
ALTER TABLE documents ADD COLUMN sale_package_id TEXT; -- Links related docs
ALTER TABLE documents ADD COLUMN requires_signature BOOLEAN DEFAULT 0;
ALTER TABLE documents ADD COLUMN signature_metadata TEXT; -- {signer, timestamp, ip}
```
---
## 12. SUMMARY: READY FOR INTEGRATION
### ✅ Foundation Complete
- Multi-tenancy architecture solid
- API design extensible
- Service layer pattern established
- Background job system in place
- Metadata fields support custom data
### ⚠️ Needs Implementation
- Integration management UI
- Webhook delivery system
- Home Assistant/MQTT publishers
- Notification queue
- E-signature support
- Sale workflow automation
### 🎯 Yacht Sales Specific
**The schema is PERFECT for boat documentation because:**
1. Entities = individual boats with full specs
2. Components = engines, systems, equipment (trackable by serial #)
3. Document types support all manuals, service records, surveys
4. Metadata allows custom broker fields (price, condition, showings)
5. Multi-tenancy supports broker teams + multiple boats
6. Offline PWA supports sea trial scenarios
7. Access control handles buyer access management
**Next Steps:**
1. Add Home Assistant integration (highest ROI for smart yacht ecosystem)
2. Build yacht sales-specific metadata templates
3. Create "as-built package" generator for closing
4. Add notification system for time-sensitive docs (surveys, warranties)