Session 2: Complete technical architecture from 11 Haiku agents

All 11 agents (S2-H01 through S2-H09 + S2-H03A + S2-H07A) have completed
their technical specifications:

- S2-H01: NaviDocs codebase architecture analysis
- S2-H02: Inventory tracking system (€15K-€50K value recovery)
- S2-H03: Maintenance log & reminder system
- S2-H04: Camera & Home Assistant integration
- S2-H05: Contact management system
- S2-H06: Accounting module & receipt OCR integration
- S2-H07: Impeccable search UX (Meilisearch facets)
- S2-H08: WhatsApp Business API + AI agent integration
- S2-H09: Document versioning with IF.TTT compliance
- S2-H03A: VAT/tax jurisdiction tracking & compliance
- S2-H07A: Multi-calendar system (4 calendar types)

Total: ~15,600 lines of technical specifications
Status: Ready for S2-H10 synthesis (awaiting Session 1 completion)
IF.bus: All inter-agent communications documented

2025-11-13 01:57:25 +00:00

47 KiB

Raw Export PDF Permalink Blame History

NaviDocs Codebase Architecture Map

Analysis Date: 2025-11-13 Agent: S2-H01 Status: Complete

1. Database Schema Summary

Core Entities

The NaviDocs database uses SQLite (v3) with a schema designed for future PostgreSQL migration. All timestamps use Unix epoch (seconds).

User Management

- users (id: TEXT PRIMARY KEY)
  - id: UUID
  - email: TEXT UNIQUE
  - password_hash: TEXT (bcrypt)
  - name: TEXT
  - status: TEXT (active, suspended, deleted)
  - email_verified: BOOLEAN
  - created_at, updated_at: INTEGER
  - last_login_at: INTEGER
  - failed_login_attempts, locked_until: Security fields

Organization Structure (Multi-tenant)

- organizations (id: TEXT PRIMARY KEY)
  - id: UUID
  - name: TEXT
  - type: TEXT (personal, commercial, hoa)
  - created_at, updated_at: INTEGER

- user_organizations (user_id + organization_id PRIMARY KEY)
  - role: TEXT (admin, manager, member, viewer)
  - joined_at: INTEGER

Entity Management (Boats, Marinas, Properties)

- entities (id: TEXT PRIMARY KEY)
  - id: UUID
  - organization_id: FK
  - user_id: FK (primary owner)
  - entity_type: TEXT (boat, marina, condo, yacht-club)
  - name: TEXT

  Boat-specific:
  - make, model, year: TEXT/INTEGER
  - hull_id: TEXT
  - vessel_type: TEXT (powerboat, sailboat, catamaran, trawler)
  - length_feet: INTEGER

  Property-specific:
  - property_type: TEXT
  - address: TEXT
  - gps_lat, gps_lon: REAL

  - metadata: TEXT (JSON)
  - created_at, updated_at: INTEGER

Hierarchical Component Structure

- sub_entities (id: TEXT PRIMARY KEY)
  - id: UUID
  - entity_id: FK
  - name: TEXT (system, dock, unit, facility)
  - type: TEXT
  - metadata: TEXT (JSON)

- components (id: TEXT PRIMARY KEY)
  - id: UUID
  - sub_entity_id: FK (optional)
  - entity_id: FK (direct link)
  - name, manufacturer, model_number, serial_number: TEXT
  - install_date, warranty_expires: INTEGER
  - metadata: TEXT (JSON)

Document Management

- documents (id: TEXT PRIMARY KEY)
  - id: UUID
  - organization_id: FK
  - entity_id, sub_entity_id, component_id: FK (hierarchical linking)
  - uploaded_by: FK (user)
  - title, document_type: TEXT
  - file_path, file_name, file_size: TEXT/INTEGER
  - file_hash: TEXT (SHA256 for deduplication)
  - mime_type: TEXT (default: application/pdf)
  - page_count: INTEGER
  - language: TEXT (default: en)
  - status: TEXT (processing, indexed, failed, archived, deleted)
  - replaced_by: TEXT (document supersession)
  - is_shared: BOOLEAN
  - shared_component_id: TEXT (for shared manual library)
  - metadata: TEXT (JSON)
  - created_at, updated_at: INTEGER

- document_pages (id: TEXT PRIMARY KEY)
  - id: UUID (page_<doc_id>_<page_num>)
  - document_id: FK
  - page_number: INTEGER
  - ocr_text: TEXT
  - ocr_confidence: REAL (0-1)
  - ocr_language: TEXT (default: en)
  - ocr_completed_at: INTEGER
  - search_indexed_at: INTEGER
  - meilisearch_id: TEXT
  - section: TEXT (TOC section name)
  - section_key: TEXT (normalized key)
  - section_order: INTEGER
  - metadata: TEXT (JSON - bounding boxes, etc)

- document_images (extracted from PDFs)
  - id: UUID
  - documentId: FK
  - pageNumber: INTEGER
  - imageIndex: INTEGER
  - imagePath: TEXT
  - imageFormat: TEXT (png, jpeg)
  - width, height: INTEGER
  - position: TEXT (JSON)
  - extractedText: TEXT
  - textConfidence: REAL
  - anchorTextBefore, anchorTextAfter: TEXT

Background Jobs

- ocr_jobs (id: TEXT PRIMARY KEY)
  - id: UUID
  - document_id: FK
  - status: TEXT (pending, processing, completed, failed)
  - progress: INTEGER (0-100%)
  - error: TEXT
  - started_at, completed_at: INTEGER
  - created_at: INTEGER

- permissions (granular access control)
  - id: UUID
  - resource_type: TEXT (document, entity, organization)
  - resource_id: FK
  - user_id: FK
  - permission: TEXT (read, write, share, delete, admin)
  - granted_by, granted_at: FK + INTEGER
  - expires_at: INTEGER (optional)

- entity_permissions (entity-level access)
  - id: UUID
  - user_id, entity_id: FK
  - permission_level: TEXT (viewer, editor, manager, admin)
  - granted_by, granted_at: FK + INTEGER
  - expires_at: INTEGER

- document_shares (simplified document sharing)
  - id: UUID
  - document_id, shared_by, shared_with: FK
  - permission: TEXT (read, write)
  - created_at: INTEGER

- refresh_tokens (JWT session management)
  - id: UUID
  - user_id: FK
  - token_hash: TEXT (SHA256)
  - device_info, ip_address: TEXT
  - expires_at: INTEGER
  - revoked: BOOLEAN
  - created_at, revoked_at: INTEGER

- password_reset_tokens
  - id: UUID
  - user_id: FK
  - token_hash: TEXT (SHA256)
  - expires_at: INTEGER
  - used: BOOLEAN
  - ip_address: TEXT
  - used_at: INTEGER

User Preferences

- bookmarks (quick access)
  - id: UUID
  - user_id, document_id: FK
  - page_id: FK (optional - specific page)
  - label: TEXT
  - quick_access: BOOLEAN (pin to homepage)
  - created_at: INTEGER

Audit Trail (Optional)

- audit_events (not shown in schema but referenced in code)
  - Logs all significant operations for compliance
  - user_id, event_type, resource_type, resource_id
  - status, ip_address, user_agent, metadata

Settings/Configuration

- settings (key-value store)
  - key: TEXT PRIMARY KEY
  - value: TEXT (JSON)
  - description: TEXT
  - category: TEXT

Key Indexes

idx_entities_org, idx_entities_user, idx_entities_type
idx_documents_org, idx_documents_entity, idx_documents_status, idx_documents_hash, idx_documents_shared
idx_pages_document, idx_pages_indexed
idx_jobs_status, idx_jobs_document
idx_permissions_user, idx_permissions_resource
idx_bookmarks_user

2. API Endpoints (Grouped by Feature)

Authentication Endpoints (`/api/auth`)

File: server/routes/auth.routes.js

POST /api/auth/register
  - Input: email, password, name
  - Output: userId, email, verificationToken
  - Logging: audit.service logs user.register

POST /api/auth/login
  - Input: email, password, deviceInfo, ipAddress
  - Output: accessToken (JWT), refreshToken, user object
  - Auth: None (initial login)
  - Side Effects: Updates failed_login_attempts, triggers account lock after 5 failures

POST /api/auth/refresh
  - Input: refreshToken
  - Output: new accessToken, user object
  - Auth: None (token-based)

POST /api/auth/logout
  - Input: refreshToken
  - Output: success message
  - Side Effects: Revokes refresh token

POST /api/auth/logout-all
  - Input: None (uses JWT)
  - Output: success message
  - Side Effects: Revokes all user tokens
  - Auth: JWT required

POST /api/auth/password/reset-request
  - Input: email
  - Output: generic success (doesn't reveal email exists)
  - Side Effects: Creates password_reset_tokens entry

POST /api/auth/password/reset
  - Input: token, newPassword
  - Output: success message
  - Side Effects: Updates password, revokes all refresh tokens

POST /api/auth/email/verify
  - Input: token
  - Output: email, success message
  - Side Effects: Sets email_verified = 1

GET /api/auth/me
  - Input: None (JWT)
  - Output: user object (id, email, name, status, emailVerified, createdAt, lastLoginAt)
  - Auth: JWT required

Organization Management (`/api/organizations`)

File: server/routes/organization.routes.js

POST /api/organizations
  - Input: name, type (optional), metadata (optional)
  - Output: organization object
  - Auth: JWT required

GET /api/organizations
  - Input: None
  - Output: Array of user's organizations with role
  - Auth: JWT required

GET /api/organizations/:organizationId
  - Input: organizationId in params
  - Output: organization details with userRole
  - Auth: JWT + requireOrganizationMember

PUT /api/organizations/:organizationId
  - Input: name, type, metadata
  - Output: updated organization
  - Auth: JWT + requireOrganizationRole('manager')

DELETE /api/organizations/:organizationId
  - Input: organizationId
  - Output: success message with deleted count
  - Auth: JWT + requireOrganizationRole('admin')

GET /api/organizations/:organizationId/members
  - Input: organizationId
  - Output: Array of members with roles
  - Auth: JWT + requireOrganizationMember

POST /api/organizations/:organizationId/members
  - Input: userId, role (optional)
  - Output: success message
  - Auth: JWT + requireOrganizationRole('manager')
  - Side Effects: Adds or updates user role

DELETE /api/organizations/:organizationId/members/:userId
  - Input: organizationId, userId
  - Output: success message with removed role
  - Auth: JWT + requireOrganizationRole('manager')

GET /api/organizations/:organizationId/stats
  - Input: organizationId
  - Output: organization statistics (document count, member count, etc)
  - Auth: JWT + requireOrganizationMember

Permission Management (`/api/permissions`)

File: server/routes/permission.routes.js (referenced but not fully reviewed)

Expected endpoints:
- POST /api/permissions/grant (grant permission to user)
- DELETE /api/permissions/revoke (revoke permission)
- GET /api/permissions/check (check permission)

Document Management (`/api/documents`)

File: server/routes/documents.js

POST /api/upload
  - Input: file (PDF), title, documentType, organizationId, entityId (optional), componentId (optional), subEntityId (optional)
  - Output: jobId, documentId, message
  - Auth: None (TODO: should be JWT)
  - Side Effects:
    * Validates file safety (file-safety.service)
    * Generates SHA256 hash for deduplication
    * Creates documents and ocr_jobs records
    * Adds OCR job to BullMQ queue

GET /api/documents
  - Input: organizationId, entityId, documentType, status, limit, offset (query params)
  - Output: { documents: [], pagination: { total, limit, offset, hasMore } }
  - Auth: None (TODO: should verify organization membership)

GET /api/documents/:id
  - Input: documentId in params
  - Output: Full document metadata + pages array + entity + component info
  - Auth: Checks organization membership, document ownership, or share access
  - Side Effects: Parses metadata JSON

GET /api/documents/:id/pdf
  - Input: documentId
  - Output: PDF file stream (inline)
  - Auth: Same as GET /api/documents/:id
  - Security: Path traversal protection

DELETE /api/documents/:id
  - Input: documentId
  - Output: success message with document title
  - Auth: None (TODO: should verify ownership)
  - Side Effects:
    * Deletes from Meilisearch index
    * Deletes from database (CASCADE deletes document_pages, ocr_jobs)
    * Deletes file from filesystem

Upload Routes (`/api/upload`)

File: server/routes/upload.js

POST /api/upload (same as above but dedicated file)
  - Multer configuration: 50MB limit, memory storage
  - Creates document in processing state
  - Queues OCR job via queue.service

Quick OCR Route (`/api/upload/quick-ocr`)

File: server/routes/quick-ocr.js (referenced but not fully reviewed)

Expected endpoint:
- POST /api/upload/quick-ocr (rapid OCR without document creation)

Job Management (`/api/jobs`)

File: server/routes/jobs.js

GET /api/jobs/:id
  - Input: jobId
  - Output: { jobId, documentId, status, progress, error, startedAt, completedAt, createdAt, document? }
  - Auth: None (TODO)
  - Status values: pending, processing, completed, failed
  - Document info included only if status === completed

GET /api/jobs
  - Input: status (optional), limit (default 50), offset (default 0)
  - Output: { jobs: [], pagination: { limit, offset } }
  - Auth: Filters to current user's jobs
  - Status filtering: Only allows pending|processing|completed|failed

Search (`/api/search`)

File: server/routes/search.js

POST /api/search/token
  - Input: expiresIn (seconds, default 3600, max 86400)
  - Output: { token, expiresAt, indexName, searchUrl, mode }
  - Auth: JWT (gets user's organizations)
  - Modes: 'tenant' (preferred) or 'search-key' (fallback)
  - Side Effects: Generates Meilisearch tenant token with organization filters

POST /api/search
  - Input: q (query string), filters? (documentType, entityId, language), limit, offset
  - Output: { hits, estimatedTotalHits, query, processingTimeMs, limit, offset }
  - Auth: JWT
  - Meilisearch filters: userId or organizationId membership
  - Additional filters: documentType, entityId, language

GET /api/search/health
  - Input: None
  - Output: { status, meilisearch: <health_response> }
  - Auth: None

Image Management (`/api/images`)

File: server/routes/images.js

GET /api/documents/:id/images
  - Input: documentId
  - Output: { documentId, imageCount, images: [{ id, pageNumber, imageIndex, format, width, height, position, extractedText, confidence, imageUrl }] }
  - Auth: Verifies document access
  - Side Effects: Parses position JSON

GET /api/documents/:id/pages/:pageNum/images
  - Input: documentId, pageNumber
  - Output: { documentId, pageNumber, imageCount, images: [] }
  - Auth: Verifies document and page exist
  - Validation: pageNumber must be >= 1

GET /api/images/:imageId
  - Input: imageId (img_<uuid>_p<page>_<index>_<timestamp> or UUID)
  - Output: Image file stream (PNG or JPEG)
  - Auth: Verifies document access
  - Rate Limiting: 200 requests per minute (more permissive than API)
  - Security: Path traversal prevention (normalizes path, checks within /uploads)

Table of Contents (`/api/documents/:documentId/toc`)

File: server/routes/toc.js

GET /api/documents/:documentId/toc
  - Input: documentId, format? (flat|tree, default flat)
  - Output: { entries: [], format, count }
  - Auth: None (TODO)
  - Caching: LRU cache (200 max, 30 min TTL)
  - Side Effects: Builds tree structure if format=tree

POST /api/documents/:documentId/toc/extract
  - Input: documentId
  - Output: { success, entriesCount, tocPages: [], message }
  - Auth: None (TODO)
  - Side Effects:
    * Calls extractTocFromDocument (section-extractor.service)
    * Invalidates LRU cache entries

Statistics (`/api/stats`)

File: server/routes/stats.js (referenced but not fully reviewed)

Expected endpoints:
- GET /api/stats/organization/:organizationId
- GET /api/stats/documents
- GET /api/stats/search

Settings (`/api/admin/settings`)

File: server/routes/settings.routes.js (referenced but not fully reviewed)

Expected endpoints:
- GET /api/admin/settings (get all settings)
- PUT /api/admin/settings/:key (update setting)
- GET /api/settings/public/app (public app settings - no auth)

Health Check

GET /health
  - Output: { status, timestamp, uptime }
  - Auth: None

3. Service Layer Architecture

Authentication Service

File: server/services/auth.service.js

Key Functions:

register(email, password, name) - User registration with bcrypt hashing (12 rounds)
login(email, password, deviceInfo, ipAddress) - JWT + refresh token generation
refreshAccessToken(refreshToken) - Generate new JWT from refresh token
revokeRefreshToken(refreshToken) - Revoke single token (logout)
revokeAllUserTokens(userId) - Logout all devices
requestPasswordReset(email, ipAddress) - Generate reset token
resetPassword(token, newPassword) - Validate token and update password
verifyEmail(token) - Mark email as verified
getUserById(userId) - Fetch user details
verifyAccessToken(token) - Validate JWT

Token Management:

JWT Access Token: expiresIn from env (default 15m)
Refresh Token: 7 days in seconds (604800)
Both stored with bcrypt hashing (for refresh tokens)
JWT Secret: process.env.JWT_SECRET (must change in production)

Security Features:

Password minimum 8 characters
Account lockout after 5 failed login attempts (15 min lock)
Refresh token revocation on password reset
Email verification token support

Authorization Service

File: server/services/authorization.service.js

Key Functions:

grantEntityPermission(userId, entityId, permissionLevel, grantedBy, expiresAt) - Grant entity access
revokeEntityPermission(userId, entityId, revokedBy) - Revoke entity access
checkEntityPermission(userId, entityId, minimumPermission) - Check if user has permission
getUserEntityPermissions(userId, options) - Get all user's entity permissions
getEntityPermissions(entityId, options) - Get all entity's permissions
addOrganizationMember(userId, organizationId, role, addedBy) - Add to organization
removeOrganizationMember(userId, organizationId, removedBy) - Remove from organization
checkOrganizationMembership(userId, organizationId, minimumRole) - Check membership
getOrganizationMembers(organizationId) - List org members
getUserOrganizations(userId) - Get user's organizations
cleanupExpiredPermissions() - Cleanup task

Permission Hierarchy:

Entity Permissions: viewer (0) < editor (1) < manager (2) < admin (3)
Organization Roles: viewer (0) < member (1) < manager (2) < admin (3)

Audit Integration:

All permission grants/revokes logged via logAuditEvent()

Organization Service

File: server/services/organization.service.js (referenced but not fully reviewed)

Expected Functions:

createOrganization(name, type, metadata, createdBy)
updateOrganization(organizationId, name, type, metadata, updatedBy)
deleteOrganization(organizationId, deletedBy)
getOrganizationById(organizationId)
getOrganizationStats(organizationId)

Search Service (Meilisearch Integration)

File: server/services/search.js

Key Functions:

indexDocumentPage(pageId, documentId, pageNumber, text, confidence) - Index page in Meilisearch
generateTenantToken(userId, organizationIds, expiresIn) - Generate tenant-scoped token

Meilisearch Index:

Index name: navidocs-pages (env configurable)
Searchable attributes: ocr text, metadata
Filtering: organizationId, userId, documentType, entityId, language

Document structure:

{
  id: string (unique page ID),
  docId: string (document UUID),
  pageNumber: integer,
  organizationId: string,
  userId: string,
  documentType: string,
  text: string (OCR content),
  language: string,
  ocrConfidence: number,
  createdAt: integer,
  updatedAt: integer
}

Tenant Token Support:

Scoped search to user's organizations
Expiration support (max 24 hours)
Fallback to search API key if tenant token fails

Queue Service (BullMQ)

File: server/services/queue.js

Key Functions:

getOcrQueue() - Get singleton queue instance
addOcrJob(documentId, jobId, data) - Add OCR job to queue
getJobStatus(jobId) - Get BullMQ job status
closeQueue() - Graceful shutdown

Queue Configuration:

Redis connection: REDIS_HOST (default 127.0.0.1), REDIS_PORT (default 6379)
Queue name: ocr-processing
Job retry: 3 attempts with exponential backoff (2s base)
Cleanup: Complete jobs kept 24h, failed jobs kept 7 days
Job options: priority support

Job Data Structure:

{
  documentId: string,
  jobId: string,
  filePath: string,
  fileName: string,
  organizationId: string,
  userId: string,
  priority: number (optional)
}

OCR Service

File: server/services/ocr.js (referenced)

Expected Functions:

extractTextFromImage(imagePath, language) - Tesseract.js OCR on images
cleanOCRText(text) - Clean and normalize OCR output

OCR Hybrid Service

File: server/services/ocr-hybrid.js (referenced)

Expected Functions:

extractTextFromPDF(filePath, options) - Extract text from PDF with progress callback
Returns: [{ pageNumber, text, confidence, error }]

OCR Google Vision Service

File: server/services/ocr-google-vision.js (referenced)

Expected Functions:

Alternative OCR provider (Google Cloud Vision)

OCR Client Service

File: server/services/ocr-client.js (referenced)

Expected Functions:

Client-side OCR coordination

Section Extractor Service

File: server/services/section-extractor.js (referenced)

Expected Functions:

extractSections(filePath, ocrResults) - Extract document sections/headings
mapPagesToSections(sections, totalPages) - Map pages to TOC sections

TOC Extractor Service

File: server/services/toc-extractor.js (referenced)

Expected Functions:

getDocumentToc(documentId) - Fetch TOC from database
buildTocTree(entries) - Build hierarchical tree from flat list
extractTocFromDocument(documentId) - Extract TOC from PDF

Audit Service

File: server/services/audit.service.js (referenced)

Expected Functions:

logAuditEvent(userId, eventType, status, ipAddress, userAgent, metadata, resourceType, resourceId)
Logs all security-relevant actions

Settings Service

File: server/services/settings.service.js (referenced)

Expected Functions:

getSetting(key) - Get setting by key
setSetting(key, value) - Set/update setting
getAllSettings() - Get all settings

File Safety Service

File: server/services/file-safety.js

Expected Functions:

validateFile(file) - Validate file type, size, etc.
sanitizeFilename(filename) - Remove dangerous characters

4. Background Job Patterns (BullMQ Usage)

OCR Worker

File: server/workers/ocr-worker.js

Job Processing Pipeline:

Job Initialization
- Receives { documentId, jobId, filePath, fileName, organizationId, userId, priority }
- Updates ocr_jobs: status = 'processing', progress = 0, started_at = now
PDF Text Extraction (60-70% of job)
- Calls extractTextFromPDF() with progress callback
- Returns: [{ pageNumber, text, confidence, error }]
- Concurrency: 2 documents at a time (env: OCR_CONCURRENCY)
- Limiter: 5 jobs per minute (prevents Tesseract overload)
Page Processing (per page)
- Clean OCR text via cleanOCRText()
- Insert/update document_pages
- Index in Meilisearch via indexDocumentPage()
- Store confidence scores and language
Image Extraction (per page)
- Extract images via extractImagesFromPage()
- Run Tesseract on each image
- Store in document_images table
- Index image text in Meilisearch with documentType: 'image'
Section/TOC Extraction (post-processing)
- Call extractSections() and mapPagesToSections()
- Update document_pages with section metadata (section, section_key, section_order)
- Call extractTocFromDocument() for TOC entries
Completion
- Update documents: status = 'indexed', imagesExtracted = 1
- Update ocr_jobs: status = 'completed', progress = 100, completed_at = now
- Return: { success: true, documentId, pagesProcessed }
Error Handling
- On failure: status = 'failed', error = error.message
- Continues processing other pages on individual page failures
- Re-throws to mark BullMQ job as failed
- Retries up to 3 times with exponential backoff

Event Handlers:

worker.on('completed', (job, result) => { /* log */ })
worker.on('failed', (job, error) => { /* log error */ })
worker.on('error', (error) => { /* worker crash */ })
worker.on('ready', () => { /* worker ready */ })

Graceful Shutdown:

SIGTERM / SIGINT handlers
Calls worker.close() and connection.quit()

Image Extractor Worker

File: server/workers/image-extractor.js

Expected Functionality:

extractImagesFromPage(filePath, pageNumber, documentId) - Extract images from PDF page
Returns: [{ id, path, format, width, height, imageIndex, position }]

5. Integration Points for New Features

Inventory Management Feature

Integration Points:

Database Schema:

Extend components table with inventory fields:

ALTER TABLE components ADD COLUMN (
  quantity_available INTEGER DEFAULT 0,
  reorder_level INTEGER,
  supplier_info TEXT,  -- JSON with supplier contacts
  last_purchased_date INTEGER,
  purchase_cost REAL,
  location_storage TEXT
);

Create inventory_transactions table for audit trail

API Endpoints:
- POST /api/inventory/items - Create inventory item (link to component)
- GET /api/inventory/items - List inventory with filters
- PUT /api/inventory/items/:id - Update quantity/location
- POST /api/inventory/items/:id/transactions - Record transaction (purchase, use, transfer)
- GET /api/inventory/alerts - Get low-stock alerts
Service Layer:
- Create server/services/inventory.service.js:
  - createInventoryItem(componentId, quantity, reorderLevel, supplier)
  - updateInventoryQuantity(itemId, change, reason, userId)
  - getInventoryAlerts(organizationId)
  - calculateReorderPoints()
Route File:
- Create server/routes/inventory.routes.js
- Add to server/index.js: app.use('/api/inventory', inventoryRoutes);
BullMQ Job (Optional):
- Create background job for inventory replenishment alerts
- Queue in server/workers/inventory-alerts.js

Maintenance Tracking Feature

Integration Points:

Database Schema:

Extend components table:

ALTER TABLE components ADD COLUMN (
  maintenance_interval_days INTEGER,
  last_maintenance_date INTEGER,
  next_maintenance_date INTEGER
);

Create maintenance_logs table:

CREATE TABLE maintenance_logs (
  id TEXT PRIMARY KEY,
  component_id FK,
  entity_id FK,
  performed_by FK,
  maintenance_type TEXT (inspection, service, repair, replacement),
  description TEXT,
  cost REAL,
  duration_hours REAL,
  next_scheduled_date INTEGER,
  document_id FK (reference manual),
  created_at INTEGER
);

API Endpoints:
- POST /api/maintenance/logs - Log maintenance event
- GET /api/maintenance/logs - List maintenance history
- GET /api/maintenance/schedule - Get upcoming maintenance
- PUT /api/maintenance/logs/:id - Update log
- DELETE /api/maintenance/logs/:id - Remove log
Service Layer:
- Create server/services/maintenance.service.js:
  - logMaintenance(componentId, type, description, performedBy)
  - getMaintenanceHistory(componentId, limit)
  - getUpcomingMaintenance(organizationId)
  - calculateNextMaintenanceDate(componentId)
Route File:
- Create server/routes/maintenance.routes.js
- Add to server/index.js: app.use('/api/maintenance', maintenanceRoutes);
Background Job:
- Create server/workers/maintenance-reminders.js
- BullMQ cron job to check and send alerts
Search Integration:
- Index maintenance logs in Meilisearch for searchability

Camera/Document Capture Feature

Integration Points:

Database Schema:

Extend documents table:

ALTER TABLE documents ADD COLUMN (
  capture_method TEXT (upload, camera, screenshot, scan),
  camera_device_info TEXT,  -- JSON with device metadata
  capture_timestamp INTEGER
);

Create camera_sessions table:

CREATE TABLE camera_sessions (
  id TEXT PRIMARY KEY,
  user_id FK,
  organization_id FK,
  device_info TEXT,  -- JSON
  started_at INTEGER,
  ended_at INTEGER,
  capture_count INTEGER
);

API Endpoints:
- POST /api/capture/camera-session - Start camera session
- POST /api/capture/upload-frame - Upload single camera frame
- GET /api/capture/sessions - List capture sessions
- POST /api/capture/batch-process - Process batch of frames as single document
Service Layer:
- Create server/services/capture.service.js:
  - createCameraSession(userId, organizationId, deviceInfo)
  - uploadCaptureFrame(sessionId, imageBuffer, frameNumber)
  - processCaptureSession(sessionId) - Convert frames to PDF
  - getSessionCaptures(sessionId)
Route File:
- Create server/routes/capture.routes.js
- Add to server/index.js: app.use('/api/capture', captureRoutes);
Background Job:
- Extend OCR worker to handle batch-captured images
- Create server/workers/batch-processor.js for frame-to-PDF conversion
Client Integration:
- Camera API integration in Vue 3 frontend
- WebRTC support for real-time preview

New Feature Route Registration Pattern

Standard Integration Checklist:

// 1. Create service file: server/services/[feature].service.js
// 2. Create route file: server/routes/[feature].routes.js
// 3. Add to server/index.js:
import [feature]Routes from './routes/[feature].routes.js';
app.use('/api/[feature]', [feature]Routes);

// 4. If background job needed:
// - Create server/workers/[feature]-worker.js
// - Extend queue.service.js with get[Feature]Queue()

// 5. If search needed:
// - Index documents via Meilisearch client in service layer

// 6. Database schema changes:
// - Add migration file or update schema.sql comments
// - Test with db/init.js

6. Tech Stack Validation

Backend Stack

Technology	Version	Purpose	Status
Node.js	18+	Runtime	Running
Express.js	^5.0.0	Web framework	Active
SQLite (better-sqlite3)	^11.0.0	Database	Active
PostgreSQL	-	Planned migration target	Not yet
Redis (ioredis)	^5.0.0	Queue backend	Required
BullMQ	^5.0.0	Job queue	Active
JWT (jsonwebtoken)	^9.0.2	Authentication	Active
Bcryptjs	^3.0.2	Password hashing	Active
Meilisearch	^0.41.0	Full-text search	Active
Tesseract.js	^5.0.0	OCR engine	Active
PDF processing	-	-	-
├─ pdf-parse	^1.1.1	PDF parsing	Active
├─ pdf-img-convert	^2.0.0	PDF to image	Active
├─ pdfjs-dist	^4.0.0	PDF viewer lib	Client
Image processing	-	-	-
├─ sharp	^0.34.4	Image optimization	Active
Multer	^1.4.5-lts.1	File upload	Active
file-type	^19.0.0	File validation	Active
Helmet	^7.0.0	Security headers	Active
CORS	^2.8.5	Cross-origin	Active
Rate-limit	^7.0.0	Request limiting	Active
LRU-Cache	^11.2.2	TOC caching	Active
UUID	^10.0.0	ID generation	Active
dotenv	^16.0.0	Config management	Active

Frontend Stack

Technology	Version	Purpose	Status
Vue.js	^3.5.0	UI framework	Active
Vue Router	^4.4.0	Client routing	Active
Pinia	^2.2.0	State management	Active
Vue i18n	^9.14.5	Internationalization	Active
Vite	^5.0.0	Build tool	Active
Tailwind CSS	^3.4.0	Styling	Active
PostCSS	^8.4.0	CSS processing	Active
Meilisearch SDK	^0.41.0	Client search	Active
PDF.js	^4.0.0	PDF viewer	Active
Playwright	^1.40.0	Testing	Dev

Infrastructure Requirements

Service	Configuration	Purpose
Database	SQLite file (or PostgreSQL)	Primary data store
Redis	`REDIS_HOST` (default 127.0.0.1:6379)	BullMQ backend
Meilisearch	`MEILISEARCH_HOST` (default http://127.0.0.1:7700)	Search service
File Storage	`/uploads` directory	PDF and image storage

Environment Variables (Key)

# Server
PORT=3001
NODE_ENV=development
ALLOWED_ORIGINS=http://localhost:5173

# Database
DATABASE_PATH=./navidocs.db

# Redis
REDIS_HOST=127.0.0.1
REDIS_PORT=6379

# Meilisearch
MEILISEARCH_HOST=http://127.0.0.1:7700
MEILISEARCH_MASTER_KEY=<key>
MEILISEARCH_SEARCH_KEY=<key>
MEILISEARCH_INDEX_NAME=navidocs-pages

# JWT
JWT_SECRET=your-secret-key-change-in-production
JWT_EXPIRES_IN=15m

# File Upload
UPLOAD_DIR=./uploads
MAX_FILE_SIZE=52428800  # 50MB

# OCR
OCR_CONCURRENCY=2

# Rate Limiting
RATE_LIMIT_WINDOW_MS=900000  # 15 minutes
RATE_LIMIT_MAX_REQUESTS=100
IMAGE_RATE_LIMIT_MAX_REQUESTS=200

Validation Summary

Confirmed Technologies:

Vue 3: ✓ Installed (^3.5.0)
Express.js: ✓ Installed (^5.0.0)
SQLite: ✓ Installed via better-sqlite3 (^11.0.0)
Redis: ✓ Installed via ioredis (^5.0.0)
Meilisearch: ✓ Installed (^0.41.0)
Tesseract: ✓ Installed via tesseract.js (^5.0.0)

Status: All core tech stack components present and correctly configured.

7. Architecture Diagram (Text-based)

┌─────────────────────────────────────────────────────────────────┐
│                     CLIENT LAYER (Vue 3)                        │
├─────────────────────────────────────────────────────────────────┤
│ • Vue Router (SPA navigation)                                    │
│ • Pinia (state management)                                       │
│ • Meilisearch Client SDK (full-text search UI)                  │
│ • PDF.js (document viewer)                                       │
│ • Tailwind CSS (styling)                                         │
└─────────────────────────────────────────────────────────────────┘
                              ↓ HTTP/REST
┌─────────────────────────────────────────────────────────────────┐
│                    EXPRESS.JS API LAYER                          │
├─────────────────────────────────────────────────────────────────┤
│ Routes: /api/auth, /api/documents, /api/search, /api/upload,    │
│         /api/organizations, /api/jobs, /api/maintenance, etc     │
│                                                                   │
│ Middleware: Authentication (JWT), Authorization, Rate Limiting   │
│             Request Logging, Security Headers (Helmet)           │
│                                                                   │
│ Response: JSON (documents, images, search results)               │
└─────────────────────────────────────────────────────────────────┘
                    ↓              ↓              ↓
        ┌─────────────────────────────────────────────────┐
        │   SERVICE LAYER (Business Logic)                │
        ├─────────────────────────────────────────────────┤
        │ • auth.service.js - JWT, password hashing       │
        │ • authorization.service.js - Permission checks  │
        │ • search.js - Meilisearch indexing              │
        │ • queue.js - BullMQ job management              │
        │ • ocr-hybrid.js - PDF text extraction           │
        │ • inventory.service.js - (new feature)          │
        │ • maintenance.service.js - (new feature)        │
        │ • capture.service.js - (new feature)            │
        └─────────────────────────────────────────────────┘
                    ↓              ↓              ↓
        ┌────────────────────┐  ┌──────────────────────┐  ┌─────────────────┐
        │   SQLite DB        │  │   Redis Queue        │  │  Meilisearch    │
        ├────────────────────┤  ├──────────────────────┤  ├─────────────────┤
        │ • users            │  │ ocr-processing queue │  │ Full-text index │
        │ • organizations    │  │ job data + status    │  │ Page documents  │
        │ • documents        │  │ (in-memory)          │  │ Image text      │
        │ • entities         │  │                      │  │                 │
        │ • components       │  │                      │  │                 │
        │ • permissions      │  │                      │  │                 │
        │ • maintenance_logs │  │                      │  │                 │
        │ • inventory_items  │  │                      │  │                 │
        └────────────────────┘  └──────────────────────┘  └─────────────────┘
                    ↓
        ┌──────────────────────┐
        │  Background Workers  │
        ├──────────────────────┤
        │ • ocr-worker.js      │
        │   - PDF → text       │
        │   - Tesseract.js OCR │
        │   - Index to MS      │
        │   - Extract images   │
        │   - Extract TOC      │
        │                      │
        │ • inventory-alerts   │
        │ • maintenance-reminders
        │ • batch-processor    │
        └──────────────────────┘
                    ↓
        ┌──────────────────────┐
        │  File System         │
        ├──────────────────────┤
        │ /uploads/            │
        │ • PDF documents      │
        │ • Extracted images   │
        │ • Temporary files    │
        └──────────────────────┘

8. Data Flow Examples

Document Upload & OCR Processing Flow

1. User uploads PDF via POST /api/upload
   ├─ Multer stores file in memory
   ├─ File validation (size, type)
   ├─ SHA256 hash for deduplication
   ├─ File saved to disk (/uploads/:docId.pdf)
   ├─ Document record created (status: processing)
   ├─ ocr_job record created (status: pending)
   └─ Response: { jobId, documentId }

2. API queues OCR job via queue.service.addOcrJob()
   └─ BullMQ adds to Redis 'ocr-processing' queue

3. OCR Worker picks up job
   ├─ extractTextFromPDF() using pdf-parse + Tesseract.js
   ├─ Per page:
   │  ├─ cleanOCRText()
   │  ├─ Insert document_page record
   │  ├─ Index in Meilisearch
   │  ├─ extractImagesFromPage()
   │  │  ├─ Convert page to image
   │  │  ├─ Extract embedded images
   │  │  └─ Run OCR on each image
   │  └─ Store image metadata
   ├─ extractSections() for TOC
   ├─ Update document status: indexed
   └─ Update ocr_job: completed

4. User polls GET /api/jobs/:jobId
   ├─ Checks database ocr_jobs record
   └─ Response: { status, progress, documentId }

5. Document now searchable
   ├─ GET /api/search/token → Meilisearch auth
   ├─ POST /api/search → Full-text search results
   └─ GET /api/documents/:id → Page list with OCR

Search & Document Retrieval Flow

1. User requests search token
   POST /api/search/token
   ├─ Verifies user's organizations
   ├─ Generates Meilisearch tenant token (org-scoped)
   └─ Response: { token, expiresAt, searchUrl }

2. Client calls Meilisearch directly with token
   ├─ Client library: meilisearch.index().search(q)
   └─ Results filtered by organization

3. User clicks document result
   GET /api/documents/:id
   ├─ Verify ownership/access
   ├─ Fetch document + pages + entity/component
   └─ Response: Full metadata + page list

4. User views PDF
   GET /api/documents/:id/pdf
   ├─ Verify access
   ├─ Stream file from /uploads/:id.pdf
   └─ Response: PDF stream

5. User views document images
   GET /api/documents/:id/images
   ├─ Query document_images table
   └─ Response: Image metadata + URLs

6. Client fetches image
   GET /api/images/:imageId
   ├─ Verify access
   ├─ Rate limit (200/min)
   ├─ Path traversal check
   └─ Stream: /uploads/:docId/image_*.png

1. Document Owner Shares Document
   POST /api/documents/:id/share
   ├─ Create document_shares record
   ├─ Audit log: document.share event
   └─ Response: { success, sharedWith }

2. Recipient Accesses Document
   GET /api/documents/:id
   ├─ Check access via:
   │  ├─ user_organizations (org membership)
   │  ├─ documents.uploaded_by (owner)
   │  └─ document_shares (shared with)
   ├─ Grant read/write permission
   └─ Return document + pages

3. Manager Grants Entity Permission
   POST /api/permissions/grant
   ├─ Create entity_permissions record
   ├─ Set permission_level (viewer|editor|manager|admin)
   ├─ Optional expiration
   ├─ Audit log
   └─ Response: Permission ID

4. Check Permission
   checkEntityPermission(userId, entityId, minimumLevel)
   ├─ Query entity_permissions table
   ├─ Verify expiration
   ├─ Check permission hierarchy
   └─ Return: { hasPermission, level }

9. Security Implementation

Authentication & Authorization

JWT Strategy:

Access Token: 15 minutes (short-lived)
Refresh Token: 7 days (stored in DB with hash)
Tokens revoked on password reset
Account lockout: 15 min after 5 failed attempts

Password Security:

Bcrypt with 12 rounds
Minimum 8 characters
Hashing on register and reset

Session Management:

Refresh tokens tracked in database
Device info and IP logging
Logout-all support

Role-Based Access Control (RBAC):

Organization Roles:
  • viewer: Read-only access
  • member: Can upload documents
  • manager: Can add members, update org
  • admin: Full org control + deletion

Entity Permissions:
  • viewer: Read-only
  • editor: Can modify/share
  • manager: All + member management
  • admin: Full control

Default Flow:
  User → Organization (role) → Entities (permissions)

API Security

Middleware Stack:

Helmet: Security headers (CSP, X-Frame-Options, etc)
CORS: Whitelisted origins (production)
Rate Limiting: 100 req/15min per IP (configurable)
Authentication: JWT verification on protected routes
Authorization: Role/permission checks in handlers
Input Validation: UUID format, file type, size limits
Path Traversal Prevention: Normalized path checks for file serving

File Upload Security:

Multer memory storage (prevents direct disk write)
File type validation via file-type library
Size limit: 50MB (configurable)
SHA256 hash for deduplication
Filename sanitization (remove dangerous chars)

Data Protection

In Transit:

HTTPS enforced (production)
TLS/SSL certificates
Secure cookies for JWT

At Rest:

SQLite encryption (optional setup)
Bcrypt password hashing
No plaintext credentials in code

Audit Trail:

All permission changes logged
User actions tracked (audit_events)
Login/logout recorded

10. Performance Considerations

Database Optimization

Indexes on common query columns (org, entity, status, hash)
Prepared statements via better-sqlite3
Connection pooling (single connection in current setup)

Search Optimization

Meilisearch for full-text indexing (not SQLite FTS)
Async indexing in OCR worker
Tenant tokens for client-side search
30-min LRU cache for TOC queries

OCR Processing

Concurrency: 2 documents (configurable via OCR_CONCURRENCY)
Limiter: 5 jobs/minute (prevents Tesseract overload)
Progress tracking (0-100%)
Batch image processing

Memory Management

Streaming responses for large PDFs
Image compression via sharp
LRU cache cleanup (30 min TTL)
Job cleanup: Complete (24h), Failed (7 days)

Scalability Bottlenecks

Single SQLite connection: Switch to PostgreSQL for concurrent writes
Local file storage: Switch to S3/cloud storage
Tesseract CPU usage: Distribute workers across machines
Meilisearch scale: Deploy cluster for high traffic

11. Known Issues & TODOs

Authentication

Authentication middleware incomplete (req.user often hardcoded as 'test-user-id')
Email verification not sent (template needed)
Password reset email not sent (template needed)

Authorization

Some endpoints missing auth checks
Entity-level permissions not fully integrated
Document-level permissions incomplete

Database

Password reset tokens table missing from schema
Refresh tokens table missing from schema
Audit events table not defined
Document images table not in schema.sql
Document metadata handling inconsistent

OCR Worker

Image extraction may fail silently
Section extraction error handling needs improvement
TOC extraction timing makes it optional (should be robust)

Frontend

Client-side image upload/capture not implemented
Multilingual search needs testing
Rate limiting feedback incomplete

12. Integration Roadmap for New Features

Phase 1: Inventory Management

Dependencies:

Components schema (exists)
Basic CRUD API patterns (exist)
Database migrations (setup required)

Estimated effort: 3-4 days New files: 3 (service, routes, worker) Database changes: +2 tables

Phase 2: Maintenance Tracking

Dependencies:

Inventory feature (Phase 1)
Meilisearch indexing (exists)
Audit logging (partial)

Estimated effort: 2-3 days New files: 3 (service, routes, worker) Database changes: +1 table

Phase 3: Camera/Capture Feature

Dependencies:

Upload API (exists)
PDF processing (exists)
WebRTC/Camera API (client)

Estimated effort: 4-5 days New files: 4 (service, routes, worker, batch-processor) Database changes: +2 tables

Phase 4: Enhanced Search & Analytics

Dependencies:

Meilisearch integration (exists)
Audit trail (Phase 2+)
Statistics API (exists)

Estimated effort: 2-3 days New files: 2 (service, routes)

Conclusion

The NaviDocs codebase is well-structured with clear separation of concerns:

Database: Comprehensive schema supporting multi-entity, multi-tenant architecture
API: RESTful endpoints organized by feature with consistent patterns
Services: Business logic isolated from routes with dependency injection
Workers: Background OCR processing via BullMQ + Redis
Frontend: Vue 3 SPA with Meilisearch client-side search

Ready for integration of:

Inventory management
Maintenance tracking
Camera/document capture
Enhanced analytics

All integration points identified and documented above.

47 KiB Raw Export PDF Permalink Blame History

NaviDocs Codebase Architecture Map

1. Database Schema Summary

Core Entities

User Management

Organization Structure (Multi-tenant)

Entity Management (Boats, Marinas, Properties)

Hierarchical Component Structure

Document Management

Background Jobs

Permissions & Sharing

User Preferences

Audit Trail (Optional)

Settings/Configuration

Key Indexes

2. API Endpoints (Grouped by Feature)

Authentication Endpoints (/api/auth)

Organization Management (/api/organizations)

Permission Management (/api/permissions)

Document Management (/api/documents)

Upload Routes (/api/upload)

Quick OCR Route (/api/upload/quick-ocr)

Job Management (/api/jobs)

Search (/api/search)

Image Management (/api/images)

Table of Contents (/api/documents/:documentId/toc)

Statistics (/api/stats)

Settings (/api/admin/settings)

Health Check

3. Service Layer Architecture

Authentication Service

Authorization Service

Organization Service

Search Service (Meilisearch Integration)

Queue Service (BullMQ)

OCR Service

OCR Hybrid Service

OCR Google Vision Service

OCR Client Service

Section Extractor Service

TOC Extractor Service

Audit Service

Settings Service

File Safety Service

4. Background Job Patterns (BullMQ Usage)

OCR Worker

Image Extractor Worker

5. Integration Points for New Features

Inventory Management Feature

Maintenance Tracking Feature

Camera/Document Capture Feature

New Feature Route Registration Pattern

6. Tech Stack Validation

Backend Stack

Frontend Stack

Infrastructure Requirements

Environment Variables (Key)

Validation Summary

7. Architecture Diagram (Text-based)

8. Data Flow Examples

Document Upload & OCR Processing Flow

Search & Document Retrieval Flow

Permission & Sharing Flow

9. Security Implementation

Authentication & Authorization

API Security

Data Protection

10. Performance Considerations

Database Optimization

Search Optimization

OCR Processing

Memory Management

Scalability Bottlenecks

11. Known Issues & TODOs

Authentication

Authorization

Database

OCR Worker

Frontend

12. Integration Roadmap for New Features

47 KiB

Raw Export PDF Permalink Blame History

Authentication Endpoints (`/api/auth`)

Organization Management (`/api/organizations`)

Permission Management (`/api/permissions`)

Document Management (`/api/documents`)

Upload Routes (`/api/upload`)

Quick OCR Route (`/api/upload/quick-ocr`)

Job Management (`/api/jobs`)

Search (`/api/search`)

Image Management (`/api/images`)

Table of Contents (`/api/documents/:documentId/toc`)

Statistics (`/api/stats`)

Settings (`/api/admin/settings`)