All 11 agents (S2-H01 through S2-H09 + S2-H03A + S2-H07A) have completed their technical specifications: - S2-H01: NaviDocs codebase architecture analysis - S2-H02: Inventory tracking system (€15K-€50K value recovery) - S2-H03: Maintenance log & reminder system - S2-H04: Camera & Home Assistant integration - S2-H05: Contact management system - S2-H06: Accounting module & receipt OCR integration - S2-H07: Impeccable search UX (Meilisearch facets) - S2-H08: WhatsApp Business API + AI agent integration - S2-H09: Document versioning with IF.TTT compliance - S2-H03A: VAT/tax jurisdiction tracking & compliance - S2-H07A: Multi-calendar system (4 calendar types) Total: ~15,600 lines of technical specifications Status: Ready for S2-H10 synthesis (awaiting Session 1 completion) IF.bus: All inter-agent communications documented
47 KiB
NaviDocs Codebase Architecture Map
Analysis Date: 2025-11-13 Agent: S2-H01 Status: Complete
1. Database Schema Summary
Core Entities
The NaviDocs database uses SQLite (v3) with a schema designed for future PostgreSQL migration. All timestamps use Unix epoch (seconds).
User Management
- users (id: TEXT PRIMARY KEY)
- id: UUID
- email: TEXT UNIQUE
- password_hash: TEXT (bcrypt)
- name: TEXT
- status: TEXT (active, suspended, deleted)
- email_verified: BOOLEAN
- created_at, updated_at: INTEGER
- last_login_at: INTEGER
- failed_login_attempts, locked_until: Security fields
Organization Structure (Multi-tenant)
- organizations (id: TEXT PRIMARY KEY)
- id: UUID
- name: TEXT
- type: TEXT (personal, commercial, hoa)
- created_at, updated_at: INTEGER
- user_organizations (user_id + organization_id PRIMARY KEY)
- role: TEXT (admin, manager, member, viewer)
- joined_at: INTEGER
Entity Management (Boats, Marinas, Properties)
- entities (id: TEXT PRIMARY KEY)
- id: UUID
- organization_id: FK
- user_id: FK (primary owner)
- entity_type: TEXT (boat, marina, condo, yacht-club)
- name: TEXT
Boat-specific:
- make, model, year: TEXT/INTEGER
- hull_id: TEXT
- vessel_type: TEXT (powerboat, sailboat, catamaran, trawler)
- length_feet: INTEGER
Property-specific:
- property_type: TEXT
- address: TEXT
- gps_lat, gps_lon: REAL
- metadata: TEXT (JSON)
- created_at, updated_at: INTEGER
Hierarchical Component Structure
- sub_entities (id: TEXT PRIMARY KEY)
- id: UUID
- entity_id: FK
- name: TEXT (system, dock, unit, facility)
- type: TEXT
- metadata: TEXT (JSON)
- components (id: TEXT PRIMARY KEY)
- id: UUID
- sub_entity_id: FK (optional)
- entity_id: FK (direct link)
- name, manufacturer, model_number, serial_number: TEXT
- install_date, warranty_expires: INTEGER
- metadata: TEXT (JSON)
Document Management
- documents (id: TEXT PRIMARY KEY)
- id: UUID
- organization_id: FK
- entity_id, sub_entity_id, component_id: FK (hierarchical linking)
- uploaded_by: FK (user)
- title, document_type: TEXT
- file_path, file_name, file_size: TEXT/INTEGER
- file_hash: TEXT (SHA256 for deduplication)
- mime_type: TEXT (default: application/pdf)
- page_count: INTEGER
- language: TEXT (default: en)
- status: TEXT (processing, indexed, failed, archived, deleted)
- replaced_by: TEXT (document supersession)
- is_shared: BOOLEAN
- shared_component_id: TEXT (for shared manual library)
- metadata: TEXT (JSON)
- created_at, updated_at: INTEGER
- document_pages (id: TEXT PRIMARY KEY)
- id: UUID (page_<doc_id>_<page_num>)
- document_id: FK
- page_number: INTEGER
- ocr_text: TEXT
- ocr_confidence: REAL (0-1)
- ocr_language: TEXT (default: en)
- ocr_completed_at: INTEGER
- search_indexed_at: INTEGER
- meilisearch_id: TEXT
- section: TEXT (TOC section name)
- section_key: TEXT (normalized key)
- section_order: INTEGER
- metadata: TEXT (JSON - bounding boxes, etc)
- document_images (extracted from PDFs)
- id: UUID
- documentId: FK
- pageNumber: INTEGER
- imageIndex: INTEGER
- imagePath: TEXT
- imageFormat: TEXT (png, jpeg)
- width, height: INTEGER
- position: TEXT (JSON)
- extractedText: TEXT
- textConfidence: REAL
- anchorTextBefore, anchorTextAfter: TEXT
Background Jobs
- ocr_jobs (id: TEXT PRIMARY KEY)
- id: UUID
- document_id: FK
- status: TEXT (pending, processing, completed, failed)
- progress: INTEGER (0-100%)
- error: TEXT
- started_at, completed_at: INTEGER
- created_at: INTEGER
Permissions & Sharing
- permissions (granular access control)
- id: UUID
- resource_type: TEXT (document, entity, organization)
- resource_id: FK
- user_id: FK
- permission: TEXT (read, write, share, delete, admin)
- granted_by, granted_at: FK + INTEGER
- expires_at: INTEGER (optional)
- entity_permissions (entity-level access)
- id: UUID
- user_id, entity_id: FK
- permission_level: TEXT (viewer, editor, manager, admin)
- granted_by, granted_at: FK + INTEGER
- expires_at: INTEGER
- document_shares (simplified document sharing)
- id: UUID
- document_id, shared_by, shared_with: FK
- permission: TEXT (read, write)
- created_at: INTEGER
- refresh_tokens (JWT session management)
- id: UUID
- user_id: FK
- token_hash: TEXT (SHA256)
- device_info, ip_address: TEXT
- expires_at: INTEGER
- revoked: BOOLEAN
- created_at, revoked_at: INTEGER
- password_reset_tokens
- id: UUID
- user_id: FK
- token_hash: TEXT (SHA256)
- expires_at: INTEGER
- used: BOOLEAN
- ip_address: TEXT
- used_at: INTEGER
User Preferences
- bookmarks (quick access)
- id: UUID
- user_id, document_id: FK
- page_id: FK (optional - specific page)
- label: TEXT
- quick_access: BOOLEAN (pin to homepage)
- created_at: INTEGER
Audit Trail (Optional)
- audit_events (not shown in schema but referenced in code)
- Logs all significant operations for compliance
- user_id, event_type, resource_type, resource_id
- status, ip_address, user_agent, metadata
Settings/Configuration
- settings (key-value store)
- key: TEXT PRIMARY KEY
- value: TEXT (JSON)
- description: TEXT
- category: TEXT
Key Indexes
idx_entities_org,idx_entities_user,idx_entities_typeidx_documents_org,idx_documents_entity,idx_documents_status,idx_documents_hash,idx_documents_sharedidx_pages_document,idx_pages_indexedidx_jobs_status,idx_jobs_documentidx_permissions_user,idx_permissions_resourceidx_bookmarks_user
2. API Endpoints (Grouped by Feature)
Authentication Endpoints (/api/auth)
File: server/routes/auth.routes.js
POST /api/auth/register
- Input: email, password, name
- Output: userId, email, verificationToken
- Logging: audit.service logs user.register
POST /api/auth/login
- Input: email, password, deviceInfo, ipAddress
- Output: accessToken (JWT), refreshToken, user object
- Auth: None (initial login)
- Side Effects: Updates failed_login_attempts, triggers account lock after 5 failures
POST /api/auth/refresh
- Input: refreshToken
- Output: new accessToken, user object
- Auth: None (token-based)
POST /api/auth/logout
- Input: refreshToken
- Output: success message
- Side Effects: Revokes refresh token
POST /api/auth/logout-all
- Input: None (uses JWT)
- Output: success message
- Side Effects: Revokes all user tokens
- Auth: JWT required
POST /api/auth/password/reset-request
- Input: email
- Output: generic success (doesn't reveal email exists)
- Side Effects: Creates password_reset_tokens entry
POST /api/auth/password/reset
- Input: token, newPassword
- Output: success message
- Side Effects: Updates password, revokes all refresh tokens
POST /api/auth/email/verify
- Input: token
- Output: email, success message
- Side Effects: Sets email_verified = 1
GET /api/auth/me
- Input: None (JWT)
- Output: user object (id, email, name, status, emailVerified, createdAt, lastLoginAt)
- Auth: JWT required
Organization Management (/api/organizations)
File: server/routes/organization.routes.js
POST /api/organizations
- Input: name, type (optional), metadata (optional)
- Output: organization object
- Auth: JWT required
GET /api/organizations
- Input: None
- Output: Array of user's organizations with role
- Auth: JWT required
GET /api/organizations/:organizationId
- Input: organizationId in params
- Output: organization details with userRole
- Auth: JWT + requireOrganizationMember
PUT /api/organizations/:organizationId
- Input: name, type, metadata
- Output: updated organization
- Auth: JWT + requireOrganizationRole('manager')
DELETE /api/organizations/:organizationId
- Input: organizationId
- Output: success message with deleted count
- Auth: JWT + requireOrganizationRole('admin')
GET /api/organizations/:organizationId/members
- Input: organizationId
- Output: Array of members with roles
- Auth: JWT + requireOrganizationMember
POST /api/organizations/:organizationId/members
- Input: userId, role (optional)
- Output: success message
- Auth: JWT + requireOrganizationRole('manager')
- Side Effects: Adds or updates user role
DELETE /api/organizations/:organizationId/members/:userId
- Input: organizationId, userId
- Output: success message with removed role
- Auth: JWT + requireOrganizationRole('manager')
GET /api/organizations/:organizationId/stats
- Input: organizationId
- Output: organization statistics (document count, member count, etc)
- Auth: JWT + requireOrganizationMember
Permission Management (/api/permissions)
File: server/routes/permission.routes.js (referenced but not fully reviewed)
Expected endpoints:
- POST /api/permissions/grant (grant permission to user)
- DELETE /api/permissions/revoke (revoke permission)
- GET /api/permissions/check (check permission)
Document Management (/api/documents)
File: server/routes/documents.js
POST /api/upload
- Input: file (PDF), title, documentType, organizationId, entityId (optional), componentId (optional), subEntityId (optional)
- Output: jobId, documentId, message
- Auth: None (TODO: should be JWT)
- Side Effects:
* Validates file safety (file-safety.service)
* Generates SHA256 hash for deduplication
* Creates documents and ocr_jobs records
* Adds OCR job to BullMQ queue
GET /api/documents
- Input: organizationId, entityId, documentType, status, limit, offset (query params)
- Output: { documents: [], pagination: { total, limit, offset, hasMore } }
- Auth: None (TODO: should verify organization membership)
GET /api/documents/:id
- Input: documentId in params
- Output: Full document metadata + pages array + entity + component info
- Auth: Checks organization membership, document ownership, or share access
- Side Effects: Parses metadata JSON
GET /api/documents/:id/pdf
- Input: documentId
- Output: PDF file stream (inline)
- Auth: Same as GET /api/documents/:id
- Security: Path traversal protection
DELETE /api/documents/:id
- Input: documentId
- Output: success message with document title
- Auth: None (TODO: should verify ownership)
- Side Effects:
* Deletes from Meilisearch index
* Deletes from database (CASCADE deletes document_pages, ocr_jobs)
* Deletes file from filesystem
Upload Routes (/api/upload)
File: server/routes/upload.js
POST /api/upload (same as above but dedicated file)
- Multer configuration: 50MB limit, memory storage
- Creates document in processing state
- Queues OCR job via queue.service
Quick OCR Route (/api/upload/quick-ocr)
File: server/routes/quick-ocr.js (referenced but not fully reviewed)
Expected endpoint:
- POST /api/upload/quick-ocr (rapid OCR without document creation)
Job Management (/api/jobs)
File: server/routes/jobs.js
GET /api/jobs/:id
- Input: jobId
- Output: { jobId, documentId, status, progress, error, startedAt, completedAt, createdAt, document? }
- Auth: None (TODO)
- Status values: pending, processing, completed, failed
- Document info included only if status === completed
GET /api/jobs
- Input: status (optional), limit (default 50), offset (default 0)
- Output: { jobs: [], pagination: { limit, offset } }
- Auth: Filters to current user's jobs
- Status filtering: Only allows pending|processing|completed|failed
Search (/api/search)
File: server/routes/search.js
POST /api/search/token
- Input: expiresIn (seconds, default 3600, max 86400)
- Output: { token, expiresAt, indexName, searchUrl, mode }
- Auth: JWT (gets user's organizations)
- Modes: 'tenant' (preferred) or 'search-key' (fallback)
- Side Effects: Generates Meilisearch tenant token with organization filters
POST /api/search
- Input: q (query string), filters? (documentType, entityId, language), limit, offset
- Output: { hits, estimatedTotalHits, query, processingTimeMs, limit, offset }
- Auth: JWT
- Meilisearch filters: userId or organizationId membership
- Additional filters: documentType, entityId, language
GET /api/search/health
- Input: None
- Output: { status, meilisearch: <health_response> }
- Auth: None
Image Management (/api/images)
File: server/routes/images.js
GET /api/documents/:id/images
- Input: documentId
- Output: { documentId, imageCount, images: [{ id, pageNumber, imageIndex, format, width, height, position, extractedText, confidence, imageUrl }] }
- Auth: Verifies document access
- Side Effects: Parses position JSON
GET /api/documents/:id/pages/:pageNum/images
- Input: documentId, pageNumber
- Output: { documentId, pageNumber, imageCount, images: [] }
- Auth: Verifies document and page exist
- Validation: pageNumber must be >= 1
GET /api/images/:imageId
- Input: imageId (img_<uuid>_p<page>_<index>_<timestamp> or UUID)
- Output: Image file stream (PNG or JPEG)
- Auth: Verifies document access
- Rate Limiting: 200 requests per minute (more permissive than API)
- Security: Path traversal prevention (normalizes path, checks within /uploads)
Table of Contents (/api/documents/:documentId/toc)
File: server/routes/toc.js
GET /api/documents/:documentId/toc
- Input: documentId, format? (flat|tree, default flat)
- Output: { entries: [], format, count }
- Auth: None (TODO)
- Caching: LRU cache (200 max, 30 min TTL)
- Side Effects: Builds tree structure if format=tree
POST /api/documents/:documentId/toc/extract
- Input: documentId
- Output: { success, entriesCount, tocPages: [], message }
- Auth: None (TODO)
- Side Effects:
* Calls extractTocFromDocument (section-extractor.service)
* Invalidates LRU cache entries
Statistics (/api/stats)
File: server/routes/stats.js (referenced but not fully reviewed)
Expected endpoints:
- GET /api/stats/organization/:organizationId
- GET /api/stats/documents
- GET /api/stats/search
Settings (/api/admin/settings)
File: server/routes/settings.routes.js (referenced but not fully reviewed)
Expected endpoints:
- GET /api/admin/settings (get all settings)
- PUT /api/admin/settings/:key (update setting)
- GET /api/settings/public/app (public app settings - no auth)
Health Check
GET /health
- Output: { status, timestamp, uptime }
- Auth: None
3. Service Layer Architecture
Authentication Service
File: server/services/auth.service.js
Key Functions:
register(email, password, name)- User registration with bcrypt hashing (12 rounds)login(email, password, deviceInfo, ipAddress)- JWT + refresh token generationrefreshAccessToken(refreshToken)- Generate new JWT from refresh tokenrevokeRefreshToken(refreshToken)- Revoke single token (logout)revokeAllUserTokens(userId)- Logout all devicesrequestPasswordReset(email, ipAddress)- Generate reset tokenresetPassword(token, newPassword)- Validate token and update passwordverifyEmail(token)- Mark email as verifiedgetUserById(userId)- Fetch user detailsverifyAccessToken(token)- Validate JWT
Token Management:
- JWT Access Token:
expiresInfrom env (default 15m) - Refresh Token: 7 days in seconds (604800)
- Both stored with bcrypt hashing (for refresh tokens)
- JWT Secret:
process.env.JWT_SECRET(must change in production)
Security Features:
- Password minimum 8 characters
- Account lockout after 5 failed login attempts (15 min lock)
- Refresh token revocation on password reset
- Email verification token support
Authorization Service
File: server/services/authorization.service.js
Key Functions:
grantEntityPermission(userId, entityId, permissionLevel, grantedBy, expiresAt)- Grant entity accessrevokeEntityPermission(userId, entityId, revokedBy)- Revoke entity accesscheckEntityPermission(userId, entityId, minimumPermission)- Check if user has permissiongetUserEntityPermissions(userId, options)- Get all user's entity permissionsgetEntityPermissions(entityId, options)- Get all entity's permissionsaddOrganizationMember(userId, organizationId, role, addedBy)- Add to organizationremoveOrganizationMember(userId, organizationId, removedBy)- Remove from organizationcheckOrganizationMembership(userId, organizationId, minimumRole)- Check membershipgetOrganizationMembers(organizationId)- List org membersgetUserOrganizations(userId)- Get user's organizationscleanupExpiredPermissions()- Cleanup task
Permission Hierarchy:
Entity Permissions: viewer (0) < editor (1) < manager (2) < admin (3)
Organization Roles: viewer (0) < member (1) < manager (2) < admin (3)
Audit Integration:
- All permission grants/revokes logged via
logAuditEvent()
Organization Service
File: server/services/organization.service.js (referenced but not fully reviewed)
Expected Functions:
createOrganization(name, type, metadata, createdBy)updateOrganization(organizationId, name, type, metadata, updatedBy)deleteOrganization(organizationId, deletedBy)getOrganizationById(organizationId)getOrganizationStats(organizationId)
Search Service (Meilisearch Integration)
File: server/services/search.js
Key Functions:
indexDocumentPage(pageId, documentId, pageNumber, text, confidence)- Index page in MeilisearchgenerateTenantToken(userId, organizationIds, expiresIn)- Generate tenant-scoped token
Meilisearch Index:
- Index name:
navidocs-pages(env configurable) - Searchable attributes: ocr text, metadata
- Filtering: organizationId, userId, documentType, entityId, language
- Document structure:
{ id: string (unique page ID), docId: string (document UUID), pageNumber: integer, organizationId: string, userId: string, documentType: string, text: string (OCR content), language: string, ocrConfidence: number, createdAt: integer, updatedAt: integer }
Tenant Token Support:
- Scoped search to user's organizations
- Expiration support (max 24 hours)
- Fallback to search API key if tenant token fails
Queue Service (BullMQ)
File: server/services/queue.js
Key Functions:
getOcrQueue()- Get singleton queue instanceaddOcrJob(documentId, jobId, data)- Add OCR job to queuegetJobStatus(jobId)- Get BullMQ job statuscloseQueue()- Graceful shutdown
Queue Configuration:
- Redis connection:
REDIS_HOST(default 127.0.0.1),REDIS_PORT(default 6379) - Queue name:
ocr-processing - Job retry: 3 attempts with exponential backoff (2s base)
- Cleanup: Complete jobs kept 24h, failed jobs kept 7 days
- Job options: priority support
Job Data Structure:
{
documentId: string,
jobId: string,
filePath: string,
fileName: string,
organizationId: string,
userId: string,
priority: number (optional)
}
OCR Service
File: server/services/ocr.js (referenced)
Expected Functions:
extractTextFromImage(imagePath, language)- Tesseract.js OCR on imagescleanOCRText(text)- Clean and normalize OCR output
OCR Hybrid Service
File: server/services/ocr-hybrid.js (referenced)
Expected Functions:
extractTextFromPDF(filePath, options)- Extract text from PDF with progress callback- Returns:
[{ pageNumber, text, confidence, error }]
OCR Google Vision Service
File: server/services/ocr-google-vision.js (referenced)
Expected Functions:
- Alternative OCR provider (Google Cloud Vision)
OCR Client Service
File: server/services/ocr-client.js (referenced)
Expected Functions:
- Client-side OCR coordination
Section Extractor Service
File: server/services/section-extractor.js (referenced)
Expected Functions:
extractSections(filePath, ocrResults)- Extract document sections/headingsmapPagesToSections(sections, totalPages)- Map pages to TOC sections
TOC Extractor Service
File: server/services/toc-extractor.js (referenced)
Expected Functions:
getDocumentToc(documentId)- Fetch TOC from databasebuildTocTree(entries)- Build hierarchical tree from flat listextractTocFromDocument(documentId)- Extract TOC from PDF
Audit Service
File: server/services/audit.service.js (referenced)
Expected Functions:
logAuditEvent(userId, eventType, status, ipAddress, userAgent, metadata, resourceType, resourceId)- Logs all security-relevant actions
Settings Service
File: server/services/settings.service.js (referenced)
Expected Functions:
getSetting(key)- Get setting by keysetSetting(key, value)- Set/update settinggetAllSettings()- Get all settings
File Safety Service
File: server/services/file-safety.js
Expected Functions:
validateFile(file)- Validate file type, size, etc.sanitizeFilename(filename)- Remove dangerous characters
4. Background Job Patterns (BullMQ Usage)
OCR Worker
File: server/workers/ocr-worker.js
Job Processing Pipeline:
-
Job Initialization
- Receives
{ documentId, jobId, filePath, fileName, organizationId, userId, priority } - Updates ocr_jobs: status = 'processing', progress = 0, started_at = now
- Receives
-
PDF Text Extraction (60-70% of job)
- Calls
extractTextFromPDF()with progress callback - Returns:
[{ pageNumber, text, confidence, error }] - Concurrency: 2 documents at a time (env: OCR_CONCURRENCY)
- Limiter: 5 jobs per minute (prevents Tesseract overload)
- Calls
-
Page Processing (per page)
- Clean OCR text via
cleanOCRText() - Insert/update document_pages
- Index in Meilisearch via
indexDocumentPage() - Store confidence scores and language
- Clean OCR text via
-
Image Extraction (per page)
- Extract images via
extractImagesFromPage() - Run Tesseract on each image
- Store in document_images table
- Index image text in Meilisearch with
documentType: 'image'
- Extract images via
-
Section/TOC Extraction (post-processing)
- Call
extractSections()andmapPagesToSections() - Update document_pages with section metadata (section, section_key, section_order)
- Call
extractTocFromDocument()for TOC entries
- Call
-
Completion
- Update documents: status = 'indexed', imagesExtracted = 1
- Update ocr_jobs: status = 'completed', progress = 100, completed_at = now
- Return:
{ success: true, documentId, pagesProcessed }
-
Error Handling
- On failure: status = 'failed', error = error.message
- Continues processing other pages on individual page failures
- Re-throws to mark BullMQ job as failed
- Retries up to 3 times with exponential backoff
Event Handlers:
worker.on('completed', (job, result) => { /* log */ })
worker.on('failed', (job, error) => { /* log error */ })
worker.on('error', (error) => { /* worker crash */ })
worker.on('ready', () => { /* worker ready */ })
Graceful Shutdown:
SIGTERM/SIGINThandlers- Calls
worker.close()andconnection.quit()
Image Extractor Worker
File: server/workers/image-extractor.js
Expected Functionality:
extractImagesFromPage(filePath, pageNumber, documentId)- Extract images from PDF page- Returns:
[{ id, path, format, width, height, imageIndex, position }]
5. Integration Points for New Features
Inventory Management Feature
Integration Points:
-
Database Schema:
- Extend
componentstable with inventory fields:ALTER TABLE components ADD COLUMN ( quantity_available INTEGER DEFAULT 0, reorder_level INTEGER, supplier_info TEXT, -- JSON with supplier contacts last_purchased_date INTEGER, purchase_cost REAL, location_storage TEXT ); - Create
inventory_transactionstable for audit trail
- Extend
-
API Endpoints:
POST /api/inventory/items- Create inventory item (link to component)GET /api/inventory/items- List inventory with filtersPUT /api/inventory/items/:id- Update quantity/locationPOST /api/inventory/items/:id/transactions- Record transaction (purchase, use, transfer)GET /api/inventory/alerts- Get low-stock alerts
-
Service Layer:
- Create
server/services/inventory.service.js:createInventoryItem(componentId, quantity, reorderLevel, supplier)updateInventoryQuantity(itemId, change, reason, userId)getInventoryAlerts(organizationId)calculateReorderPoints()
- Create
-
Route File:
- Create
server/routes/inventory.routes.js - Add to
server/index.js:app.use('/api/inventory', inventoryRoutes);
- Create
-
BullMQ Job (Optional):
- Create background job for inventory replenishment alerts
- Queue in
server/workers/inventory-alerts.js
Maintenance Tracking Feature
Integration Points:
-
Database Schema:
- Extend
componentstable:ALTER TABLE components ADD COLUMN ( maintenance_interval_days INTEGER, last_maintenance_date INTEGER, next_maintenance_date INTEGER ); - Create
maintenance_logstable:CREATE TABLE maintenance_logs ( id TEXT PRIMARY KEY, component_id FK, entity_id FK, performed_by FK, maintenance_type TEXT (inspection, service, repair, replacement), description TEXT, cost REAL, duration_hours REAL, next_scheduled_date INTEGER, document_id FK (reference manual), created_at INTEGER );
- Extend
-
API Endpoints:
POST /api/maintenance/logs- Log maintenance eventGET /api/maintenance/logs- List maintenance historyGET /api/maintenance/schedule- Get upcoming maintenancePUT /api/maintenance/logs/:id- Update logDELETE /api/maintenance/logs/:id- Remove log
-
Service Layer:
- Create
server/services/maintenance.service.js:logMaintenance(componentId, type, description, performedBy)getMaintenanceHistory(componentId, limit)getUpcomingMaintenance(organizationId)calculateNextMaintenanceDate(componentId)
- Create
-
Route File:
- Create
server/routes/maintenance.routes.js - Add to
server/index.js:app.use('/api/maintenance', maintenanceRoutes);
- Create
-
Background Job:
- Create
server/workers/maintenance-reminders.js - BullMQ cron job to check and send alerts
- Create
-
Search Integration:
- Index maintenance logs in Meilisearch for searchability
Camera/Document Capture Feature
Integration Points:
-
Database Schema:
- Extend
documentstable:ALTER TABLE documents ADD COLUMN ( capture_method TEXT (upload, camera, screenshot, scan), camera_device_info TEXT, -- JSON with device metadata capture_timestamp INTEGER ); - Create
camera_sessionstable:CREATE TABLE camera_sessions ( id TEXT PRIMARY KEY, user_id FK, organization_id FK, device_info TEXT, -- JSON started_at INTEGER, ended_at INTEGER, capture_count INTEGER );
- Extend
-
API Endpoints:
POST /api/capture/camera-session- Start camera sessionPOST /api/capture/upload-frame- Upload single camera frameGET /api/capture/sessions- List capture sessionsPOST /api/capture/batch-process- Process batch of frames as single document
-
Service Layer:
- Create
server/services/capture.service.js:createCameraSession(userId, organizationId, deviceInfo)uploadCaptureFrame(sessionId, imageBuffer, frameNumber)processCaptureSession(sessionId)- Convert frames to PDFgetSessionCaptures(sessionId)
- Create
-
Route File:
- Create
server/routes/capture.routes.js - Add to
server/index.js:app.use('/api/capture', captureRoutes);
- Create
-
Background Job:
- Extend OCR worker to handle batch-captured images
- Create
server/workers/batch-processor.jsfor frame-to-PDF conversion
-
Client Integration:
- Camera API integration in Vue 3 frontend
- WebRTC support for real-time preview
New Feature Route Registration Pattern
Standard Integration Checklist:
// 1. Create service file: server/services/[feature].service.js
// 2. Create route file: server/routes/[feature].routes.js
// 3. Add to server/index.js:
import [feature]Routes from './routes/[feature].routes.js';
app.use('/api/[feature]', [feature]Routes);
// 4. If background job needed:
// - Create server/workers/[feature]-worker.js
// - Extend queue.service.js with get[Feature]Queue()
// 5. If search needed:
// - Index documents via Meilisearch client in service layer
// 6. Database schema changes:
// - Add migration file or update schema.sql comments
// - Test with db/init.js
6. Tech Stack Validation
Backend Stack
| Technology | Version | Purpose | Status |
|---|---|---|---|
| Node.js | 18+ | Runtime | Running |
| Express.js | ^5.0.0 | Web framework | Active |
| SQLite (better-sqlite3) | ^11.0.0 | Database | Active |
| PostgreSQL | - | Planned migration target | Not yet |
| Redis (ioredis) | ^5.0.0 | Queue backend | Required |
| BullMQ | ^5.0.0 | Job queue | Active |
| JWT (jsonwebtoken) | ^9.0.2 | Authentication | Active |
| Bcryptjs | ^3.0.2 | Password hashing | Active |
| Meilisearch | ^0.41.0 | Full-text search | Active |
| Tesseract.js | ^5.0.0 | OCR engine | Active |
| PDF processing | - | - | - |
| ├─ pdf-parse | ^1.1.1 | PDF parsing | Active |
| ├─ pdf-img-convert | ^2.0.0 | PDF to image | Active |
| ├─ pdfjs-dist | ^4.0.0 | PDF viewer lib | Client |
| Image processing | - | - | - |
| ├─ sharp | ^0.34.4 | Image optimization | Active |
| Multer | ^1.4.5-lts.1 | File upload | Active |
| file-type | ^19.0.0 | File validation | Active |
| Helmet | ^7.0.0 | Security headers | Active |
| CORS | ^2.8.5 | Cross-origin | Active |
| Rate-limit | ^7.0.0 | Request limiting | Active |
| LRU-Cache | ^11.2.2 | TOC caching | Active |
| UUID | ^10.0.0 | ID generation | Active |
| dotenv | ^16.0.0 | Config management | Active |
Frontend Stack
| Technology | Version | Purpose | Status |
|---|---|---|---|
| Vue.js | ^3.5.0 | UI framework | Active |
| Vue Router | ^4.4.0 | Client routing | Active |
| Pinia | ^2.2.0 | State management | Active |
| Vue i18n | ^9.14.5 | Internationalization | Active |
| Vite | ^5.0.0 | Build tool | Active |
| Tailwind CSS | ^3.4.0 | Styling | Active |
| PostCSS | ^8.4.0 | CSS processing | Active |
| Meilisearch SDK | ^0.41.0 | Client search | Active |
| PDF.js | ^4.0.0 | PDF viewer | Active |
| Playwright | ^1.40.0 | Testing | Dev |
Infrastructure Requirements
| Service | Configuration | Purpose |
|---|---|---|
| Database | SQLite file (or PostgreSQL) | Primary data store |
| Redis | REDIS_HOST (default 127.0.0.1:6379) |
BullMQ backend |
| Meilisearch | MEILISEARCH_HOST (default http://127.0.0.1:7700) |
Search service |
| File Storage | /uploads directory |
PDF and image storage |
Environment Variables (Key)
# Server
PORT=3001
NODE_ENV=development
ALLOWED_ORIGINS=http://localhost:5173
# Database
DATABASE_PATH=./navidocs.db
# Redis
REDIS_HOST=127.0.0.1
REDIS_PORT=6379
# Meilisearch
MEILISEARCH_HOST=http://127.0.0.1:7700
MEILISEARCH_MASTER_KEY=<key>
MEILISEARCH_SEARCH_KEY=<key>
MEILISEARCH_INDEX_NAME=navidocs-pages
# JWT
JWT_SECRET=your-secret-key-change-in-production
JWT_EXPIRES_IN=15m
# File Upload
UPLOAD_DIR=./uploads
MAX_FILE_SIZE=52428800 # 50MB
# OCR
OCR_CONCURRENCY=2
# Rate Limiting
RATE_LIMIT_WINDOW_MS=900000 # 15 minutes
RATE_LIMIT_MAX_REQUESTS=100
IMAGE_RATE_LIMIT_MAX_REQUESTS=200
Validation Summary
Confirmed Technologies:
- Vue 3: ✓ Installed (^3.5.0)
- Express.js: ✓ Installed (^5.0.0)
- SQLite: ✓ Installed via better-sqlite3 (^11.0.0)
- Redis: ✓ Installed via ioredis (^5.0.0)
- Meilisearch: ✓ Installed (^0.41.0)
- Tesseract: ✓ Installed via tesseract.js (^5.0.0)
Status: All core tech stack components present and correctly configured.
7. Architecture Diagram (Text-based)
┌─────────────────────────────────────────────────────────────────┐
│ CLIENT LAYER (Vue 3) │
├─────────────────────────────────────────────────────────────────┤
│ • Vue Router (SPA navigation) │
│ • Pinia (state management) │
│ • Meilisearch Client SDK (full-text search UI) │
│ • PDF.js (document viewer) │
│ • Tailwind CSS (styling) │
└─────────────────────────────────────────────────────────────────┘
↓ HTTP/REST
┌─────────────────────────────────────────────────────────────────┐
│ EXPRESS.JS API LAYER │
├─────────────────────────────────────────────────────────────────┤
│ Routes: /api/auth, /api/documents, /api/search, /api/upload, │
│ /api/organizations, /api/jobs, /api/maintenance, etc │
│ │
│ Middleware: Authentication (JWT), Authorization, Rate Limiting │
│ Request Logging, Security Headers (Helmet) │
│ │
│ Response: JSON (documents, images, search results) │
└─────────────────────────────────────────────────────────────────┘
↓ ↓ ↓
┌─────────────────────────────────────────────────┐
│ SERVICE LAYER (Business Logic) │
├─────────────────────────────────────────────────┤
│ • auth.service.js - JWT, password hashing │
│ • authorization.service.js - Permission checks │
│ • search.js - Meilisearch indexing │
│ • queue.js - BullMQ job management │
│ • ocr-hybrid.js - PDF text extraction │
│ • inventory.service.js - (new feature) │
│ • maintenance.service.js - (new feature) │
│ • capture.service.js - (new feature) │
└─────────────────────────────────────────────────┘
↓ ↓ ↓
┌────────────────────┐ ┌──────────────────────┐ ┌─────────────────┐
│ SQLite DB │ │ Redis Queue │ │ Meilisearch │
├────────────────────┤ ├──────────────────────┤ ├─────────────────┤
│ • users │ │ ocr-processing queue │ │ Full-text index │
│ • organizations │ │ job data + status │ │ Page documents │
│ • documents │ │ (in-memory) │ │ Image text │
│ • entities │ │ │ │ │
│ • components │ │ │ │ │
│ • permissions │ │ │ │ │
│ • maintenance_logs │ │ │ │ │
│ • inventory_items │ │ │ │ │
└────────────────────┘ └──────────────────────┘ └─────────────────┘
↓
┌──────────────────────┐
│ Background Workers │
├──────────────────────┤
│ • ocr-worker.js │
│ - PDF → text │
│ - Tesseract.js OCR │
│ - Index to MS │
│ - Extract images │
│ - Extract TOC │
│ │
│ • inventory-alerts │
│ • maintenance-reminders
│ • batch-processor │
└──────────────────────┘
↓
┌──────────────────────┐
│ File System │
├──────────────────────┤
│ /uploads/ │
│ • PDF documents │
│ • Extracted images │
│ • Temporary files │
└──────────────────────┘
8. Data Flow Examples
Document Upload & OCR Processing Flow
1. User uploads PDF via POST /api/upload
├─ Multer stores file in memory
├─ File validation (size, type)
├─ SHA256 hash for deduplication
├─ File saved to disk (/uploads/:docId.pdf)
├─ Document record created (status: processing)
├─ ocr_job record created (status: pending)
└─ Response: { jobId, documentId }
2. API queues OCR job via queue.service.addOcrJob()
└─ BullMQ adds to Redis 'ocr-processing' queue
3. OCR Worker picks up job
├─ extractTextFromPDF() using pdf-parse + Tesseract.js
├─ Per page:
│ ├─ cleanOCRText()
│ ├─ Insert document_page record
│ ├─ Index in Meilisearch
│ ├─ extractImagesFromPage()
│ │ ├─ Convert page to image
│ │ ├─ Extract embedded images
│ │ └─ Run OCR on each image
│ └─ Store image metadata
├─ extractSections() for TOC
├─ Update document status: indexed
└─ Update ocr_job: completed
4. User polls GET /api/jobs/:jobId
├─ Checks database ocr_jobs record
└─ Response: { status, progress, documentId }
5. Document now searchable
├─ GET /api/search/token → Meilisearch auth
├─ POST /api/search → Full-text search results
└─ GET /api/documents/:id → Page list with OCR
Search & Document Retrieval Flow
1. User requests search token
POST /api/search/token
├─ Verifies user's organizations
├─ Generates Meilisearch tenant token (org-scoped)
└─ Response: { token, expiresAt, searchUrl }
2. Client calls Meilisearch directly with token
├─ Client library: meilisearch.index().search(q)
└─ Results filtered by organization
3. User clicks document result
GET /api/documents/:id
├─ Verify ownership/access
├─ Fetch document + pages + entity/component
└─ Response: Full metadata + page list
4. User views PDF
GET /api/documents/:id/pdf
├─ Verify access
├─ Stream file from /uploads/:id.pdf
└─ Response: PDF stream
5. User views document images
GET /api/documents/:id/images
├─ Query document_images table
└─ Response: Image metadata + URLs
6. Client fetches image
GET /api/images/:imageId
├─ Verify access
├─ Rate limit (200/min)
├─ Path traversal check
└─ Stream: /uploads/:docId/image_*.png
Permission & Sharing Flow
1. Document Owner Shares Document
POST /api/documents/:id/share
├─ Create document_shares record
├─ Audit log: document.share event
└─ Response: { success, sharedWith }
2. Recipient Accesses Document
GET /api/documents/:id
├─ Check access via:
│ ├─ user_organizations (org membership)
│ ├─ documents.uploaded_by (owner)
│ └─ document_shares (shared with)
├─ Grant read/write permission
└─ Return document + pages
3. Manager Grants Entity Permission
POST /api/permissions/grant
├─ Create entity_permissions record
├─ Set permission_level (viewer|editor|manager|admin)
├─ Optional expiration
├─ Audit log
└─ Response: Permission ID
4. Check Permission
checkEntityPermission(userId, entityId, minimumLevel)
├─ Query entity_permissions table
├─ Verify expiration
├─ Check permission hierarchy
└─ Return: { hasPermission, level }
9. Security Implementation
Authentication & Authorization
JWT Strategy:
- Access Token: 15 minutes (short-lived)
- Refresh Token: 7 days (stored in DB with hash)
- Tokens revoked on password reset
- Account lockout: 15 min after 5 failed attempts
Password Security:
- Bcrypt with 12 rounds
- Minimum 8 characters
- Hashing on register and reset
Session Management:
- Refresh tokens tracked in database
- Device info and IP logging
- Logout-all support
Role-Based Access Control (RBAC):
Organization Roles:
• viewer: Read-only access
• member: Can upload documents
• manager: Can add members, update org
• admin: Full org control + deletion
Entity Permissions:
• viewer: Read-only
• editor: Can modify/share
• manager: All + member management
• admin: Full control
Default Flow:
User → Organization (role) → Entities (permissions)
API Security
Middleware Stack:
- Helmet: Security headers (CSP, X-Frame-Options, etc)
- CORS: Whitelisted origins (production)
- Rate Limiting: 100 req/15min per IP (configurable)
- Authentication: JWT verification on protected routes
- Authorization: Role/permission checks in handlers
- Input Validation: UUID format, file type, size limits
- Path Traversal Prevention: Normalized path checks for file serving
File Upload Security:
- Multer memory storage (prevents direct disk write)
- File type validation via file-type library
- Size limit: 50MB (configurable)
- SHA256 hash for deduplication
- Filename sanitization (remove dangerous chars)
Data Protection
In Transit:
- HTTPS enforced (production)
- TLS/SSL certificates
- Secure cookies for JWT
At Rest:
- SQLite encryption (optional setup)
- Bcrypt password hashing
- No plaintext credentials in code
Audit Trail:
- All permission changes logged
- User actions tracked (audit_events)
- Login/logout recorded
10. Performance Considerations
Database Optimization
- Indexes on common query columns (org, entity, status, hash)
- Prepared statements via better-sqlite3
- Connection pooling (single connection in current setup)
Search Optimization
- Meilisearch for full-text indexing (not SQLite FTS)
- Async indexing in OCR worker
- Tenant tokens for client-side search
- 30-min LRU cache for TOC queries
OCR Processing
- Concurrency: 2 documents (configurable via OCR_CONCURRENCY)
- Limiter: 5 jobs/minute (prevents Tesseract overload)
- Progress tracking (0-100%)
- Batch image processing
Memory Management
- Streaming responses for large PDFs
- Image compression via sharp
- LRU cache cleanup (30 min TTL)
- Job cleanup: Complete (24h), Failed (7 days)
Scalability Bottlenecks
- Single SQLite connection: Switch to PostgreSQL for concurrent writes
- Local file storage: Switch to S3/cloud storage
- Tesseract CPU usage: Distribute workers across machines
- Meilisearch scale: Deploy cluster for high traffic
11. Known Issues & TODOs
Authentication
- Authentication middleware incomplete (req.user often hardcoded as 'test-user-id')
- Email verification not sent (template needed)
- Password reset email not sent (template needed)
Authorization
- Some endpoints missing auth checks
- Entity-level permissions not fully integrated
- Document-level permissions incomplete
Database
- Password reset tokens table missing from schema
- Refresh tokens table missing from schema
- Audit events table not defined
- Document images table not in schema.sql
- Document metadata handling inconsistent
OCR Worker
- Image extraction may fail silently
- Section extraction error handling needs improvement
- TOC extraction timing makes it optional (should be robust)
Frontend
- Client-side image upload/capture not implemented
- Multilingual search needs testing
- Rate limiting feedback incomplete
12. Integration Roadmap for New Features
Phase 1: Inventory Management
Dependencies:
- Components schema (exists)
- Basic CRUD API patterns (exist)
- Database migrations (setup required)
Estimated effort: 3-4 days New files: 3 (service, routes, worker) Database changes: +2 tables
Phase 2: Maintenance Tracking
Dependencies:
- Inventory feature (Phase 1)
- Meilisearch indexing (exists)
- Audit logging (partial)
Estimated effort: 2-3 days New files: 3 (service, routes, worker) Database changes: +1 table
Phase 3: Camera/Capture Feature
Dependencies:
- Upload API (exists)
- PDF processing (exists)
- WebRTC/Camera API (client)
Estimated effort: 4-5 days New files: 4 (service, routes, worker, batch-processor) Database changes: +2 tables
Phase 4: Enhanced Search & Analytics
Dependencies:
- Meilisearch integration (exists)
- Audit trail (Phase 2+)
- Statistics API (exists)
Estimated effort: 2-3 days New files: 2 (service, routes)
Conclusion
The NaviDocs codebase is well-structured with clear separation of concerns:
- Database: Comprehensive schema supporting multi-entity, multi-tenant architecture
- API: RESTful endpoints organized by feature with consistent patterns
- Services: Business logic isolated from routes with dependency injection
- Workers: Background OCR processing via BullMQ + Redis
- Frontend: Vue 3 SPA with Meilisearch client-side search
Ready for integration of:
- Inventory management
- Maintenance tracking
- Camera/document capture
- Enhanced analytics
All integration points identified and documented above.