navidocs/docs/analysis/lilian1-extraction-plan.md
ggq-admin 155a8c0305 feat: NaviDocs MVP - Complete codebase extraction from lilian1
## Backend (server/)
- Express 5 API with security middleware (helmet, rate limiting)
- SQLite database with WAL mode (schema from docs/architecture/)
- Meilisearch integration with tenant tokens
- BullMQ + Redis background job queue
- OCR pipeline with Tesseract.js
- File safety validation (extension, MIME, size)
- 4 API route modules: upload, jobs, search, documents

## Frontend (client/)
- Vue 3 with Composition API (<script setup>)
- Vite 5 build system with HMR
- Tailwind CSS (Meilisearch-inspired design)
- UploadModal with drag-and-drop
- FigureZoom component (ported from lilian1)
- Meilisearch search integration with tenant tokens
- Job polling composable
- Clean SVG icons (no emojis)

## Code Extraction
-  manuals.js → UploadModal.vue, useJobPolling.js
-  figure-zoom.js → FigureZoom.vue
-  service-worker.js → client/public/service-worker.js (TODO)
-  glossary.json → Merged into Meilisearch synonyms
-  Discarded: quiz.js, persona.js, gamification.js (Frank-AI junk)

## Documentation
- Complete extraction plan in docs/analysis/
- README with quick start guide
- Architecture summary in docs/architecture/

## Build Status
- Server dependencies:  Installed (234 packages)
- Client dependencies:  Installed (160 packages)
- Client build:  Successful (2.63s)

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-19 01:55:44 +02:00

17 KiB
Raw Export PDF Blame History

lilian1 (FRANK-AI) Code Extraction Plan

Date: 2025-10-19 Purpose: Extract clean, production-ready code from lilian1 prototype; discard experimental Frank-AI features Target: NaviDocs MVP with Meilisearch-inspired design


Executive Summary

lilian1 is a working boat manual assistant prototype called "FRANK-AI" with:

  • Total size: 2794 lines of JavaScript (7 files)
  • Clean code: ~940 lines worth extracting
  • Frank-AI junk: ~1850 lines to discard
  • Documentation: 56+ experimental markdown files to discard

Key Decision: What to Extract vs Discard

Category Extract Discard Reason
Manual management Core upload/job polling logic is solid
Figure zoom Excellent UX, accessibility-first, production-ready
Service worker PWA pattern is valuable for offline boat manuals
Quiz system Gamification - not in NaviDocs MVP scope
Persona system AI personality - not needed
Gamification Points/achievements - not in MVP scope
Debug overlay Development tool - replace with proper logging

Files to Extract

1. app/js/manuals.js (451 lines)

What it does:

  • Upload PDF to backend
  • Poll job status with progress tracking
  • Catalog loading (manuals list)
  • Modal controls for upload UI
  • Toast notifications

Clean patterns to port to Vue:

// Job polling pattern (lines 288-322)
async function startPolling(jobId) {
  pollInterval = setInterval(async () => {
    const response = await fetch(`${apiBase}/api/manuals/jobs/${jobId}`);
    const data = await response.json();
    updateJobStatus(data);
    if (data.status === 'completed' || data.status === 'failed') {
      clearInterval(pollInterval);
    }
  }, 2000);
}

Port to NaviDocs as:

  • client/src/components/UploadModal.vue - Upload UI
  • client/src/composables/useJobPolling.js - Polling logic
  • client/src/composables/useManualsCatalog.js - Catalog state

Discard:

  • Line 184: ingestFromUrl() - Claude CLI integration (not in MVP)
  • Line 134: findManuals() - Claude search (replace with Meilisearch)

2. app/js/figure-zoom.js (299 lines)

What it does:

  • Pan/zoom for PDF page images
  • Mouse wheel, drag, touch pinch controls
  • Keyboard shortcuts (+, -, 0)
  • Accessibility (aria-labels, prefers-reduced-motion)
  • Premium UX (spring easing)

This is EXCELLENT code - port as-is to Vue:

  • client/src/components/FigureZoom.vue - Wrap in Vue component
  • Keep all logic: updateTransform, bindMouseEvents, bindTouchEvents
  • Keep accessibility features

Why it's good:

  • Respects prefers-reduced-motion
  • Proper event cleanup
  • Touch support for mobile
  • Smooth animations with cubic-bezier easing

3. app/service-worker.js (192 lines)

What it does:

  • PWA offline caching
  • Precache critical files (index.html, CSS, JS, data files)
  • Cache-first strategy for data, network-first for HTML
  • Background sync hooks (future)
  • Push notification hooks (future)

Port to NaviDocs as:

  • client/public/service-worker.js - Adapt for Vue/Vite build
  • Update PRECACHE_URLS to match Vite build output
  • Keep cache-first strategy for manuals (important for boats with poor connectivity)

Changes needed:

// OLD: FRANK-AI hardcoded paths
const PRECACHE_URLS = ['/index.html', '/css/app.css', ...];

// NEW: Vite build output (generated from manifest)
const PRECACHE_URLS = [
  '/',
  '/assets/index-[hash].js',
  '/assets/index-[hash].css',
  '/data/manuals.json'
];

4. data/glossary.json (184 lines)

What it is:

  • Boat manual terminology index
  • Maps terms to page numbers
  • Examples: "Bilge", "Blackwater", "Windlass", "Galley", "Seacock"

How to use:

  • Extract unique terms
  • Add to Meilisearch synonyms config (we already have 40+, this adds more)
  • Use for autocomplete suggestions in search bar

Example extraction:

// Terms we don't have yet in meilisearch-config.json:
"seacock": ["through-hull", "thru-hull"],  // ✅ Already have
"demister": ["defroster", "windscreen demister"],  //  Add
"reboarding": ["ladder", "swim platform"],  //  Add
"mooring": ["docking", "tie-up"],  //  Add

Files to Discard

Gamification / AI Persona (Frank-AI Experiments)

File Lines Reason to Discard
app/js/quiz.js 209 Quiz game - not in MVP scope
app/js/persona.js 209 AI personality system - not needed
app/js/gamification.js 304 Points/badges/achievements - not in MVP
app/js/debug-overlay.js ~100 Dev tool - replace with proper logging

Total discarded: ~820 lines


Documentation Files (56+ files to discard)

All files starting with:

  • CLAUDE_SUPERPROMPT_*.md (8 files) - AI experiment prompts
  • FRANK_AI_*.md (3 files) - Frank-AI specific docs
  • FIGURE_*.md (6 files) - Figure implementation docs (interesting but not needed)
  • TEST_*.md (8 files) - Test reports (good to read, but don't copy)
  • *_REPORT.md (12 files) - Sprint reports
  • *_SUMMARY.md (10 files) - Session summaries
  • SECURITY-*.md (3 files) - Security audits (good insights, already captured in hardened-production-guide.md)
  • UX-*.md (3 files) - UX reviews

Keep for reference (read but don't copy):

  • README.md - Understand the project
  • CHANGES.md - What was changed over time
  • DEMO_ACCESS.txt - How to run lilian1

Total: ~1200 lines of markdown to discard


Migration Strategy

Phase 1: Bootstrap NaviDocs Structure

cd ~/navidocs

# Create directories
mkdir -p server/{routes,services,workers,db,config}
mkdir -p client/{src/{components,composables,views,stores,assets},public}

# Initialize package.json files

server/package.json:

{
  "name": "navidocs-server",
  "version": "1.0.0",
  "type": "module",
  "dependencies": {
    "express": "^5.0.0",
    "better-sqlite3": "^11.0.0",
    "meilisearch": "^0.41.0",
    "bullmq": "^5.0.0",
    "helmet": "^7.0.0",
    "express-rate-limit": "^7.0.0",
    "tesseract.js": "^5.0.0",
    "uuid": "^10.0.0",
    "bcrypt": "^5.1.0",
    "jsonwebtoken": "^9.0.0"
  }
}

client/package.json:

{
  "name": "navidocs-client",
  "version": "1.0.0",
  "type": "module",
  "scripts": {
    "dev": "vite",
    "build": "vite build",
    "preview": "vite preview"
  },
  "dependencies": {
    "vue": "^3.5.0",
    "vue-router": "^4.4.0",
    "pinia": "^2.2.0",
    "pdfjs-dist": "^4.0.0"
  },
  "devDependencies": {
    "@vitejs/plugin-vue": "^5.0.0",
    "vite": "^5.0.0",
    "tailwindcss": "^3.4.0",
    "autoprefixer": "^10.4.0",
    "postcss": "^8.4.0"
  }
}

Phase 2: Port Clean Code

Step 1: Figure Zoom Component

From: lilian1/app/js/figure-zoom.js To: navidocs/client/src/components/FigureZoom.vue

Changes:

  • Wrap in Vue component
  • Use Vue refs for state (scale, translateX, translateY)
  • Use Vue lifecycle hooks (onMounted, onUnmounted)
  • Keep all UX logic identical

Implementation:

<template>
  <div class="figure-lightbox" v-if="isOpen">
    <img
      ref="imageRef"
      :src="imageSrc"
      @wheel="handleWheel"
      @mousedown="handleMouseDown"
    />
    <div class="zoom-controls">
      <button @click="zoomIn">+</button>
      <button @click="zoomOut"></button>
      <button @click="reset"></button>
      <span>{{ Math.round(scale * 100) }}%</span>
    </div>
  </div>
</template>

<script setup>
import { ref, onMounted, onUnmounted } from 'vue';

const imageRef = ref(null);
const scale = ref(1);
const translateX = ref(0);
const translateY = ref(0);

// Copy all logic from figure-zoom.js
// ...
</script>

Step 2: Upload Modal Component

From: lilian1/app/js/manuals.js (lines 228-263) To: navidocs/client/src/components/UploadModal.vue

Changes:

  • Replace vanilla DOM manipulation with Vue reactivity
  • Use <script setup> syntax
  • Replace FormData upload with Meilisearch-safe approach

Step 3: Job Polling Composable

From: lilian1/app/js/manuals.js (lines 288-322) To: navidocs/client/src/composables/useJobPolling.js

Pattern:

import { ref, onUnmounted } from 'vue';

export function useJobPolling(apiBase) {
  const jobId = ref(null);
  const progress = ref(0);
  const status = ref('pending');
  let pollInterval = null;

  async function startPolling(id) {
    jobId.value = id;

    pollInterval = setInterval(async () => {
      const response = await fetch(`${apiBase}/api/jobs/${id}`);
      const data = await response.json();

      progress.value = data.progress;
      status.value = data.status;

      if (data.status === 'completed' || data.status === 'failed') {
        clearInterval(pollInterval);
      }
    }, 2000);
  }

  onUnmounted(() => {
    if (pollInterval) clearInterval(pollInterval);
  });

  return { jobId, progress, status, startPolling };
}

Step 4: Service Worker

From: lilian1/app/service-worker.js To: navidocs/client/public/service-worker.js

Changes:

  • Update CACHE_NAME to navidocs-v1
  • Update PRECACHE_URLS to match Vite build output
  • Keep cache strategy identical (cache-first for data, network-first for HTML)

Phase 3: Backend API Structure

New files (not in lilian1):

server/
├── index.js              # Express app entry point
├── config/
│   └── db.js             # SQLite connection
│   └── meilisearch.js    # Meilisearch client
├── routes/
│   └── upload.js         # POST /api/upload
│   └── jobs.js           # GET /api/jobs/:id
│   └── search.js         # POST /api/search (with tenant tokens)
│   └── documents.js      # GET /api/documents/:id
├── services/
│   └── file-safety.js    # 4-layer validation pipeline
│   └── ocr.js            # Tesseract.js wrapper
│   └── search.js         # Meilisearch service
├── workers/
│   └── ocr-worker.js     # BullMQ worker for OCR jobs
└── db/
    └── schema.sql        # (Already created in docs/architecture/)
    └── migrations/       # Future schema changes

Lilian1 had: api/server.js (custom search logic) NaviDocs will use: Meilisearch (< 10ms vs ~100ms, typo tolerance, synonyms)


Phase 4: Frontend Structure

New Vue 3 app (not in lilian1):

client/
├── index.html
├── vite.config.js
├── tailwind.config.js
├── src/
│   ├── main.js
│   ├── App.vue
│   ├── router.js
│   ├── components/
│   │   ├── UploadModal.vue      # ← From manuals.js
│   │   ├── FigureZoom.vue       # ← From figure-zoom.js
│   │   ├── SearchBar.vue        # ← New
│   │   ├── DocumentViewer.vue   # ← New (PDF.js)
│   │   └── JobProgress.vue      # ← From manuals.js
│   ├── composables/
│   │   ├── useJobPolling.js     # ← From manuals.js
│   │   ├── useManualsCatalog.js # ← From manuals.js
│   │   └── useSearch.js         # ← New (Meilisearch)
│   ├── views/
│   │   ├── HomeView.vue
│   │   ├── SearchView.vue
│   │   └── DocumentView.vue
│   ├── stores/
│   │   └── manuals.js           # Pinia store
│   └── assets/
│       └── icons/               # Clean SVG icons (Meilisearch-inspired)
└── public/
    └── service-worker.js         # ← From lilian1

Design System: Meilisearch-Inspired

User directive: "use as much of the https://www.meilisearch.com/ look and feel as possible, grab it all, no emojis, clean svg sybold for an expensive grown up look and feel"

Visual Analysis of Meilisearch.com

Colors:

  • Primary: #FF5CAA (Pink)
  • Secondary: #6C5CE7 (Purple)
  • Accent: #00D4FF (Cyan)
  • Neutral: #1E1E2F (Dark), #F5F5FA (Light)

Typography:

  • Headings: Bold, sans-serif (likely Inter or similar)
  • Body: Medium weight, generous line-height
  • Code: Monospace (Fira Code or similar)

Icons:

  • Clean SVG line icons
  • 24px base size
  • 2px stroke weight
  • Rounded corners (not sharp)

Components:

  • Generous padding (24px, 32px)
  • Subtle shadows: box-shadow: 0 4px 24px rgba(0,0,0,0.08)
  • Rounded corners: border-radius: 12px
  • Search bar: Large (56px height), prominent, centered

NaviDocs adaptation:

/* Tailwind config */
{
  colors: {
    primary: '#0EA5E9',      // Sky blue (boat theme)
    secondary: '#6366F1',    // Indigo
    accent: '#10B981',       // Green (success)
    dark: '#1E293B',
    light: '#F8FAFC'
  },
  fontFamily: {
    sans: ['Inter', 'system-ui', 'sans-serif'],
    mono: ['Fira Code', 'monospace']
  },
  borderRadius: {
    DEFAULT: '12px',
    lg: '16px'
  }
}

Icon System

NO emojis - Use clean SVG icons from:

Icons needed:

  • Upload (cloud-arrow-up)
  • Search (magnifying-glass)
  • Document (document-text)
  • Boat (custom or use sailboat icon)
  • Settings (cog)
  • User (user-circle)
  • Close (x-mark)
  • Zoom in/out (magnifying-glass-plus/minus)

Data Structure Insights

lilian1 data/pages.json structure:

{
  "manual": "boat",
  "slug": "boat",
  "vendor": "Prestige",
  "model": "F4.9",
  "pages": [
    {
      "p": 1,
      "headings": ["Owner Manual", "Technical Information"],
      "text": "Full OCR text here...",
      "figures": ["f1-p42-electrical-overview"]
    }
  ]
}

NaviDocs Meilisearch document structure:

{
  "id": "page_doc_abc123_p7",
  "vertical": "boating",

  "organizationId": "org_xyz789",
  "entityId": "boat_prestige_f49_001",
  "entityName": "Sea Breeze",

  "docId": "doc_abc123",
  "userId": "user_456",

  "documentType": "owner-manual",
  "title": "Owner Manual - Page 7",
  "pageNumber": 7,
  "text": "Full OCR text here...",

  "boatMake": "Prestige",
  "boatModel": "F4.9",
  "boatYear": 2024,

  "language": "en",
  "ocrConfidence": 0.94,

  "createdAt": 1740234567,
  "updatedAt": 1740234567
}

Key difference: NaviDocs uses per-page documents in Meilisearch (same as lilian1), but with richer metadata for multi-vertical support.


Testing Strategy

lilian1 had:

  • Playwright E2E tests (tests/e2e/app.spec.js)
  • Multi-manual ingestion tests
  • Engagement pack tests

NaviDocs will have:

Playwright tests:

tests/
├── upload.spec.js        # Upload PDF → job completes → searchable
├── search.spec.js        # Search with synonyms
├── document.spec.js      # View PDF, zoom figures
└── offline.spec.js       # PWA offline mode

Test cases:

  1. Upload PDF → OCR completes in < 5min → search finds text
  2. Search "bilge" → finds "sump pump" (synonym test)
  3. Search "electrical" → highlights matches in results
  4. Open document → zoom in/out → pan around
  5. Go offline → app still loads → cached manuals work

Success Criteria

Before declaring NaviDocs MVP ready:

  • All clean code extracted from lilian1
  • No Frank-AI junk (quiz, persona, gamification) in codebase
  • Meilisearch-inspired design applied (no emojis, clean SVG icons)
  • Upload PDF → OCR → searchable in < 5min
  • Search latency < 100ms
  • Synonym search works ("bilge" finds "sump pump")
  • Figure zoom component works (pan, zoom, keyboard shortcuts)
  • PWA offline mode caches manuals
  • Playwright tests pass (4+ E2E scenarios)
  • All fields display correctly in UI
  • No console errors in production build
  • Proof of working system (screenshots, demo video)

Timeline Estimate

Phase Tasks Time
Bootstrap Create directory structure, package.json files 1 hour
Backend API SQLite schema, Meilisearch setup, upload endpoint 4 hours
OCR Pipeline Tesseract.js integration, BullMQ queue 3 hours
Frontend Core Vue 3 + Vite + Tailwind setup, routing 2 hours
Components Upload modal, search bar, document viewer 4 hours
Figure Zoom Port from lilian1, adapt to Vue 2 hours
Service Worker Port PWA offline support 1 hour
Testing Playwright E2E tests 3 hours
Polish Debug, validate fields, UI refinement 4 hours
Total 24 hours

With multi-agent approach: Can parallelize backend + frontend work → ~12-16 hours


Next Steps

  1. Complete this extraction plan document
  2. ⏭️ Bootstrap NaviDocs directory structure
  3. ⏭️ Set up Vue 3 + Vite + Tailwind
  4. ⏭️ Implement backend API (Express, SQLite, Meilisearch)
  5. ⏭️ Port figure-zoom component
  6. ⏭️ Implement upload & OCR pipeline
  7. ⏭️ Add Playwright tests
  8. ⏭️ Debug and validate
  9. ⏭️ Proof of working system

User directive: "develop, debug, deploy and repeat; multi agent the max out of this"

Let's ship it.