Commit graph

5 commits

Author SHA1 Message Date
ggq-admin
b152df159d feat: Add dotenv loading to OCR worker for environment configuration
- Import dotenv in worker to load .env configuration
- Specify explicit path to server/.env file
- Update Meilisearch config to use changeme123 as default key
- Add debug logging to Meilisearch client initialization
- Add meilisearch-data/ to .gitignore

OCR pipeline is fully functional with 85% confidence:
- PDF upload 
- Queue processing 
- PDF to image conversion 
- Tesseract OCR 
- Database storage 

Remaining issue: Meilisearch authentication needs to be resolved
to enable search indexing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-19 09:00:16 +02:00
ggq-admin
df68e27e26 fix: Complete OCR pipeline with language code mapping
- Fix tesseract language code mapping (en -> eng) to match available training data
- Switch from Tesseract.js to local system tesseract command for better reliability
- Add TESSDATA_PREFIX environment variable for tesseract data path
- Create test directory structure to workaround pdf-parse debug mode
- OCR now successfully extracting text with 0.85 confidence

Tested with NaviDocs test manual - successfully extracted text including:
- "Bilge Pump Maintenance"
- "Electrical System"
- Battery maintenance instructions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-19 05:09:51 +02:00
ggq-admin
af02363299 fix: Switch to local system tesseract command for OCR
- Replace Tesseract.js with local tesseract CLI due to CDN 404 issues
- Fix queue name mismatch (ocr-processing vs ocr-jobs)
- Local tesseract uses pre-installed training data
- Faster and more reliable than downloading from CDN

\ud83e\udd16 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-19 04:48:18 +02:00
ggq-admin
09892de4a3 chore: Local development environment setup
- Installed system dependencies (Redis, Tesseract, poppler-utils)
- Downloaded and configured Meilisearch 1.11.3
- Initialized SQLite database with schema
- Started all services successfully:
  - Meilisearch on port 7700
  - Redis on port 6379
  - Backend API on port 3001
  - OCR Worker (BullMQ)
  - Frontend dev server on port 5174

All health checks passing. Ready for testing.

\ud83e\udd16 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-19 04:42:55 +02:00
ggq-admin
155a8c0305 feat: NaviDocs MVP - Complete codebase extraction from lilian1
## Backend (server/)
- Express 5 API with security middleware (helmet, rate limiting)
- SQLite database with WAL mode (schema from docs/architecture/)
- Meilisearch integration with tenant tokens
- BullMQ + Redis background job queue
- OCR pipeline with Tesseract.js
- File safety validation (extension, MIME, size)
- 4 API route modules: upload, jobs, search, documents

## Frontend (client/)
- Vue 3 with Composition API (<script setup>)
- Vite 5 build system with HMR
- Tailwind CSS (Meilisearch-inspired design)
- UploadModal with drag-and-drop
- FigureZoom component (ported from lilian1)
- Meilisearch search integration with tenant tokens
- Job polling composable
- Clean SVG icons (no emojis)

## Code Extraction
-  manuals.js → UploadModal.vue, useJobPolling.js
-  figure-zoom.js → FigureZoom.vue
-  service-worker.js → client/public/service-worker.js (TODO)
-  glossary.json → Merged into Meilisearch synonyms
-  Discarded: quiz.js, persona.js, gamification.js (Frank-AI junk)

## Documentation
- Complete extraction plan in docs/analysis/
- README with quick start guide
- Architecture summary in docs/architecture/

## Build Status
- Server dependencies:  Installed (234 packages)
- Client dependencies:  Installed (160 packages)
- Client build:  Successful (2.63s)

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-19 01:55:44 +02:00