navidocs/server
Claude f0096a6bd6
Feature: Multi-format upload support (JPG, PNG, DOCX, XLSX, TXT, MD)
Implements multi-format document upload capability expanding beyond PDFs.

Changes:
- server/package.json: Add mammoth (DOCX) and xlsx (Excel) dependencies
- server/services/file-safety.js: Expand allowed file types and MIME types
  - Added getFileCategory() function to classify file types
  - Support for images, Office docs, and text files
  - Flexible MIME validation for text files
- server/services/document-processor.js: NEW routing service
  - processImageFile(): Tesseract OCR for JPG/PNG/WebP
  - processWordDocument(): Mammoth for DOCX text extraction
  - processExcelDocument(): XLSX for spreadsheet data extraction
  - processTextFile(): Native reading for TXT/MD files
  - Unified interface with processDocument() router
- server/workers/ocr-worker.js: Switch from extractTextFromPDF to processDocument
  - Now handles all file types through unified processor
- client/src/components/UploadModal.vue: Update UI for multi-format
  - File input accepts all new file types
  - Updated help text to show supported formats

Supported formats: PDF, JPG, PNG, WebP, DOCX, XLSX, TXT, MD
Text extraction methods: Native (Office/text), Tesseract OCR (images), PDF.js (PDFs)
Search indexing: All file types processed and indexed in Meilisearch

Session: Cloud Session 2 - Multi-Format Upload Support
Branch: feature/multiformat
Status: Complete - Ready for testing
2025-11-13 12:54:44 +00:00
..
config chore(debug): log tenant token parent uid for troubleshooting 2025-10-19 17:11:05 +02:00
db FINAL: P0 blockers fixed + Joe Trader + ignore binaries 2025-11-13 01:29:59 +01:00
docs FINAL: P0 blockers fixed + Joe Trader + ignore binaries 2025-11-13 01:29:59 +01:00
examples feat: NaviDocs MVP - Complete codebase extraction from lilian1 2025-10-19 01:55:44 +02:00
middleware FINAL: P0 blockers fixed + Joe Trader + ignore binaries 2025-11-13 01:29:59 +01:00
migrations feat: Phase 3 - Admin settings system with encryption 2025-10-21 10:12:10 +02:00
routes FINAL: P0 blockers fixed + Joe Trader + ignore binaries 2025-11-13 01:29:59 +01:00
scripts [DEMO READY] Working NaviDocs v0.5 - Feature specs + Launch system 2025-11-13 12:57:41 +01:00
services Feature: Multi-format upload support (JPG, PNG, DOCX, XLSX, TXT, MD) 2025-11-13 12:54:44 +00:00
test/data chore: Local development environment setup 2025-10-19 04:42:55 +02:00
utils FINAL: P0 blockers fixed + Joe Trader + ignore binaries 2025-11-13 01:29:59 +01:00
workers Feature: Multi-format upload support (JPG, PNG, DOCX, XLSX, TXT, MD) 2025-11-13 12:54:44 +00:00
.env.example feat: Phase 3 - Admin settings system with encryption 2025-10-21 10:12:10 +02:00
API_SUMMARY.md feat: NaviDocs MVP - Complete codebase extraction from lilian1 2025-10-19 01:55:44 +02:00
ARCHITECTURE_DIAGRAM.md FINAL: P0 blockers fixed + Joe Trader + ignore binaries 2025-11-13 01:29:59 +01:00
AUTH_QUICK_START.md docs: Comprehensive implementation documentation 2025-10-21 10:12:25 +02:00
AUTH_SYSTEM_SUMMARY.md docs: Comprehensive implementation documentation 2025-10-21 10:12:25 +02:00
check-doc-status.js Fix search, add PDF text selection, clean duplicates, implement auto-fill 2025-10-20 01:35:06 +02:00
check-documents.js Fix router path - change /documents/ to /document/ in HomeView 2025-10-20 01:43:15 +02:00
CODEX_REVIEW_COMPLETE.md docs: Comprehensive implementation documentation 2025-10-21 10:12:25 +02:00
DESIGN_AUTH_MULTITENANCY.md FINAL: P0 blockers fixed + Joe Trader + ignore binaries 2025-11-13 01:29:59 +01:00
fix-user-org.js Fix search, add PDF text selection, clean duplicates, implement auto-fill 2025-10-20 01:35:06 +02:00
IMPLEMENTATION_COMPLETE.md docs: Comprehensive implementation documentation 2025-10-21 10:12:25 +02:00
IMPLEMENTATION_TASKS.md FINAL: P0 blockers fixed + Joe Trader + ignore binaries 2025-11-13 01:29:59 +01:00
index.js FINAL: P0 blockers fixed + Joe Trader + ignore binaries 2025-11-13 01:29:59 +01:00
package.json Feature: Multi-format upload support (JPG, PNG, DOCX, XLSX, TXT, MD) 2025-11-13 12:54:44 +00:00
PHASE_1_COMPLETE.md docs: Comprehensive implementation documentation 2025-10-21 10:12:25 +02:00
README_AUTH.md FINAL: P0 blockers fixed + Joe Trader + ignore binaries 2025-11-13 01:29:59 +01:00
run-migration.js feat: Add image extraction design, database schema, and migration 2025-10-19 19:47:30 +02:00
test-full-pipeline.js Implement PDF image extraction with OCR in OCR worker 2025-10-19 19:54:25 +02:00
test-image-extraction.js Implement PDF image extraction with OCR in OCR worker 2025-10-19 19:54:25 +02:00
test-image-system-e2e.js Fix search, add PDF text selection, clean duplicates, implement auto-fill 2025-10-20 01:35:06 +02:00
test-routes.js feat: NaviDocs MVP - Complete codebase extraction from lilian1 2025-10-19 01:55:44 +02:00
UX-RECOMMENDATIONS-SUMMARY.md FINAL: P0 blockers fixed + Joe Trader + ignore binaries 2025-11-13 01:29:59 +01:00
UX-REVIEW.md FINAL: P0 blockers fixed + Joe Trader + ignore binaries 2025-11-13 01:29:59 +01:00