navidocs/docs/analysis/LILIANE1_ARCHIVE_ANALYSIS.md
Danny Stocker 58b344aa31 FINAL: P0 blockers fixed + Joe Trader + ignore binaries
Fixed:
- Price: €800K-€1.5M, Sunseeker added
- Agent 1: Joe Trader persona + actual sale ads research
- Ignored meilisearch binary + data/ (too large for GitHub)
- SESSION_DEBUG_BLOCKERS.md created

Ready for Session 1 launch.

🤖 Generated with Claude Code
2025-11-13 01:29:59 +01:00

21 KiB

Liliane1 Archive Analysis

Date: 2025-10-23 Source: Liliane1-20251023T150408Z-1-001.zip Total Files: 29 files Total Size: 18.7 MB Boat Name: LILIAN I (LLC registered)


Executive Summary

This archive contains real-world yacht documentation for the vessel "LILIAN I", demonstrating the exact document types, organizational challenges, and role-based access needs that NaviDocs must solve.

Key Findings:

  • 9 distinct document categories spanning legal, operational, financial, and technical domains
  • Multiple file formats (PDF, JPEG, XLSX) requiring unified search
  • Role-specific information needs clearly evident (owner needs financials, captain needs manuals, crew needs operational docs)
  • Chronological organization challenges (invoices, delivery logs spanning months)
  • Critical documents (insurance, registration) requiring quick access

Document Inventory by Category

1. Equipment Manuals (1 file, 6.7 MB)

File Size Pages Description
OWNER_S MANUAL.pdf 6.7 MB 17 pages Primary vessel owner's manual

Importance: CRITICAL Access Frequency: Low (reference only during issues) Primary Users: Captain, Crew, Service Technicians Search Keywords: Engine, electrical, plumbing, troubleshooting, maintenance


2. Registration & Licensing (7 files, 2.7 MB)

File Size Description
Lilian 1 Registration.pdf 554 KB Final vessel registration
Lilian I Provisional Registration.pdf 1.0 MB Provisional registration (3 pages)
Licence Document Ship LILIAN LLC 2025-07-31.pdf 83 KB Ship license (2 pages)
LILIAN APP TO REGISTER.pdf 143 KB Registration application
LILIAN DECLARATION OF ELIGIBILITY RT.pdf 172 KB Eligibility declaration
LILIAN DECLARATION OUTSIDE.pdf 52 KB Outside registration declaration
ADMISSSION TEMPORAIRE LILIAN I.pdf 105 KB Temporary admission (customs)

Importance: CRITICAL Access Frequency: Low (quarterly inspections, port authorities) Primary Users: Owner, Captain, Harbor Master, Customs Search Keywords: Registration, license, customs, temporary admission, eligibility Regulatory Requirement: Must be aboard vessel at all times


3. Insurance (1 file, 227 KB)

File Size Description
Lilian Insurance 2025.pdf 227 KB Annual insurance policy 2025

Importance: CRITICAL Access Frequency: Low (annual renewal, claims) Primary Users: Owner, Insurance Agent, Marina Expiration Alert: Required for v1.2 (Insurance Documentation Vault) Search Keywords: Insurance, coverage, liability, policy number


4. Financial Records - Invoices (9 files, 2.3 MB)

HomeBox Telecom Service

File Size Description
HomeBox.jpeg 453 KB HomeBox invoice scan
HomeBox 2.jpeg 354 KB Second HomeBox invoice
Home box March 2025.pdf 27 KB March 2025 HomeBox invoice

General Invoices (French "Factures")

File Size Invoice # Date
Facture n° F1820005790.pdf 27 KB F1820005790 2025-03-03
Facture n° F1820006506.pdf 27 KB F1820006506 2025-05-02
Facture n° F1820006824.pdf 27 KB F1820006824 2025-06-02
Facture n° F1820007010.pdf 27 KB F1820007010 2025-07-01
Facture n° F1820008157.pdf 27 KB F1820008157 2025-10-01

Yacht Management Invoices

File Size Description
Lilian 1 Invoice.jpeg 679 KB Invoice #1 (Nov 2024)
Lilian 2 Invoice.pdf 1.0 MB Invoice #2 (Nov 2024)
Lilian 3 Invoice.pdf 1.1 MB Invoice #3 (Nov 2024)
Lilian 4 Invoice.pdf 1.0 MB Invoice #4 (Nov 2024)

Importance: HIGH (financial record-keeping, tax reporting) Access Frequency: Medium (monthly reconciliation, annual tax filing) Primary Users: Owner, Accountant, Management Company Search Keywords: Invoice, facture, payment, HomeBox, telecom Feature Requirement: v1.4 Tax-Ready Reporting needs to parse and categorize these

Observation: Some invoices stored as JPEG (scanned), others as PDF (digital). NaviDocs OCR must handle both.


5. Operational Logs - Delivery Schedules (2 files, 23 KB)

File Size Description
Lilian. delivery Olbia Cannes 2025.xlsx 11 KB Delivery expense tracking (10/10-10/14)
Lilian. delivery Olbia.xlsx 12 KB Same delivery, credit/cash breakdown

Contents Analysis:

Delivery: Olbia (Italy) → Cannes (France)
Dates: October 10-14, 2025
Crew:
  - Jean Michele: €600 (5 days)
  - Frank Stocker: €1,750 (5 days captain)

Expense Breakdown:
  Food:       €608.94
  Travel:     €322.00
  Diesel:     €2,506.28 (685 liters total)
  Port Fees:  €262.00 (Bonifacio 2 nights, Calvi)
  Crew:       €2,350.00

Total:      €6,049.22

Payment Methods:
  Credit Card: €3,124.29
  Cash:        €2,924.50

Importance: HIGH (operational expenses, crew payments, reimbursements) Access Frequency: High during deliveries, medium for historical review Primary Users: Captain, Owner, Management Company, Accountant Search Keywords: Delivery, Olbia, Cannes, diesel, crew, expenses, Jean Michele, Frank Stocker Feature Requirement: Directly maps to v1.1 Time Tracking & Automated Invoicing

Critical Insight: This is EXACTLY what NaviDocs v1.1 needs to capture:

  • Crew time tracking (Jean Michele: 5 days, Frank Stocker: 5 days)
  • GPS verification (Olbia → Bonifacio → Calvi → Cannes)
  • Expense categorization (food, travel, diesel, ports, crew)
  • Photo proof of work (fuel receipts, port receipts)
  • Automated invoice generation (€6,049.22 total → bill owner)

6. Purchase Orders / Contracts (1 file, 65 KB)

File Size Description
BDC LILIAN LLC.pdf 65 KB "Bon de Commande" (purchase order) 2 pages

Importance: MEDIUM Access Frequency: Low (initial purchase, legal reference) Primary Users: Owner, Legal, Vendor Search Keywords: Purchase order, BDC, contract, LILIAN LLC


7. Photos (3 files, 1.2 MB)

File Size Description
7EE4A803-3FA9-407C-A337-6D5847CF3897.jpeg 382 KB Vessel photo (1536x2048)
HomeBox.jpeg 453 KB HomeBox equipment scan
HomeBox 2.jpeg 354 KB Second HomeBox scan

Importance: MEDIUM (visual reference, proof of condition) Access Frequency: Low (marketing, insurance claims, equipment reference) Primary Users: Owner, Captain, Insurance Adjuster Feature Requirement: v1.1 Photo-Based Proof of Work

Observation: iPhone quality photos (JFIF standard), GPS metadata may be embedded


8. Folder Structure (Archive Organization)

Liliane1/
├── [Root files - 26 files]
└── Lilian invoice Nov 2024/
    ├── Lilian 1 Invoice.jpeg
    ├── Lilian 2 Invoice.pdf
    ├── Lilian 3 Invoice.pdf
    └── Lilian 4 Invoice.pdf

Observation: Minimal folder structure. Owner organized by date ("Nov 2024") but most files remain in root directory. This demonstrates the disorganization problem NaviDocs solves.


Document Format Analysis

By File Type

Format Count Total Size OCR Required?
PDF 22 16.5 MB Yes (scanned)
JPEG 5 2.1 MB Yes (photos, scans)
XLSX 2 23 KB No (structured data)
Total 29 18.7 MB 27 files need OCR

OCR Confidence Expectations

Document Type Expected Confidence Recommended Engine
French invoices (Facture) 90%+ Google Cloud Vision (multi-language)
Registration docs (typed) 95%+ Tesseract or Google Drive
Owner's Manual (printed) 85%+ Tesseract (sufficient)
Scanned invoices (JPEG) 75-85% Google Cloud Vision (handwriting support)
Excel delivery logs N/A Direct Excel parsing (xlsx2csv)

Role-Based Access Needs

Based on the documents in this archive, here's what each role needs to access frequently:

Owner (Boat Owner / Management Company)

Primary Concerns: Financials, legal compliance, asset value Frequent Access:

  • Invoices (all types) - Monthly review
  • Delivery expense logs - After each trip
  • Insurance policy - Annual renewal
  • Registration documents - Quarterly/annual inspections

Key Questions:

  • "How much did the Olbia delivery cost?"
  • "When does insurance expire?"
  • "Show me all invoices from Q3 2025"
  • "What's the total maintenance spend this year?"

Captain (Professional Boat Manager)

Primary Concerns: Operations, safety, compliance, crew coordination Frequent Access:

  • Owner's Manual - During troubleshooting
  • Registration & License - Required aboard vessel
  • Delivery schedules - Planning and execution
  • Crew payment tracking - After each delivery

Key Questions:

  • "How do I troubleshoot the engine alarm?"
  • "What's the registration number?"
  • "Who was crew on the last Cannes delivery?"
  • "Where are the port fees receipts?"

Crew (Day Workers, Cleaners, Deckhands)

Primary Concerns: Task assignments, work logs, payment verification Frequent Access:

  • Delivery schedules - To verify hours worked
  • Photos of completed work - Proof of service
  • Task assignments - Daily checklist

Key Questions:

  • "How many hours did I log last week?"
  • "Did the captain approve my time?"
  • "Where are my before/after cleaning photos?"

Service Technician (Third-Party Maintenance)

Primary Concerns: Equipment specs, service history, warranty status Frequent Access:

  • Owner's Manual - Technical specifications
  • Service records (not in archive, but would be)
  • Warranty documents (not in archive, but would be)

Key Questions:

  • "What's the engine model and serial number?"
  • "When was the last oil change?"
  • "Is this equipment still under warranty?"

Information Architecture Challenges

1. Chronological vs. Categorical Organization

Problem: Are invoices organized by:

  • Date? (March 2025, June 2025, Oct 2025)
  • Vendor? (HomeBox, Facture supplier, Yacht Management)
  • Type? (Telecom, Fuel, Crew, Port Fees)

Current State: Flat folder with dates in filenames (poor discoverability)

NaviDocs Solution: ALL THREE via search + filters

  • Search: "invoice 2025" → Find all
  • Filter by date: March 2025 → 1 invoice
  • Filter by category: "Telecom" → HomeBox invoices
  • Filter by vendor: "HomeBox" → 3 invoices

2. Critical vs. Reference Documents

Problem: Some docs are accessed weekly (delivery logs), others once a year (registration)

Current State: All files equal priority (no visual hierarchy)

NaviDocs Solution:

  • Pinned Documents: Insurance, Registration (always accessible)
  • Recent Activity: Delivery logs, latest invoices
  • Archive: Historical invoices, old registrations

3. Multi-Format Consistency

Problem: Same information in different formats

  • Lilian 1 Invoice.jpeg vs. Lilian 2 Invoice.pdf
  • Two versions of same delivery schedule (one for expenses, one for payment method)

Current State: User must manually open each to compare

NaviDocs Solution:

  • OCR both JPEG and PDF → Same searchable text
  • Version detection: "2 versions of Olbia delivery schedule"
  • Unified search: "Frank Stocker" finds him in both Excel files

4. Multi-Language Content

Problem: French invoices ("Facture"), English manuals, mixed terminology

Current State: User must remember "Facture" = "Invoice"

NaviDocs Solution:

  • Synonym detection: "invoice" finds "Facture" files
  • Multi-language OCR: Google Cloud Vision detects French automatically
  • Search translation: Owner searches "invoice", finds "Facture n° F1820006824.pdf"

Integration Recommendations for NaviDocs

Immediate MVP Integration (v1.0)

Use this archive as the DEMO DATASET:

  1. Upload all 29 files to NaviDocs
  2. Run OCR on all PDFs and JPEGs (27 files)
  3. Parse Excel files (2 delivery schedules)
  4. Create demo organization: "Zen Yacht Management"
  5. Create demo entity: "LILIAN I" (Boat)
  6. Demonstrate search:
    • "Frank Stocker" → Finds delivery schedules
    • "insurance" → Finds Lilian Insurance 2025.pdf
    • "registration" → Finds all 7 registration documents
    • "invoice October" → Finds Facture F1820008157.pdf

Why: Real-world data demonstrates NaviDocs value immediately. Users see their own document chaos solved.


v1.1 Feature Validation

Time Tracking & Automated Invoicing:

This archive contains the PERFECT validation dataset:

  • Crew: Jean Michele (€600), Frank Stocker (€1,750)
  • Hours: 5 days each (10/10-10/14)
  • Expenses: Food, travel, diesel, port fees (all categorized)
  • Invoice Total: €6,049.22

Demo Scenario:

  1. Captain logs delivery: "Olbia → Cannes, 10/10-10/14"
  2. Jean Michele clocks in/out each day via mobile app (GPS verified)
  3. Frank Stocker clocks in/out each day (captain rate: €350/day)
  4. Expenses logged with photos:
    • Diesel receipt (Bonifacio: €1,115.28)
    • Port fees receipt (Bonifacio: €197)
    • Fuel receipt (Calvi: €1,391)
    • Food receipts (€608.94 total)
  5. NaviDocs auto-generates invoice: €6,049.22
  6. Owner approves and pays

Result: Same data as Excel spreadsheet, but automated and audit-ready.


v1.2 Feature Validation

Warranty Management:

Missing Data in Archive: No warranty documents found But: OWNER_S MANUAL.pdf likely contains equipment serial numbers

Demo Scenario:

  1. Upload OWNER_S MANUAL.pdf
  2. OCR extracts: "Engine: Volvo D4, Serial: ABC123456"
  3. Captain photos warranty receipt → OCR extracts: "Warranty expires 2027-03-15"
  4. NaviDocs creates warranty record:
    • Equipment: Volvo D4 Engine
    • Serial: ABC123456
    • Expires: 2027-03-15
    • Days remaining: 490
  5. Alert sent 30 days before expiration (2027-02-13)

v1.4 Feature Validation

Tax-Ready Reporting:

Data Available:

  • 9 invoices (5 Factures + 4 Lilian invoices)
  • 2 delivery expense logs
  • All categorized by expense type

Demo Scenario:

  1. Accountant opens NaviDocs
  2. Selects "Tax Report: Q1-Q4 2025"
  3. NaviDocs generates:
    • Telecom: €X (HomeBox invoices)
    • Crew Labor: €4,700 (2 crew × 5 days)
    • Fuel: €2,506.28
    • Port Fees: €262
    • Food: €608.94
    • Travel: €322
  4. Export as CSV → QuickBooks
  5. Attach all invoice PDFs as proof

IRS Audit: Owner provides NaviDocs link → All receipts with GPS timestamps


Document Library Navigation Design

Based on this real-world archive, here's how NaviDocs should organize the library:

Top-Level Navigation (All Roles)

┌─────────────────────────────────────────────────────┐
│  🔍 Search: [Find documents, manuals, invoices...]  │
└─────────────────────────────────────────────────────┘

📌 Pinned Documents (Quick Access)
├─ 📄 LILIAN I Registration
├─ 🛡️ Insurance Policy 2025 (expires 2025-12-31)
└─ 📘 Owner's Manual

📁 Browse by Category
├─ 📋 Legal & Compliance (8)
│   ├─ Registration (7)
│   └─ Customs (1)
├─ 💰 Financial (11)
│   ├─ Invoices (9)
│   └─ Purchase Orders (1)
├─ 🚢 Operations (2)
│   └─ Delivery Logs (2)
├─ 🛡️ Insurance (1)
├─ 📘 Manuals (1)
└─ 📸 Photos (3)

📅 Recent Activity
├─ Uploaded today: Facture F1820008157.pdf
├─ Viewed yesterday: Lilian delivery Olbia schedule
└─ Shared last week: Insurance Policy 2025

🗂️ By Date
├─ 2025-10 (2 documents)
├─ 2025-07 (3 documents)
├─ 2025-06 (1 document)
└─ [View all dates...]

Role-Specific Views

Owner Dashboard

💰 Financial Summary
├─ Total Expenses 2025: €X,XXX
├─ Latest Invoice: Facture F1820008157 (Oct 2025)
└─ Pending Payments: 0

📊 Reports
├─ Quarterly Expense Report
├─ Crew Payment Summary
└─ Tax-Ready Documentation

⚠️ Expiration Alerts
└─ Insurance expires in 68 days (2025-12-31)

Captain Dashboard

📋 Today's Tasks
├─ Log delivery: Cannes → Nice
└─ Approve crew hours: Jean Michele (8 hrs)

📘 Quick Reference
├─ Owner's Manual (LILIAN I)
├─ Registration Number: [XXX]
└─ Emergency Contacts

🚢 Recent Deliveries
└─ Olbia → Cannes (Oct 10-14) - €6,049.22

Crew Dashboard

⏰ Time Clock
├─ Clock In (GPS: Cannes Marina)
└─ My Hours This Week: 16.5 hrs

💵 Payment Status
├─ Last Payment: Oct 14 (€600)
└─ Pending Approval: 8 hrs (Oct 20-21)

📸 My Work Photos
└─ Oct 14: Cleaning photos (before/after)

Search Query Examples

Based on the Liliane1 archive, here are realistic search queries and expected results:

Query Expected Results Why
"Frank Stocker" 2 delivery Excel files Named in crew section
"insurance" Lilian Insurance 2025.pdf Filename + OCR content
"registration" 7 registration PDFs Filename + document category
"October 2025" 3 files (1 invoice, 2 delivery logs) Date metadata
"Olbia" 2 Excel files City name in delivery route
"Cannes" 2 Excel files Destination city
"diesel" 2 Excel files Fuel expense category
"€6,049" 2 Excel files Total expense amount
"HomeBox" 3 files (2 JPEG, 1 PDF) Telecom vendor
"Facture" 5 French invoices French keyword
"invoice" 9 files (5 Factures + 4 Lilian invoices) Synonym detection
"owner manual" OWNER_S MANUAL.pdf Fuzzy match (underscore ignored)
"temporary admission" ADMISSSION TEMPORAIRE.pdf Fuzzy match + synonym (admission/admisssion typo tolerance)

File Naming Conventions Analysis

Current Naming Patterns

Good Practices:

  • Descriptive names: "Lilian Insurance 2025.pdf"
  • Date suffixes: "Licence Document Ship LILIAN LLC 2025-07-31.pdf"
  • Sequential numbering: "Lilian 1 Invoice.jpeg", "Lilian 2 Invoice.pdf"

Bad Practices:

  • UUID filenames: "7EE4A803-3FA9-407C-A337-6D5847CF3897.jpeg" (no human meaning)
  • Inconsistent spacing: "Lilian. delivery Olbia Cannes 2025.xlsx" (period after "Lilian")
  • Mixed languages: "Facture n° F1820006824.pdf" (French + English context)
  • Abbreviations: "BDC LILIAN LLC.pdf" (BDC = "Bon de Commande" not universally known)

NaviDocs Solution:

  • Accept any filename (user shouldn't be forced to rename)
  • Extract metadata (OCR finds "Insurance Policy" even if filename is UUID)
  • Suggest better name (AI renames "7EE4A803...jpeg" → "Vessel Photo Oct 2025.jpeg")

Metadata Extraction Opportunities

From this archive, NaviDocs OCR should extract:

From Registration Documents

  • Vessel Name: LILIAN I
  • Registration Number: [Extract from PDFs]
  • Owner/LLC: LILIAN LLC
  • Registration Date: [Extract from "Lilian 1 Registration.pdf"]
  • Expiration Date: [Extract if present]

From Insurance

  • Policy Number: [Extract from "Lilian Insurance 2025.pdf"]
  • Coverage Amount: [Extract]
  • Effective Date: 2025-01-01 (assumed from filename)
  • Expiration Date: 2025-12-31 (assumed from filename)
  • Insurer: [Extract company name]

From Delivery Logs (Excel)

  • Route: Olbia → Bonifacio → Calvi → Cannes
  • Dates: October 10-14, 2025
  • Crew: Jean Michele (€600), Frank Stocker (€1,750)
  • Total Cost: €6,049.22
  • Fuel Consumed: 685 liters (612L + 73L)
  • Port Stops: Bonifacio (2 nights), Calvi

From Invoices

  • Invoice Numbers: F1820005790, F1820006506, F1820006824, F1820007010, F1820008157
  • Dates: March, May, June, July, October 2025
  • Vendor: [Extract from PDF content]
  • Amounts: [Extract totals]

Conclusion & Next Steps

What We Learned

The Liliane1 archive is a perfect real-world validation dataset for NaviDocs. It demonstrates:

  1. Document diversity - Legal, financial, operational, technical
  2. Multi-format challenges - PDF, JPEG, XLSX all need search
  3. Role-based needs - Owner, captain, crew have different priorities
  4. Real operational data - Delivery schedules map directly to v1.1 features
  5. Organization chaos - Flat folder structure NaviDocs will solve

Immediate (v1.0 MVP):

  1. Use Liliane1 as demo dataset for MVP
  2. Upload all 29 files to NaviDocs
  3. Run OCR on 27 files (PDF + JPEG)
  4. Demonstrate search across all document types
  5. Create role-based navigation mockups

Next Session (v1.1):

  1. Build time tracking feature using delivery log as reference
  2. Implement automated invoicing (€6,049.22 total)
  3. Add crew management (Jean Michele, Frank Stocker)
  4. GPS verification for clock-in/out

Future (v1.2-v1.4):

  1. 📅 Warranty management (extract from manuals)
  2. 📅 Insurance expiration alerts (2025-12-31)
  3. 📅 Tax-ready reporting (categorize all expenses)

Document Version: 1.0 Last Updated: 2025-10-23 Cross-Reference: docs/debates/03-document-library-navigation.md (to be created)