navidocs/docs/analysis/LILIANE1_ARCHIVE_ANALYSIS.md
Danny Stocker 58b344aa31 FINAL: P0 blockers fixed + Joe Trader + ignore binaries
Fixed:
- Price: €800K-€1.5M, Sunseeker added
- Agent 1: Joe Trader persona + actual sale ads research
- Ignored meilisearch binary + data/ (too large for GitHub)
- SESSION_DEBUG_BLOCKERS.md created

Ready for Session 1 launch.

🤖 Generated with Claude Code
2025-11-13 01:29:59 +01:00

645 lines
21 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Liliane1 Archive Analysis
**Date:** 2025-10-23
**Source:** Liliane1-20251023T150408Z-1-001.zip
**Total Files:** 29 files
**Total Size:** 18.7 MB
**Boat Name:** LILIAN I (LLC registered)
---
## Executive Summary
This archive contains **real-world yacht documentation** for the vessel "LILIAN I", demonstrating the exact document types, organizational challenges, and role-based access needs that NaviDocs must solve.
**Key Findings:**
- **9 distinct document categories** spanning legal, operational, financial, and technical domains
- **Multiple file formats** (PDF, JPEG, XLSX) requiring unified search
- **Role-specific information needs** clearly evident (owner needs financials, captain needs manuals, crew needs operational docs)
- **Chronological organization challenges** (invoices, delivery logs spanning months)
- **Critical documents** (insurance, registration) requiring quick access
---
## Document Inventory by Category
### 1. Equipment Manuals (1 file, 6.7 MB)
| File | Size | Pages | Description |
|------|------|-------|-------------|
| **OWNER_S MANUAL.pdf** | 6.7 MB | 17 pages | Primary vessel owner's manual |
**Importance:** CRITICAL
**Access Frequency:** Low (reference only during issues)
**Primary Users:** Captain, Crew, Service Technicians
**Search Keywords:** Engine, electrical, plumbing, troubleshooting, maintenance
---
### 2. Registration & Licensing (7 files, 2.7 MB)
| File | Size | Description |
|------|------|-------------|
| **Lilian 1 Registration.pdf** | 554 KB | Final vessel registration |
| **Lilian I Provisional Registration.pdf** | 1.0 MB | Provisional registration (3 pages) |
| **Licence Document Ship LILIAN LLC 2025-07-31.pdf** | 83 KB | Ship license (2 pages) |
| **LILIAN APP TO REGISTER.pdf** | 143 KB | Registration application |
| **LILIAN DECLARATION OF ELIGIBILITY RT.pdf** | 172 KB | Eligibility declaration |
| **LILIAN DECLARATION OUTSIDE.pdf** | 52 KB | Outside registration declaration |
| **ADMISSSION TEMPORAIRE LILIAN I.pdf** | 105 KB | Temporary admission (customs) |
**Importance:** CRITICAL
**Access Frequency:** Low (quarterly inspections, port authorities)
**Primary Users:** Owner, Captain, Harbor Master, Customs
**Search Keywords:** Registration, license, customs, temporary admission, eligibility
**Regulatory Requirement:** Must be aboard vessel at all times
---
### 3. Insurance (1 file, 227 KB)
| File | Size | Description |
|------|------|-------------|
| **Lilian Insurance 2025.pdf** | 227 KB | Annual insurance policy 2025 |
**Importance:** CRITICAL
**Access Frequency:** Low (annual renewal, claims)
**Primary Users:** Owner, Insurance Agent, Marina
**Expiration Alert:** Required for v1.2 (Insurance Documentation Vault)
**Search Keywords:** Insurance, coverage, liability, policy number
---
### 4. Financial Records - Invoices (9 files, 2.3 MB)
#### HomeBox Telecom Service
| File | Size | Description |
|------|------|-------------|
| **HomeBox.jpeg** | 453 KB | HomeBox invoice scan |
| **HomeBox 2.jpeg** | 354 KB | Second HomeBox invoice |
| **Home box March 2025.pdf** | 27 KB | March 2025 HomeBox invoice |
#### General Invoices (French "Factures")
| File | Size | Invoice # | Date |
|------|------|-----------|------|
| **Facture n° F1820005790.pdf** | 27 KB | F1820005790 | 2025-03-03 |
| **Facture n° F1820006506.pdf** | 27 KB | F1820006506 | 2025-05-02 |
| **Facture n° F1820006824.pdf** | 27 KB | F1820006824 | 2025-06-02 |
| **Facture n° F1820007010.pdf** | 27 KB | F1820007010 | 2025-07-01 |
| **Facture n° F1820008157.pdf** | 27 KB | F1820008157 | 2025-10-01 |
#### Yacht Management Invoices
| File | Size | Description |
|------|------|-------------|
| **Lilian 1 Invoice.jpeg** | 679 KB | Invoice #1 (Nov 2024) |
| **Lilian 2 Invoice.pdf** | 1.0 MB | Invoice #2 (Nov 2024) |
| **Lilian 3 Invoice.pdf** | 1.1 MB | Invoice #3 (Nov 2024) |
| **Lilian 4 Invoice.pdf** | 1.0 MB | Invoice #4 (Nov 2024) |
**Importance:** HIGH (financial record-keeping, tax reporting)
**Access Frequency:** Medium (monthly reconciliation, annual tax filing)
**Primary Users:** Owner, Accountant, Management Company
**Search Keywords:** Invoice, facture, payment, HomeBox, telecom
**Feature Requirement:** v1.4 Tax-Ready Reporting needs to parse and categorize these
**Observation:** Some invoices stored as JPEG (scanned), others as PDF (digital). NaviDocs OCR must handle both.
---
### 5. Operational Logs - Delivery Schedules (2 files, 23 KB)
| File | Size | Description |
|------|------|-------------|
| **Lilian. delivery Olbia Cannes 2025.xlsx** | 11 KB | Delivery expense tracking (10/10-10/14) |
| **Lilian. delivery Olbia.xlsx** | 12 KB | Same delivery, credit/cash breakdown |
**Contents Analysis:**
```
Delivery: Olbia (Italy) → Cannes (France)
Dates: October 10-14, 2025
Crew:
- Jean Michele: €600 (5 days)
- Frank Stocker: €1,750 (5 days captain)
Expense Breakdown:
Food: €608.94
Travel: €322.00
Diesel: €2,506.28 (685 liters total)
Port Fees: €262.00 (Bonifacio 2 nights, Calvi)
Crew: €2,350.00
Total: €6,049.22
Payment Methods:
Credit Card: €3,124.29
Cash: €2,924.50
```
**Importance:** HIGH (operational expenses, crew payments, reimbursements)
**Access Frequency:** High during deliveries, medium for historical review
**Primary Users:** Captain, Owner, Management Company, Accountant
**Search Keywords:** Delivery, Olbia, Cannes, diesel, crew, expenses, Jean Michele, Frank Stocker
**Feature Requirement:** Directly maps to v1.1 Time Tracking & Automated Invoicing
**Critical Insight:** This is EXACTLY what NaviDocs v1.1 needs to capture:
- Crew time tracking (Jean Michele: 5 days, Frank Stocker: 5 days)
- GPS verification (Olbia → Bonifacio → Calvi → Cannes)
- Expense categorization (food, travel, diesel, ports, crew)
- Photo proof of work (fuel receipts, port receipts)
- Automated invoice generation (€6,049.22 total → bill owner)
---
### 6. Purchase Orders / Contracts (1 file, 65 KB)
| File | Size | Description |
|------|------|-------------|
| **BDC LILIAN LLC.pdf** | 65 KB | "Bon de Commande" (purchase order) 2 pages |
**Importance:** MEDIUM
**Access Frequency:** Low (initial purchase, legal reference)
**Primary Users:** Owner, Legal, Vendor
**Search Keywords:** Purchase order, BDC, contract, LILIAN LLC
---
### 7. Photos (3 files, 1.2 MB)
| File | Size | Description |
|------|------|-------------|
| **7EE4A803-3FA9-407C-A337-6D5847CF3897.jpeg** | 382 KB | Vessel photo (1536x2048) |
| **HomeBox.jpeg** | 453 KB | HomeBox equipment scan |
| **HomeBox 2.jpeg** | 354 KB | Second HomeBox scan |
**Importance:** MEDIUM (visual reference, proof of condition)
**Access Frequency:** Low (marketing, insurance claims, equipment reference)
**Primary Users:** Owner, Captain, Insurance Adjuster
**Feature Requirement:** v1.1 Photo-Based Proof of Work
**Observation:** iPhone quality photos (JFIF standard), GPS metadata may be embedded
---
### 8. Folder Structure (Archive Organization)
```
Liliane1/
├── [Root files - 26 files]
└── Lilian invoice Nov 2024/
├── Lilian 1 Invoice.jpeg
├── Lilian 2 Invoice.pdf
├── Lilian 3 Invoice.pdf
└── Lilian 4 Invoice.pdf
```
**Observation:** Minimal folder structure. Owner organized by date ("Nov 2024") but most files remain in root directory. This demonstrates the **disorganization problem NaviDocs solves**.
---
## Document Format Analysis
### By File Type
| Format | Count | Total Size | OCR Required? |
|--------|-------|------------|---------------|
| **PDF** | 22 | 16.5 MB | Yes (scanned) |
| **JPEG** | 5 | 2.1 MB | Yes (photos, scans) |
| **XLSX** | 2 | 23 KB | No (structured data) |
| **Total** | 29 | 18.7 MB | 27 files need OCR |
### OCR Confidence Expectations
| Document Type | Expected Confidence | Recommended Engine |
|---------------|---------------------|-------------------|
| French invoices (Facture) | 90%+ | Google Cloud Vision (multi-language) |
| Registration docs (typed) | 95%+ | Tesseract or Google Drive |
| Owner's Manual (printed) | 85%+ | Tesseract (sufficient) |
| Scanned invoices (JPEG) | 75-85% | Google Cloud Vision (handwriting support) |
| Excel delivery logs | N/A | Direct Excel parsing (xlsx2csv) |
---
## Role-Based Access Needs
Based on the documents in this archive, here's what each role needs to access frequently:
### Owner (Boat Owner / Management Company)
**Primary Concerns:** Financials, legal compliance, asset value
**Frequent Access:**
- Invoices (all types) - Monthly review
- Delivery expense logs - After each trip
- Insurance policy - Annual renewal
- Registration documents - Quarterly/annual inspections
**Key Questions:**
- "How much did the Olbia delivery cost?"
- "When does insurance expire?"
- "Show me all invoices from Q3 2025"
- "What's the total maintenance spend this year?"
---
### Captain (Professional Boat Manager)
**Primary Concerns:** Operations, safety, compliance, crew coordination
**Frequent Access:**
- Owner's Manual - During troubleshooting
- Registration & License - Required aboard vessel
- Delivery schedules - Planning and execution
- Crew payment tracking - After each delivery
**Key Questions:**
- "How do I troubleshoot the engine alarm?"
- "What's the registration number?"
- "Who was crew on the last Cannes delivery?"
- "Where are the port fees receipts?"
---
### Crew (Day Workers, Cleaners, Deckhands)
**Primary Concerns:** Task assignments, work logs, payment verification
**Frequent Access:**
- Delivery schedules - To verify hours worked
- Photos of completed work - Proof of service
- Task assignments - Daily checklist
**Key Questions:**
- "How many hours did I log last week?"
- "Did the captain approve my time?"
- "Where are my before/after cleaning photos?"
---
### Service Technician (Third-Party Maintenance)
**Primary Concerns:** Equipment specs, service history, warranty status
**Frequent Access:**
- Owner's Manual - Technical specifications
- Service records (not in archive, but would be)
- Warranty documents (not in archive, but would be)
**Key Questions:**
- "What's the engine model and serial number?"
- "When was the last oil change?"
- "Is this equipment still under warranty?"
---
## Information Architecture Challenges
### 1. **Chronological vs. Categorical Organization**
**Problem:** Are invoices organized by:
- Date? (March 2025, June 2025, Oct 2025)
- Vendor? (HomeBox, Facture supplier, Yacht Management)
- Type? (Telecom, Fuel, Crew, Port Fees)
**Current State:** Flat folder with dates in filenames (poor discoverability)
**NaviDocs Solution:** ALL THREE via search + filters
- Search: "invoice 2025" → Find all
- Filter by date: March 2025 → 1 invoice
- Filter by category: "Telecom" → HomeBox invoices
- Filter by vendor: "HomeBox" → 3 invoices
---
### 2. **Critical vs. Reference Documents**
**Problem:** Some docs are accessed weekly (delivery logs), others once a year (registration)
**Current State:** All files equal priority (no visual hierarchy)
**NaviDocs Solution:**
- **Pinned Documents:** Insurance, Registration (always accessible)
- **Recent Activity:** Delivery logs, latest invoices
- **Archive:** Historical invoices, old registrations
---
### 3. **Multi-Format Consistency**
**Problem:** Same information in different formats
- Lilian 1 Invoice.jpeg vs. Lilian 2 Invoice.pdf
- Two versions of same delivery schedule (one for expenses, one for payment method)
**Current State:** User must manually open each to compare
**NaviDocs Solution:**
- OCR both JPEG and PDF → Same searchable text
- Version detection: "2 versions of Olbia delivery schedule"
- Unified search: "Frank Stocker" finds him in both Excel files
---
### 4. **Multi-Language Content**
**Problem:** French invoices ("Facture"), English manuals, mixed terminology
**Current State:** User must remember "Facture" = "Invoice"
**NaviDocs Solution:**
- Synonym detection: "invoice" finds "Facture" files
- Multi-language OCR: Google Cloud Vision detects French automatically
- Search translation: Owner searches "invoice", finds "Facture n° F1820006824.pdf"
---
## Integration Recommendations for NaviDocs
### Immediate MVP Integration (v1.0)
**Use this archive as the DEMO DATASET:**
1. **Upload all 29 files to NaviDocs**
2. **Run OCR on all PDFs and JPEGs** (27 files)
3. **Parse Excel files** (2 delivery schedules)
4. **Create demo organization:** "Zen Yacht Management"
5. **Create demo entity:** "LILIAN I" (Boat)
6. **Demonstrate search:**
- "Frank Stocker" → Finds delivery schedules
- "insurance" → Finds Lilian Insurance 2025.pdf
- "registration" → Finds all 7 registration documents
- "invoice October" → Finds Facture F1820008157.pdf
**Why:** Real-world data demonstrates NaviDocs value immediately. Users see their own document chaos solved.
---
### v1.1 Feature Validation
**Time Tracking & Automated Invoicing:**
This archive contains the PERFECT validation dataset:
- **Crew:** Jean Michele (€600), Frank Stocker (€1,750)
- **Hours:** 5 days each (10/10-10/14)
- **Expenses:** Food, travel, diesel, port fees (all categorized)
- **Invoice Total:** €6,049.22
**Demo Scenario:**
1. Captain logs delivery: "Olbia → Cannes, 10/10-10/14"
2. Jean Michele clocks in/out each day via mobile app (GPS verified)
3. Frank Stocker clocks in/out each day (captain rate: €350/day)
4. Expenses logged with photos:
- Diesel receipt (Bonifacio: €1,115.28)
- Port fees receipt (Bonifacio: €197)
- Fuel receipt (Calvi: €1,391)
- Food receipts (€608.94 total)
5. NaviDocs auto-generates invoice: **€6,049.22**
6. Owner approves and pays
**Result:** Same data as Excel spreadsheet, but automated and audit-ready.
---
### v1.2 Feature Validation
**Warranty Management:**
**Missing Data in Archive:** No warranty documents found
**But:** OWNER_S MANUAL.pdf likely contains equipment serial numbers
**Demo Scenario:**
1. Upload OWNER_S MANUAL.pdf
2. OCR extracts: "Engine: Volvo D4, Serial: ABC123456"
3. Captain photos warranty receipt → OCR extracts: "Warranty expires 2027-03-15"
4. NaviDocs creates warranty record:
- Equipment: Volvo D4 Engine
- Serial: ABC123456
- Expires: 2027-03-15
- Days remaining: 490
5. Alert sent 30 days before expiration (2027-02-13)
---
### v1.4 Feature Validation
**Tax-Ready Reporting:**
**Data Available:**
- 9 invoices (5 Factures + 4 Lilian invoices)
- 2 delivery expense logs
- All categorized by expense type
**Demo Scenario:**
1. Accountant opens NaviDocs
2. Selects "Tax Report: Q1-Q4 2025"
3. NaviDocs generates:
- Telecom: €X (HomeBox invoices)
- Crew Labor: €4,700 (2 crew × 5 days)
- Fuel: €2,506.28
- Port Fees: €262
- Food: €608.94
- Travel: €322
4. Export as CSV → QuickBooks
5. Attach all invoice PDFs as proof
**IRS Audit:** Owner provides NaviDocs link → All receipts with GPS timestamps
---
## Document Library Navigation Design
Based on this real-world archive, here's how NaviDocs should organize the library:
### Top-Level Navigation (All Roles)
```
┌─────────────────────────────────────────────────────┐
│ 🔍 Search: [Find documents, manuals, invoices...] │
└─────────────────────────────────────────────────────┘
📌 Pinned Documents (Quick Access)
├─ 📄 LILIAN I Registration
├─ 🛡️ Insurance Policy 2025 (expires 2025-12-31)
└─ 📘 Owner's Manual
📁 Browse by Category
├─ 📋 Legal & Compliance (8)
│ ├─ Registration (7)
│ └─ Customs (1)
├─ 💰 Financial (11)
│ ├─ Invoices (9)
│ └─ Purchase Orders (1)
├─ 🚢 Operations (2)
│ └─ Delivery Logs (2)
├─ 🛡️ Insurance (1)
├─ 📘 Manuals (1)
└─ 📸 Photos (3)
📅 Recent Activity
├─ Uploaded today: Facture F1820008157.pdf
├─ Viewed yesterday: Lilian delivery Olbia schedule
└─ Shared last week: Insurance Policy 2025
🗂️ By Date
├─ 2025-10 (2 documents)
├─ 2025-07 (3 documents)
├─ 2025-06 (1 document)
└─ [View all dates...]
```
---
### Role-Specific Views
#### Owner Dashboard
```
💰 Financial Summary
├─ Total Expenses 2025: €X,XXX
├─ Latest Invoice: Facture F1820008157 (Oct 2025)
└─ Pending Payments: 0
📊 Reports
├─ Quarterly Expense Report
├─ Crew Payment Summary
└─ Tax-Ready Documentation
⚠️ Expiration Alerts
└─ Insurance expires in 68 days (2025-12-31)
```
#### Captain Dashboard
```
📋 Today's Tasks
├─ Log delivery: Cannes → Nice
└─ Approve crew hours: Jean Michele (8 hrs)
📘 Quick Reference
├─ Owner's Manual (LILIAN I)
├─ Registration Number: [XXX]
└─ Emergency Contacts
🚢 Recent Deliveries
└─ Olbia → Cannes (Oct 10-14) - €6,049.22
```
#### Crew Dashboard
```
⏰ Time Clock
├─ Clock In (GPS: Cannes Marina)
└─ My Hours This Week: 16.5 hrs
💵 Payment Status
├─ Last Payment: Oct 14 (€600)
└─ Pending Approval: 8 hrs (Oct 20-21)
📸 My Work Photos
└─ Oct 14: Cleaning photos (before/after)
```
---
## Search Query Examples
Based on the Liliane1 archive, here are realistic search queries and expected results:
| Query | Expected Results | Why |
|-------|------------------|-----|
| **"Frank Stocker"** | 2 delivery Excel files | Named in crew section |
| **"insurance"** | Lilian Insurance 2025.pdf | Filename + OCR content |
| **"registration"** | 7 registration PDFs | Filename + document category |
| **"October 2025"** | 3 files (1 invoice, 2 delivery logs) | Date metadata |
| **"Olbia"** | 2 Excel files | City name in delivery route |
| **"Cannes"** | 2 Excel files | Destination city |
| **"diesel"** | 2 Excel files | Fuel expense category |
| **"€6,049"** | 2 Excel files | Total expense amount |
| **"HomeBox"** | 3 files (2 JPEG, 1 PDF) | Telecom vendor |
| **"Facture"** | 5 French invoices | French keyword |
| **"invoice"** | 9 files (5 Factures + 4 Lilian invoices) | Synonym detection |
| **"owner manual"** | OWNER_S MANUAL.pdf | Fuzzy match (underscore ignored) |
| **"temporary admission"** | ADMISSSION TEMPORAIRE.pdf | Fuzzy match + synonym (admission/admisssion typo tolerance) |
---
## File Naming Conventions Analysis
### Current Naming Patterns
**Good Practices:**
- Descriptive names: "Lilian Insurance 2025.pdf"
- Date suffixes: "Licence Document Ship LILIAN LLC 2025-07-31.pdf"
- Sequential numbering: "Lilian 1 Invoice.jpeg", "Lilian 2 Invoice.pdf"
**Bad Practices:**
- UUID filenames: "7EE4A803-3FA9-407C-A337-6D5847CF3897.jpeg" (no human meaning)
- Inconsistent spacing: "Lilian. delivery Olbia Cannes 2025.xlsx" (period after "Lilian")
- Mixed languages: "Facture n° F1820006824.pdf" (French + English context)
- Abbreviations: "BDC LILIAN LLC.pdf" (BDC = "Bon de Commande" not universally known)
**NaviDocs Solution:**
- **Accept any filename** (user shouldn't be forced to rename)
- **Extract metadata** (OCR finds "Insurance Policy" even if filename is UUID)
- **Suggest better name** (AI renames "7EE4A803...jpeg" → "Vessel Photo Oct 2025.jpeg")
---
## Metadata Extraction Opportunities
From this archive, NaviDocs OCR should extract:
### From Registration Documents
- **Vessel Name:** LILIAN I
- **Registration Number:** [Extract from PDFs]
- **Owner/LLC:** LILIAN LLC
- **Registration Date:** [Extract from "Lilian 1 Registration.pdf"]
- **Expiration Date:** [Extract if present]
### From Insurance
- **Policy Number:** [Extract from "Lilian Insurance 2025.pdf"]
- **Coverage Amount:** [Extract]
- **Effective Date:** 2025-01-01 (assumed from filename)
- **Expiration Date:** 2025-12-31 (assumed from filename)
- **Insurer:** [Extract company name]
### From Delivery Logs (Excel)
- **Route:** Olbia → Bonifacio → Calvi → Cannes
- **Dates:** October 10-14, 2025
- **Crew:** Jean Michele (€600), Frank Stocker (€1,750)
- **Total Cost:** €6,049.22
- **Fuel Consumed:** 685 liters (612L + 73L)
- **Port Stops:** Bonifacio (2 nights), Calvi
### From Invoices
- **Invoice Numbers:** F1820005790, F1820006506, F1820006824, F1820007010, F1820008157
- **Dates:** March, May, June, July, October 2025
- **Vendor:** [Extract from PDF content]
- **Amounts:** [Extract totals]
---
## Conclusion & Next Steps
### What We Learned
The Liliane1 archive is a **perfect real-world validation dataset** for NaviDocs. It demonstrates:
1.**Document diversity** - Legal, financial, operational, technical
2.**Multi-format challenges** - PDF, JPEG, XLSX all need search
3.**Role-based needs** - Owner, captain, crew have different priorities
4.**Real operational data** - Delivery schedules map directly to v1.1 features
5.**Organization chaos** - Flat folder structure NaviDocs will solve
### Recommended Actions
**Immediate (v1.0 MVP):**
1. ✅ Use Liliane1 as demo dataset for MVP
2. ✅ Upload all 29 files to NaviDocs
3. ✅ Run OCR on 27 files (PDF + JPEG)
4. ✅ Demonstrate search across all document types
5. ✅ Create role-based navigation mockups
**Next Session (v1.1):**
1. ⏳ Build time tracking feature using delivery log as reference
2. ⏳ Implement automated invoicing (€6,049.22 total)
3. ⏳ Add crew management (Jean Michele, Frank Stocker)
4. ⏳ GPS verification for clock-in/out
**Future (v1.2-v1.4):**
1. 📅 Warranty management (extract from manuals)
2. 📅 Insurance expiration alerts (2025-12-31)
3. 📅 Tax-ready reporting (categorize all expenses)
---
**Document Version:** 1.0
**Last Updated:** 2025-10-23
**Cross-Reference:** docs/debates/03-document-library-navigation.md (to be created)