# Google Drive OCR Quick Start ## Why Use Google Drive OCR? Your current Tesseract setup works great (**85% confidence**), but Google Drive offers: | Feature | Tesseract | Google Drive | |---------|-----------|--------------| | **Typed text** | ✅ 85% | ✅ 98% | | **Handwriting** | ❌ No | ✅ **YES!** | | **Tables/Columns** | ⚠️ Struggles | ✅ Excellent | | **Speed** | Fast (2.5s) | Medium (4.2s) | | **Cost** | Free | Free* | | **Offline** | Yes | No | *1 billion requests/day free quota = effectively unlimited ## Perfect For Marine Applications - 📝 **Handwritten logbooks** - Captain's logs, maintenance records - ✏️ **Annotated manuals** - Notes written on equipment guides - 📋 **Service forms** - Filled-out inspection checklists - 🗺️ **Navigation logs** - Chart annotations ## 5-Minute Setup ### Step 1: Get Google Cloud Credentials ```bash # 1. Go to: https://console.cloud.google.com/ # 2. Create new project: "NaviDocs" # 3. Enable "Google Drive API" # 4. Create Service Account with "Editor" role # 5. Download JSON credentials ``` ### Step 2: Install in NaviDocs ```bash # Copy credentials to server cp ~/Downloads/navidocs-*.json /home/setup/navidocs/server/config/google-credentials.json # Install Google APIs client cd /home/setup/navidocs/server npm install googleapis ``` ### Step 3: Configure ```bash # Add to server/.env echo 'GOOGLE_APPLICATION_CREDENTIALS=/home/setup/navidocs/server/config/google-credentials.json' >> .env echo 'PREFERRED_OCR_ENGINE=google-drive' >> .env ``` ### Step 4: Update Worker ```bash # Edit server/workers/ocr-worker.js # Change line 18 from: # import { extractTextFromPDF, cleanOCRText } from '../services/ocr.js'; # To: # import { extractTextFromPDF, cleanOCRText } from '../services/ocr-hybrid.js'; ``` ### Step 5: Restart and Test ```bash # Restart OCR worker pkill -f ocr-worker cd /home/setup/navidocs node server/workers/ocr-worker.js > logs/worker.log 2>&1 & # Upload a test PDF curl -X POST http://localhost:8001/api/upload \ -F "file=@your-handwritten-logbook.pdf" \ -F "title=Captain's Log" \ -F "documentType=logbook" \ -F "organizationId=test-org-id" # Check logs tail -f logs/worker.log # Should see: "[OCR Hybrid] Using google-drive engine" ``` ## Testing Handwriting Recognition Try it with: - Handwritten notes - Filled forms - Annotated diagrams - Cursive writing - Mixed typed + handwritten pages ## Hybrid Mode (Best of Both Worlds) The system **automatically chooses** the best engine: ```javascript // Set in .env: PREFERRED_OCR_ENGINE=auto // Behavior: // - Google Drive configured? → Use it for quality // - Document > 50 pages? → Use Tesseract to save quota // - Network error? → Fallback to Tesseract // - Not configured? → Use Tesseract ``` ## Cost Analysis | Monthly Volume | Google Drive Cost | Recommendation | |----------------|-------------------|----------------| | 0-1,000 PDFs | **$0** | Use Google Drive | | 1,000-10,000 PDFs | **$0** | Use Google Drive | | 10,000-100,000 PDFs | **$0** | Use Google Drive | | > 1M PDFs/month | **$0** | Still free! | **Quota**: 1 billion requests/day = 365 billion PDFs/year For comparison: - If you processed **1 PDF every second** for an entire year - That's only 31.5 million PDFs - Still **well under the free quota** ## Monitoring Usage Check your Google Cloud Console: ``` https://console.cloud.google.com/apis/api/drive.googleapis.com/quotas ``` You can see: - Requests per day - Quota remaining - Error rates ## Troubleshooting ### "API key invalid" error ```bash # Check credentials path echo $GOOGLE_APPLICATION_CREDENTIALS cat server/.env | grep GOOGLE_APPLICATION_CREDENTIALS # Test connection node -e " import { testGoogleDriveConnection } from './server/services/ocr-google-drive.js'; const ok = await testGoogleDriveConnection(); console.log(ok ? '✅ Connected' : '❌ Failed'); " ``` ### Worker still using Tesseract ```bash # Verify import changed in worker grep "ocr-hybrid" server/workers/ocr-worker.js # Check env loaded grep "PREFERRED_OCR_ENGINE" server/.env ``` ### Want to go back to Tesseract? ```bash # Just set in .env: PREFERRED_OCR_ENGINE=tesseract # Or remove the line entirely ``` ## Next Steps 1. ✅ Set up Google Drive OCR (5 minutes) 2. ✅ Test with a handwritten document 3. ✅ Compare quality vs Tesseract 4. ✅ Keep hybrid mode for automatic fallback See **docs/OCR_OPTIONS.md** for detailed comparison and advanced configuration. ## Real Example: Boat Logbook **Before (Tesseract)**: ``` Cannot recognize handwriting [Empty result or gibberish] ``` **After (Google Drive)**: ``` Captain's Log - May 15, 2024 Departed Marina Bay 08:30 Wind: 15 knots NE Waves: 2-3 feet Engine hours: 847.2 Fuel: 75% full Arrived safe at 14:20 ``` ✅ **Perfect transcription of handwritten logbook!**