Implements complete TOC feature for document navigation with bilingual support.
## TOC Detection & Extraction
- Pattern-based TOC detection with 3 regex patterns
- Heuristic validation (30%+ match ratio, 5+ entries, sequential pages)
- Hierarchical section key parsing (e.g., "4.1.2" → level 3, parent "4.1")
- Database schema with parent-child relationships
- Automatic extraction during OCR post-processing
- Server-side LRU caching (200 entries, 30min TTL)
## UI Components
- TocSidebar: Collapsible sidebar (320px) with auto-open on TOC presence
- TocEntry: Recursive component for hierarchical rendering
- Flex layout: Sidebar + PDF viewer side-by-side
- Active page highlighting with real-time sync
- localStorage persistence for sidebar state
## Navigation Features
- Click TOC entry → PDF jumps to page
- Deep link support: URL hash format #p=12
- Page change events: navidocs:pagechange custom event
- URL hash updates on all navigation (next/prev/goTo/TOC)
- Hash change listener for external navigation
- Page clamping and validation
## Search Integration
- "Jump to section" button in search results
- Shows when result has section field
- Navigates to document with page number and hash
## Accessibility
- ARIA attributes: role, aria-label, aria-expanded, aria-current
- Keyboard navigation: Enter/Space on entries, Tab focus
- Screen reader support with aria-live regions
- Semantic HTML with proper list/listitem roles
## Internationalization (i18n)
- Vue I18n integration with vue-i18n package
- English and French translations
- 8 TOC-specific translation keys
- Language switcher component in document viewer
- Locale persistence in localStorage
## Error Handling
- Specific error messages for each failure case
- Validation before processing (doc exists, has pages, has OCR)
- Non-blocking TOC extraction (doesn't fail OCR jobs)
- Detailed error returns: {success, error, entriesCount, pages}
## API Endpoints
- GET /api/documents/:id/toc?format=flat|tree
- POST /api/documents/:id/toc/extract
- Cache invalidation on re-extraction
## Testing
- Smoke test script: 9 comprehensive tests
- E2E testing guide with 5 manual scenarios
- Tests cover: API, caching, validation, navigation, search
## Database
- Migration 002: document_toc table
- Fields: id, document_id, title, section_key, page_start, level, parent_id, order_index
- Foreign keys with CASCADE delete
## Files Changed
- New: TocSidebar.vue, TocEntry.vue, LanguageSwitcher.vue
- New: toc-extractor.js, toc.js routes, i18n setup
- Modified: DocumentView.vue (sidebar, deep links, events)
- Modified: SearchView.vue (Jump to section button)
- Modified: ocr-worker.js (TOC post-processing)
- New: toc-smoke-test.sh, TOC_E2E_TEST.md
Generated with Claude Code (https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
35 lines
1.4 KiB
SQL
35 lines
1.4 KiB
SQL
-- Migration: Add document_toc table for interactive table of contents
|
|
-- Date: 2025-10-20
|
|
-- Description: Store extracted TOC entries from PDF documents for navigation
|
|
|
|
CREATE TABLE IF NOT EXISTS document_toc (
|
|
id TEXT PRIMARY KEY,
|
|
document_id TEXT NOT NULL,
|
|
|
|
-- TOC entry details
|
|
title TEXT NOT NULL, -- "Chapter 4 - Plumbing System"
|
|
section_key TEXT, -- "4" or "4.1.2" for hierarchical entries
|
|
page_start INTEGER NOT NULL, -- Target page number
|
|
|
|
-- Hierarchy support
|
|
level INTEGER DEFAULT 1, -- 1 for "4", 2 for "4.1", 3 for "4.1.2"
|
|
parent_id TEXT, -- Reference to parent entry for nesting
|
|
|
|
-- Ordering
|
|
order_index INTEGER NOT NULL, -- Sequential order in TOC
|
|
|
|
-- Source tracking
|
|
toc_page_number INTEGER, -- Which page the TOC entry was found on
|
|
|
|
-- Metadata
|
|
created_at INTEGER NOT NULL,
|
|
|
|
FOREIGN KEY (document_id) REFERENCES documents(id) ON DELETE CASCADE,
|
|
FOREIGN KEY (parent_id) REFERENCES document_toc(id) ON DELETE CASCADE
|
|
);
|
|
|
|
-- Indexes for performance
|
|
CREATE INDEX IF NOT EXISTS idx_toc_document ON document_toc(document_id);
|
|
CREATE INDEX IF NOT EXISTS idx_toc_order ON document_toc(document_id, order_index);
|
|
CREATE INDEX IF NOT EXISTS idx_toc_parent ON document_toc(parent_id);
|
|
CREATE INDEX IF NOT EXISTS idx_toc_section ON document_toc(document_id, section_key);
|