Key Features
Interface Features
- Archive.org-style interface - Clean, focused document browsing
- Zoom controls - Click to zoom or use +/- buttons
- Fullscreen mode - Press F11 or click fullscreen button
- Progress tracking - Visual progress bar showing position
- Keyboard navigation - Arrow keys, Home, End for quick navigation
Image Processing
- Automatic format conversion - TIF files converted to JPEG for browser compatibility
- High-quality thumbnails - Fast loading with optimized image sizes
- Responsive scaling - Images adapt to screen size
- Pan and zoom - Smooth image navigation
Search Types
- Filename search - Find documents by name
- OCR text search - Search within extracted text content
- Combined search - Search both filenames and content
- Advanced filtering - Filter by OCR status and file type
Search Features
- Text excerpts - See context around search matches
- Search highlighting - Highlighted search terms in results
- Pagination - Navigate through large result sets
- Sorting options - Sort by relevance, filename, or ID
Text Extraction
- Tesseract OCR - Industry-standard text recognition
- Memory efficient - Lightweight processing for large datasets
- Background processing - Non-blocking OCR operations
- Progress tracking - Real-time processing status
Text Display
- Side-by-side view - Image and text displayed together
- Searchable content - Full-text search through extracted text
- Text formatting - Preserved formatting and structure
- Error handling - Graceful handling of OCR failures
Data Management
- Idempotent operations - Safe to re-run during file transfers
- File hash tracking - Detect changed files efficiently
- SQLite database - Fast, reliable data storage
- Directory structure mapping - Complete file organization
Performance & Reliability
- Responsive design - Works on desktop and mobile devices
- Production ready - Screen-based process management
- Error handling - Robust error handling and recovery
- Analytics tracking - User behavior and search analytics
REST API
- Search API - Programmatic document search
- Statistics API - System status and progress
- Image serving - Direct access to document images
- Thumbnail generation - Optimized image thumbnails
Integration
- JSON responses - Machine-readable data formats
- CORS support - Cross-origin resource sharing
- Pagination - Efficient handling of large datasets
- Error handling - Consistent error response format
Navigation
- Quick navigation - Jump to first, middle, or last document
- Random browsing - Discover documents randomly
- Keyboard shortcuts - Power user navigation
- Breadcrumb navigation - Clear page hierarchy
Interface
- Clean design - Minimal, distraction-free interface
- Mobile responsive - Works on all device sizes
- Accessibility - Screen reader and keyboard friendly
- Loading indicators - Clear feedback during operations