API Guide

The Epstein Documents Browser provides a comprehensive REST API for programmatic access to documents, search functionality, and system statistics.

API Overview

Base URL

http://epstein.rizzn.net/

Response Format

All API responses are in JSON format with appropriate HTTP status codes.

Authentication

Most endpoints are publicly accessible. Admin endpoints require authentication.

Rate Limiting

No rate limiting is currently implemented, but please use responsibly.

CORS Support

Cross-origin requests are supported for web applications.

Error Handling

Errors return appropriate HTTP status codes with descriptive messages.

Search API

Endpoint

GET /api/search

Parameters
Parameter Type Required Description
q string Yes Search query
type string No Search type: all, filename, ocr
ocr string No OCR filter: all, with-ocr, without-ocr
sort string No Sort order: relevance, filename, id
page integer No Page number (default: 1)
Example Requests
Basic Search
GET http://epstein.rizzn.net/api/search?q=juror
Advanced Search
GET http://epstein.rizzn.net/api/search?q=testimony&type=ocr&sort=relevance&page=1
Response Format
{
  "results": [
    {
      "id": 123,
      "file_name": "DOJ-OGR-00000008.jpg",
      "file_path": "Prod 01_20250822/VOL00001/IMAGES/IMAGES001/DOJ-OGR-00000008.jpg",
      "directory_path": "Prod 01_20250822/VOL00001/IMAGES/IMAGES001",
      "has_ocr_text": true,
      "match_type": "content",
      "excerpt": "...prospective jurors completed a lengthy questionnaire..."
    }
  ],
  "pagination": {
    "page": 1,
    "per_page": 50,
    "total_count": 2,
    "total_pages": 1,
    "has_next": false,
    "has_prev": false
  }
}

Statistics API

Endpoint

GET /api/stats

Description

Returns system statistics including total images, OCR progress, and volume breakdown.

Example Request
GET http://epstein.rizzn.net/api/stats
Response Format
{
  "total_images": 24528,
  "images_with_ocr": 12345,
  "ocr_percentage": 50.3,
  "volumes": [
    {
      "volume": "VOL00001",
      "count": 24528
    }
  ]
}

Image Range API

Endpoint

GET /api/first-image

Description

Returns the first and last available image IDs for navigation purposes.

Example Request
GET http://epstein.rizzn.net/api/first-image
Response Format
{
  "first_id": 1,
  "last_id": 24528
}

Image Serving API

Endpoint

GET /image/<path>

Description

Serves image files with automatic format conversion (TIF to JPEG) for browser compatibility.

Parameters
  • path - Relative path to the image file
Example Requests
Direct Image
GET http://epstein.rizzn.net/image/Prod%2001_20250822/VOL00001/IMAGES/IMAGES001/DOJ-OGR-00000008.jpg
Thumbnail
GET http://epstein.rizzn.net/api/thumbnail/123
Response

Returns the image file with appropriate MIME type and headers for browser display.

Admin API

Authentication Required

Admin endpoints require authentication via session cookie after login.

Endpoints
Endpoint Method Description
/admin GET Admin dashboard with analytics
/admin/login POST Admin login (password required)
/admin/logout GET Admin logout
/admin/analytics GET Raw analytics data (JSON)
Analytics Data

The analytics endpoint provides detailed usage statistics including:

  • Request counts and unique visitors
  • Popular search queries
  • Top pages and referrers
  • Hourly usage distribution

Integration Examples

JavaScript/Fetch
// Search for documents
async function searchDocuments(query) {
  const response = await fetch(`http://epstein.rizzn.net/api/search?q=${encodeURIComponent(query)}`);
  const data = await response.json();
  return data.results;
}

// Get system statistics
async function getStats() {
  const response = await fetch('http://epstein.rizzn.net/api/stats');
  return await response.json();
}

// Usage
searchDocuments('juror').then(results => {
  console.log('Found', results.length, 'documents');
});
Python/Requests
import requests

# Search for documents
def search_documents(query):
    response = requests.get(f'http://epstein.rizzn.net/api/search', params={'q': query})
    return response.json()

# Get system statistics
def get_stats():
    response = requests.get('http://epstein.rizzn.net/api/stats')
    return response.json()

# Usage
results = search_documents('testimony')
print(f"Found {len(results['results'])} documents")
cURL Examples
# Search for documents
curl "http://epstein.rizzn.net/api/search?q=juror&type=all"

# Get statistics
curl "http://epstein.rizzn.net/api/stats"

# Get image range
curl "http://epstein.rizzn.net/api/first-image"

# Download an image
curl "http://epstein.rizzn.net/image/Prod%2001_20250822/VOL00001/IMAGES/IMAGES001/DOJ-OGR-00000008.jpg" -o image.jpg

Error Handling

HTTP Status Codes
Code Description Example
200 Success Request completed successfully
400 Bad Request Invalid parameters
404 Not Found Image or endpoint not found
500 Internal Server Error Server-side error
Error Response Format
{
  "error": "Error message description",
  "results": []
}