- OCR
Source transcription
The website will describe how raw page images are turned into machine-readable text and where OCR uncertainty enters the workflow.
Version 1 skeleton
This page is intentionally light on live content. Its main job is to establish route structure, layout, and visual language before the site begins consuming generated dataset exports.
The website will describe how raw page images are turned into machine-readable text and where OCR uncertainty enters the workflow.
This section is reserved for the logic that converts biography text into structured fields such as names, occupations, places, and family relations.
Validation steps, conflict handling, and manual review boundaries will be described here once the content phase begins.
This page will later explain how normalized exports, geocoding, and derived statistics are generated for downstream use.
Coming next
This placeholder keeps the structure explicit now, while leaving the implementation of real dataset integration to a later phase.