Download
Current public release
Structured biographies from Degener's Wer Ist's? (1911)
Download the dataset produced by the workflow. Each row describes one person from the source, and each field is one column of information about that person.
The 22 core fields are the main pieces of biography information the workflow tries to recover: names, birth details, address, education, occupation, career, family relations, publications, memberships, political affiliation, hobbies, collections, and personal notes.
What the Fields Mean
Person
Who is the entry about?
Name, first names, gender, title or profession, and basic birth information.
Work and education
What did they do?
Education, job, career, specialization, and works or publications.
Family and networks
Who were they connected to?
Parents, spouse, children, ancestors, memberships, and political affiliation.
Notes and uncertainty
What needs careful reading?
Hobbies, collections, personal notes, unknown values, and workflow review flags.
Download the Data
Primary data
Start here
These files contain the biography records themselves.
JSONL · Primary data
Normalized JSONL
Best for scripts and reproducible pipelines. Each line is one structured biography.
Excel · Primary data
Normalized Excel
Best for reading, filtering, and sharing the biography table without writing code.
Supplementary data
Derived outputs
These files support specific analysis tasks built from the biographies.
CSV · Supplementary data
Geocoded addresses CSV
Use this for maps and place-based analysis. It is derived from the normalized address field.
Documentation and checks
Read before reuse
These files explain what is present, missing, or flagged for review.
Markdown · Documentation and checks
Biography stats report
Use this to inspect coverage, missing values, geography, occupation classes, and quality flags.
Reproducibility Notes
Versioning
Pin the repository state when citing the data.
The buttons point to the current public files in the repository. For formal reuse, cite the repository together with the commit hash and download date.
Missing values
unknown is part of the data model.
Historical entries rarely contain every detail. An unknown value means the source or workflow did not provide a confident value for that field.