Alexander Roehrkasse, Cornell University
Rapid advances in the availability of historical census data are greatly improving historical social research, but much remains unknown about the quality of these data. Full count census microdata are compared to new dataset of county-level vital records of marriages. National census counts of marriage events in 1900 are shown to be 36% lower than counts based on vital records. Multi-year comparisons show discrepancies that are smaller but still greatly in excess of known rates of overall census undercounting. Analysis of exogenous indicators of data quality suggests that both census and vital records quality varied widely across counties, and largely corroborates prior research on the political and demographic correlates of measurement error in official statistics. Implications for historical analysis of marriage and divorce and for the sociology of official knowledge are discussed.
Presented in Session 22. Overcoming Limitations in Big Data