Michael R. Haines, Colgate University
Users of the published United States Federal Censuses will encounter frustrations when looking for specific tabulations for research purposes. For example, the censuses from 1790 to 1840 published data by age and sex and race fat the state, county, and town/city level. Partial tabulations of the town/city data with limited detail were published in 1850, 1860, and 1870. Thereafter, only total populations were published until 1930. Similarly, county level populations by age and sex 1800-1860 for whites and 1830-1860 for slaves and free blacks. The tabulations were not resumed until 1930. The complete count data, now available for 1850, 1880, and 1900-1940 as well as household level data for 1790-1940 (with some missing data because of lost manuscripts). This paper aims to discuss the issues and pitfalls of trying to “fill in the blanks” using the complete count data to supplement the published tabulations (and correct them). One example is the published data for the city of St. Louis, MO in 1880, which gives the results for a re-enumeration in September, 1880 rather than the initial enumeration ffrom June, 1880, creating comparability issues. The complete count data obtained from Ancestry.com also have some problems as well. The ultimate release of the complete count data for 1860 and 1870 will allow a great deal more to be done.
No extended abstract or paper available
Presented in Session 22. Overcoming Limitations in Big Data