Big Data in Colonial History: Building an Annual Panel of Households across a Century

Johan Fourie, Stellenbosch University
Erik Green, Lund University
Auke Rijpma, Utrecht University
Dieter von Fintel, Stellenbosch University

We know fairly little about the dynamics of living standards in pre-industrial societies. How socially mobile were settler households in a frontier economy? What were the sources of wealth creation and persistence? How severe was inequality and to what extent did it fluctuate over time? This paper reports estimates of settler income, mobility and inequality from a large new data source – the Cape of Good Hope Panel – that currently span more than 80 years and will, ultimately, cover close to 150 years. Settled in the seventeenth century, the Cape Colony, administered by the Dutch East India Company and, after 1806, by the British government, was home to not only European settlers but also slaves from the East Indies and indigenous Khoesan, all of whom are recorded in the annual censuses. The paper highlights the advantages of Big Data and sophisticated matching techniques but also warns against the dangers of data transcription, cleaning and analysis without a rigorous understanding of the historical context.

