Computer-Assisted Transcription by Computer Vision through Citizen-Centered Projects: Crowdsourcing Platforms and Gamesourcing Experiences. The Barcelona Case.

Joana Maria Pujades Mora, Open University Of Catalonia & Center For Demographic Studies, Universitat Autònoma De Barcelona
Alícia Fornés, Computer Vision Center
Josep Lladós, Computer Vision Center
Miquel Valls Fígols, Universitat Autònoma de Barcelona
Gabriel Brea-Martinez, Centre for Economic Demography-Economic History Lund University

Citizen-centered projects are being common in gathering data for scientific studies in last decades and has shown to be a productive way of advancing knowledge. The expansion of information technologies, the popularization of handheld devices and the integration of internet in everyday life have enabled to develop this specific kind of projects only online. The aim of this paper is to show the information technology methodologies enabled by computer vison on engaging citizens used in the research projects CROWDS, TOOLS and NETWORKS in order to build demographic databases using parish and civil registers and census material. Those projects have been developed by a team compounded by researchers of the Geography Department of the Universitat Autònoma de Barcelona, its Center for Demographic Studies and its Computer Vision Center. Information technologies have been adopted to extract (transcribe) information manually on a crowdsourcing platform (web application) designed ad hoc and by applying computer vision techniques to computer-assisted that manual transcription. At the same time, the extracted information works as a ground-truth for document analysis to develop algorithms of semantic recognition which are useful to train them to perform automatic transcription, which is validated with ad hoc gamesourcing experiences. Besides, the involved citizens in the projects will be monotorized through their sociodemographic profile and their engagement over time using a multivariable analysis.

See extended abstract

 Presented in Session 58. Transcription and Data Capture