The development of IPUMS-International involves harmonizing data from different national statistical offices created over several decades. The original samples vary in quality and have different data formats and variable coding schemes. The authors describe the methods developed to deal with the challenges posed by such diversity and unevenness. The first stage of harmonization involves standardizing the data formats and correcting errors. Diagnostic routines analyze each data set, and custom computer programs modify the different data structures into a single standard format. The second stage of the work centers on harmonizing the codes for all variables shared across data sets, including the compilation and integration of all the relevant documentation. © 2003 Taylor & Francis Group, LLC.
|Publication status||Published - 1 Jan 2003|