Extracting data from the IDS
Comparable sets of microdata is the necessary first step towards a ‘European life course history’. One of the tasks of the network is to create and manage software that extracts datasets from the Intermediate Data Structure (IDS). The IDS is the standard data format in which the databases will transfer their data. On the left side of the diagram you find the various types of sources included in historical longitudinal databases. On the right side of the diagram are the data files that researchers require for analyses. These files are made by the extraction software using the date from the IDS of each database in the same way.
The extraction programs will be re-usable and transparent. Anyone can contribute an extraction module, and all extraction modules will operate on every dataset with the required data. Extraction programs will also be open to scrutiny by the research community. Methodologies can be examined, discussed, and tested, and research results will be reproducible. Since the requirements of every type of analysis differ (fertility, mortality, social mobility, etc.), there will be many specialized extraction programs. However, all extraction programs will start with the IDS, and they will work on any dataset that includes the necessary attribute types. Extraction programs will be modular, and some types of analysis will require “workflows” that link together several extraction services. This process creates standardized information for all databases. Researchers will not need to learn a new set of formats and relational structures for every database. Consequently, data extraction programs can be re-used and adapted to other purposes, and the steps involved in preparing data for analysis will be more open and transparent.
George Alter and Kees Mandemakers finished the fourth version of the IDS. This version is published in the first volume of Historical Life Course Studies. It integrates the results of several IDS-meetings and workshops, amongst others Chicago (2010), Boston (2011) and Vancouver (2012). Since 2011 the enterprise has become part of the European Historical Population Samples network, resulting in workshops in Umeå (2012) and Lund (2013).
Version 4 is accompanied by a new version of the meta data table, version 4.01, dated the 12th of June 2013. Changes of and additions to this table have to be reported to the Clearing committee (Luciana Quaranta, George Alter and Kees Mandemakers; firstname.lastname@example.org).
For more information about the scientific background of the IDS read the article 'Defining and Distributing Longitudinal Historical Data in a General Way Through an Intermediate Structure' by George Alter, Kees Mandemakers and Myron Gutmann, published in Historical Social Research.