Concept: Birth Cohort Registry - Methodology

Concept Description

    This concept provides the methodology, including SAS code (internal access only), for the development of the birth cohort registry at the Manitoba Centre for Health Policy (MCHP). Issues associated with creating birth cohorts are also discussed.

Birth Cohort Registry

    MCHP developed a multigenerational birth cohort registry that has been used in multiple studies to analyze the health and development of children. It enabled longitudinal analyses of subjects from birth until adulthood by providing the ability to distinguish among family members (e.g., siblings, parents, spouses). For approved studies, the birth cohort was created and linked to other databases to examine possible factors related to health including education, social assistance and prescription drugs (Currie et al., 2010; Jutte et al., 2010; and Oreopoulos et al., 2008).

    Various analyses were conducted with this data, including:
    This birth cohort registry database was created solely for the above research projects after receiving the appropriate approvals from the Government of Manitoba - Health Information Privacy Committee (HIPC) and the University of Manitoba - Research Ethics Board (REB).

    The purpose of this concept is to explain the steps that were taken to build this birth cohort registry in order to enable researchers to easily "recreate" it in the future for their own purposes.


    A birth cohort is defined by birth in a particular year, or a range of birth years. The birth cohort registry included Manitoba residents born in the years 1979 through 1989, containing information on about 98% of all children born in Manitoba over the sample period and "tracks 99 percent of the original sample conditional on remaining in the province until June of their 18th year". (Currie et al. 2010)

    Highlights of the data include:

    • Child information
      • Birth weight
      • Birth date
      • Sibling status
      • Birth order
      • Apgar Score
      • Gestational Age
      • Family size

    • Mother information
      • Age at first birth
      • Marital status at children's births

    • Father information (only available for 85% of the cases)
    • Linkage to additional files - examples include: Education, Social Assisstance and Prescription Drugs

Data Sources

    Three main data sources from the MCHP Data Repository were used to create the birth cohort registry:

    1. Manitoba Health Insurance Registry / MCHP Research Registry
      • The Manitoba Health Insurance Registry contains de-identified individual information (birth date, date specific demographic characteristics (age and sex), location of residence, family composition (marital status), encrypted PHINs , and REGNOs ) (Roos et al., 1999). This Registry, coordinated with Vital Statistics files, provides information on dates of arrival and departure (births, deaths and moves) for any date since 1970 (Roos & Nicol, 1999).

    2. Hospital Abstracts Data
      • Hospital birth record information is available from the Hospital data, as well as birth weight, gestation, Apgar score, etc.

    3. Canada Census Data
      • The Canada Census provides ecological information at the neighbourhood level, not the individual level. The place of residence from the census is merged with census income from enumeration areas, allowing it to be used to determine the socioeconomic status (SES) of individuals living within a certain geographical area.

Steps in the Development of the Birth Cohort

    The following steps describe the process of developing the Birth Cohort.

1. Link Database Records

    Record linkage among the above databases is necessary to create the birth cohort. Encrypted PHINs were used from 1984 onward to specify individual cases and link the Manitoba Health Insurance Registry and the Hospital Abstracts data. For individuals born prior to 1984, records were linked and individuals were specified using a combination of family registration number (REGNO), date of birth, sex, and initials.
    NOTE: Since 1970, a REGNO is assigned to the male member in a marital/common-law relationship and is used to map out family relationships - the spouse and children (until age 18) have the same REGNO as the father. When an individual turns 18 (since 1992) or 19 (prior to 1992), he/she receives his/her own REGNO. Upon marriage, the female receives the REGNO of her husband.

    The publicly available census does not contain individual level information, therefore it is linked by area via postal codes. Statistics Canada provides a Postal Code Conversion File (PCCF) which enables cross-walking between postal codes and census areas for 1986 forward (Information from Charles Burchill, August 4, 2011). The census areas are currently known as Dissemination Areas (since 2001); they were previously known as Enumeration Areas (between 1981-1996). Oreopoulos et al. (2008) used the census to estimate area SES by using the following method: "The postal code from the family head's address identifies the street or building where the family lives. The address of the family is updated about every six months. To proxy for general socio economic background, family income in the 2001 Census was aggregated and averaged over Enumeration Areas, which were in turn matched to corresponding postal code addresses in [the] sample." (p.6-7).

    Note: Because socioeconomic status from the census is measured at the area-level instead of the individual-level the use of this data may be limited; however, there is a substantial correlation between the census information and individual socioeconomic status (Brownell et al., 2006).
    Various factors related to health and development of children can be analyzed using the health insurance registry, hospital abstracts data and the census, including:
    • Birth and infant health characteristics (birth weight, gestational age, 5 minute Apgar score) (Jutte, Roos et al., 2010; Currie et al., 2010; Oreopoulos et al., (2008))
    • Hospitalization/physician visits (Jutte, Roos et al., 2010)
    • Mortality (Jutte, Roos et al., 2010)
    • Socio-economic status (Oreopoulos et al., 2008; Jutte, Brownell et al., 2010)
    • Marital status of parents (Jutte, Brownell et al. 2010)
    • Teenage pregnancy (Brownell et al., 2010; Currie et al., 2010; Jutte, Roos et al., 2010; Roos et al., 2006)

2. Distinguish Among Family Members

3. Select Additional Database Linkages

    Depending on the research purpose, the birth cohort can be linked to other databases using the encrypted PHIN to obtain additional information. For example, the following databases have been used with the birth cohort registry to obtain various outcomes for study:

    1. Education Databases
      • Grade retention (Guevremont et al., 2007)
      • Graduation from grade 12 (Currie et al., 2010; Jutte, Roos et al., 2010; Jutte. Brownell et al., 2010)
      • Withdrawal from school (Guevremont et al., 2007, Roos et al., 2006)
      • Scores on provincial tests in grades 3, 9, and 12 (Currie et al., 2010; Jutte, Roos et al., 2010; Guevremont et al., 2007; Oreopoulos et al., 2008; Roos et al., 2006)
      • School changes (Guevremont et al., 2007)

    2. Child and Family Services Application Data
      • Children in care (Brownell et al., 2010; Jutte, Roos et al., 2010)
      • Families receiving services from Child and Family Services (Brownell et al., 2010; Jutte, Roos et al., 2010)
      • Receiving Income Assistance (IA) (available from 1996 onwards) (Brownell et al., 2010; Jutte, Roos et al., 2010)
        • Jutte, Roos et al. (2010) specified individuals with income assistance as those who received income assistance at any point between the ages of 18 and 25 years.
        • Brownell et al. (2010) used "families receiving income assistance for two or more months when the child as 10-17 years" to specify individuals with IA (p.810).

    3. Prescription Drugs (available from 1995/1996 onwards)
      • Use of select prescription drugs (e.g. antibiotics) (Kozyrskyi et al., 2009)

4. Select Birth Years

    The selection of birth years to include in the cohort will depend upon the purpose of the study. Some data, such as education standardized tests, are not available for certain years; those years would not be selected if the birth cohort was going to be linked to the education databases. Please see Issues below for a description of factors that will affect the selection of birth years.

    The birth cohort registry created at MCHP selected individuals who were born in the years 1979 to 1989, as specified within the Manitoba Health Insurance Registry and hospital data. The Manitoba Health Insurance Registry specifies birth dates of all individuals who have been registered with the Manitoba Health Services Insurance Plan at some point since 1970. If available, birthdates are resolved with hospital abstracts which provide the birthdates of all individuals born in hospital since 1970.

    A single study may need to build several cohorts (e.g., more than one birth cohort year) to obtain a larger study group to control for a large number of variables. Using a large study population enables the researcher to study siblings and twins; Oreopoulos et al. (2008) , for example, included births from 1978 to 1985 (excluding 1983 because of incomplete data) and were able to track siblings and twins from birth to adulthood. More information on siblings and twins can be found in the concept dictionary.

    In Roos et al. (2011), the birth cohort included births from 1982-1989, excluding 1983 due to incomplete data. For this time period, Vital Statistics reported 121,487 births in Manitoba and the Health Insurance Registry reported 118,636 births; 97.7 per cent of the Vital Statistics total. This difference seems largely due to a delay in dealing with out-of-province migration. Of children born in Manitoba between 1982 and 1989, 24 per cent either died or left the province before age 18. Mobility out-of-province was largely uncorrelated with several measures of health and socioeconomic status, although Oreopoulos et al. (2008) found some evidence of a "healthy mover" effect.

Issues Selecting Birth Years for a Cohort

    The following are some of the issues encountered by MCHP regarding data availability for certain databases that will affect the selection of birth years for a cohort. (Personal Communication with Leslie Roos, July 12, 2011).

    • Dates
      • Health data is recorded by fiscal years, whereas education data is recorded by calendar years. For ease of analysis, it is suggested that April 1 of the desired start year is used at the start date.

    • Education Data
      • Education data became available in academic year 1995/1996 (birth cohort of 1978); however, some of the education variables are incomplete for the 1978 cohort. The quality and completeness of the education data improves with time as the more recent years capture more standardized test scores than the earlier ones. Data from 1979 and on should generally be used.
      • Standardized test scores were not available for the academic year 1999/2000, thus in the selection of birth cohorts, 1983 should be excluded. (Information from Charles Burchill, August 5, 2011)

    • ICD Codes
      • ICD coding has changed over time: ICDA-8 was used until April 1, 1979, when ICD-9-CM began being used. For ease of analysis, it is suggested that April 1, 1979 be used as the start date.
      • Use of ICD-10-CA began April 1, 2004. The "ACG/ADG system provides a way to automatically cross-walk across this transition" (email from Les, Aug. 13, 2009).

    • Study Files
      • Ideally, variables of interest would be available for 18 years, allowing us to follow the cohort from birth to adulthood. In reality, not all variables are available for the desired 18 years. For example, Child and Family Services Information System (CFSIS) and Employment and Income Assisitance (EIA) (also known as Social Allowances Management Information Network (SAMIN)) variables have only been available since 1992, and 1995, respectively (more information can be found in the database summaries provided in the respective links below ). These are important variables associated with child health and development, and thus often need to be included in the analysis,

      • The CFS data is most complete in the years 1995-2005. Thus, when birth cohorts are used, each cohort will differ in the number of years for which records are most complete.

        Two approaches to analyzing such information have been tried:
        1. Variable age range
          • all the years available for each individual birth cohort (up to each child's 18th birthday) are used
        2. Fixed age range
          • coverage at fixed ages (13th to 18th birthday) for each birth cohort is used

        "The latter approach 'loses' some children who receive aid from CFS but makes sure that each child is categorized using the same number of years of information. Since these (and other) approaches can be tried with little more work than that associated with using just one, sensitivity testing is especially appropriate here." (This information is from an email from Leslie Roos, Feb. 26, 2009.)

    • Place of Residence
      • The definition of residential area boundaries in Manitoba has changed over time, making it difficult to evaluate social mobility up/downwards over time. MCHP has a cross-walk between postal codes and the 1986 census which enables better comparison of individuals across space and time. The 1981 census can be used with just a few corrections (please see Documentation of Census and Postal Code Data for more information ( internal access only ).

5. Select Inclusion and Exclusion Criteria of Subjects

    It has often been necessary to exclude certain individuals based on characteristics that will affect the analysis of outcomes of interest.
    • The current birth cohort registry includes residents of Manitoba who were born between January 1, 1979 and December 31, 1989, inclusive.
    • Not included are:
      • Stillborn
      • Individuals lost to follow-up through death or moves out of province
      • Individuals with a Child and Family Services postal code
      • Records where data are missing on birth weight and mother's encrypted PHIN

      The flexibility involved in its construction allows changes if needed, for particular studies. The following are various studies that have used the birth cohort with additional exclusions:
      • Currie et al. (2010) excluded children with a mental retardation diagnosis (ICD-9 317-319, ICD-10 F70-79)
      • When analyzing educational achievement and socio-economic status of the individuals in the birth cohort, Roos et al. (2006) included only Winnipeg residents "because socio-economic status in neighborhood of residence may be more meaningfully defined in urban areas" (p.687). However, First Nations youth living in Winnipeg were excluded to allow "generalization to other jurisdictions" (p.687). Individuals in private and parochial schools were included.

