Concept: Socioeconomic Factor Index (SEFI) - Based on the 2001 Census Data
Last Updated: 2009-01-05
1. Census Variables
The SEFI components come from the DA/EA level of Census records provided to MCHP through the Statistics Canada Data Liberation Initiative (DLI). Information for each of the following variables was collected at the DA/EA level and used to calculate the SEFI. Some of the variables are available directly from the census and others are calculated from two or more variables on the census.
- age dependency ratio - expresses the ratio of the population aged 65 or older in a region by the population aged 15-64. The age dependency ratio was tested in two ways: 1) Pop 65+/Pop 15-64 and 2) (pop 0-14 + Pop 65+)/Pop 15-64. The second method was selected for use, following Statistics Canada's definition. See the SEFI Age Variables and Age Dependency Ratio Development documentation and resulting age dependency ratio values.
- Percent single parent families - is calculated by dividing the number of lone-parent families by the total number of census families. In the past (1996 census and earlier), the percent of single parent households among households with children aged 0-14 was used. This information was not available from the 2001 census. Hence this variable was replaced by the percent of families which are single parent. Census variables are:
- famstat_totcenfam - total number of census families in private households (20% Sample Data)
- famstat_totlpar - total lone-parent families (20% Sample Data)
- Percent female single parent families (see percent single parent families above) - calculated by dividing the number of female lone-parent families by the number of census families. Census variables are:
- famstat_totcenfam - total number of census families in private households (20% Sample Data)
- famstat_totflpar - female parent lone-parent families
- Labour force participation rate female - rate of women working or seeking work on census day, reported directly on the census files. The numerator is the number of females aged 15 or more in the labour force. The denominator for this rate is based on the population of all women aged 15 or more. Census variables are:
- lf_fempoptot15 - total Population Females 15 years and over - Labour force activity
- lf_femprate15 Participation rate Females 15 years and over*
- lf_fempop15Pop of Females 15 years and over In the labour force
Labour force definitions:
- Labour Participation rate = Pop in Labour Force / Population Aged 15 and over * 100
- Employment rate = Employed Labour Force/ Population Aged 15 and over * 100
- Unemployment rate = Unemployed Labour Force/ Pop in Labour Force *100
- Labour Force = Employed + Unemployed
- Total Pop = Pop in Labour Force + Pop Not in the labour force
- Unemployment rate 15-24, 25-34, 35-44, 45-54 - reported directly on the census. The unemployed include persons during the week prior to the census that were without work, had looked for work in the previous four weeks and were available for work in the week of the census. The denominator for each age group was the count of the total labour force in that age group. See SEFI Unemployment Variables for more information.
- Proportion with a High School Graduation Certificate by Age Group 25-34, 35-44, 45-54 - based on the age group specific population minus the count of the number of residents on census day reporting less than high school graduation certificate. The age-specific rates were computed by dividing the value above by the total population in the age group. See SEFI High School Graduation Variables for more information.
2. SEFI Index Development
SEFI values were created and assigned in the following way:
- Factor analysis was carried out at the DA level first (without population weighting). The education and unemployment variables were reduced to a single education and unemployment factor by using an un-weighted principal component analysis (SAS PRINCOMP procedure). These two factors were used in place of the seven education/unemployment variables in the calculation of the SEFI.
- SEFI values were assigned to postal codes using the Statistics Canada PCCF (postal code conversion file), which links each postal code to the "best" DA. The SEFI score was defined for each DA/EA using the first principal component factor from an un-weighted principal components analysis of the group of socioeconomic characteristics previously listed.
- Population-weighted average SEFI values were calculated for each area. For each postal code, a SEFI value was generated for the First Nations reserves in the postal code and for the non-First Nations communities in the postal code. The First Nations DAs are identified by the census by CSD type. The SEFI values were then linked to the 2001 Manitoba population file by postal code and municipality code type (0 = non First Nations municipality, 1 = First Nation Municipality). The geographic area level SEFI values are calculated by taking the mean of the values linked to the population. This method more accurately reflects the SEFI values of districts with First Nations reserve areas within them. Unlike the previous SEFI methodology, these values are NOT standardized to have a weighted mean of 0 and an un-weighted standard deviation of 1. This avoids the problem of changing geographic area SEFI values whenever there is a slight change in the number of geographic areas.
- The mean and standard deviation of the logarithm of the SEFI averages were calculated.
Calculating SEFI Values for Larger Geographic AreasThree methods for calculating the SEFI at the NC level and then grouping the NCs into 4 groups by SEFI were investigated and compared.
- Method 1 takes the weighted mean of the DA level SEFI values directly to the NC level and weights by the 2001 census population of Manitoba. No Manitoba Health Insurance Registry (Registry) population values are involved.
- Method 2 assigns the DA level SEFI values to the 2001 Registry population by postal code. The NC was defined from the Manitoba population using municipal code and postal code. The mean value of SEFI is calculated for each NC, which is essentially weighted by the Registry population.
- Method 3 assigns the DA level SEFI values to the 2001 Registry population by postal code. The NC was defined from the Manitoba population using postal code alone. The mean value of SEFI is calculated for each NC weighted by the Registry population. However, method 3 is problematic due to grouping by postal code only, and was dropped from consideration as a viable method.
Comparing the resulting SEFI levels from each of the methods, there is some movement within the groupings between methods. Arguments can be made for both choosing either Method 1 and 2 since they were deemed methodologically sound, but Method 2 was considered the most appropriate method to use for the Inequalities in Child Health study (Brownell et al. (2004) because it uses the Registry population. This method reflects a small refinement in earlier methods used to calculate SEFI values for larger geographic areas.Imputation Strategy
To protect the confidentiality of individual responses on the Census, Statistics Canada has adopted a technique known as area suppression, or the deletion of all characteristic data for geographic areas below a specified size. Income distributions and related statistics are suppressed if the total non-institutional population in the area from either the 100% or 20% databases is less than 250. Other characteristics are suppressed if the total non-institutional population in the area from either the 100% or 20% databases is less than 40. Imputation methods were used to provide a value for the SEFI components that were missing at the DA/EA level, according to the strategy below:
- Non-First Nations Communities - socioeconomic characteristics were imputed for missing values at the DA/EA level using records at the census subdivision level if they were not defined as a First Nations community.
- First Nations Communities - socioeconomic characteristics were imputed for missing values at the DA/EA level for First Nations communities from the weighted north or south First Nation community average value according to whether the community was defined as northern or southern (by latitude).
3. 2001 SEFI Values and 1996/2001 SEFI Comparisons by Geographic Regions
The following provides the SEFI values calculated for a variety of geographic regions using the 2001 Census data:
The following provides an illustrated comparison of the 1996 SEFI values using the "old" methodology and the "newer" 2001 methodology at the RHA, RHA district, Winnipeg NC and Winnipeg Community Area (CA) levels:
- SEFI value listings for RHA, RHA district, and Winnipeg NC. See SEFI Values by Geographic Area for more information.
NOTE: SEFI scores less than zero indicate more favourable socioeconomic conditions, while SEFI scores greater than zero indicate less ideal socioeconomic conditions.
- Comparison of SEFI Values Using Different Methods document.
The R2 values at each of the geographic levels are quite high which indicates a strong correlation between the values generated by the two methods. The R2 value at the RHA district level is relatively low because a few of the districts had been assigned the same SEFI value using the old method. These districts could not be assigned unique SEFI values because they could not be fully defined by municipality code. The new 2001 method allows unique SEFI values to be calculated for all of the RHA districts.
A comparison of the 1996 SEFI using the 2001 methodology to the 2001 SEFI was done. There were 2318 DA's in 2001 and 1802 EA's in 1996. Of these 1785 were comparable for calculating a correlation between the SEFI values for these small areas in 1996 and in 2001. The correlations are as follows:
These are quite good correlations for a variable such as this calculated at such a low level. Small changes in income, education, single parent proportions, could change values considerably, especially since it is not the same people answering the long form in every DA in both of the Census.
- pearson r = .693
- spearman r = .637
Additional information
- the sefi_region3.txt conversion file contains the new 2001 ordering of SEFI based on DA level for RHAs, RHA Districts, Winnipeg Community Areas and Winnipeg Neighbourhood Clusters.
- SAS® formats – available in the MCHP SAS® Format Library include:
- to create SEFI from Winnipeg NC (Neighbourhood Clusters) values: sefigroup=put(wpg_nc,$sefincg.)
- to create SEFI from RHA district values: sefigroup=put(district,$sefincg.)
Brownell et al. (2004)
In the Manitoba Child Health Atlas deliverable by Brownell et al. (2004), they used the SEFI as a measure of socioeconomic status (SES). They calculated SEFI scores for the 1146 dissemination areas (DAs) within Winnipeg and for the 1172 DAs outside of Winnipeg, using publicly available data from the 2001 Census. SEFI scores for each of the 25 Winnipeg Neighbourhood Clusters were calculated using a weighted average of the scores for each DA in that neighbourhood. Likewise, a SEFI score was calculated for each RHA District using a weighted average of the scores for each DA in that district. For ease of presentation, for both Winnipeg and non-Winnipeg areas, we divided the neighbourhoods or districts into four groups based on how different they were from the average score for all neighbourhoods or districts. Thus for both Winnipeg and non-Winnipeg areas we end up with four SEFI Groups: Low SES (or most disadvantaged), Low-Mid SES, Middle SES, and High SES.
As shown in the Winnipeg map, the more disadvantaged areas (Low SES areas shown in red on the map) tend to be found in the central part of Winnipeg, with the most advantaged areas (High SES areas in dark green on the map) on the outskirts of the city. The Non-Winnipeg map (RHA Districts) shows where each of these four groups are located in non-Winnipeg areas of the province. It is clear that the more disadvantaged (red) areas tend to be in the northern parts of the province, with the more advantaged areas in the south central parts of Manitoba. It should be noted that the total number of people and the total number of children residing in these SES groups is not equal: The Middle SES category in Winnipeg has almost half of Winnipeg’s total population; and the Middle SES category has just over half of the non-Winnipeg population.
Socioeconomic characteristics for each neighbourhood area of Winnipeg and for each non-Winnipeg RHA district can be found in the following tables and graphs:
- SEFI Variable Values by Region - list SEFI variable values and identifies the SES of RHA Districts and Winnipeg NC using SES quartiles.
- Average Number of Children per Family for Winnipeg and Non-Winnipeg by SES (SEFI) Group
For more information on this research, please see the Manitoba Child Health Atlas 2004 Web Site.Finlayson et al. (2007)
In the Allocating Funds for Healthcare in Manitoba Regional Health Authorities: A First Step--Population-Based Funding deliverable by Finlayson et al. (2007), they investigated the SEFI (as a measure of socioeconomic status (SES)) as one of the top 5 factors expected to affect the need / use of health services. The SEFI was used in all statistical models as an independent variable / covariate.
Appendix B: Detailed Results in the deliverable provides the SEFI contribution to the calculations for all predictive models used in this research. Additional information on the parameter estimates used in this research are available in Table A.1 of this same appendix.
One of the important findings of this work was that community-level socioeconomic status is a better predictor of health services utilization than any of the other community characteristics that were considered: aboriginal population, older population, population density, infant mortality rate, etc. This is valuable information because it indicates that although aboriginal status and infant mortality rates (for example) may be important in determining health services use, socioeconomic status is able to explain more variability in utilization than the other factors.