Concept: Duplicate Records - Hospital Discharge Abstracts
Concept Description
Last Updated: 2007-05-23
Introduction
In this concept, "duplicate records" refer to records that are duplicated on most of the commonly used variables in the
Hospital Abstracts Data.
A record can still be a duplicate, despite some slight differences in variable values, especially if the hospital record number is coded incorrectly. For example, two records with different hospital record number can be (almost) identical in all other variables. In this case, it would be up to the programmer and principal investigator to decide if the records were true duplicates.
Duplicate records can occur both WITHIN fiscal years of data and ACROSS fiscal years of data:
-
Across fiscal years
- a small number of claims are actually dated in the current and previous fiscal year; these represent corrections to claims from earlier files. Two possible solutions:
-
Keep the record with the most recent claim year. However, this will not happen if each individual file is restricted to April 1- March 31 of its fiscal year.
-
Eliminate duplicates before any other exclusions
-
Within fiscal years
- about half of the duplicates are legitimate independent claims, with the other half representing true duplicates that Manitoba Health did not pick up.
The number of total duplicate records involving Manitoba hospitals, however, is very small (20-40 duplicates) within a fiscal year (out-of-province hospitals often generate duplicate hospital claims). The most appropriate way of handling duplicates will ultimately depend on the study requirements. A good first step is to see how many there are in your data and how they differ.
Related concepts
Related terms
Keywords