By Ross Anderson
[21 June 1996] An agreement has been reached between the British Medical Association and the UK's main providers of healthcare analysis information services - CHKS Ltd (a subsidiary of HCIA Inc), the SEMA group, IMG and Reuters - to set minimum standards for the de-identification of medical records. These records are used in analysing hospital readmission rates, referral patterns and casemix, and in epidemiological research generally.
Such studies require records of hospital care episodes to be linked, but they should still not be identifiable to individuals outside the hospital or other care provider (or else patient consent must be sought). A problem had arisen in that some (though not all) healthcare information companies had been identifying patients, and linking episodes, by their postcode and date of birth. This combination is enough to identify over 99% of UK residents.
It has therefore been agreed that in future, de-identified medical records will not contain either the last two symbols of the postcode, or the day and month of birth. Thus for example
CB5 9HF 15/09/1956
will become
CB59 56
This is sufficient information for age related casemix studies, and to identify deprived areas. However, it is very rarely enough to identify individuals; there are on average six individuals with each combination of year of birth and postcode sector.
In order that episodes can still be linked, there will also be stored a pseudonym for the patient such as a hospital number, practice number or cryptographic hash function of the patient's name and date of birth (in which case there will be a key unique to each provider).
This arrangement is not entirely sufficient for the secure handling of health information - further access control and statistical security measures are neededd to foil inferencing and other attacks. However it brings the following immediate benefits: