In accordance with HIPAA regulations, patients' personal information is typically removed or generalized prior to being released as public data files. However, it is not known if the standard method of de-identification is sufficient to prevent re-identification by an intruder. The authors conducted analytical processing to identify security vulnerabilities in the protocols to de-identify hospital data. Their techniques for discovering privacy leakage utilized three disclosure channels: (1) data inter-dependency, (2) biomedical domain knowledge, and (3) suppression algorithms and partial suppression results. One state's inpatient discharge data set was used to represent the current practice of de-identification of health care data, where a systematic approach had been employed to suppress certain elements of the patient's record. Of the 1,098 records for which the hospital ID was suppressed, the original hospital ID was recovered for 616 records, leading to a nullification rate of 56.1%. Utilizing domain knowledge based on the patient's Diagnosis Related Group (DRG) code, the authors recovered the real age of 64 patients, the gender of 83 male patients and 713 female patients. They also successfully identified the ZIP code of 1,219 patients. The procedure used to de-identify hospital records was found to be inadequate to prevent disclosure of patient information. As the masking procedure described was found to be reversible, this increases the risk that an intruder could use this information to re-identify individual patients.
|Number of pages||11|
|Journal||International Journal of Healthcare Information Systems and Informatics|
|Publication status||Published - 1 Oct 2012|
- Data analytic processing
- Data privacy
- Diagnosis Related Group (DRG)
- Health Insurance Portability and Accountability Act (HIPAA)