Sickle cell disease affects more than 100,000 individuals in the United States, among whom disease severity varies considerably. One factor that influences disease severity is the sickle cell disease genotype. For this reason, clinical prevention and treatment guidelines tend to differentiate between genotypes. However, previous research suggests caution when using a claimsbased determination of sickle cell disease genotype in healthcare quality studies. The objective of this study was to describe the extent of miscoding for the major sickle cell disease genotypes in hospital discharge data. Individuals with sickle cell disease were identified through newborn screening results or hemoglobinopathy specialty care centers, along with their sickle cell disease genotypes. These genotypes were compared to the diagnosis codes listed in hospital discharge data to assess the accuracy of the hospital codes in determining sickle cell disease genotype. Eighty-three percent (sickle cell anemia), 23% (Hemoglobin SC), and 31% (Hemoglobin Sβ+ thalassemia) of hospitalizations contained a diagnosis code that correctly reflected the individual's true sickle cell disease genotype. The accuracy of the sickle cell disease genotype coding was indeterminate in 11% (sickle cell anemia), 12% (Hemoglobin SC), and 7% (Hemoglobin Sβ+ thalassemia) and incorrect in 3% (sickle cell anemia), 61% (Hemoglobin SC), and 52% (Hemoglobin Sβ+ thalassemia) of the hospitalizations. The use of ICD-9-CM codes from hospital discharge data for determining specific sickle cell disease genotypes is problematic. Research based solely on these or other types of administrative data could lead to incorrect understanding of the disease.
Keywords: Administrative data; Genotype; ICD-9-CM codes; Sickle cell anemia; Sickle cell disease; Surveillance.