Table 2

Performance of multinomial regression and a rule-based algorithm using ICD-10 codes for determining diabetes status and type using SEARCH cohort status as gold standard

Multinomial regression*Rule-based algorithm
Diabetes (n = 5,308)
Se 0.964 0.991
Sp 0.982 0.966
PPV 0.987 0.969
NPV 0.918 0.983
Type 1 diabetes (n = 4,732)
Se 0.953 0.978
Sp 0.963 0.968
PPV 0.978 0.980
NPV 0.957 0.967
Type 2 diabetes (n = 400)
Se 0.573 0.899
Sp 0.992 0.975
PPV 0.778 0.642
NPV 0.977 0.995
Other diabetes type, e.g., medication-induced, monogenic (n = 176)
Se 0.381 0.496
Sp 0.981 0.996
PPV 0.512 0.698
NPV 0.986 0.988
κ statistic 0.870 0.910
Accuracy 0.936 0.955
Multinomial regression*Rule-based algorithm
Diabetes (n = 5,308)
Se 0.964 0.991
Sp 0.982 0.966
PPV 0.987 0.969
NPV 0.918 0.983
Type 1 diabetes (n = 4,732)
Se 0.953 0.978
Sp 0.963 0.968
PPV 0.978 0.980
NPV 0.957 0.967
Type 2 diabetes (n = 400)
Se 0.573 0.899
Sp 0.992 0.975
PPV 0.778 0.642
NPV 0.977 0.995
Other diabetes type, e.g., medication-induced, monogenic (n = 176)
Se 0.381 0.496
Sp 0.981 0.996
PPV 0.512 0.698
NPV 0.986 0.988
κ statistic 0.870 0.910
Accuracy 0.936 0.955

Accuracy = number correctly classified / N. Positive (LR+) and negative (LR−) likelihood ratios may be calculated with the following formulas: LR+ = Se / (1 − Sp), LR− = (1 − Se) / Sp.

*

Variables in the final multinomial regression model included the following: most common diabetes type–specific code, maximum HbA1c, proportion of type 2 diabetes codes, any elevated outpatient glucose, any metformin, any antidiabetes medicine, age, proportion of type 1 diabetes codes, multiple elevations in outpatient random glucose, obesity, any diabetic ketoacidosis, ethnicity, any contraceptive medication, count of type 1 diabetes codes, proportion of other diabetes codes, and polycystic ovarian syndrome.

Close Modal