Background: For a fracture classification to be useful it must provide prognostic significance, interobserver reliability, and intraobserver reproducibility. Most studies have found reliability and reproducibility to be poor for fracture classification schemes. The purpose of this study was to evaluate the interobserver and intraobserver reliability of the Sanders and Crosby-Fitzgibbons classification systems, two commonly used methods for classifying intra-articular calcaneal fractures. Methods: Twenty-five CT scans of intra-articular calcaneal fractures occurring at one trauma center were reviewed. The CT images were presented to eight observers (two orthopaedic surgery chief residents, two foot and ankle fellows, two fellowship-trained orthopaedic trauma surgeons, and two fellowship-trained foot and ankle surgeons) on two separate occasions 8 weeks apart. On each viewing, observers were asked to classify the fractures according to both the Sanders and Crosby-Fitzgibbons systems. Interobserver reliability and intraobserver reproducibility were assessed with computer-generated kappa statistics (SAS software; SAS Institute Inc., Cary, North Carolina). Results: Total unanimity (eight of eight observers assigned the same fracture classification) was achieved only 24% (six of 25) of the time with the Sanders system and 36% (nine of 25) of the time with the Crosby-Fitzgibbons scheme. Interobserver reliability for the Sanders classification method reached a moderate (kappa = 0.48, 0.50) level of agreement, when the subclasses were included. The agreement level increased but remained in the moderate (kappa = 0.55, 0.55) range when the subclasses were excluded. Interobserver agreement reached a substantial (kappa = 0.63, 0.63) level with the Crosby-Fitzgibbons system. Intraobserver reproducibility was better for both schemes. The Sanders system with subclasses included reached moderate (kappa = 0.57) agreement, while ignoring the subclasses brought agreement into the substantial (kappa = 0.77) range. The overall intraobserver agreement was substantial (kappa = 0.74) for the Crosby-Fitzgibbons system. Conclusions: Although intraobserver kappa values reached substantial levels and the Crosby-Fitzgibbons system generally showed greater agreement, we were unable to demonstrate excellent interobserver or intraobserver reliability with either classification scheme. While a system with perfect agreement would be impossible, our results indicate that these classifications lack the reproducibility to be considered ideal.
- Kappa Values
ASJC Scopus subject areas
- Orthopedics and Sports Medicine