A study published in The Spine Journal has shown that the agreement of degenerative pathology findings in magnetic resonance imaging (MRI) of the cervical spine can vary significantly, even in an idealised study setting. The researchers compared the inter- and intra-rated agreement of the MRI findings, and discovered that even inter-rated agreement ranged substantially.
The study, from Yale University School of Medicine, New Haven, USA, compared the findings of two spine surgeons and four musculoskeletal radiologists, who independently reviewed 48 sets of T2-weighted axial and sagittal MRI sequences to determine inter-rater agreement. The first 10 studies were re-reviewed by each panellist to determine intra-rater agreement. The study used a set of de novo standardised criteria—such as; normal, partially reduced or completely black disc, for disc hydration—to grade the degree of degeneration on the MRI sequences. The researchers also graded for disc space height, central stenosis, end plate changes, spondylolisthesis and cord signal change. As well as comparing the absolute inter- and intra-rated agreement, a modified analysis was undertaken, which ignored the disagreements between the two lowest severity grades. This analysis was performed to distinguish between those results which might prove clinically significant, and those which would likely prove benign in terms of surgical planning. The disagreements for disc hydration between ‘normal’ and ‘partially reduced’, for example, would be ignored.
The overall inter-rater absolute agreement found across all findings was 75.7% (95% confidence interval (CI), range 74.7-77%). This was calculated as the percentage of all assessments where two particular panellists agreed on the same severity grading. When stratified according to MRI finding, agreements ranged from 54.6% for disc hydration and 95% for spondylolisthesis. When the agreement was modified, the overall agreement was higher at 87% (95% CI, 86.1-87.9%), with a range between 79% for end plate changes and 95% for spondylolisthesis for the stratified results.
The overall intra-rater absolute agreement was, as expected, higher, at 81.6% (95% CI, 78.9-84.3%). The stratified results ranged from 74.2% agreement for disc hydration, and 94.7% for spondylolisthesis. The overall modified intra-rater agreement was 88.3% (95% CI, 86.5-90%), while the stratified results ranged from 85% agreement for disc hydration, and 94.8% for spondylolisthesis.
The biggest limitation of this study is its idealised setting, which may have affected the way that assessments were made. These agreements, thus, represent a “best-case” scenario, according to the study authors. Whilst the study also found no significant impact of clinical specialty of the reviewer on agreement, the authors note that it is quite possible that this would not hold true for other specialties, who may interpret the results differently.
In spite of the standardised assessment criteria, and the “best-case” setting of the study, significant variability was observed for both inter- and intra-rater agreement, in both absolute and modified terms. The authors speculate that in a clinical setting agreements could be lower, and the range of agreements might be wider. “When reading imaging reports, physicians should be cognisant of inconsistencies inherent in the interpretation of cervical MRI findings,” the authors urge. Too, they highlight “the importance of correlating imaging findings with clinical findings when provided patient care.”