TY - JOUR
T1 - Reliability of a computer-aided system in the evaluation of indeterminate ultrasound images of thyroid nodules
AU - Reverter, Jordi L.
AU - Ferrer Estopiñan, Laura
AU - Vázquez, Federico
AU - Ballesta Purroy, Sílvia
AU - Batule, Sol
AU - Pérez-Montes de Oca, Alejandra
AU - Puig-Jové, Carlos
AU - Puig Domingo, Manuel
PY - 2021
Y1 - 2021
N2 - Computer-aided diagnostic (CAD) programs for malignancy risk stratification from ultrasound (US) imaging of thyroid nodules are being validated both experimentally and in real-world practice. However, they have not been tested for reliability in analyzing difficult or unclear images. US images with indeterminate characteristics were evaluated by five observers with different experience in US examination and by a commercial CAD program. The nodules, on which the observers widely agreed, were considered concordant and, if there was little agreement, not concordant or difficult to assess. The diagnostic performance of the readers and the CAD program was calculated and compared in both groups of nodule images. In the group of concordant thyroid nodules (n = 37), the clinicians and the CAD system obtained similar levels of accuracy (77.0% vs 74.2%, respectively; P = 0.7) and no differences were found in sensitivity (SEN) (95.0% vs 87.5%, P = 0.2), specificity (SPE) (45.5 vs 49.4, respectively; P = 0.7), positive predictive value (PPV) (75.2% vs 77.7%, respectively; P = 0.8), nor negative predictive value (NPV) (85.6 vs 77.7, respectively; P = 0.3). When analyzing the non-concordant nodules (n = 43), the CAD system presented a decrease in accuracy of 4.2%, which was significantly lower than that observed by the experts (19.9%, P = 0.02). Clinical observers are similar to the CAD system in the US assessment of the risk of thyroid nodules. However, the AI system for thyroid nodules AmCAD-UT ® showed more reliability in the analysis of unclear or misleading images.
AB - Computer-aided diagnostic (CAD) programs for malignancy risk stratification from ultrasound (US) imaging of thyroid nodules are being validated both experimentally and in real-world practice. However, they have not been tested for reliability in analyzing difficult or unclear images. US images with indeterminate characteristics were evaluated by five observers with different experience in US examination and by a commercial CAD program. The nodules, on which the observers widely agreed, were considered concordant and, if there was little agreement, not concordant or difficult to assess. The diagnostic performance of the readers and the CAD program was calculated and compared in both groups of nodule images. In the group of concordant thyroid nodules (n = 37), the clinicians and the CAD system obtained similar levels of accuracy (77.0% vs 74.2%, respectively; P = 0.7) and no differences were found in sensitivity (SEN) (95.0% vs 87.5%, P = 0.2), specificity (SPE) (45.5 vs 49.4, respectively; P = 0.7), positive predictive value (PPV) (75.2% vs 77.7%, respectively; P = 0.8), nor negative predictive value (NPV) (85.6 vs 77.7, respectively; P = 0.3). When analyzing the non-concordant nodules (n = 43), the CAD system presented a decrease in accuracy of 4.2%, which was significantly lower than that observed by the experts (19.9%, P = 0.02). Clinical observers are similar to the CAD system in the US assessment of the risk of thyroid nodules. However, the AI system for thyroid nodules AmCAD-UT ® showed more reliability in the analysis of unclear or misleading images.
KW - CAD system
KW - Risk classification
KW - Thyroid nodules
U2 - 10.1530/ETJ-21-0023
DO - 10.1530/ETJ-21-0023
M3 - Article
C2 - 34981749
SN - 2235-0640
VL - 11
JO - European Thyroid Journal
JF - European Thyroid Journal
ER -