Conventional radiography and correlated factors of enthesopathies of the Achilles tendon and plantar fascia in patients with axial spondyloarthritis

Ebru Yılmaz; Özge Pasin; Loiane Cristina de Souza; Guilherme Torres Vilarino; Alexandro Andrade; Camila Gusmão Vicente de Carvalho; Barbara Bayeh; Fernando Henrique Carlos de Souza; Renata Miossi; Pleiades Tiharu Inaoka; Takashi Matsushita; Naoki Mugii; Samuel Katsuyuki Shinjo; Società Italiana di Reumatologia

doi:10.4081/reumatismo.2024.1709

Authors

Background. Large Language Models (LLMs) are increasingly applied and promising in medicine. In idiopathic inflammatory myopathies (IIM), it has been recently demonstrated a strong correlation of LLMs (Claude-2) with experts assessors of the Myositis Disease Activity Assessment Tool-Visual Analogue Scale (MDAAT-VAS). The specific evaluation of LLMs performance in the cutaneous domain of Dermatomyositis (DM) is unexplored. Objectives. This study aimed at evaluting LLM 'Claude v. 3.5 Sonnet' in scoring cutaneous manifestations of DM against expert assessors, with implications for automated clinical trials screening where competing trials often limit patient recruitment from an already restricted patient pool.

Methods. Twenty-seven DM cases with standardized clinical photographs were identified through systematic PubMed review. Two rheumatologists with expertise in Cutaneous Dermatomyositis Disease Area and Severity Index (CDASI) scoring and trial recruitment independently assessed the images. The LLM 'Claude' analysed identical images using chain-of-thought and it was prompted the scoring of CDASI domains: erythema, scaling, erosion/ulceration, poikiloderma and calcinosis. Hand lesions were scored with specific attention to papules (requiring doubled erythema scores) and periungual changes. Intraclass Correlation Coefficient (ICC) analysis was performed using two-way random effects modelling (Stata 18).

Results. Global ICC analysis demonstrated excellent agreement between Claude and expert assessors (0.92, 95% CI: 0.89-0.94), comparable to inter-expert reliability (0.87, 95% CI: 0.82-0.91). Domain-specific analysis revealed: 1. Moderate agreement for core features: o Erythema (0.61, 95% CI: 0.27-0.81) o Scaling (0.57, 95% CI: 0.19-0.79) o Erosions (0.57, 95% CI: 0.21-0.79) o Poikiloderma (0.47, 95% CI: 0.10-0.73) 2. Strong concordance for hand assessment -– an ubiquitous and specific feature of disease: o Global hand score (0.95, 95% CI: 0.91-0.97) o Hand erythema (0.78, 95% CI: 0.24-0.95) o Perfect agreement for periungual vasculitis (ICC 1.0) 3. Lower reliability for damage assessment: o Hand damage (0.37, 95% CI: 0.10-0.85) Time Efficiency Analysis: o Expert assessors: Mean 8.4 minutes per case (range 6-12 minutes) o LLM assessment: Mean 42 seconds per case (range 35-50 seconds) o Total time saved: 93% reduction in scoring time o Additional efficiency: Simultaneous batch processing capability for LLM versus sequential expert assessment.

Conclusions. The LLM demonstrates excellent reliability for global disease assessment (ICC 0.92) and objective features like periungual changes (ICC 1.0), with significant time efficiency (93% reduction in scoring time) and batch processing capabilities. This could enhance clinical trial recruitment workflows in DM patients, where competing trials often limit patient recruitment from an already restricted patient pool. However, important limitations persist in assessing subtle features (poikiloderma ICC 0.47, damage ICC 0.37) and technical constraints, including image quality dependencies. This suggests its current advisable use is as a screening tool to support, rather than replace, expert assessment.

Downloads

Download data is not yet available.

Citations

How to Cite

1.

PO:24:060 | Evaluation of a Large Language Model Performance in Rating Cutaneous Manifestations of Dermatomyositis: A Comparison with Expert Assessors: Gianmarco Roselli1, Marco Fornaro1, Swapnasha Panigrahi2, Sara Sabbagh3, Florenzo Iannone1, Vincenzo Venerito1, Latika Gupta4 | 1Università degli Studi di Bari, Unità di Reumatologia, DiMePRe-J Bari, Italy; 2University of Birmingham Birmingham, United Kingdom; 3Medical College of Wisconsin, Unit of Rheumatology, Department of Paediatrics Milwaukee, USA; 4University of Manchester, Manchester Academic Health Centre, Division of muscoloskeletal and deramatological sciences Manchester, United Kingdom. Reumatismo [Internet]. 2026 Mar. 18 [cited 2026 May 27];77(s1). Available from: https://www.reumatismo.org/reuma/article/view/2349

Download Citation

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Current Issue

PO:24:060 | Evaluation of a Large Language Model Performance in Rating Cutaneous Manifestations of Dermatomyositis: A Comparison with Expert Assessors

Authors

Downloads

Citations

How to Cite

Download Citation

authors

reviewers

Categories

indexing

linkedin

Keywords