Methods

Through a collaborative effort between 11 different hospitals all over Norway, we have successfully collected NCS data from more than 220 000 subjects, now available for research purposes. The project further takes advantage of NCS data from a database of 92 000 NCS from Uppsala University Hospital.

Development and validation of the AI-based decision-support system

We plan using artificial neural networks as our machine learning model architecture. ANNs have been growing in popularity due to their recent successful implementations, one of them being ChatGPT. We will include both NCS measures, clinical data (e.g. neurological symptoms and signs) and other relevant patient data (e.g. age, height, sex) as model parameters.

First, will use split sample validation as the first step for our model validation. The data set will be divided into a development set (80%) and an internal validation set (20%). During each iterative training step of the model, we will split the development set into a training set and a tuning set. The model will be adjusted using the training set and tested on the tuning set. We will repeat this process until the model reaches maximum classification accuracy and tuning set generalizability.

Second, we will test the model on the withheld internal validation set not included in the development phase. Third, we will further compare the output of our model with clinical assessments from a large single-center external validation set of more than 40 000 nerve conduction studies from St. Olavs Hospital. Finally, we will establish a panel of ten experts from different European countries, trained in different institutions, whom have not contributed data to the development dataset. The experts will independently label NCS from a new, independent and representative multi-center external validation set (that will be collected during the project with the help our collaborators), using the same output categories as the machine-learning model. We will then calculate the agreement between the model and the human experts, as well as the interrater agreement among the experts, using majority consensus of the raters as the reference standard.

Real-life validation

We will follow the most recent principles for external validation of machine-learning based algorithms on real-world clinical data, using a prospective, diagnostic cohort design. Patients are included based on a clinical suspicion of entrapment resulting in a referral to our outpatient clinics. Neurophysiology technicians will be asked to make a diagnostic conclusion based on the combination of the presenting symptoms and the NCS, supported by the clinical decision support system. Next, the clinical neurophysiologist makes his or her assessment, blind to the technician’s initial conclusion. The two diagnostic conclusions are then compared using a non-inferiority analytic approach.