Dr Naglis Ramanauskas is a researcher at Oxipit. Trained as a radiologist, he aims to use his first-hand clinical experience to maximize the added value of Oxipit medical imaging products.
In this talk with Dr Ramanauskas we discuss how AI productization and workflow alignment are no less important than statistical model performance.
So your AI software is able to detect nodules with 100% sensitivity. What happens next?
Clickbaity headlines is a byproduct of when you try to put an end metric to the performance of AI radiology software. Or even compare whether it outperforms a radiologist. Sensitivity is nothing without specificity, you could end up with lots of false positives or miss important findings while trying to finetune the model performance.
What I am trying to say is that creating AI based software for radiology is not a Kaggle competition. Even if you managed to hit a certain benchmark, this will not automatically translate into value for the end users – medical institutions, radiologists or patients.
I would say that it is fair to assume that most leading AI medical imaging software companies either already are or soon will be in the same ballpark in terms of accuracy metrics for the common radiological findings. Certainly, one company will be better than others at detecting nodules or signs of tuberculosis for example, but for many use cases these marginal differences in AUC values will not directly translate into value in the clinical workflow from a radiologists standpoint.
This does not mean model performance is not improving or everyone should stop the research and development altogether. There are certainly still some benchmarks to be achieved and for certain applications incremental differences in accuracy might be of critical importance.
I think that at certain point, the “wrapper” of the AI models – this includes the way you utilize your AI model – define the exact way of how the software is to be implemented and used in the clinical workflow, select appropriate thresholds and solve the UX/UI challenges – becomes more important than than the raw accuracy of your AI models.
I think at this point we should focus on AI adoption workflows and in what scenarios AI medical imaging applications can maximize the added value.
ChestEye Quality is the first AI CXR double reading tool. The product analyses the final radiologist report and corresponding chest X-ray for any missed findings. It utilizes hypersensitized nodule model in combination with supervision of a second human radiologist. What is unique about this approach?
This is a way to utilize AI radiology software as a value-added tool versus an end silver bullet solution.
We developed this product for an efficient retrospective audit tool, where you can feed tens or hundreds of thousands images into the platform. After ChestEye Quality analysis, only suspicious cases are flagged, leaving you with much fewer cases for manual review.
The platform can also operate in the prospective setting, meaning that any discrepancies are flagged as soon as the platform is provided with CXR image and corresponding radiologist report. Operating in near-real time, the platform can identify clinically impactful findings that were missed. This helps to change treatment decisions before, for instance, the patient is discharged from the hospital.
We apply a hypersensitive nodule model with a complex threshold system to minimize false positives. The extra layer of human radiologist helps to efficiently review flagged cases and only submit confirmed cases for additional radiologist review at the medical institution.
I would also say that the ‘extra human in the loop’ is not to compensate for model performance. The model already creates substantial value. By only flagging suspicious cases, it enables near real time double reading audit – a quality check which would otherwise be impossible.
How would ChestEye Quality operate in a medical institution workflow?
From our current deployments, up to 10% of all CXR cases may be flagged for additional review. Up to 5% from flagged cases may have clinically impactful missed findings.
Our framework enables a quick review of suspicious cases. The suspicious CXR area is automatically highlighted with heatmaps and the potentially missed pathology identified. The reviewing radiologist instantly knows where to focus his attention. Our Analytical page user interface, case layout and filtering tools further help to review the cases efficiently. Email notifications are sent to the corresponding radiologist if suspicions are validated.
Operating in near real time, the medical institution can assign an in-house radiologist to review suspicious cases – this would call for little additional resources. As we are identifying clinically impactful cases on a daily basis, this AI workflow already contributes to improved patient care.