Understanding Zero-Shot Clinical Trial Patient Matching with LLMs

This paper presents a zero-shot system using large language models (LLMs) for clinical trial patient matching. The system evaluates whether a patient meets the eligibility criteria of a clinical trial based on their medical history, which is provided as unstructured clinical text. The system uses a two-stage retrieval pipeline to reduce the number of tokens processed while maintaining high performance. It achieves state-of-the-art results on the n2c2 2018 cohort selection challenge, the largest public benchmark for clinical trial patient matching. The system is also more efficient in terms of data and cost compared to the status quo. It can provide coherent explanations for 97% of its correct decisions and 75% of its incorrect ones. The system's zero-shot approach allows it to scale to arbitrary trials and patient records with minimal reconfiguration. The system's performance is evaluated using precision, recall, and F1 scores. The results show that the system outperforms traditional methods and is more efficient in terms of cost and data usage. The system also demonstrates the potential for human-in-the-loop deployment by having clinicians evaluate the natural language justifications generated by the LLM. The system's results establish the feasibility of using LLMs to accelerate clinical trial operations. The paper also discusses the limitations of the system, including the need for preliminary retrieval-based pipelines to reduce the amount of text processed by the LLM and the impact of criterion specificity on the system's performance. The study highlights the potential of LLMs in clinical trial patient matching, offering a more efficient and accurate solution compared to traditional methods.This paper presents a zero-shot system using large language models (LLMs) for clinical trial patient matching. The system evaluates whether a patient meets the eligibility criteria of a clinical trial based on their medical history, which is provided as unstructured clinical text. The system uses a two-stage retrieval pipeline to reduce the number of tokens processed while maintaining high performance. It achieves state-of-the-art results on the n2c2 2018 cohort selection challenge, the largest public benchmark for clinical trial patient matching. The system is also more efficient in terms of data and cost compared to the status quo. It can provide coherent explanations for 97% of its correct decisions and 75% of its incorrect ones. The system's zero-shot approach allows it to scale to arbitrary trials and patient records with minimal reconfiguration. The system's performance is evaluated using precision, recall, and F1 scores. The results show that the system outperforms traditional methods and is more efficient in terms of cost and data usage. The system also demonstrates the potential for human-in-the-loop deployment by having clinicians evaluate the natural language justifications generated by the LLM. The system's results establish the feasibility of using LLMs to accelerate clinical trial operations. The paper also discusses the limitations of the system, including the need for preliminary retrieval-based pipelines to reduce the amount of text processed by the LLM and the impact of criterion specificity on the system's performance. The study highlights the potential of LLMs in clinical trial patient matching, offering a more efficient and accurate solution compared to traditional methods.

Zero-Shot Clinical Trial Patient Matching with LLMs

10 Apr 2024 | Michael Wornow*, Alejandro Lozano*, Dev Dash, Jenelle Jindal, Kenneth W. Mahaffey, Nigam H. Shah

10 Apr 2024 | Michael Wornow, Alejandro Lozano, Dev Dash, Jenelle Jindal, Kenneth W. Mahaffey, Nigam H. Shah