Understanding Large language models and multimodal foundation models for precision oncology

The article discusses the significant advancements in artificial intelligence (AI) and large language models (LLMs) that have transformed precision oncology. Since 2022, AI has seen rapid technological progress, particularly in the development of LLMs and multimodal AI models. These models can process and generate text-based data, and their performance scales with size, showing human-level competency in text processing. The integration of transformer neural networks in both text and image processing networks has enabled the creation of multimodal AI models that can handle diverse types of data simultaneously, marking a qualitative shift from specialized niche models. The volume of patient-specific data in oncology is rapidly expanding due to the widespread use of electronic health records (EHRs), advancements in medical imaging, and the integration of large-scale genomic analyses. AI and machine learning (ML) have shown promise in helping healthcare professionals manage this data effectively. Prior to 2012, AI played a marginal role in oncology, but the advent of convolutional neural networks (CNNs) in 2012 marked a turning point, making image processing much easier. This was followed by hardware improvements that lowered computational barriers, broadening the user base for AI development. Between 2012 and 2022, neural networks were primarily applied to oncological imaging and text analysis, with regulatory approvals for specialized AI tools in radiological and pathological image analysis. However, high-profile initiatives like IBM Watson did not achieve their projected outcomes. The landscape changed in 2022 and 2023 with the introduction of LLMs and multimodal AI models. LLMs, such as GPT-3.5 and GPT-4, have demonstrated impressive conversational skills and knowledge retrieval capabilities. They can be applied to medical problems through various approaches, including fine-tuning on medical data and using retrieval augmented generation (RAG) to integrate domain knowledge. Multimodal AI systems, which can interpret multiple types of data together, have been evaluated for applications in precision oncology, such as outcome predictions. Foundation models, which are pre-trained on large and diverse tasks and can be applied to specialized tasks, have reduced the data requirements for specialized applications, such as predicting diseases from retinal photographs. These models can also be deployed in clinical settings, such as chatbot assistants aiding diagnosis. However, several challenges must be addressed to fully realize the potential of LLMs and multimodal models in oncology, including data quality, system design, regulatory approval, ongoing model evaluation, and interpretability. The article concludes by highlighting the potential impact of these advancements on the practice of oncology through various applications.The article discusses the significant advancements in artificial intelligence (AI) and large language models (LLMs) that have transformed precision oncology. Since 2022, AI has seen rapid technological progress, particularly in the development of LLMs and multimodal AI models. These models can process and generate text-based data, and their performance scales with size, showing human-level competency in text processing. The integration of transformer neural networks in both text and image processing networks has enabled the creation of multimodal AI models that can handle diverse types of data simultaneously, marking a qualitative shift from specialized niche models. The volume of patient-specific data in oncology is rapidly expanding due to the widespread use of electronic health records (EHRs), advancements in medical imaging, and the integration of large-scale genomic analyses. AI and machine learning (ML) have shown promise in helping healthcare professionals manage this data effectively. Prior to 2012, AI played a marginal role in oncology, but the advent of convolutional neural networks (CNNs) in 2012 marked a turning point, making image processing much easier. This was followed by hardware improvements that lowered computational barriers, broadening the user base for AI development. Between 2012 and 2022, neural networks were primarily applied to oncological imaging and text analysis, with regulatory approvals for specialized AI tools in radiological and pathological image analysis. However, high-profile initiatives like IBM Watson did not achieve their projected outcomes. The landscape changed in 2022 and 2023 with the introduction of LLMs and multimodal AI models. LLMs, such as GPT-3.5 and GPT-4, have demonstrated impressive conversational skills and knowledge retrieval capabilities. They can be applied to medical problems through various approaches, including fine-tuning on medical data and using retrieval augmented generation (RAG) to integrate domain knowledge. Multimodal AI systems, which can interpret multiple types of data together, have been evaluated for applications in precision oncology, such as outcome predictions. Foundation models, which are pre-trained on large and diverse tasks and can be applied to specialized tasks, have reduced the data requirements for specialized applications, such as predicting diseases from retinal photographs. These models can also be deployed in clinical settings, such as chatbot assistants aiding diagnosis. However, several challenges must be addressed to fully realize the potential of LLMs and multimodal models in oncology, including data quality, system design, regulatory approval, ongoing model evaluation, and interpretability. The article concludes by highlighting the potential impact of these advancements on the practice of oncology through various applications.

Large language models and multimodal foundation models for precision oncology

2024 | Daniel Truhn, Jan-Niklas Eckardt, Dyke Ferber, Jakob Nikolas Kather