Imagine a world where researchers can simulate complex biological processes with just a few lines of text. This is not science fiction, but the exciting potential of a new approach using large language models (LLMs) like GPT-4.
Traditionally, simulating biological systems has been a complex task requiring deep scientific expertise and specialized software. This often hinders the pace of biomedical research. However, a recent study published in the journal Computers in Biology and Medicine demonstrates an innovative approach: using LLMs as powerful biomedical simulators.
A simulator developed at the Medical University of Vienna (MedUni Vienna) and the CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, based on GPT-4, shows greater accuracy in classifying the importance of genes in cancer cells and in prognosticating cancer patients.
The power of language: LLMs as problem solvers
LLMs are AI systems trained with massive amounts of text data. They excel at tasks such as answering complex questions, developing step-by-step arguments, and even generating various creative text formats. This impressive capability extends to the field of biomedicine, where LLMs have achieved remarkable feats. For instance, GPT-4 has surpassed the passing score on the U.S. medical licensing exam, a test used to evaluate medical competence!
But LLMs go beyond just answering questions. They can explain their reasoning and even design laboratory experiments, all through the power of language.
Unlocking the potential of simulations
Computational simulations play a crucial role in scientific discovery. They allow researchers to test hypotheses, predict outcomes, and guide experiments. However, traditional simulation methods often struggle with the intricate complexity of biological systems.
This is where LLMs come in. By leveraging their vast knowledge base and sequential reasoning capabilities, LLMs offer a complementary approach to traditional simulation methods. They potentially enhance both the performance and interpretability of simulation results.
The power of words: SimulateGPT
The study tests the hypothesis that gradual simulation of biological and medical processes with GPT-4 leads to better outcomes. This is relevant for future applications in biomedical research as well as for understanding these new models.
The researchers developed “SimulateGPT,” a system that harnesses the vast knowledge contained in LLMs to simulate biological processes. They interact with SimulateGPT by providing specific prompts (essentially, written instructions about the desired simulation).
SimulateGPT then uses its language understanding to generate realistic predictions about how biological systems might behave under different conditions. This opens a new avenue for researchers to explore various scenarios without the need for elaborate laboratory setups.
Superior performance: outperforming traditional methods
The researchers tested SimulateGPT against traditional LLM inference methods in various scenarios. These included predicting the effects of treatments in mice and analyzing patient data to estimate cancer progression.
This method has been tested and validated by experts in various scenarios, such as experiments with mice, support for sepsis treatment, prediction of essential genes in cancer cells, and progression-free survival of cancer patients. The method is designed for basic research and not for clinical use.
The results were impressive.
Biomedical experts rated SimulateGPT’s predictions as significantly more accurate than those obtained using LLMs directly. Moreover, in quantitative tests involving genetic essentiality and patient survival predictions, SimulateGPT demonstrated a substantial increase in accuracy compared to traditional methods.
The future of AI in biomedical research
The use of artificial intelligence in health is an area of research being explored, even in drafting medical discharge documents. This study paves the way for a future where LLMs play a transformative role in biomedical research. LLMs have the potential to:
- Simplify complex simulations: Text-based simulations powered by LLMs could make complex biological simulations more accessible to a wider range of researchers.
- Improve interpretability: Unlike traditional methods, LLM simulations can offer clear explanations of their reasoning, helping researchers understand the “why” behind the results.
- Accelerate discovery: LLMs could streamline the research process by enabling rapid testing of various scenarios and hypotheses.
Conclusion
“This study shows that large language models (LLMs) like GPT-4 could enable a new class of biomedical simulators,” explains Matthias Samwald from the Institute of Artificial Intelligence at the Medical University of Vienna. “Text-based simulations are particularly well-suited for modeling and understanding living systems, as text and language provide the necessary flexibility and interpretability to describe the complexity of biology. For further development of LLM-based biomedical simulators, we propose several directions, including the integration of biological databases and mathematical modeling, as well as training new AI models with experimental data.”
In this sense, the study offers a vision of a future where LLMs revolutionize biomedical research. Text-based simulations powered by LLMs are immensely promising for understanding complex biological systems, especially those that challenge traditional physics-based simulations.
The main conclusions of the study were:
- Surprisingly effective simulations: Even without specific biomedical training, GPT-4 achieved impressive results in simulating real-world biological scenarios.
- The power of prompts: The study found that providing clear instructions (prompts) significantly improved the accuracy of LLM simulations compared to traditional methods. These prompts guide the LLM to consider the evolving state of the simulation, leading to more realistic results.
- A roadmap for the future: The researchers propose ten intriguing pathways for further development. Among them are:
- Interactive simulations: Imagine asking follow-up questions or exploring hypothetical scenarios within the simulation!
- Smarter LLMs: The ability to access external information, such as scientific articles, could enhance accuracy and reduce errors in LLM results.
- Integrated math and programming: Allowing LLMs to integrate mathematical models and even execute basic code would open doors to more complex simulations.
- Self-checking models: The ability of LLMs to critically analyze their own simulations would further improve their reliability.
- Multimodal integration: Incorporating data such as medical images into simulations could provide a richer picture of biological processes.
- Real-world data tuning: Training LLMs on real-world biomedical data could significantly enhance the accuracy of their simulations.
- Real-world feedback loop: Using results from actual experiments to refine simulations can create a powerful feedback loop for even more accurate predictions.
Overall, this study demonstrates the immense potential of LLMs to simulate biological systems. By continuing to develop these AI tools, researchers can unlock new possibilities for scientific discovery and accelerate progress in biomedicine.
Contact
Matthias Samwald
Medical University of Vienna, Institute of Artificial Intelligence, Center for Medical Data Science
Währingerstraße 25a, 1090, Vienna, Austria
Email: matthias.samwald@meduniwien.ac.at
Reference (open access)
Schaefer, M., Reichl, S., Ter Horst, R., Nicolas, A. M., Krausgruber, T., Piras, F., Stepper, P., Bock, C., & Samwald, M. (2024). GPT-4 as a biomedical simulator. Computers in Biology and Medicine, 178, 108796. https://doi.org/10.1016/j.compbiomed.2024.108796