[slides] Towards a Personal Health Large Language Model

The paper introduces the Personal Health Large Language Model (PH-LLM), a version of the Gemini model fine-tuned for understanding and reasoning over numerical time-series personal health data, particularly for sleep and fitness applications. The authors create three novel benchmark datasets to evaluate PH-LLM's performance in generating personalized insights and recommendations, assessing expert domain knowledge, and predicting self-reported sleep quality outcomes. Through comprehensive human and automatic evaluations, they find that PH-LLM approaches expert performance in fitness tasks and shows significant improvements in sleep coaching when fine-tuned. Additionally, PH-LLM achieves high scores on multiple-choice exams in sleep medicine and fitness, exceeding average human expert performance. The model also successfully predicts self-reported sleep quality using multimodal sensor data, demonstrating the effectiveness of integrating continuous physiological and behavioral signals into health monitoring. The study highlights the broad knowledge base and capabilities of Gemini models and the importance of contextualizing physiological data for personal health applications.The paper introduces the Personal Health Large Language Model (PH-LLM), a version of the Gemini model fine-tuned for understanding and reasoning over numerical time-series personal health data, particularly for sleep and fitness applications. The authors create three novel benchmark datasets to evaluate PH-LLM's performance in generating personalized insights and recommendations, assessing expert domain knowledge, and predicting self-reported sleep quality outcomes. Through comprehensive human and automatic evaluations, they find that PH-LLM approaches expert performance in fitness tasks and shows significant improvements in sleep coaching when fine-tuned. Additionally, PH-LLM achieves high scores on multiple-choice exams in sleep medicine and fitness, exceeding average human expert performance. The model also successfully predicts self-reported sleep quality using multimodal sensor data, demonstrating the effectiveness of integrating continuous physiological and behavioral signals into health monitoring. The study highlights the broad knowledge base and capabilities of Gemini models and the importance of contextualizing physiological data for personal health applications.

TOWARDS A PERSONAL HEALTH LARGE LANGUAGE MODEL