Leveraging LLMs for Clearer Health Reporting: Guiding Journalists to Extract and Communicate Risk Information Effectively

Requirements

Required: Completion of the lectures on "Human-Computer Interaction" or "Data Visualization"
Preferred: Completion of the seminar on “Interactive Intelligent Systems” and the lecture on "Wissenschaftliches Arbeiten in der Informatik"

Academic Advisor

Prof. Dr. Claudia Müller-Birn, Prof. Dr. Odette Wegwarth, Dr. Nico Gradwohl

Discipline

Interaction Design, Prompt Engineering, Hugging Face

Degree

Bachelor of Science (B.Sc.)

During the COVID-19 pandemic, many people were confused by health news. A big reason? The presentation of medical data, also referred to as risk information, was often unclear or misleading. To avoid these misunderstandings in the future, journalists need to use clear numbers—like showing both the benefits and side effects of a medical intervention in absolute terms so they’re easy to compare. But here’s the problem: when journalists write about risks (like for a new drug or vaccine), they often rely on press releases. These are usually incomplete, cherishing only one side of the medal, or leave out key context. Even if journalists want to dig deeper, they’re often racing against deadlines and may not have the stats skills to decode dense scientific studies.

That’s where Large Language Models (LLMs) could step in. The goal of this thesis is to explore how LLMs can be used to extract risk information from scientific articles through a guided process.

Such a solution would save journalists time, help them ask better questions, and enable them to explain numbers more clearly in their articles. The result? More accurate risk reporting and a better-informed public. And that’s a big deal—because whether it’s about vaccines, treatments, or lifestyle choices, understanding risk empowers people to make smarter decisions that can impact their health and lives.

General Research Process

Analyze Technical Possibilities
Conduct a thorough review of current LLM-based data extraction methods from text that could be applied in this context. Compare these methods and select the most promising one.
Understand the Key Components of Risk Communication
Familiarize yourself with research on risk communication and its essential components.
Design a Guided Process (Task Flow)
Specify how users should be guided through the process of identifying and extracting risk information from text.
Create a High-Fidelity Prototype
Develop an LLM-based high-fidelity user interface prototype as a minimum viable product (MVP).
Conduct a Study to Assess Human-AI Interaction
Design and execute a study to evaluate the human-AI interaction. Recruit 12 participants for the study.
Analyze, Present, and Discuss Results
Perform a qualitative analysis of the study results using Human-AI heuristics. Identify key insights and implications for future research.

References

Dagdelen, J., Dunn, A., Lee, S., & et al. (2024). Structured information extraction from scientific text with large language models. Nature Communications, 15, 1418. https://doi.org/10.1038/s41467-024-1418-0
Lühnen, J., Albrecht, M., Mühlhauser, I., & Steckelberg, A. (2017). Leitlinie Evidenzbasierte Gesundheitsinformation. Retrieved from https://www.leitlinie-gesundheitsinformation.de
Wegwarth, O. (2024). "Wir brauchen eine Revolution zur Förderung transparenter Informationen" [Editorial]. InFo Hämatologie + Onkologie, 27(9), 3. https://doi.org/10.1007/s15004-024-0697-8
Wegwarth, O., & Gigerenzer, G. (2014). Improving evidence-based practices through health literacy—in reply. JAMA Internal Medicine, 174(8), 1413–1414.

Human-Centered Computing