Biomedical articles are often characterized by structural and technical complexity, making them inaccessible to non-expert readers. Large language models (LLMs) have shown promise in the task of biomedical text simplification, but existing techniques fail to adapt to readers' varying literacy needs. We address this by tailoring medical text simplification for two distinct proficiency levels of English as Home Language (HL) and First Additional Language (FAL). We evaluate a range of open-source instruction models including the Llama-3 family and Mistral-7B across multiple prompting strategies including zero-shot, few-shot, and in-context learning. We investigate instruction fine-tuning with synthetic data using Self-Instruct. Furthermore, we investigate the role of domain-specific knowledge by comparing biomedical LLMs (i.e., BioMistral) against prompt-based external knowledge injection. Our results reveal two key findings. First, we identify a clear trade-off as general-purpose models like Mistral excel at preserving semantic content, while in-context prompting of models like Llama-3.1-8B achieves the highest readability scores. Second, and most notably, we demonstrate that domain-specific models underperform in simplification due to their bias towards retaining complex terminology. Conversely, a small, self-instruct tuned model (Llama-3.2-3B) achieves comparable readability for FAL audiences, showing that targeted, efficient tuning can surpass both larger general-purpose and specialized models.
Explore the visual story of this exhibit
Title Image