Back

GatorTron: A Large Clinical Language Model to Unlock Patient Information from Unstructured Electronic Health Records

2022-02-28 health informatics Title + abstract only
View on medRxiv
Show abstract

ObjectiveTo develop a large pretrained clinical language model from scratch using transformer architecture; systematically examine how transformer models of different sizes could help 5 clinical natural language processing (NLP) tasks at different linguistic levels. MethodsWe created a large corpus with >90 billion words from clinical narratives (>82 billion words), scientific literature (6 billion words), and general English text (2.5 billion words). We developed GatorTron models from scratch ...

Predicted journal destinations