We have been working with BERT[1], a natural language processing (NLP) AI model Google released a few years ago. BERT can be used for a number of NLP tasks, including multi-label classification. I have been using BERT to determine the tissue type (part of determining case complexity for scheduling) from the gross pathology report, and I am getting good results.
We are able to achieve 95% accuracy on 38 labels using just 200 reports per label. People are starting to augment (training with unclassified data) BERT models with clinical text[2], we have done with surgical pathology reports, which in my case seems to improve results 2-5%. It would be reasonable for us to train models on specific types of reports from other areas (radiology, neurology, etc.), which might further improve accuracy. This is not fine-tuning training, but rather changing the structure transformer model to better work with the domain vocabulary. Hypothetically (and my limited experience), a Pathology-BERT performs better on Pathology fine-tuned models, than a base BERT model.