The promise of AI in healthcare is vast, with plenty of ink spilled around the rapid application of technology to improve patient outcomes. The research and clinical sides of healthcare have seen everything from AI-assisted drug discovery and design to minimally invasive robotic surgery; medical image analyses using AI-based anomaly detection to the collection of enormous amounts of data through affordable and easily accessible consumer wearable devices.
Compared to rapid advancements on the research and clinical sides, progress in the administrative burden of managing healthcare information has seemed like the polar opposite: lumbering, bureaucratic, and slow. In an era where a TikTok of a dancing cat can reach hundreds of millions of viewers in a heartbeat, it’s disheartening that 75% of all medical communication happens over fax. The porch light can turn on automatically when your smart home senses your car pulling into the driveway, but an unimaginable number of patient referral forms are still re-keyed into a Electronic Health Record (EHR) system from a piece of paper by hand.
Some progress has been made towards fully electronic integration of EHRs, including the development of standards such as Fast Healthcare Interoperability Resources (FHIR), Continuity of Care Documents (CCD), Digital Imaging and Communications in Medicine (DICOM), and Health Information Exchanges (HIE). However, most of these standards focus on system to system integration of applications that adhere to these standards. Not only do many mid-market healthcare information management systems not follow these structures, these standards do little to address the flow of information from documents designed for human use.
Clearly, an interim solution is needed while the long-term promise of ubiquitous interorganizational data integration is worked on. This short- to mid-term solution is something we call healthcare document interoperability. It focuses on the document as the lowest common denominator in how information is passed from one organization to another, even if those organizations have not agreed on a structured data model and communication protocol to have that conversation. These documents, which are ubiquitous in the healthcare environment, are universal in that they are designed for human interaction, whether it be filling out a patient intake form with a ballpoint pen or reading a letter of introduction from a referring physician.
The good news is that the promise of healthcare document interoperability is a vision that is possible not just with anticipated innovation decades into the future, but is achievable with the approaches and technologies that we have at our fingertips today. Advances in Artificial Intelligence (AI)–if coupled with the creative and intelligent application of that evolving technology to existing challenges–promises to alleviate some of that pain both in the short run and in the long term. AI promises to turn the unstructured, highly variable, difficult-to-interpret world of human-centric documents into the structured, consistent, analyzable world that machines are more comfortable operating in. At the heart of this transformation lies a set of key techniques:
Optical Character Recognition
The first hurdle with faxes and scanned documents, some of the lowest-common-denominator mechanisms for managing information in the healthcare industry, is that they are simply a digitized image of a document transmitted over digital or even analog communication lines. Worse still, a significant proportion of faxes and scanned documents contain handwritten text that is a critical component of the information being transmitted.
Optical Character Recognition (OCR) is based on Machine Learning (ML) models trained on massive datasets containing millions of images of handwritten and typeset characters. OCR can be used in a healthcare setting to translate scanned or faxed documents into raw text capable of being read by a machine. Once that document is translated into a more accessible format, it is transformed from a single isolated bit of unusable data into information that can be managed, processed, and integrated into a greater body of knowledge. In an era where a TikTok of a dancing cat can reach hundreds of millions of viewers in a heartbeat, it’s disheartening that 75% of all medical communication happens over fax, and electronic health records are still manually updated from paper forms.
Electronic Health Records Classification
Not all documents are created equal. A pile of faxes, scanned documents, or files sitting in a secure electronic inbox could contain anything from critical information for an upcoming heart transplant to the menu from the pizzeria around the corner. AI-based document classification models, with the assistance of Natural Language Processing (NLP) techniques, can look at the contents of incoming files, even ones it has never encountered before, and compare it to a library of documents it has been trained on to identify shared characteristics. Based on how similar the documents are to known examples, they can be classified by type, urgency, and sentiment. This classification is the key component to allowing the document to be prioritized appropriately, routed through the correct workflow, and raised to the attention of a person for additional evaluation.
Clearly, an interim solution is needed while the long-term promise of ubiquitous inter-organizational data integration in healthcare systems is worked on. The main advantages of ML-based classification models over traditional rules-based expert systems is that they can be faster and easier to train, don’t require a pre-defined set of explicit rules articulated by a human subject matter expert, and can be made to be more adaptable over time.
Key Value Pair Parsing
The majority of documents in a healthcare setting are forms containing a set of fields, each consisting of a field name (often called a key) and its contents (its value). Converting the raw text of a document, which can often look like an unstructured blob of text, into a table of key value pairs can often make the contents easier to use
Such a table can be easier to visually scan for errors or key information, to apply validation rules in order to improve data accuracy, or to simply copy and paste data into other applications. Accurate key value pair parsing can significantly improve health outcomes by ensuring that critical patient information is correctly interpreted and utilized.
Context-based Data Extraction using Natural Language Processing
Key value pair parsing can be sufficient for forms designed explicitly for data extraction, for situations when the structure and layout of the form are known in advance, or where the variability of layouts is low. This might occur in applications designed to read a single type of document such as a W-2 form, a credit card, or a US passport. Unfortunately, healthcare is an environment in which incoming documents, even documents of the same type, might come in from a wide variety of sources, be structured in a wide variety of formats, and contain a wide range of terminology. Think “Medical Record Number” vs “MRN” vs “Patient ID #” and you’ll get the picture.
Worse still, documents not designed with automated data extraction in mind may confuse a simpler key value pair parser. For example, a simple referral form may have three names on it: the name of the referring doctor, the name of the specialist that the patient is being referred to, and the name of the patient themself. Each of the names might also have addresses, phone numbers, and additional identifying information. A simple form parser might just have three name fields, three phone number fields, and three address fields, with no ability to distinguish amongst them.
Context-based data extraction models use more complex models to pay attention to the context surrounding a particular field, rather than paying attention to just the field’s name and value. If the first name is contained within a section with a header of “Referring Doctor,” a more sophisticated data extraction model would understand how to distinguish the doctor’s name from the patient’s name. Better still, because these more sophisticated models are trained on a much larger body of text, they are equally adept at understanding similar concepts, whether the form uses the term referring doctor, referrer, or primary care physician.
Document Summarization
Oftentimes the purpose of an incoming document is self-explanatory, especially forms that serve a single purpose that are largely key value pairs. However, in clinical practice, text-heavy documents with long-form text such as patient histories or more complex medical referrals may contain key information that cannot be summarized in a simple table. Document summarization models can provide the “Cliff Notes” versions of text-heavy documents that can be a handy pre-read prior to more thorough review.
Document Inquiry
Just like summarizing a document can be convenient for longer pieces of content, being able to ask targeted questions of a document can be useful. Think of this as basically being able to have a conversation with your PDF. A well-trained model can understand the data represented in a single document and provide contextual answers to questions asked of it using natural language prompts.
Being able to inquire about past hospitalizations, medications used, or tests taken when confronted by a patient’s long medical history can help clinical staff pinpoint facts more quickly and more accurately. Techniques such as Retrieval Augmented Generation (RAG) can not only provide succinct answers to questions asked of a long document, it can also provide citations linking to the most relevant portion of the document to allow for additional research and follow-up.
Generative Workflow Automation
Generative AI is arguably the sector that has seen the most explosive growth and innovation in the past year or two. Foundational models such as OpenAI’s ChatGPT, Google’s Gemini, and Meta’s LLaMA have made automated content generators, chatbots, and automated language translators widely available. Similar models are being extended into specialized tasks such as image creation, code generation, and content moderation.
One of the up and coming applications of generative AI in the healthcare space is generative workflow automation. By describing tasks they want to accomplish using natural language prompts, users can generate workflows that can automate simple repetitive tasks such as sending email reminders, logging appointments in calendars, and updating patient records. As technology advances, tasks that generative workflow automation systems can accomplish will extend beyond simple administrative steps and become more complex and more valuable, saving time and ensuring compliance in the process.
Conclusion
With advancements in available technologies, expansion of affordable compute resources, and clever application of solutions to real world problems, AI has the potential to reduce the wasted effort and frustration in much of healthcare administrivia by allowing healthcare professionals to focus on more high-value activities. It can improve patient outcomes by reducing lag in communications, improving accuracy of data exchanged from provider to provider, and separating information that needs to be prioritized from the noise. Finally, AI has the potential to help reduce cost out of the aspects of the healthcare system that produce the least value, allowing limited investment to be focused on those areas that most directly impact quality of life for patients and their caregivers.