Using AI to Unlock Hidden Insights Within Unstructured Clinical Trial Data

Clinical trials rely upon lots of data to prove the safety and efficacy of new therapies. Much of this data comes in standard formats like NDC (National Drug Code) and ICD (International Classification of Diseases), making actionable information relatively easy to access. Yet a great deal of critical data remains unstructured, encompassing hand-written notes, image files, and various other formats that have traditionally required more time and effort to analyze. 

Understanding and effectively leveraging this unstructured data is crucial for optimizing clinical trial recruitment, as it can help researchers develop more accurate pictures of the patient populations they are seeking to enroll. New approaches, powered by new artificial intelligence (AI) models built specifically for clinical trial recruitment, can help researchers unlock the important insights within unstructured data.

What is Unstructured Clinical Trial Data?

Unstructured data refers to information that does not follow a predefined model or format, making it more challenging to process and analyze using conventional data systems. In clinical trials, this data often includes:

  • Hand-written notes from physicians and other healthcare providers
  • Medical images and videos such as X-rays, MRI scans, CT scans, pathology slides, etc.
  • Free-text entries in electronic health records (EHRs) such as personal and family history, appointment reasons, progress notes, discharge notes, etc.
  • Third Party Reports and Files either as machine readable text or images of text (i.e., faxed reports)
  • Patient-reported outcomes including surveys, questionnaires, and diaries
  • Audio recordings of patient interviews and consultations
  • Wearable Medical Device data from smart watches, fitness trackers, and other IoT compatible medical devices 

Unlike structured data, which fits neatly into databases (e.g., numeric lab results, coded diagnoses), unstructured data is rich in detail but complex to manage.

The Challenges of Unstructured Data in Clinical Trial Recruitment

Inclusion and exclusion criteria for clinical trials often require detailed patient information, much of which may be buried in unstructured data. According to recent independent research by BEKhealth:

  • Patients generate approximately 19,000 words per year of unstructured text – equivalent to a novella’s worth of information annually.
  • Each patient generates an average of ~38 unstructured notes per year including files & other documents. This spans short free text notes, long form reports, and everything in between.
  • Patients average 19 diagnoses per year and 36% of those diagnosis events are solely documented in unstructured notes
  • Patients average 177 medication exposures/prescriptions each year and 12% of those exposures are solely documented in unstructured notes

This sheer volume of data, coupled with its unstructured nature, poses significant challenges:

  1. Data Overload: With such a high volume of unstructured notes and documents, manual review is impractical. The data growth, which BEKhealth’s own team has seen accelerate at a ~4% compound annual growth rate (CAGR) over the past three years, exacerbates this issue. This is even more daunting considering that healthcare providers are routinely expected to manage care with a paltry 15 minutes of time per patient.
  2. Complexity: Extracting relevant information from unstructured data requires advanced techniques. Traditional keyword searches or simple algorithms often fall short, as they cannot comprehend context, nuances in medical terminology, acronyms, synonyms, homonyms, and accidental misspellings. 
  3. Time and Resource Intensive: Manual data extraction and analysis are labor-intensive and prone to errors. This can delay patient recruitment, increase costs, and potentially impact the trial’s success.
The Advantage of AI and NLP in Analyzing Unstructured Data

AI and natural language processing (NLP) technologies represent new approaches for overcoming these challenges. For example, purpose-built AI and NLP systems for clinical trial recruitment, such as the solution developed by BEKhealth, can analyze unstructured data efficiently, accurately, and comprehensively. Here’s how:

  1. Automated Data Extraction: AI can process vast amounts of unstructured data quickly, identifying and extracting relevant information such as diagnoses, treatment histories, and patient demographics from diverse sources.
  2. Contextual Understanding: NLP algorithms are designed to understand the context and subtleties of human language, making them adept at interpreting medical notes and records. This capability helps ensure that crucial information is not missed and that the extracted data is accurate and relevant.
  3. Enhanced Patient Matching: By comprehensively analyzing patient data, AI systems can match patients to clinical trials more effectively. This ensures that eligible patients are identified promptly, reducing recruitment times and improving trial outcomes.
  4. Scalability: AI solutions can handle the increasing volume of unstructured data, scaling effortlessly to accommodate the growing demands of clinical trials. This scalability is essential as the amount of unstructured data continues to rise.
  5. Cost Efficiency: Automating the data analysis process reduces the need for extensive manual labor, cutting down costs and allowing clinical trial coordinators to focus on more strategic tasks.

Pressures to accelerate clinical trial timelines, while simultaneously making studies more accessible to broader, more diverse groups of patients underscore the importance of using all the data at our disposal to improve clinical trial recruitment. AI and NLP technologies offer solutions that can handle the complexity and volume of unstructured clinical data efficiently so that researchers can get insights more quickly. 

As the amount of unstructured data continues to grow, technologies like BEKhealth’s purpose-built patient recruitment AI model will become increasingly indispensable, driving faster, more accurate patient recruitment and ultimately, reducing operational risk and improving the likelihood of more successful clinical trials.


Read More

Why Patients are Reluctant to Participate in Clinical Trials

Why Patients are Reluctant to Participate in Clinical TrialsClinical trials are the cornerstone of medical advancement, providing the data needed to develop new treatments and improve existing ones. Despite their critical importance, patient participation in clinical...

Challenges in Clinical Trial Recruitment: Decoding the Data Types

Challenges in Clinical Trial Recruitment: Decoding the Data TypesOne of the things that makes clinical trial recruitment challenging for researchers is the broad variety of data sources needed to effectively identify eligible patients. For certain, being able to comb...

How to Leverage AI for Effective Patient Engagement

How to Leverage AI for Effective Patient EngagementClinical research is transforming at an incredible pace for a variety of reasons, with factors such as the need for more inclusive and diverse clinical trials leading researchers to seek out new solutions.  The...

Using AI to Transform Your Clinical Trial Recruitment Process

Using AI to Transform Your Clinical Trial Recruitment ProcessClinical trial recruitment must evolve to keep pace with advances in the broader scope of clinical research. To make this happen, researchers need to sort through an immense volume of patient data to...