Explore By Subject Area   

What Does Clinical Trial Tokenization Mean for Real-World Data and Long-Term Patient Follow-Up?

Novo Nordisk’s Real World Data Strategy Lead, Tom Dougherty, walks through clinical trial tokenization: where it is being used, why it’s such a potential value-add to trials and programs, and what life science organizations must consider when preparing for the inclusion of tokenization.

December 10, 2025
What Does Clinical Trial Tokenization Mean for Real-World Data and Long-Term Patient Follow-Up?

Why is tokenization important for the future of medicine development?

A tremendous amount of data is collected throughout a clinical trial, but prior to tokenization, the life science industry did not have a way to continuously capture data in the real-world setting. Now by integrating clinical trial data and real-world data through tokenization there is a significant advancement in clinical research and unlocks the ability to capture long-term follow-up data on patients. After gaining patient consent and maintaining patient privacy while linking diverse datasets, researchers can realize invaluable insights into patient journeys, treatment outcomes, and healthcare utilization.

The use of tokenization can also help validate real-world endpoints against the gold-standard trial data, which then improves decision-making and increases confidence in future integration of real-world evidence. 

"Researchers can link disparate data sources such as clinical trial data, EHRs, claims data, and registries, creating a more comprehensive view of patient journeys without ever exposing the patient's sensitive information."


What’s the value of tokenization to a trial or a program? 

It can be difficult to put a number on the return on the investment for doing this, but there’s a lot of utility and value generated from trial tokenization. From a study team perspective, you can confirm clinical and medication history to help fill in the gaps. With increasing demand for post-marketing commitments, post-marketing requirements, and long-term safety data, tokenization supports comprehensive evidence collection across trial participants. Where I also see value is the long-term patient follow-up as tokenization and RWD linkage reduces the site and patient burden by allowing sponsors to study efficacy and safety outcomes through a passive data collection beyond the life cycle of a trial. 


How does tokenization work? 

Tokenization transforms personally identifiable information also referred to as PII which typically includes first name, last name, gender, date of birth and zip code into a unique identifier also referred as a token. Real-world data providers also have access to PII and the same token is generated which enables a probabilistic match of trial data to real-world data.  By matching tokens, researchers can link disparate data sources such as clinical trial data, EHRs, claims data, and registries, creating a more comprehensive view of patient journeys without ever exposing the patient's sensitive information. 

"Where I also see value is the long-term patient follow-up as tokenization and RWD linkage reduces the site and patient burden by allowing sponsors to study efficacy and safety outcomes through a passive data collection beyond the life cycle of a trial." 


What types of trials see immediate benefit from tokenization? 

What I have seen over the years across the life science industry, the majority of trials that are being tokenized are later phase, with larger patient participant populations. However, there is a trend of adoption in Phase I and Phase II programs. There is likely an immediate benefit from tokenization regardless of the trial, rare disease trials face unique challenges due to their smaller patient populations. With these trials, every enrolled patient offers a critical source of data so by tokenizing and linking to RWD, sponsors can evaluate disease progression and treatment outcomes without the burden on the site and patient. 


What is the actual usage of tokenization in trials today? 

What I've seen at recent conferences is that some organizations are in the early stages by piloting 1-2 trials and other companies are going all in and tokenizing every trial that they have. With hundreds of trials that have likely utilized tokenization to date, I am eager and excited to see publications on how they are using the tokenized data and if it is suitable for regulatory submission. 


What would be your advice for others trying to implement tokenization into their trial?

Start planning at least 3-6 months before your trial is about to start as this will enable you to form the right team that can drive this activity, create the required documents such as the informed consent and protocol, and develop tools such as site-facing and patient-facing materials. Engage stakeholders across the organization such as legal, privacy, trial operations, RWE teams, data management, IT, etc. to ensure everyone is on the same page. 

"The life science industry did not have a way to continuously capture data in the real-world setting. Now by integrating clinical trial data and real-world data through tokenization there is a significant advancement in clinical research and unlocks the ability to capture long-term follow-up data on patients."

In this article

Subscribe for More Information

Please provide your contact information and select areas of interest to receive updates.