Medidata Blog

Don't Let Data Queries Slip Through The Cracks

Reading Time: 3 minutes

Join us at Medidata Symposium, the industry's premier clinical trial technology event, in Phoenix October 17-19 at the renowned Arizona Biltmore resort.

Recent findings from Medidata Trial Assurance highlight the power of machine learning algorithms and their impact on pharmaceutical companies seeking greater capabilities to ensure data quality.

Trial Assurance is a service offering from Medidata’s team of former FDA statistical reviewers, who leverage Medidata Centralized Statistical Analytics (CSA) technology to deliver analyses on clinical trials.  The analyses are fueled by CSA’s breakthrough machine learning algorithms, which drive data quality to levels largely unseen in drug development.   

When CSA is demonstrated to potential clients, their immediate impressions are of a technology that can transform the way they manage and clean data. They see that CSA identifies data anomalies faster than anything they can do manually – and that it picks up on the types of issues that are nearly impossible to catch when using today’s processes. 

We often find that CSA picks up on data issues that current data management and query processes have already captured,  yet these standard query processes don’t actually clean the data.  The traditional system is broken. Let’s look at two examples.

One of the major features within CSA is that users can get insight into all queries at a patient level.  This is a game changer for a statistical analytics tool such as CSA because it provides direct visibility into their eCRF data within Rave, to better understand if the issue is new or something that they have already dealt with and how it was resolved.

In our first example, CSA recently found that a value for one trial subject’s temperature reading (98) was much higher than the other temperature readings in the study.  In the graph below, the red dot shows a temperature reading value that is clearly an outlier compared to all other values on the graph.


Using CSA, the reviewer was able to view the queries for this subject and found that the data point had been queried for being out of range. But when looking at the query, the site responded that the value was correct, did not change the value, and the query was closed.  The data had not been changed when it clearly should have been. Potentially in this situation, the site entered the temperature value in fahrenheit when it should have been entered in celsius.

In our next example, a CSA graph of respiration vital sign readings clearly shows that two data points are outliers in the data.  Both data points were from the same site, and both were queried.


Unfortunately, the site did not change the value of the data points in response to the query. Instead, the site wrote a query response: “Done Data.” What exactly does that mean? And again in this example, the query was closed with no improvement to the data quality.

In both of these examples, the findings are straightforward.  Traditional data management practices can often find these errors pretty easily.  The value of CSA is that it’s is easy to see when a value is an outlier, and it’s easy to follow the query history and resolution for each data variable. 

The purpose of current data management practices and the use of a tool such as CSA is to increase the quality of clinical trial data.  If standard practices are not working to achieve this outcome, then taking an innovative approach like CSA can help get you there.

Trial Assurance allows Medidata’s FDA experts to conduct analysis directly in CSA.  This way, you can see the final results and recommendations that CSA will bring to your organization.

Join us for dinner in Philadelphia or London where our industry experts will share how Centralized Statistical Analytics (CSA) is being used to detect data anomalies and outliers that may indicate procedural errors, protocol compliance issues, investigator deficiencies or fraud in clinical trials.


Jacob Angevine