A researcher’s worst nightmare is to realize that they cannot use their data. This horrible realization often comes right after data has been collected, teams have returned from the field, and resources have been spent. At that moment, unfortunately, it’s too late, and the damage is done — data with inconsistent or invalid fields that cannot be corrected and must be discarded.
‘Data quality’ is an umbrella term that refers to the degree to which a piece of qualitative or quantitative information can be used in “operations, decision making and planning”. Researchers strive to have data that is of the highest quality, with the minimum amount of errors, missing fields and inconsistencies.
With the use of computer-assisted personal interviewing (CAPI) or Electronic Data Capture (EDC) applications, researchers can ensure their data is of high quality. These applications aim to reduce the margin for entry errors with automated data validation and eliminate the possibility of transcription errors. Issues that have plagued paper-based forms for years.
By implementing data validation rules, researchers can confirm that their data matches specific criteria or that the required fields have not been left empty. However, there are two unfortunate issues in data validation:
- Team members that are collecting data see a warning that a particular field does not match the validation criteria, but they are still able to save incorrect data.
- Team members are prevented from advancing or saving data, until an out of range field is corrected, when in fact that information was correct.
Still, data validation is a solid first step in data accuracy. The second step is query management, and this functionality, found in many data capture tools, helps close the feedback loop in data quality.
What is a query management system?
Query management is the ability in data collection systems to identify data entries with issues and isolate them into a report. For every out of range or inconsistent value, the data capture tool generates a data query. Each data issue becomes an entity in itself and thus can be tracked over time to see if it is still present, or it has been resolved by someone in the team. A query management system substantially minimizes and even eliminates the risk of invalid data being unnoticed.
Without a query module, researchers often rely on Excel or SPSS to scan their collected data for erroneous entries. This quality control usually occurs after data collection has finished, which presents a real problem in humanitarian projects where returning to a subject to corroborate a suspicious data entry may not be feasible.
A query management system allows data quality control to be simultaneous as data is being entered. If invalid data is collected, the project administrator can quickly identify this issue and confirm with the team that is still in the ground if the information is correct or not.
Query management can be found on most validated electronic data capture (EDC) platforms, such as REDcap and our technology, Teamscope.
Key benefits of using data queries
There are three key benefits of a data query management system, including:
- Data queries can be triggered when there is an issue on a single variable (e.g. an age of “210”) and also across variables to catch impossible combinations (e.g. age “4” and education level “university”)
- Once a query is open, a user can either correct the mistake, for example out of range blood pressure, or confirm that it is indeed an accurate value.
- In medical research, data queries can be used to initiate patient referral workflows. For example, a validation rule can be built to be triggered whenever a patient matches specific criteria. If that an entry matches that criteria the case be easily individualized within the data collection platform.
Please use data validation and data queries
We often hear “What isn’t measured, cannot be improved”, I argue that this is no longer true. How things are measured, and the quality of that information is more critical. Invalid or inaccurate data hinders a researcher’s ability to make decisions and reduces the data’s future usability.
Query management within data collection tools is an effective measure to support research teams and identify when a value that they are entering could be wrong. Unfortunately, this is not enough. A query management system is a useful tool within a researcher’s data quality arsenal.
By Diego Menchaca, founder of Teamscope, a data capture application for clinical research
Sorry, the comment form is closed at this time.