Data integrity is a significant and common bottleneck in research, with the volume of data outpacing the ability to identify and process it. One of the issues that is faced by clinical trials stems from the inaccuracy and completeness of the data collected, both at the point of collection and over the course of its lifecycle.
Studies can be rendered ineffective from just a handful of important data points being collected or stored badly, and thus the importance of data integrity cannot be understated. In the 1990s, the FDA introduced a set of key principles of data collection and recording that aim to minimize the errors and elimination of data points to increase the power of the data available to researchers and reduce the bottlenecks and cost of bad data.
These policies are abbreviated to the acronym ALCOA, and have since been expanded upon. In this article, we’re going to go over the ALCOA meaning, how it applies to the data integrity lifestyle, and how ALCOA clinical research best practices have expanded to ALCOA+.
The ALCOA Meaning and ALCOA Data Integrity Principles
The ALCOA, or ALCOA-C principle is the acronym applied to the principle of supplying good data quality from source documents in clinical trials. This covers both electronic and paperwork sources of data and is applied as a set of standards by the FDA and the EMA.
The acronym ALCOA, meaning Attributable, Legible, Contemporaneous, Original, and Accurate strives to cover all bases of data integrity but has been expanded upon, as we will discuss below. First, here are the components of ALCOA in detail:
- Attributable: This point is about making sure the person who created and edited the document is recorded. That the data it holds can be attributed to someone and that there is a paper trail. There should be enough information for auditors to see why and why any changes were made.
- Legible: Documents must be filled out clearly and with the intent to be understood. Numbers and letters must be distinguishable from each on there, and the clarity of the data must be ensured into the future, too.
- Contemporaneous: Data must be entered when it is recorded, along with a timestamp, attributing it to that moment in time. Where this is impossible, there must be a note attached explaining why it was entered late.
- Original: Maintain and report original documents or copies that have been authorized. Informal records are a potential security or integrity breach and cannot be counted. With paper documents, this means keeping them in their original state.
- Accurate: Data entered must be error-free and accurately represent findings. Any amendments need to be accompanied by notes explaining when and why the amendment happened. Data needs to be a thorough and honest expression of the conduct of the study itself, acting as evidence of the events that took place.
These are the fundamental tenets of data integrity for clinical trials, and from there, other governing body standards have been added for a more comprehensive practice. These cover even more bases of data security over the long term and employ a greater focus on longevity and accessibility.
ALCOA + or ALCOA-C?
ALCOA-C refers to the first addition to the ALCOA principle: “Complete”. From there, even more, points have been developed, to the point where many are following the ALCOA+ principle, which we will detail below:
- Complete – This is an extension of the contemporaneous point, in that data should be filled in to completion at the time of recording. It goes on to state that there should be no deletions after the fact. Again, any changes that are made need to follow this and the above principles.
- Consistent – Another addendum to the above points, consistent data should be arranged in chronological order with no missing pieces. They should follow all of the ALCOA+ principles, all of the time.
- Enduring – This point provides an emphasis on the long-term reliability of the data. Readability and processing should be maintained for the duration of the requirement of the data, sometimes years or even decades into the future.
- Available – There should be no need for a lengthy hunt for the data or the files it’s kept on. Data should be stored in a way that is easily accessible and can be found when needed, in a short amount of time. For example, FDA audits commonly involve a 30-minute time limit from the request of certain documents to their presentation for audit. If the time limit expires, the file is logged as missing.
The above are the four official additions to ALCOA that make up ALCOA+. Some sources will include the following:
- Credible – The data should come from a reliable source, within a reliable time frame from measurement. There is also a need to store data in the most efficient way possible, without resorting to multiple entry points that increase the risk of error.
- Corroborated – Where possible, all data should be backed up with evidence. The attached printouts should match and support the submitted data.
As you can see, there is significant overlap between some of these additional points, and different practitioners may modify the principles with their own addenda. However, the approach is set up to cover all bases of data entry and ensure its lasting security.
In a moment, we’ll look at the official points in ALCOA + and go into some ways in which they can be applied. First, though, it’s a good idea to consider the data integrity lifecycle, and how it’s affected by the ALCOA+ principles.
ALCOA Data Integrity Lifecycle
The data integrity lifecycle can follow a series of points that span decades. The range of this lifecycle is determined by the requirements of the data, and with clinical trials commonly spanning 15 years or more, in clinical research, it’s not uncommon for data to be relevant for 30+ years.
In research, there are two standard phases of data: the active phase and the inactive phase. The active phase follows the following:
- Data acquisition – this is the moment of creation for the data point and represents its conception. From generation to collection, this step covers the direct effects of whatever is being measured, its measurement, and the recording of that measurement.
- Data storage – The collected data is kept safe and accessible in order for it to function as intended. It is stored in a way that prevents decay and manipulation or any alteration so that it is accurate for use.
- Data usage – Data then goes from its raw form into something more useful, via data wrangling, or cleaning. In some cases, the data is also compressed or encrypted, or otherwise processed. This is the stage where the data are viewed and analyzed and can be used for decision-making.
- Data reporting – Sharing and publishing the data with stakeholders, employees, or patients, represents a vulnerable point in the integrity lifecycle. Depending on the stakeholder involved, the data needs to be handled with caution and under strict security guidelines.
- Short-term retention – Around this time, the data needs to be kept safe for easy and rapid access while it’s still highly relevant to the needs of the researchers. This is the final stage of the active phase of its lifecycle before it’s securely archived.
From here, the data is entering the second, inactive phase. This involves the following stages:
- Archiving – The data has served its purpose but it must now be kept for auditing and for future reference; assessing the integrity of processes, and accessing by relevant bodies. Archiving maintains the integrity of the data for its long-term use under these circumstances and needs to be set up with this in mind.
- Auditing – During the end of its lifecycle, the data may be referred to for audits, and tied to other processes to check the accuracy of record keeping and the monitor the adherence to the research processes. Numerous stakeholders are involved in ensuring policies and processes were followed as agreed upon, and the data serves as a guarantee that this is the case.
- Destruction – Once the data reaches the point at which it is no longer relevant, it has completed its lifecycle. For security and privacy reasons, as well as resource-savings, the data must be destroyed securely.
Each of these stages is simplified and can be broken into subcategories for a more detailed coverage of the integrity lifecycle, but it should become clear that ALCOA+ represents the facilitation of this integrity throughout the lifecycle of the data. To understand exactly how this might happen, take a look at some of the best practices of ALCOA+ principles.
ALCOA Clinical Research Best Practices as ALCOA +
The pointers above are relatively simple. But how does ALCOA in clinical research actually look? Here, we’ll go into some of the best practices for implementing ALCOA + into the trial process. Broken down as above:
- Attributable – There has to be a delegation of authority log established for attributable data logging to be ensured. These can be simplified using tools such as eBinders, and by making the process fully electronic, but where this isn’t possible, it’s still necessary to record the identity of the person, sensor, system, or device that collected the data, the specific source, and the date and time of recording. These are necessities regardless of how the data was generated. Most systems and apps have this capacity already, so for electronic data, this shouldn’t be an issue. However, paperwork templates should all factor in space for these details.
- Legible – The long-term legibility of data is also a lot easier with electronic systems. For handwritten records, documents must be second-checked for clarity and stored in a manner that ensures the maintenance of that clarity. Using robust arterials for long-term legibility on paper is essential: indelible ink, suitable paper, and other considerations for not only the clarity of the data in the immediate term but its longevity in storage. Another factor to consider is the standardization of language and dating systems. Local jargon especially may change over time, so consider simplifying and future-proofing the language throughout the entire organization. This applies both to digital and manual data collection and recording.
- Contemporaneous – Again, electronic systems can automatically add a time stamp, which makes this point a lot simpler with digitized recording methods. However, clocks need to be accurate and maintained, time zones need to be considered, and added where necessary, and time stamps on data must not be affected by pending process; they must represent the time that the data point was taken, not the time that it was recorded. For manual record keeping, this has even more relevance. The principle should ensure that corrections are minimized and data is never created or updated after the fact since late entries maximize the possibilities of error.
- Original – For both electronic and handwritten documents, all data should be recorded onto an original document. On paper, there should be no note keeping on scraps of paper, and all original documents should be made available in time for the contemporaneous recording of data. If copies need to be made, they must be authorized as legitimate copies, clearly distinguishable from original documents, and never stored as a replacement for an original document. Original data should be secured in these ways to ensure that it cannot be changed by any means or for any reason without the appropriate procedures and notifications attached to the modification.
- Accurate – This leads on from the previous point; that any data changes must be appropriately documented in a manner that leads back to the original document or data point. Where mistakes are made, nothing should be blocked out or erased; instead, a correction should be clearly made and a reason for the correction, along with its attribution to the person making the correction and the time and date, should be present. For example, if a letter or number is entered incorrectly or in an unclear manner, a single strikethrough shows the data as it was entered, and clearly shows that it has been corrected. The corresponding correction can be written alongside the strikethrough, with an explanation where possible, or a numbered asterisk leading to the corresponding asterisk on the designated space for corrections. Therefore, this accuracy also applies to correction; under no circumstances should the original error be scrubbed out; it must remain legible.
- Complete – This is a relatively simple point to make, but the focus should include the paper trail, and this can be tested quite simply. Each document should be complete enough to lead an inquiry directly to the point of data collection and the person or system recording it. This includes the trail of changes where necessary, the changes to templates for paperwork, and so on. For example, if a data point was recorded on a machine that no longer exists, the paperwork should be complete enough for an auditor to follow version number changes back to the machine logs and calibration records.
- Consistent – Consistency expands upon some of the practices described in the cotemporaneous section; that the data must be entered in a way that is uniform across the sites and organizations involved. Machines and paperwork templates, as well as handwriting policies, should be uniform, and individuals should be trained in the correct way to fill out paperwork. This involves setting a unified date format (e.g., 01/JAN/2022), which eliminates conflicting formats, and ensuring that the time stamps on documents follow a chronological order. Changes to data should also be timestamped using the same policy.
- Enduring – Data backups, cyber security, and disaster or recovery plans are all part of ensuring the longevity of electronic data. The same scrutiny should be placed upon the storage of physical data, with archives being designed in a way that reduces the integrity of files. Consistent power supplies, temperature control, and stringent access restrictions are all common ways to address the long-term storage of physical data.
- Available – While the storage must be secure, it cannot substantially restrict the access to documents where needed. For any justifiable purpose, the data must be accessible within a reasonable time frame. This means recording, storing, and logging the location of the documents as they are secured. This log itself needs to be user-friendly and qualified people need to be competent in using it to recover files quickly and effectively where necessary. Documents are sometimes stored in more complex systems as they age; with the most relevant and urgently accessible files being stored in organized folders, and the more distant and less-relevant being archived. Either way, they need to remain available until the duration of their storage requirements has passed.
All of these practices promote the integrity of data throughout the integrity lifecycle. With proper management and additional focus on ALCOA and ALCOA+ as data is being recorded, the efficiency of storage, accuracy, and recovery of these data is significantly increased.
Conclusion
With ALCOA+, clinical research stands to vastly improve the integrity of its data, allowing for longer and more accurate use of the information coming out of trials.
ALCOA+ represents an effort to catch up to the modern standard of data integrity, and reduce bottlenecks and the subsequent costs associated with them, arising from poor collection, recording, or storage of data.
With a move to more electronic means of managing data, many of the principles outlined in ALCOA+ are easy to implement, however, it’s still important that everyone involved understands and agrees with the principles to create the consistency needed for the scaling up of research to come.